We put five leading large language models head-to-head in a March Madness bracket challenge, and the results revealed more about LLM usability, safety settings, and creativity than basketball predictions. Some models produced clear, coherent brackets, while others delivered inconsistent formatting and overwhelming data dumps. Models should be trained to ensure outputs are user-friendly, especially in real-world applications where clarity can make or break usability. When it came to safety, not all models wanted to play ball. Mistral and Gemini refused the task (likely flagging it as gambling-adjacent), while GPT-4o partially complied, providing only Final Four predictions. We take this as a positive. These models are increasingly being trained to avoid generating content that could be unethical or harmful. Gemini and Mistral scored bonus points for creativity. Instead of simple refusals, they offered detailed guides on building your own bracket. Gemini went one further and created an entirely fictional bracket with made-up schools, details about various teams’ strengths, and strategies. While this may not be useful for your office pool, it did showcase the model’s capacity for imaginative generation—a capability that can come in useful for requests that require storytelling or simulation. Who called it right?
Do you have job openings for Shona language?
The case for continued Model training being solidified! Imagine how big the gap might be as these models become industry or context or task specialized! Then take that a step further to multiple models that feed of each other in the enterprise!!! Welcome to why Agentic AI transformation is here for the long haul!!
Great analysis! When it comes to output usability, I’m curious, did any of the models evaluate the quality of their own responses? Do reasoning-focused models handle this differently, or do they all just generate and move on?
Talent Partner scaling AI Startups from 0-1 and beyond through foundational TA system design and architecture.
3wTaylor Trepagnier you let me down. Should have had the Gators going all the way. Typical LSU fan 😂. Great read!