Mathematicians explain why predictive algorithms still won’t get you a perfect March Madness bracket

A week into the tournament, and let us guess—your bracket’s busted.

[Source Images: iStock]

It’s that time of year again.

March Madness is upon us, in all its basketball, school-spirit, gambling, money-on-the-line, eyes-glued-to-tiny-phone-screen-sports-apps, hidden-browser-tab-at-work, legendary-triumph, devastating-loss, stunning-upset, tears-streaming-down-a-six-foot-tall-college-senior’s-face, crazed-alumni, heart-of-a-champion, Cinderella-story, buzzer-beater glory.

And now, a week into the tournament and heading into the Sweet Sixteen, your bracket’s busted.

I’m not psychic, just data driven. We’ve passed 48 of the 63 total games in the tournament, and the probability that your bracket is still in play is incredibly small. It might even be somewhere around 1 in 280,000,000,000,000. (That’s trillions, to save you the squinting.)

At least, those would be the odds if your bracket was completely random—if you drafted it by flipping a coin to decide which team wins each face-off. With that strategy, you have a 50% chance of picking correctly; multiply that chance by 48, and your odds drop exponentially—to 1 in 1/(0.50)^48, to be exact. The whole tournament? You land that shot once every 1/(0.50)^63 = 9.2 quintillion tries.

The number quintillion is so scarcely relevant in life, it’s hard to fathom. According to a report from the NCAA website, if you consider that there are an estimated 7.5 quintillion grains of sand on Earth, that means if you had to guess which one specific grain from any of the world’s beaches I was thinking of, you would have a 23% better chance than a perfect bracket. Or if you had to guess where a single acorn was hidden in any of the planet’s three trillion trees, your chances would still be three million times better than a perfect bracket.

As Tim Chartier, a mathematics and computer science professor at Davidson College (where Golden State Warrior Steph Curry once played), tells Fast Company, you could make a billion brackets per second, and it would still take 300 years to cover the quintillion possible versions of events.

But yet again, the quintillion is mostly irrelevant. Nobody actually picks a bracket by flipping a coin, and if you’re at all trying to identify the better team, you’re probably above 50%, and the odds of the perfect bracket rise exponentially.

Could it reach the realm of possibility? According to Chartier, who has researched “bracketology” for years and even developed a fan-friendly “March Mathness” website that generates weighted brackets, the answer is, not really. There are modeling machines that crunch data from millions of factors to determine the most likely outcome of each game, but Chartier says despite years of tinkering, their accuracy plateaus at around 70%.

If you picked the correct winner of each game 70% of the time, your perfect bracket odds would be 1 in 1/(0.70)^63 = 5.7 billion. Still a tough shot.

This makes plain why Warren Buffett—who made a fortune through smart investing—has made a sport out of offering riches to anybody who achieves a perfect bracket: Once it was \$1 billion, another time, \$1 million per year for life. Being a numbers guy, he knows it will never happen.

Algorithms get us closer—but no cigar

From quintillion to billion is still orders of magnitude greater. Predictive algorithms can achieve this by analyzing a range of statistics. For a game between Duke and Kansas, it could consider how often Duke wins against Kansas, how big the score difference is, how many games Duke has won all season, how many it won consecutively leading up to this match (a hot streak), whether star players are injured, the coach’s track record, how often 2 seeds (Duke) defeat 1 seeds (Kansas), and millions of other factors of increasing complexity, down to early-game three-point blocks and late-game free-throw percentages. And each of these factors is adjusted for importance.

The feats these models can accomplish have grown massively in the past few years, Mark Ward, a statistics professor at Purdue and director of the data science-focused Data Mine, tells Fast Company. They can now trawl information from newspapers, social media, or Wikipedia to glean insights better than humans: “It’s able to discern things that you and I may think are just qualitative—not hard numbers, not quantitative information—but it’s able to garner whether written sentences are favorable for a team because of the way they’ve been trained.”

To drive home the point, Ward brings up AlphaGo, an AI that learned to beat humans at the complex game of Go. It does so by analyzing zillions of possible outcomes that result from a single move and deciding which are most probable, given data from past matches. AIs can do this for sports—like EA Games’s video game Madden does for the NFL’s Super Bowl, by simulating countless possible games, each player like a chess piece with custom stats. However, as Chartier notes, the Madden model isn’t that useful for March Madness, as most of its players only have stats from one, and at most three, tournament appearances.

Ultimately, says Ward, “they’re just models,” and luck and randomness can always thwart.

That’s echoed by Chartier, who cites the inevitability of unknowables—the X factors that he hasn’t figured out how to account for. For one, emotional stress can wreak havoc: “Higher seeds, when they get in trouble, sense the weight of history’s shadow falling upon them, because you don’t want to be the No. 2 seed losing to the No. 15 seed. Let alone, the No. 15 seed seeing the rays of hope and success shining on them. That can sometimes raise the level of play. That’s difficult to quantify.” And sometimes, teams gel midway through the tournament and start behaving differently. Then, there are always the Cinderella stories that soar out of left field—this year, the St. Peter’s Peacocks; and last year, the Oral Roberts Golden Eagles. No model of Chartier’s foresaw those.

Even if your pick wins—if it was an edge-of-your-seat, final-second buzzer-beater, “Did you really predict that?” Chartier laughs, “’cause it easily could have gone the other way. . . . Sometimes people watch Moneyball and think it’s possible to get 90% accuracy, but there’s always some element of luck you can’t predict.” (The famed KenPom rankings system does its best with a “luck” offset, calculated as the deviation between a team’s actual winning percentage versus what you’d expect based on statistics.)

What Chartier has found matters most: The toughness of a team’s regular season schedule (“it isn’t just your record, it’s the strength of the teams you beat”); so-called “math mojo” (if you win against good teams as the season progresses); and not missing home (if you win against good teams on the road). “If you play all your hard teams at home, early in the season, it’s less predictive of March Madness performance,” he says.

The NBA, he concedes, is more predictable: Despite that players are closer in skill—which makes it harder to call—they’re also more consistent. (For example, when a player gets “hot hands” at the college level, the variability is far more extreme than in the league.) And NBA championships are a seven-game series versus the NCAA’s single elimination. The longer the playoffs, the more likely the better team wins.