Like most people, Sean J. Taylor would have never picked the Ravens to be in Sunday’s Super Bowl XLVII.
Taylor, a fifth year PhD candidate at NYU’s Stern School of Business, works in an abstract realm where mathematics meets social media. He attempts to describe—and predict–outcomes. In a talk to about 200 rapt members of the NY Open Statistical Programming Meetup at Pivotal Labs Thursday, Taylor said prediction is hard.
In order to rank teams, colleges, people or even influence–an outcome is required. Outcomes in social networks include having the most influence, the highest Klout score or the most followers on Twitter. One measurement might be if the follow relationship is reciprocated–or not. “I win against somebody who follows me but I don’t follow them back.”
“You can rank people the same way you rank sports teams,” he said. But what Taylor found is that even the best algorithms—the BTL model, Pythagorean Wins and even his own optimal model–reach a prediction ceiling of roughly 75 to 80 percent. They never reach 100 percent.
After the talk, he said, “A lot of that is just inherent unpredictability. No ranking would do really well with predictability because the games are often upsets.” Taylor wants to study other sports to see if they have similar upset rates. He thinks there might be some kind of human need to have a certain level of unpredictability.
A consensus model—like one created by ESPN’s own sports experts—fairs even worse than algorithmic rankings. “Coming to consensus with experts is not a robust statistical procedure,” he said. “It’s hard to say they had any opportunity to do well in prediction.” But just BTL, ESPN does a good job of describing the data.
There are ways to improve data and make it a better fit. Instead of estimating teams, you can estimate features of teams that contribute to winning. “That’s what Brian Burke of NFL Stats does,” he said. Burke has a list of events that correlate with success and lead to victory over time—and not things that are just flukes. According to Burke, a defensive touchdown is unsustainable. Instead, a home team advantage, penalties, interceptions, fumbles and catch rates are all things that can be measured and put into a model.
Burke has 62 percent for the 49ers and 38 percent for the Ravens. In Burke’s model the 49ers have better metrics, hence the higher probability.
Taylor didn’t reveal his exact method for obtaining his final Super Bowl outcome. He did say that at the beginning of the season, he creates a graph of the teams and expands it each week as teams continue to play each other. A topological sort in igraph creates a ranking.
But there’s a problem. Complex cycles are embedded in the data. “For instance, this year, the Eagles beat the Giants, which is pretty awesome,” Taylor, an Eagles fan, said.
But the Giants also beat the Eagles, which creates a two cycle. The Eagles beat the Ravens. And the Ravens beat the Giants. Cycles can get very long. And ranking systems don’t accommodate them.
According to Taylor, that can be corrected by creating a minimum feedback arc set. “But sometimes the results look weird,” he said. “You get stuff like the Giants are the fourth best time in the league.”
In addition, NFL data is small. “With small data sets, methods matter. And it make predicting post season even harder,” he said. Taylor’s prediction is Ravens 24, 49ers 28. But if rankings were 100 percent accurate, the outcome would be very boring.