You are using an outdated browser.
Please upgrade your browser
and improve your visit to our site.
Skip Navigation

Why Didn't the Data Geeks Predict Eric Cantor's Defeat?

Getty Images/Kris Connor

On Friday afternoon, the campaign for House Majority Leader Eric Cantor released an internal poll showing the majority leader up 34 points on tea party challenger Dave Brat. On Tuesday, Brat beat Cantor handily. It shocked the entire political world. Journalists on Twitter freaked out:

You can find similar tweets from just about every political reporter in D.C. In fact, Fox News, MSNBC, and CNN weren’t even covering the primary at the time, assuming that nothing newsworthy would happen. But a few minutes after 8 p.m., the Associated Press called the race in favor of Brat. Cantor gave his concession speech not long after. But one question remained on everyone’s mind: How did this happen and how did no one see it coming?

It’s impossible to know the reliability of the internal poll showing Cantor with a 34-point lead. The campaign could have exaggerated it to create a favorable news cycle for the majority leader. But the Washington Post’s Robert Costa reported on Twitter that Cantor’s friends said the majority leader had been told that he was up 20-30 points. The Post’s Chris Cilliza tweeted that GOP strategists were blaming Cantor’s consultants. Clearly, something went wrong with Cantor’s polling team.

But it also says something more about the limits of data. The only other recent public poll from this race came from the Daily Caller and was conducted by Vox Populi. It found Cantor up 52-40 with a margin of error of 9 points. Even if Cantor’s polling team was wildly off, the Daily Caller poll showed Cantor with a solid advantage. In most races, a 12-point lead would be a big advantage. In Cantor’s race, it was surprisingly small, but it still indicated a relatively easy victory for the majority leader. With the limited knowledge we had, Cantor seemed like he was in a good position.

But the key word there is “limited.” Like most congressional primaries, we didn’t know much about this race. Independent pollsters generally stay away from these races. District-level polling is just very hard to do accurately. In other words, data can only tell you so much. When you can’t survey a large enough sample size to represent a given population, a poll isn’t going to be very useful. You may get the outcome correct sometimes, but other times, you’ll be widely inaccurate. That’s not helpful for a candidate.

This is not to absolve Cantor’s campaign: His team should have had a much, much better feel for the political landscape in his district. They expected to win easily. Instead, they were blown out. If the polling team was really off by around 50 points—with a margin of error of around 5 points—they deserve considerable blame. As Vox’s Ezra Klein writes, there seems to be a fundamental problem with Republican campaign infrastructure.

One of the ironic truths to these events is that the past few months have seen the coronation of data journalism, with the creation of several new sites: Vox, the New York Times's The Upshot, and most notably Nate Silver's FiveThirtyEight. These quants focused on numbers to tell a story. Increasingly, we relied upon them to guide us through the election season. Silver and his fellow-travelers often reminded us that their forecasts were just that, and did not guarantee victory for one candidate or the other. On Tuesday night, we—journalists, pundits, lobbyists, staffers, and everyone else following the primary season—were reminded of that fundamental truth.