Stop Explaining Away Inconvenient Polling Data

One of the big takeaways from this election season is the failure of polling. The pollsters failed to catch Bernie Sanders’ strength in the Michigan primary. They missed Hillary Clinton’s general election weakness in several normally-blue states in the industrial Midwest. They missed Trump’s over-performance among black and Latino voters. Failures all, and much remarked upon. What gets less attention, and is to my mind a greater failure, is the establishment (both in the sense of their political loyalties and their faith in political science conventional wisdom) punditocracy’s repeated insistence on ignoring consistent, reliable polling data that supported conclusions they did not want to acknowledge.

The most obvious example of this was the insistence, in the face of month after month of public opinion polls showing him with a large and growing lead, that Donald Trump could not win the Republican nomination and that a figure from the party establishment would win it instead. An old poli-sci book titled “The Party Decides” enjoyed a revival because it backed up this excuse to ignore the evidence. Initially, I certainly acknowledge, it was quite reasonable to look at Trump’s polling leads with skepticism. Back in 2012, figures like Herman Cain took their turns at the top of the Republican polls, before Mitt Romney surged into the lead. And historically, there is a strong pattern of anti-establishment Republicans enjoying some time in the sun before fading away. But several months after Trump first topped the polls, that lead was growing, and had proven itself to be enduring. Nonetheless, the establishment had its story and was sticking to it, so instead of acknowledging the strength of the data and adjusting their understanding to the facts, they doubled down on their certainty – that is, doubling down on the primacy of their narratives and their own status as authorities.

Similarly, there has long been robust and enduring data demonstrating the superior strength of Bernie Sanders to Hillary Clinton in a general election matchup vs. Republican candidates. There were head-to-head polls. There were the favorability polls showing Sanders was the most popular politician in America while Clinton was grotesquely unpopular. There was a recent report done of post-election preferences showing Sanders winning a ridiculous Electoral College map. Notably, no other candidate who lost a 2016 primary, nor any third-party candidate, managed to become the most popular politician in America or win a massive landslide in a candidate-preference survey conducted after the election. Only Bernie Sanders. Surely, that tells us something about his strength as a candidate, no? Against this, there was and is…nothing. No polling showing Clinton doing better. No deep-dives into the cross tabs or the qualitative data showing that Clinton had better penetration into certain electoral segments. Nothing, zero, zilch with which to refute the data showing Sanders’ greater strength. Yet the response to this data – data during the primaries, from the general election period, and from the post-election period – from the poli-sci and pundit establishment was and continues to be to dismiss it and explain it away in the same cloud of authority-asserting verbiage they used to explain why we should ignore Trump’s primary polling.

The point here is not that we should never question the strength of polling data. It is true that the imperfect relationship between general-election head-to-head polling conducted during primaries and actual general election results makes that polling less than wholly reliable (though it does become increasingly reliable after February.) It is also true that Hillary Clinton was enduring more attacks from the Republicans than was Bernie Sanders. However, even with these disclaimers, the data was and is clear and overwhelming. Professional political data analysis – think, for example, of poll aggregators – often consists of taking a large amount of evidence that, in each individual case, is of low reliability, and considering it in the aggregate in order to draw a more reliable conclusion than one could glean from any one piece of data. We can do this is a mathematical sense with an aggregation of polls that ask the same question in order to lower the margin of error, and we can do it logically with public opinion research that looks into different questions.

But that’s not what happened. Instead, the establishment voices looked at large piles of data, picked out each individual piece, declared it to be less-than-definitive, threw it out, and drew the conclusion “We’ve got nuthin’.” There was a massive pile of evidence of Trump’s strength in the primary, and none that would lead to the conclusion that he would not win. There was, and continues to be, a massive pile of data showing that Sanders would have been the stronger general election candidate, and absolutely no evidence in the other direction – but even in hindsight, the same voices who told us that the data showing a Trump lead in the primary was meaningless tell us to ignore the data about Sanders. That big pile of data? It’s out the garage, overflowing two 50-gallon trash barrels, because no one piece proved the case to their satisfaction. Against this, they have literally nothing. But when you ask them if the most popular politician in America would have out-performed the least popular nominee in the history of the Democratic Party, the answer you get is:


If not a mere recitation of primary-vintage oppo research and a sneer.

This is not intellectual rigor. This is not reality-based analysis. This is not a good-faith effort to use the evidence available to us to draw the best conclusion. There is a large body of evidence in one direction, and absolute bupkis in the other. A reality- and evidence-based discussion about the candidates’ relative strength would revolve around how confident we should be in the conclusion that Sanders would have been a stronger candidate. There is a good case to to be made that we should only conclude that with a low-ish degree of confidence, rather than a high-ish one! But to proclaim that the entire body of evidence tells us nothing whatsoever is turf-protection, ass-covering, and social-group signaling, and it needs to stop. Especially in light of the complete collapse of the narratives on which their dismissals were based, the insistence by the Democratic Iraq War Pundits on ignoring both our lying eyes and our lying sustained-and-robust polling data needs to be ignored going forward.



  1. I think the answer you’d have gotten is that the primary-vintage oppo research was just too good, that Bernie had so much in his past to make him unacceptable, which Hillary had not used on him but the Republicans would, that there is just no possible way those numbers would hold up for even half a second.

    I don’t know how much credence I give that. Like you say, it asks us to ignore the evidence.


    • I think the idea that his good numbers would regress given the reality of being a candidate and Trump (and the press) focusing their fire on him is reasonable. The idea that he would have hit Hillary-level unfavourable numbers is…speculative. How this would relate to election prospects is tricky.

      We can reasonable suspect that the FBI wouldn’t have been so strongly against him. Which would have helped.

      (My guess is that he would have done about as well as Clinton because of polarisation, but because there was less of the establishment as dead set against him from the start, he would have had a better shot at winning. I.e., no Comey letters.)


      • I think it is certain that Sanders’ numbers would have come down had he actually been the candidate. Please don’t misunderstand – I don’t dismiss all criticisms and contextualization of the polling data as illegitimate. Some points, like this one, reduce the confidence we can have in the evidence significantly. Other arguments I’ve seen offered reduce it much less. But still, even with all of the questions raised about how much weight each piece of evidence deserves, we’re still left with a strong preponderance in favor of one conclusion and nothing on the other side of the scale.

        The most interesting analysis I read of how the counter factual would play out posited that Sanders would overperform in the Great Lakes states, but that he would do worse than Clinton in the Confederate states, including Virginia, the only one Clinton won. The question is, how much worse?

        She won the state by five points, and had only a so-so turnout among black and Latino voters. We can reasonably posit that Sanders would have gotten more votes from younger voters of all races, and from union households, which would have to make his underperformance among older voters of color, or some other Hillary demographics, add up to more than six points. I just don’t see where those points would come from, given that Clinton’s vote total among her strongest demographics wasn’t sky-high enough for Bernie’s underperformance to be that large.


      • Can we reasonably suspect that the FBI wouldn’t have hated him as much as Hillary? He’s to her left, after all, and I do think ideological right-wing bias played a role in the FBI’s attitude (the ghost of Hoover coming back to haunt us). Do you think the FBI was dead-set against Hillary for reasons that had significantly more to do with personal Clinton-hatred than with the left/right divide?


      • “Can we reasonably suspect that the FBI wouldn’t have hated him as much as Hillary?”

        I think so. It’s not a slam dunk, as you point out, because of ideology, but, for example, a lot all this was driven by a hack book on Clinton.

        What I’ve read is that a lot of the FBI agents were personally driven by the idea that the Clintons have gotten away with stuff. It takes time to build up that sort of animosity.

        But sure, it’s only “reasonably suspect”.


  2. Sam Wang ate a bug! Now *that’s* accountability!

    I think it’s right to note that Bernie didn’t face a full opposition gauntlet and that his positive numbers may be inflated as a result. But *given Trump* (and even Clinton) we should presume that a fair number of Democrats would have come home, regardless. It’s tempting to think that there was a special hatred of Clinton, but Gore’s and Kerry’s experience would suggest otherwise.

    I don’t think that The Party Decides enjoyed a revival per se. The Party Decides was pretty standard polysci view at the time and the authors acknowledged that Trump in the primaries was a pretty clear test (and that his being nominated broke their model). (Cf Bernstein’s mea culpa as well.)

    I think it’s still open whether Trump was strong or lucky. If we grant that in polarised times, there’s a strong tendency to vote party, then it’s very hard to be an impossible to win. Add in a fundamentals edge, and it’s even hard to be an impossible to win. Throw in a crappy campaign, then you need some other factors. Trump had some helpful factors.

    (I do think the polysci claim that campaigns/candidates matter much less than we think was supported, but then any claim that Bernie would have tanked is correspondingly weakened.)


  3. Here’s an interesting argument against Bernie by Kevin Drum:

    I.e., basically, Bernie is way more liberal than any prior winning democrat.

    It’s interesting, but it seems sui generis. If Trump gets normalised, why wouldn’t Bernie? Contrariwise, aren’t all Democrats attacked as wildly leftwing? (Indeed, that’s *part* of the normalisation. Republicans believe the worst and Democrats *mostly* dismiss it as bullshit or as a plus!).

    • I would argue that, once again, we have extensive polling data about opinions of Sanders and his campaign that we can draw on here. Sanders’ left-wing politics were not a mystery. I read that as evidence that people were willing to put up with policy they disagreed with because they liked Bernie so much, in a year when they hated everyone else.


