One pollster’s explanation for why the polls got it wrong
What the hell happened with the polls this year?
Yes, the polls correctly predicted that Joe Biden would win the presidency. But they got all kinds of details, and a number of Senate races, badly wrong. FiveThirtyEight’s polling models projected that Biden would win Wisconsin by 8.3 points; with basically all the votes in, he won by a mere 0.63 percent, a miss of more than 7 points. In the Maine Senate race, FiveThirtyEight estimated that Democrat Sara Gideon would beat Republican incumbent Susan Collins by 2 points; Gideon lost by 9 points, an 11-point miss.
Biden’s lead was robust enough to hold even with this kind of polling error, but the leads of candidates like Gideon (or apparently, though it’s not officially called yet, Cal Cunningham in North Carolina) were not. Not all ballots have been counted yet, which could change polling-miss estimates, but a miss is already evident in states like Wisconsin and Maine where the votes are almost all in.
To try to make sense of the massive failure of polling this year, I reached out to the smartest polling guy I know: David Shor, an independent data analyst who’s a veteran of the Obama presidential campaigns who formerly operated a massive web-based survey at Civis Analytics before leaving earlier this year. He now works advising SuperPACs on ad testing. Since 2016, Shor’s been trying to sell me, and basically anyone else who’ll listen, on a particular theory of what went wrong in polling that year, and what he thinks went wrong with polling in 2018 and 2020, too.
The theory is that the kind of people who answer polls are systematically different from the kind of people who refuse to answer polls — and that this has recently begun biasing the polls in a systematic way.
This challenges a core premise of polling, which is that you can use the responses of poll takers to infer the views of the population at large — and that if there are differences between poll takers and non-poll takers, they can be statistically “controlled” for by weighting according to race, education, gender, and so forth. (Weighting increases and decreases the importance of responses from particular groups in a poll to better match their share of the actual population.) If these two groups do differ systematically, that means the results are biased.
The assumption that poll respondents and non-respondents are basically similar, once properly weighted, used to be roughly right — and then, starting in 2016, it became very, very wrong. People who don’t answer polls, Shor argues, tend to have low levels of trust in other people more generally. These low-trust folks used to vote similarly to everyone else. But as of 2016, they don’t: they tend to vote for Republicans.
Now, in 2020, Shor argues that the differences between poll respondents and non-respondents have gotten larger still. In part due to Covid-19 stir-craziness, Democrats, and particularly highly civically engaged Democrats who donate to and volunteer for campaigns, have become likelier to answer polls. It’s something to do when we’re all bored, and it feels civically useful. This biased the polls, Shor argues, in deep ways that even the best polls (including his own) struggled to account for.
Liberal Democrats answered more polls, so the polls overrepresented liberal Democrats and their views (even after weighting), and thus the polls gave Biden and Senate Democrats inflated odds of winning.
Shor and I talked on Zoom last Thursday about the 2020 polling miss, how he’s trying to prevent it from happening again (at least with his own survey), and why qualitative research is vulnerable to these same problems. A transcript, edited for length and clarity, follows.
So, David: What the hell happened with the polls this year?
So the basic story is that, particularly after Covid-19, Democrats got extremely excited, and had very high rates of engagement. They were donating at higher rates, etc., and this translated to them also taking surveys, because they were locked at home and didn’t have anything else to do. There’s some pretty clear evidence that that’s nearly all of it: it was partisan non-response. Democrats just started taking a bunch of surveys [when they were called by pollsters, while Republicans did not].
Just to put some numbers on that, if you look at the early vote results, and compare it with the cross tabs of what public polls said early voters were going to be, it’s pretty clear that early voters were considerably less Democratic than people thought. Campaign pollsters can actually join survey takers to voter files, and starting in March, the percentage of our survey takers who were, say, ActBlue donors, skyrocketed. The average social trust of respondents went up, core attitudes changed — basically, liberals just started taking surveys at really high rates. That’s what happened.
You mentioned social trust. Walk me through your basic theory about how people who agree to take surveys have higher levels of social trust, and how that has biased the polls in recent years.
For three cycles in a row, there’s been this consistent pattern of pollsters overestimating Democratic support in some states and underestimating support in other states. This has been pretty consistent. It happened in 2018. It happened in 2020. And the reason that’s happening is because the way that [pollsters] are doing polling right now just doesn’t work.
Poll Twitter tends to ascribe these mystical powers to these different pollsters. But, they’re all doing very similar things. Fundamentally, every “high quality public pollster” does random digit dialing. They call a bunch of random numbers, roughly 1 percent of people pick up the phone, and then they ask stuff like education, and age, and race, and gender, sometimes household size. And then they weight it up to the census, because the census says how many adults do all of those things. That works if people who answer surveys are the same as people who don’t, once you control for age and race and gender and all this other stuff.
But it turns out that people who answer surveys are really weird. They’re considerably more politically engaged than normal. I put in a five-factor test [a kind of personality survey] and they have much higher agreeableness [a measure of how cooperative and warm people are], which makes sense, if you think about literally what’s happening.
They also have higher levels of social trust. I use the General Social Survey’s question, which is, “Generally speaking, would you say that most people can be trusted or that you can’t be too careful in dealing with people?” The way the GSS works is they hire tons of people to go get in-person responses. They get a 70 percent response rate. We can basically believe what they say.
It turns out, in the GSS, that 70 percent of people say that people can’t be trusted. And if you do phone surveys, and you weight, you will get that 50 percent of people say that people can be trusted. It’s a pretty massive gap. [Sociologist] Robert Putnam actually did some research on this but people who don’t trust people and don’t trust institutions are way less likely to answer phone surveys. Unsurprising! This has always been true. It just used to not matter.
It used to be that once you control for age and race and gender and education, that people who trusted their neighbors basically voted the same as people who didn’t trust their neighbors. But then, starting in 2016, suddenly that shifted. If you look at white people without college education, high-trust non-college whites tended toward [Democrats], and low-trust non-college whites heavily turned against us. In 2016, we were polling this high-trust electorate, so we overestimated Clinton. These low-trust people still vote, even if they’re not answering these phone surveys.
So that’s 2016. Same story in 2018 and 2020?
The same biases happened again in 2018, which people didn’t notice because Democrats won anyway. What’s different about this cycle is that in 2016 and 2018, the national polls were basically right. This time, we’ll see when all the ballots get counted, but the national polls were pretty wrong. If you look at why, I think the answer is related, which is that people who answer phone surveys are considerably more politically engaged than the overall population.
If you match to vote history, literally 95 percent of people who answer phone surveys vote. That’s the problem with “likely voter screens” [which try to improve polls by limiting them to the likeliest respondents to vote]. If you restrict to people who have never voted in an election before, 70 percent of phone survey takers vote. If you restrict to people who say they will definitely not vote, 76 percent of those people vote.
Normally that doesn’t matter, because political engagement is actually not super correlated with partisanship. That is normally true, and if it wasn’t polling would totally break. In 2020, they broke. There were very, very high levels of political engagement by liberals during Covid. You can see in the data it really happened around March. Democrats’ public Senate polling started surging in March. Liberals were cooped up, because of Covid, and so they started answering surveys more and being more engaged.
This gets to something that’s really scary about polling, which is that polling is fundamentally built on this assumption that people who answer surveys are the same as people who don’t, once you condition on enough things. That can be true at any given time. But these things that we’re trying to measure are constantly changing. And so you can have a method that worked in past cycles suddenly break.
Why can’t you just fix that by weighting? Why not just control the results by sexual orientation or religion to get around that problem?
You can know from the GSS, say, how many people nationwide have low levels of social trust. But that doesn’t tell you — what about likely voters? Or what about likely voters in Ohio’s 13th Congressional District? How does that break out by race or gender or education? How does that interact with turnout? All that stuff becomes quite hard.
There’s a reason pollsters don’t weight by everything. Say you have 800 responses. The more variables you weight by, the lower your effective sample size is. Once the number of things you control for increases past a certain point, traditional techniques start to fail and you need to start doing machine learning and modeling.
This is the bigger point about the industry I’m trying to make. There used to be a world where polling involved calling people, applying classical statistical adjustments, and putting most of the emphasis on interpretation. Now you need voter files and proprietary first-party data and teams of machine learning engineers. It’s become a much harder problem.
One reaction I’ve seen from several quarters is that 2020 shows that quantitative methods aren’t enough to understand the electorate, and pollsters need to do more to incorporate ethnographic techniques, deep interviews, etc. In a way you’re proposing the opposite: Pollsters need to get way more sophisticated in their quantitative methods to overcome the biases that wrecked the polls this year. Am I understanding that right?
I mean, I’m not a robot. Qualitative research and interpretation are important for winning elections. But I think it’s a misunderstanding of why polls were wrong.
A lot of people think that the reason why polls were wrong was because of “shy Trump voters.” You talk to someone, they say they’re undecided, or they say they’re gonna vote for Biden, but it wasn’t real. Then, maybe if you had a focus group, they’d say, “I’m voting for Biden, but I don’t know.” And then your ethnographer could read the uncertainty and decide, “Okay, this isn’t really a firm Biden voter.” That kind of thing is very trendy, as an explanation.
But it’s not why the polls were wrong. It just isn’t. People tell the truth, when you ask them who they’re voting for. They really do, on average. The reason why the polls are wrong is because the people who were answering these surveys were the wrong people. If you do your ethnographic research, if you try to recruit these focus groups, you’re going to have the same biases. They recruit focus groups by calling people! Survey takers are weird. People in focus groups are even weirder. Qualitative research doesn’t solve the problem of one group of people being really, really excited to share their opinions, while another group Isn’t. As long as that bias exists, it’ll percolate down to whatever you do.