With roughly three weeks before Americans choose a new president, voters have begun paying much closer attention to the latest polls.
A lot of them are wondering: How come the results are all over the place?
Last week, Democratic presidential candidate Hillary Clinton appeared to be widening her lead against GOP rival Donald Trump in national voter polls — hitting a double-digit advantage in a few surveys. But the results span a wide range.
On Sunday, for example, results from a joint NBC News/Wall Street Journal poll showed Clinton holding an 11-point margin over Trump, 48-37 percent, with a margin of error of about 3 points.
On the same day, and ABC News/Washington Post poll showed Clinton leading Trump by just a 4-point margin, 47-43 percent, which was roughly the same as the survey’s margin of error.
Why do the margins vary so widely from one poll to another? The short answer: Opinion polling includes a lot of guesswork and assumptions, and pollsters make different choices when setting up their surveys.
Despite decades of refinements based on rigorous statistical science, modern opinion polls are still just rough estimates of how survey respondents might actually cast their votes. No matter how vigilant, the designers of these polls have to deal with a substantial amount of noise in the data they collect.
For starters, voters may say one thing to a pollster and then change their minds days later. That’s why results can swing so widely early in the race, but tend to solidify as Election Day approaches.
This year, a four-way race further complicates pollsters’ lives. Give people a choice that includes Libertarian Party candidate Gary Johnson and Green Party hopeful Jill Stein, and fewer of them will pick Clinton or Trump. In a two-way match-up, the margins will be different.
Most polls also rely on an extremely small sample — a few thousand at most — and then weight the results based on the demographic groups represented by the people who responded to the survey.
Given the relatively small sample — just a few thousands responses to estimate the mood of the entire electorate — the final results can be skewed by a few dozen opinions.
For example, to estimate the Election Day votes of the 130 million people who voted for president in 2012, a pollster might make just 1,300 phone calls — which means each response has to stand in for 100,000 voters. If the original sample includes fewer 50-something women than the national average, those responses are given more weight and another group that’s over-represented in the sample, say, 20-something men, is underweighted.
But there are different ways to weight any given sample, and the choices over how to weight the results can play a big part in the wide variance that shows up from one poll to another. One pollster, for example, might choose to weight their raw responses based on a balance of demographic groups tied to the latest census data. Another might use exit polling data from the last election cycle to determine demographic weightings.
That’s why small shifts from one survey to the next can be magnified in the final results.
Some online surveys use larger samples; an internet “poll” can generate tens of thousands of responses. But these online polls rarely ask enough demographic information — race, education, age, gender, etc. — to allow for the statistical weighting required to make the survey scientifically valid. So online polls generally say a lot more about a website’s audience than the attitudes of a “typical” voter.
That’s why most political analysts look at the average result of a number of polls and the shifts in direction from an individual poll with a consistent methodology.
Poll results can also be skewed by the methods used to collect data. Most pollsters attempt to collect their raw responses from a “random” pool of voters, but that’s harder than it might seem.
For one thing, it’s not always easy to predict how many of the people who respond to a poll will actually vote. That’s why many polls target both registered voters and people who say they are “likely” voters. That likelihood is self-reported, so it’s impossible to confirm whether they’ll actually turn out to vote.
Pollsters also have to be careful how they generate their lists of “random” telephone numbers, especially as more cellphone users abandon traditional landline service. Those “cord cutters” tend to be younger, for example, than the average landline customer.