Over the weekend, Mark Blumenthal of the Huffington Post published a lengthy review of the Gallup poll’s methodology. It is a technical read, but I encourage you to give it a careful look.
Blumenthal’s bottom line is that Gallup tends to place the president’s job approval rating about 2.5 points below the average of similar polls, and that a portion of this can be chalked up to under-sampling non-whites.
What to make of this? I have five points to offer.
1. Blumenthal’s technical analysis appears sound. Basically, there are a series of difficult choices that need to be made about how to sample non-white adults. Blumenthal argues that the sum of Gallup’s choices results in fewer non-whites being sampled than in other polls, according to the most up-to-date demographic information about America. This is a persuasive point.
2. Gallup hasn’t done anything wrong. This is where I start to diverge from Blumenthal. To be clear, he does not accuse Gallup of doing anything unethical. Instead, his argument is that Gallup’s choices are methodologically inappropriate.
The problem is that Blumenthal looks at Gallup in isolation.
Think of it this way: Suppose that I am asked to describe my cat, Popper, to somebody who has never met another cat before. I tell him that she is very stuck up; she acts as if the world revolves around her; she gets in fights all the time, in particular with her younger sisters; and generally behaves as though I am her pet, not vice-versa. That person might conclude that Popper is a bad cat, but she is in fact typical of the species. Because I only discussed one cat, without noting how cats in general are prickly and self-involved, I gave the false sense that Popper is somehow worse than others.
And so it goes with Blumenthal’s analysis. Yes, Gallup is making decisions about how to sample, and these decisions create a “house effect,” which in this case cuts against the president. However, house effects are endemic to polling. It is an inexact science, and every pollster must make trade-offs similar to what Gallup is doing. Some pollsters make choices that cut against this president; others make choices that cut in his favor.
The only fair way to criticize Gallup for its choices is to examine them in relation to other pollsters: why did Gallup make one choice that another pollster did not, what effect did these different choices have on the final result, and whose decision is more defensible? Yes, Gallup is undersampling nonwhites, but maybe achieving a proper balance among nonwhites creates imbalances in other aspects of the poll, which Gallup judged to be more harmful to its accuracy. Blumenthal cannot address this (very reasonable) possibility because he has not looked at other polls.
3. Other polls have similar issues. Blumenthal compares the Gallup poll to Pew, using the latter as a baseline. However, he fails to note that Pew has a “house effect” that is roughly as substantial as Gallup’s.
Specifically, consider the following graph, which I created by using Pollster’s superb polling widget. The black and red lines track President Obama’s job approval and disapproval since 2010 among the major media polls of adults, conducted via live interviews. The dots are Pew’s disapproval numbers.
As we can see, Pew systematically underestimates Obama’s disapproval rating relative to the other polls.
Why is this? Quite possibly, it relates to my previous point: Pew is confronting the same difficult challenges that Gallup is facing, making different decisions on how to deal with them, and thus arriving at a different result.
By the way, this is the source of all house effects: assumptions about a whole host of questions must be made, and those assumptions can add up to small yet persistent statistical biases.
4. It is misleading to single out Gallup. In my years of watching polls, I’ve concluded that all pollsters fall into one of three categories:
(a) Hacks. These pollsters skew their numbers to score some political point.