Three Frequently Asked Questions

There are a number of questions about the polls that we get asked time and time again. Over the next few months, as we run up to the general election that will probably be held next year, I am going to try to answer some of them here. Let us begin with some questions about sampling:

Three Frequently Asked Questions

There are a number of questions about the polls that we get asked time and time again. Over the next few months, as we run up to the general election that will probably be held next year, I am going to try to answer some of them here. Let us begin with some questions about sampling:

"How can a sample of only 1,000 or 2,000 possibly reflect the opinions of 42 million Britons within a 3% margin of error?"

There is time-honoured answer to this question that goes back to George Gallup, the American who first developed opinion polling in the 1930s: if you have a large bowl of soup, you don't have to drink the whole bowl to decide if it has too much salt in it - just stir it well, and one spoonful will suffice.

Of course, finding a representative sample is not really as easy as stirring soup. The theory of representative samples is derived from the mathematical science called statistics, which dictates how to judge the probability of different events. The study of probability was originally developed to understand the gambling odds involved in various permutations of dice throws or playing cards, and we can use a simple example to illustrate the theory behind sampling. Suppose you have four playing cards in front of you, a heart, a club, a diamond and a spade. If you were to shuffle them together and pick two of them at random, it is not too hard to work out how likely it is that you will pick one red and one black card. There are six different possible pairs that you could pick: two of those six are pairs of the same colour, the diamond and the heart or the spade and the club, the other four are mixed. So there are four chances in six, or 66.6%, that you will pick one red and one black card.

Now suppose you have the whole pack of 52 cards. You can work out in exactly the same way how likely you would be to pick one red and one black card, although it will take you a lot longer to count all the possible pairs; but instead of counting, a simple mathematical calculation will tell you how many pairs there must be. (The first card can be any of 52, and the second any of the remaining 51, so there are 52 x 51 possibilities. This is double the total number of possible pairs, because you can pick each possible pair two ways depending on which of the two cards comes out first.) From here it is a simple step to working out the formula that will tell you how likely it is that any six cards you pick will be split equally between red and black, or any fourteen cards. Or, indeed, how likely you are to get any other combination of red and black.

Now suppose you don't know for certain that the 52 cards in front of you are necessarily a full pack: there may be more than 26 red cards, or there may be less. But if you pick, say, ten of them at random and find that five are red, you can work out how likely that would be if 26 of the whole pack were red, and how likely if 30 were red, and how likely if 35 were red, and so on. In short, it gives you a best guess for the number of red cards in the whole pack, and a margin of error (or "confidence interval"). Then, when you know how likely it is that any given number of randomly drawn cards will have the same split of reds and blacks as the whole pack - what we call being representative - you can work out how many cards you need to pick before you reach a given level of certainty.

Sample survey reliability works the same way - but on a much larger scale. Instead of 52 cards, we have 42 million adults to pick from. Using exactly the same mathematical principles (although the formulas have got a great deal more complicated now), we can find that 19 out of 20 of all the samples which could possibly be drawn will yield an outcome within 3% of the true percentage among the population. The most likely outcome is the true percentage of whatever it is we are measuring; next most likely are outcomes very close to this true percentage. One in 20 of the possible samples are outside this 3% range. This is what is meant by the "3% margin of error" that is often - especially in the USA though sometimes here as well - tagged onto the end of news reports of poll stories.

"Is a larger sample always better than a smaller sample?"

All other things being equal, yes, but all other things are rarely equal. A small but representative sample is far better than a large and unrepresentative sample. In the early days of polling, George Gallup and his rivals, Archibald Crossley and Elmo Roper, made their reputations by correctly predicting the result of the 1936 US Presidential election using small but scientifically-selected samples when the long-established straw poll organised by the magazine Literary Digest, which analysed millions (sic) of postal responses but with a sample biased by its reliance of directories of telephone and car owners, called the result disastrously wrong. In just the same way today, worthless phone-in polls organised by newspapers or TV stations may get hundreds of thousands of responses but are unable to control the composition of their samples (or, quite often, to prevent people voting more than once).

The important rule in sampling is not how many poll respondents are selected but, instead, how they are selected. In theory, the most reliable sample selects poll respondents randomly - which does not mean haphazardly! A random sample, also called a probability sample, is one that ensures that everybody in the population being surveyed has an equal chance of being selected for interview. In practice, a pure random sample is not always practical - apart from any other considerations, it takes a lot of time because once you have selected your target respondents you have to make repeated attempts to contact them; even after weeks, the chances are that the response rate will be a long way short of 100%, and of course those that have not been contacted or who have refused to take part may well be systematically different from those who have been interviewed … besides which, of course, an opinion poll which takes weeks to conduct may be out of date by the time it has been completed, of very little use when issues or political opinions which might change daily are in view! For this reason, most polls in Britain use quota sampling which pre-determines the demographic make-up of the sample to ensure it matches the profile of the whole adult population - although they often incorporate elements of random sampling as well, such as selecting telephone numbers by random digit dialling or, in the case of face-to-face interviews, randomly selecting the areas where the interviews are conducted. The quotas ensure, in theory, that we don't over-represent any groups that are easier to find, and that for every potential respondent who won't take part we replace them with somebody of similar views who will. Historically in Britain, the record of quota samples in predicting elections has been better than that of random samples.

For any given sampling design, in theory the larger the sample the more accurate the survey can be expected to be; but the benefits of increasing the sample size beyond a certain point are small. In the case of pure random samples, the mathematics dictates that doubling the sample size only makes the margin of error half as good again, so there is a point beyond which the extra expense of larger samples is wasted. One frequent reason for larger samples is to allow small subgroups of the population (first-time voters, for example, or readers of a particular newspaper) to be examined and compared.

"If the polls are representative of the whole public, why have I never been polled?"

There is a story that George Gallup used to answer this complaint with the comment that "You are more likely to be struck by lightning than to be interviewed in one of my polls", until one lady retorted, "But Dr Gallup, I've been struck by lightning twice". But Gallup's logic is still fundamentally true, even with the vastly increased number of polls and surveys that are conducted today. There are about 42 million adults in Great Britain, and most polls have a sample size of 1,000 adults. In any given year, MORI conducts only just over 400,000 interviews - and that includes not only opinion polls (which are a comparatively small part of our business), but all our other market research work as well. So we ought to get round to you around once every hundred years - and that assumes, of course, that nobody is ever interviewed twice, which isn't true either. (MORI can strike twice, like the lightning in the case of Dr Gallup's unfortunate questioner.)

Of course, getting a representative sample is only one part of the battle. Even when the opinions of our 1,000 respondents exactly mirror those of the whole adult population, we have still got to ask them the right questions and get them to give us the right answers. More on that next month.

More insights about Public Sector