“The truth is rarely pure and never simple,” Oscar Wilde famously pointed out.
In the audience measurement business a less famous British commentator, Rodney Harris, noted in the 1980s that:
“Media research is not designed to find out the truth. It is a treaty between interested parties.”
Certainly, if by truth we mean perfection, then audience measurement falls down. There are no perfect audience numbers; all are estimates. Short of interviewing everybody in the population and achieving error-free recall or tracking of behaviour, all measurement systems are imperfect.
So we employ statistical techniques to help us. Four stand out: sample weighting, ascription, data fusion and modelling.
It is true that we can tell what blood type a person has or whether they are afflicted with certain conditions from examining very small amounts of blood. And it is equally true that we can learn a lot from survey samples about peoples’ opinions and behaviour.
But sampling people is not exactly the same as taking a blood sample. People are not homogenous. Our samples need to be representative across a range of demographic and other characteristics that influence whatever behaviour we are measuring.
But it can never be flawless. Reporting samples may be skewed towards certain groups which are easier to recruit or who respond more readily. To the extent these biases are known, they can be compensated for by various kinds of sample weighting.
Then there is the issue of information ‘gaps’, where some people have not answered every question or reported on all the days they were asked to report on. Here, answers can be ‘ascribed’ or filled in on their behalf using information we have collected both from them and similar respondents. This is common practice in many audience measurement studies.
Where we want to collect more information than it is reasonable to ask a single group of people to provide, we can use statistical ‘fusion’ to join together different surveys.
Similar participants from each study will first be matched on as many criteria as possible (e.g. on gender, age group, region and any other characteristics considered pivotal to the behaviours being measured). Survey data on each matching respondent is then merged so that we can look at the answers to questions asked on both studies in a single database.
Again, this is common practice. Many readership surveys fuse data from separate online measurement services to report on cross-platform audiences for newspapers and magazines. Out of Home measurement integrates travel survey data with traffic flow information. Increasingly, TV audience measurement combines panel data with internet traffic data to report on total video usage from all sources.
Which brings us to the ‘modelling’ of behaviours. Nielsen, for example, has recently decided to ‘assign’ rather than collect viewer demographics in several US cities. Here, TV set usage is captured automatically by ‘set-meters’ (which detect when the set is switched on and which channel is it tuned to) while a separate sample of people used to fill out a one-week diary to tell us who was watching.
Assigning viewer data to set behaviour data is not a new idea. In an experiment carried out in Boston in 2000, viewing was modelled from household data and compared to results of peoplemeters and diaries running in parallel in preparation for launch of a new peoplemeter service in 2002.
Two sets of household information were looked at: the tuning behaviour captured by the set-meter and the demographic composition of each household in the sample.
The thinking was that, for example, in households where only one person lived, it was reasonable to assign any viewing to that person. In households with a teenager where the set in their bedroom is tuned to MTV, it is equally reasonable to assume that it is the teenager watching.
In 75% of quarter hour time periods examined, modelled results were closer to the peoplemeter outcomes than the diary was (though far from identical). The recommendation of the researchers was that diaries should be dropped in favour of larger set-meter samples and modelling.
Not everybody agrees that we can so easily predict who in a household is watching at any given time. But the same could perhaps be said for diaries – especially if the kind of people who agree to keep a viewing diary are not entirely representative of the population and/or do not log all of their viewing, conscientiously, as it happens. The set meter does not require compliance, so is arguably a more reliable measure. Increasing the sample of set-meter homes will improve accuracy more cost-effectively than adding diary sweeps.
In the end we must make choices: choices about how to improve imperfect measurement systems. We aim for the truth – but we won’t attain the whole truth. Falling survey response and tough challenges managing respondent compliance mean statistical approaches like sample weighting, ascription, data fusion and modelling will play a growing role in audience measurement.