The challenge today is that clients want us to collect more information, not less.
Where statistical adjustment can help
Marketers want to know everything they can about their target consumer in order to maximise the return on their research investment. But no individual respondent will agree to answer such a large number of questions. And many of the questions will be impossible to answer accurately. There are two closely related statistical techniques used to help address this: data ascription and data fusion.
Where answers to a survey are missing or incomplete, it is possible to infer what those answers would be by looking at answers given by similar survey respondents.
In this case, we design two (or more) questionnaires, each sharing certain core questions in common, but with separate sets of questions on other topics. These questionnaires can either be served simultaneously to separate but similar samples of people or they could be asked at different times.
The assumption is that we can then ‘match’ people answering the different questionnaires using the known demographic and other characteristics of each sample member, as well as the answers they give to other common questions.
We then take answers to the first set of questions and ascribe them to matching respondents who answered the second set of questions and vice versa. This gives us a larger database of answers than we could have had with a single sample of people.
Fusion is not a single technique – different approaches can be taken depending on the objectives. The principles for any approach are similar however and follow these
- Set the objectives
- Analyse the datasets and determine the relationships that need to be preserved
- Choose a fusion method
- Select the variables and metrics to be fused and prepare the input datasets
- Define critical variables if required.
- Identify the ‘hooks’ to be used in the fusion – either as matching or modelling variables
- If necessary, attach importance weights to the hooks
- For “row-wise” data fusion methods, choose a distance metric
- Run the matching process or modelling process
- Create a new fused dataset
- Validate the data
- Provide fusion diagnostics
There are at least three reasons why we might want to use it to enhance our survey data:
- To increase the scope of our survey coverage.
- To increase the granularity of our reporting.
- To improve the speed of reporting.
- The digital revolution means advertisers and media companies need more information than ever before on media usage.
- Yet people are less willing than they have been in the past to participate in surveys and, even when they can be persuaded to do so, want to be engaged rather than bored.
- This conundrum of needing more information while finding it harder to collect from surveys alone is likely to get harder rather than easier as time passes.
In the audience measurement domain...
- Techniques like data ascription, data fusion and audience modelling are allowing us to collect and report more and better data, enabling us to keep pace with increasingly complex client needs.
- The practical application of data science demands a high level of skill and expertise, as well as experience – many of the decisions and choices made in building fusions and ascriptions, for example, are not black and white, demanding judgement and a deep knowledge of the context.
We believe it will play a growing role in the processing of audience measurement data.