Copy Testing U.S.-Style
empty
England and America are two countries separated by a common language said George Bernard Shaw, (1856 - 1950).
As an observer of global copy testing, I have heard and read many articles from the UK that reference (sometimes in a disparaging fashion), the way copy testing is practised in the US. People in the US are equally perplexed by the `British' philosophy. It is apparent that Shaw's sentiment can easily be applied to copy testing (or maybe we should say pre-testing?). Practitioners in the UK look at the US and wonder why, and vice versa.
In the interest of an open understanding and sharing of ideas, this article explores some of these myths, and reviews the reasons behind US practices. (Note: this is written with the obvious bias of someone who has been involved in US copy testing for nearly 20 years.) First, an exploration of the myths about US practice.
Belief number 1: Copy testing is prevalent in US advertising. Both a myth and reality. In reality, less than half the top 100 advertisers in the US regularly use copy testing. Testing is more prevalent among the largest advertisers.
There is also a skew by industry, with testing almost universal within fmcg and pharmaceutical industries, and rare among retailers, entertainment, technology and services (Figure 1).
Why the difference? Marketers at fmcg and pharmaceutical companies tend to have classic marketing training with an emphasis on analysis, risk reduction and optimising opportunities. This is conducive to using copy testing as an aid to judgment. Other industries tend to have less classically trained marketers who rely less on research.
Belief number 2: Copy testing is used as a substitute for judgment, not an aid to judgment. Patently a myth, but based on a version of reality. Virtually every company gives brand managers the right to air copy they deem to be effective. But in companies with a culture of quantitative testing, test results are a powerful aid to judgment. This is particularly true where companies have validation experience and an ingrained belief that the copy-test measures they use are linked to the marketplace. So going against the results of a test requires explanation.
Belief number 3: US copy testing is obsessed with persuasion. And a corollary: persuasion testing is synonymous with the brand-switch measure. A second corollary: recall is dead as a measure in US copy testing. These are all variations on the one-size-fits-all theory of advertising and are most assuredly myths.
Very few companies use persuasion as a sole measure of advertising effectiveness. The norm is to use multiple measures, to understand overall how the ad affects consumers. These measures generally include a measure of persuasion as well as measures of recall or intrusiveness, communication, imagery and likeability. The key to the successful use of a copy-test system has moved beyond which single measure you believe to how you build a composite picture of advertising success.
As for persuasion itself, one frequently heard myth is that the only persuasion measurement used in the US is the classic brand-switch measure. In fact, many people hear `persuasion testing' and think of brand switching. While this may have been true ten or so years ago, it is no longer the case. For example, our approach at Ipsos-ASI is to offer a variety of measures, so that we can best match the measurement to the market situation, not the other way around. We do offer the classic brand-switch measure, because, frankly, it works - when applied in appropriate situations. But it is used in less than half our tests. Other measures, including purchase intent and frequency, are used more often. Even clients who use the brand-switch measure do not use it exclusively. They use it when appropriate and use others when they are appropriate. Persuasion is not a measure, it is a concept: the concept of increasing the likelihood of purchase among those exposed vs those not exposed. How you measure that can, and should, vary.
To paraphrase Mark Twain, reports of the death of related recall are greatly exaggerated. In fact, related recall and similar measures are probably the most widely used measures in the industry. The reason is that as more and more work is done to understand how copy-test measures relate to the marketplace, it has become obvious that recall matters. This does not mean that recall is predictive of sales effects. It most assuredly is not. By the same token, neither is any other measure. Remember, the `one measure fits all' theory is a myth. But we have found that recall used in conjunction with other measures is predictive.
In fact, one primary measure used in our Next*TV copy-test system is a measure called the Copy Effect Index. Copy effect a composite measure, combining recall and (appropriate) persuasion measures into a single score. The beauty of this is twofold. First, it does link to the marketplace, as we find a good fit between the Copy Effect Index and advertising volume identified through market-mix models. This is discussed later, but the important thing is that the fit is always better when we look at recall in combination with persuasion than for either measure alone.
Secondly, copy effect acknowledges trade-off between these two ideas. Ads that may not have a particularly strong or persuasive message can offset this with strong recall, and vice-versa.
Other suppliers report similar findings. So while I would not call it consensus, there is now a general belief in the importance of some type of effective reach measure, such as related recall, used in conjunction with other measures a key to understanding in-market advertising effectiveness.
Why do we act this way? You will note that thus far priority has been given to the use of measures that evaluate the success of the ad rather than exploring why it is successful. This is fair observation and is a fairly precise summary of the US market.
Copy testing can serve two roles.
- To minimise risk by ensuring the quality of copy on-air.
- To maximise opportunities by understanding, from existing creative, how to produce better creative in future.
The order in which a client prioritises these objectives sets the tone for if and how copy testing is used.
For the most part, the culture in the US, particularly among larger advertisers, to minimise risk. Where profits are reported quarterly, budgets are tight and advertising is a huge expenditure, the financial microscope is turned on advertising. And the numbers are substantial. According to Advertising Age, the ten largest US advertisers spent nearly $21 billion on advertising alone in 2001. General Motors spent $3.4 billion and Procter & Gamble, the second largest advertiser, $2.5 billion. We are talking real money, money that the companies expect will provide a return on investment as strong as any other capital expenditure. There is a need to know, and know early, if that expenditure is going to be productive. This is the key driver of the evaluation oriented focus of US copy-test systems. Moreover, it is becoming clear to advertisers that quality of copy provides the greatest leverage, increasing the need ensure quality copy. Results have convinced advertisers that no amount money can offset weak copy and that copy varies tremendously in its ability drive sales. Data from market-mix modelling, for example, show that the sales impact of different ads for a brand can vary by as much as 4:1, from the strongest to the weakest. Differences of this magnitude are by no means unusual. Clearly, an advertiser would be much better off only airing the stronger ads.
There are clearly ads that do not work (and others that work exceptionally well), but how does the advertiser know which is which? How do you identify effective copy? Three approaches are available.
Judgment - used in almost all circumstances but it is still, as these data indicate, fairly limited in accuracy. It is very difficult to tell how consumers will react to any situation, since the variables that influence their reaction are so complicated and relational.
Qualitative or communication and reaction testing - these help you understand certain aspects of consumer reaction, but are not tapping into the key reactions.
Quantitative copy testing - a path that more and more advertisers are using. But not without proof of the validity and predictiveness of the testing system.
The emphasis on validation is linked to that on risk reduction. If copy testing is used to reduce risk, the advertiser needs to know that the system can identify effective and ineffective ads. In fact, evidence of validation has been a cost of entry into the industry for a while, but the emphasis is growing. Several of our larger clients now have their own on going validation programmes to continue fine-tuning their learning.
The reason for increased emphasis on validation is that it has become easier over time. For years, efforts to link test results to marketplace results were dismissed because it was `too difficult to identify the effects' or `too many variables beyond advertising impact sales'. Before the advent of improved marketing databases, this was true and it was easy to confuse effects such as pricing with advertising effects.
But these excuses no longer hold. Using market-mix modelling, we can identify the sales impact of advertising and understand how copy-test measures fit with sales data. This has provided important new learning.
For example, modelling and validation work resulted in the development of the Copy Effect Index. I mentioned earlier the relationship between the Copy Effect Index and sales effects for one brand. The relationship between copy effect and volumetric impact of commercials is a strong one, as Figure 2, which compares Copy Effect Indices with a standardised sales effect, demonstrates.
It is not perfect; never believe perfect validation data. But it is better than guessing or not testing at all.
But the emphasis on minimising risk is not the end-all to advertisers' needs, it is only the beginning. They want to know why, how and how to make it better. Once the advertisers have selected a system, they turn their attention to this issue. And most of our time with clients is spent answering these questions, not simply giving ads that thumb up or down. One last comment on the use of copy testing - it is used to eliminate ideas. At the 2002 WARC Pre-testing Conference, a paper was presented by Michel Joannic from Kraft Jacobs Suchard outlining how it used pre-testing and outlined the decisions it had made, including deciding not to proceed with certain pieces of copy. The fact that copy-testing data could be used to eliminate ideas shocked some people. But not every piece of copy has the potential to be strong. Some are strong when hatched, some need care and nurturing, and some simply need to be discarded and efforts placed against more productive ideas. Effective copy testing should help you tell the difference between these three.
An eye to the future While the US approach makes sense in light of marketplace needs, it does not mean that everything is fine. There are a number of areas for improvement.
One of the most promising is developing an approach that can help understand how specific pieces of copy link to brand equity. The measures commonly in use today measure short-term sales effects. At the same time, much modelling work shows that the major contribution of advertising is to long-term effects or brand equity.
How can we understand the effect copy has on equity? First, we need a coherent, validated model of how equity works. Our Equity*Builder provides this model. It looks beyond attitudinal equity to the broad concept of brand health. Brand health has three components: brand equity (attitudes/beliefs in the brand), brand involvement and price.
The equity component has five key questions. We have incorporated these into our copy testing for the last year to `hook' copy testing to Equity*Builder. This allows a reading for the consumer response to each individual component of equity as well as the calculation of an equity index for a specific ad. We are starting to find some interesting relationships. The good news is that it appears that we will be able to measure the ability of an ad to sustain or contribute to equity. We find that advertising can increase consumers' perception of the difference/relevance/popularity, and so on, of the brand, and that this is variable. Some ads show no impact, others show quite strong impact.
We are also finding an unexpected relationship. Ads with higher equity tend to get better copy-test scores, as both recall (and its components, attention and linkage) and persuasion scores are stronger.
Importantly, though, ads with strong copy effect (sales effect) do not necessarily show stronger results on the key equity questions.
So there may be three types of ads: those that show little short-term success, those with good short-term sales success but little potential to contribute to equity and those with good potential to drive short-term sales and long-term equity.
This is very preliminary. Our next step is to relate changes in the copy test to changes we read in equity over time. This takes time, but we believe it has great potential to expand the considerations of advertisers when selecting advertising.
Concluding thoughts This article aims to give some insight into how copy testing is used in the US today. I make no claim that this is optimal and identify areas where it can be improved. But it is serving its purpose well. It is helping advertisers manage the risks involved in advertising and spend their money wisely. In doing so, it is promoting the cause of advertising and helping advance the learning necessary to keep advertising as an important part of the marketing process. An important piece of evidence of the value that advertisers see in testing is that during the recent economic downturn, market research spending has remained flat to slightly down; yet spending on copy testing is growing. This is the most obvious sign of copy testing's value to advertisers.
Some of the myths and beliefs about copy testing today have merit, but many are just myths. And these get in the way of a productive use of copy test data, so their exposure is warranted.
This article has been reprinted with the permission of Admap, for more details go to www.warc.com/admap.
More insights about Public Sector