Implicit Association Testing in Advertising Research: A Critique of Venkatraman et al. 2015

By Aaron Reid

September 4, 2015

Share

On the additive nature of neurophysiological measures in advertising research.

A few weeks ago, I was telling a colleague about the findings from a recent study on “neuromarketing” techniques applied to advertising testing published in the Journal of Marketing Research (paraphrased below).
I told him that they tested five techniques (including implicit association techniques) and concluded that only fMRI added predictive value above and beyond traditional research methods in predicting ad success. Given what he has seen Sentient produce on the additive predictive accuracy of implicit measures over the past decade, he scoffed: “How did they test the technique?”
“Get this,” I said. “They took a ‘salient’ image from each of the 30-second spots and used it as a representation of the ad. Then they captured the implicit valence associated with that specific image and used it as a measure of the ‘desirability’ of the ad itself.”
“That’s a lot of weight placed on one image from an ad,” he said. “And beyond that, it’s not even a brand impression impact variable! And they expected that to be predictive of the success of the ad?”
“Yes. Can you believe it? And this is a peer-reviewed article!”
“Makes you wonder who the ‘peers’ are reviewing the article.”
“Exactly. I think this is a case where the scientific-practitioners, who have been applying these techniques for years, know more about the appropriate application of behavioral science techniques to business than the pure-play academics.”
“You know,” he said, “when you think about it, it’s akin to testing a set of explicit questions that you’ve created, finding that they are not predictive of some behavior of interest, and concluding that the method of “questionnaire” does not have additive predictive value beyond other measures.”
“Exactly.” I laughed, “Wouldn’t you wonder if you were asking the right questions first, before you concluded that the entire approach had no added value? But it does speak volumes to where we are as an industry in applying these new methods, as well as where Academia is in identifying who the appropriate peers are to be recruited for designing and reviewing these studies.”
“I wonder how long it will take for that peer review paradigm to shift,” he reflected.
“I’m not sure,” I thought, “But I do know that continuing to publish applied validation studies, and focusing on scientific integrity in our methods, rather than trying to make a quick buck on ‘shiny-new-object’ trends with pseudo-scientific techniques is surely the path for long-lasting impact on our industry and the advancement of applied behavioral science.”
“To be fair to the authors,” I continued, “this is a really important study. It represents the first real foray in designing a study that evaluates multiple non-conscious methods in their ability to predict real-world behavioral metrics. For that, it should be lauded. And naturally, we should expect to find some failings in the design and an over-eagerness in the conclusions drawn. I’ve done that myself on many occasions.”
In order to advance the industry, we need to recognize the vision of researchers like Venkatraman et al., while simultaneously challenging it in order to advance. We attempt to achieve both of these requirements in our critique below

An Analysis of “Predicting Advertising Success Beyond Traditional Measures…”

The 2015 study “Predicting Advertising Success Beyond Traditional Measures: New Insights from Neurophysiological Methods and Market Response Modeling,” from researchers Vinod Venkatraman, Angelika Dimoka, Paul A. Pavlou, Khoi Vo, William Hampton, Bryan Bollinger, Hal E. Hershfield, Masakazu Ishihara, and Russell S. Winer, represents an important contribution to the growing literature on advanced market research techniques in advertising testing. The authors sought to answer a critical question facing the market research industry:

“A common question that practitioners ask is whether neurophysiological measures are ‘valuable’, that is, do they contribute anything beyond traditional measures in predicting ad success?” – Venkatram et al. (2015)

However, we believe that while the effort and critical thinking at the essence of the paper is essential, the conclusions of the paper understate the additive predictive accuracy advantage of several applied consumer neuroscience techniques.

Venkataram et al.’s Implicit Association Testing Study Design

The authors constructed an experiment designed to evaluate the ability of five consumer neuroscience measurement techniques to add predictive accuracy above and beyond traditional measures.
The study evaluated 37 thirty-second ads for 15 unique brands from six different companies, and then used the individual level data to predict actual in-market ad elasticities.
The five consumer neuroscience measurement techniques evaluated included:

EEG (electroencephalography)
fMRI (functional magnetic resonance imaging)
Biometrics (including, skin conductance, respiration, and heart rate)
Eye-tracking
Implicit Association Testing

The conclusion: of the five consumer neuroscience measures tested, only fMRI added an advantage over traditional measures in predicting real world ad elasticities.
Our assessment: the conclusions of the paper are overstated and the design and analyses do not adequately assess the potential of several consumer neuroscience techniques.
Before detailing the nature of the fallacies in the design and analysis of this study, let’s first talk about the many merits of this paper.

What We Loved About the Venkatraman et al. Study

First, the pursuit of the objective of this research is exactly what we need as an industry. We must continue to invest in building case studies of the predictive validity of consumer neuroscience methods. Within those validation studies, we need to use actual in-market behavior as our dependent variables, and the authors of this study have gone to great lengths to obtain real world ad-performance metrics to serve as dependent variables for their analyses.
Second, the introduction reads as a bona-fide historical how-to of ad-testing. This alone is worth the paper it’s published on for any young researcher seeking to understand the paradigms of traditional and advanced neurophysiological copy testing. Venkatraman and colleagues describe the theoretical AIDA model (Attention, Information assimilation, Desirability and Action) and transition into today’s Attention, Affect, Memory and Desirability approach.
The detail provided describes both the traditional self-report measures of these ad performance constructs (relevancy, liking, recall, and purchase intent, etc.) as well as corollary neurophysiological measures (eye-tracking, physiological arousal and valence, fMRI hippocampus activity, and ventral striatum activation).¹
Third, the study reveals that greater activation of the ventral striatum, averaged across an ad, is predictive of in-market ad elasticities above and beyond traditional explicit measures. This finding replicates and extends previous work showing activation of the ventral striatum as a predictively valid measure of desirability (see Knutson et al., 2007), and is an important contribution of the paper to the literature.
Fourth, it is no small challenge to conceive a study with an experimental design that is able to create fair comparisons across multiple data collection methodologies. The design of this experiment is a significant first attempt to establish a fair comparison across many neurophysiological measures. As a first-of-its-kind study (comparing EEG, fMRI, biometrics, eye-tracking and implicit association measures), it is truly ground-breaking.

Biases and Potential Flaws We Identified

However, as with any significant effort to break new ground in experimental design, it is very difficult to design without your own biases influencing the direction of your path.
In this case, the experimental design and analyses are biased toward accommodating the data collection requirements of fMRI (is it a surprise, then, that we see the results favoring fMRI?).
With all the ingenuity around the design of this study, the researchers fail to use implicit association testing, EEG and biometrics according to their best practices in applied consumer neuroscience, and these failures likely draw into question the authors’ conclusion that “our findings have important implications for practice in that they help to elucidate which particular methods and which exact measures better predict real-world advertising success.”
Two primary issues with the design and analysis of this research are addressed below:

Critique of Implicit Association Testing Study Design and Analysis

The design of the implicit association t est (IAT) does not adequately assess the primary value of implicit association testing in copy research. Namely, the research design fails to assess the impact of ad exposure on the implicit associations with the brand or product being advertised.
The analyses of the biometrics and EEG fail to capitalize on the primary merits of these techniques in copy testing. Namely, the analyses presented do not assess the insight and predictive accuracy advantage of capturing moment-by-moment neurophysiological responses to ads.

Let’s dive into the details.

Great finding, but let’s not call it a “buy button,” okay?

The fMRI results demonstrate the power of measuring the ventral striatum during the processing of an ad. Activation of this part of the brain is indicative of processing stimuli that are “wanted” versus stimuli that are merely “liked.”
Thus, the measure is a good representation of the desirability construct of the current ad-effectiveness model. Perhaps the most significant (and valid) finding of the study is that greater activation in the ventral striatum (VS) is significantly related to ad elasticities above and beyond traditional measures.
In fact, it is quite possible that this relationship may be even stronger than what was revealed in the study. The authors contend that “the extent of reward-related activation in the brain during the actual ad provides a better and more direct measure of desirability.”
However, for that statement to be true, the reward activation observed in the brain would need to be associated with the product of interest in the consumer’s mind. Reward activation in the brain during an ad could or could not be associated with product or brand of interest. It is quite possible that elements of an ad may produce reward processing that is not imbued on the brand.
If that is the case, it suggests that the additional predictive utility of VS activation reported in this paper is likely understated relative to the potential of VS activation associated directly with the product or brand of interest.

Evaluation of Ads vs Evaluation of Outcomes of Ads

This limitation of the design and analysis plan for assessing desirability is consistent across measurement tools in the study. Essentially, what the research design measures is an evaluation of the ad itself, not an evaluation of the outcome of the ad.
That is, this study lacks a measure of what the impact of exposure to the ad is on desirability toward the product or brand. This leads to conjecture that the responses observed during the processing of the ad, translate into impact of that ad on desirability of the product.
What is really needed in the application of consumer neuroscience to ad testing is a measure of impact. Fortunately, that’s the beauty of properly design implicit association testing for copy research: it is an implicit impact variable.
Unfortunately, the design of this study does not properly use IAT in this manner, thereby woefully misrepresenting the additive predictive utility of implicit association testing in copy research.

The creative misuse of implicit association measures in copy testing

The implicit association portion of the test was completed among 186 participants (note that this does not meet the minimum recommended sample size from Sentient Prime implicit research technology studies of n=200).
The implicit association method utilized was a modified version of the original IAT (Greenwald, McGhee and Schwartz, 1998). Participants sorted images and words into categories (one measure: positive/negative, second measure: indoor/outdoor, and all four unique category combination) and response times were recorded.
The words represented positive and negative valence (e.g., love, death). The images were “salient,” unbranded screenshots from the target ads, and foil images drawn from competitive ads (not previously viewed). The process proceeded according to standard original IAT protocol.
The researchers calculated two implicit measures from this exercise:

Implicit Valence: the differential response time for target ad images when paired with positive versus negative words.
Implicit Memory: the difference in response latencies to target ad images versus foil ad images (“previously seen images are likely to be retrieved faster”).

There are numerous problems with these particular applications of IAT for ad testing.
First, each of these measures is dependent on images chosen to represent the ad. For the memory measures, the intent is simply to have an implicit indicator of ease of recognition, and the application of IAT here is quite creative.
However, note that the data will only be as good as the chosen images are in equally representing each ad in the memory of each individual (the authors use an average “salient” image from each ad and applied it to the individual level). There is a tremendous amount of noise inherent in a measure of that type, and it is remarkable that it is related to ad elasticity at all.
The implicit valence measures are even more problematic. Again, the IAT is implemented using “salient” non-branded images from the ad as measures of emotional association with the ad itself. Best practices in the application of implicit association measures to ad testing call for the evaluation of the impact on associations with a target product or brand following exposure to the ad.
Using a non-branded moment from the ad to assess the positive to negative association with the ad, and then using that data to predict an in-market behavioral outcome, necessarily handicaps the potential contribution of IAT in forecasting ad success.

An appropriate design, but fallacious analysis of EEG and Biometrics

Interestingly, for the biometrics and the EEG measures, the pitfalls appeared in the analysis of the results. The researchers do offer that these consumer neuroscience measures were not evaluated for their full potential. This suggests that this study may offer additional promise for future analyses of the data if the authors are willing to share.
The EEG, biometrics and fMRI analyses in this study focused on aggregated responses across the entire ad (note again that the implicit association measures in comparison were on a single salient screenshot, biasing results against the IAT). Reasonably, it is suggested in the paper that EEG and biometrics could potentially be better at identifying particularly impactful moments of the ad (e.g., branding moments or final seconds) and that these moments might be related to ad success. In fact, the authors suggest that the EEG and biometric methods might be superior to fMRI in “identifying interesting temporal components within each ad.”
There is significant untapped potential in this dataset. Analyses of the EEG and biometrics that address moments of greatest approach versus avoidance emotional response have the potential to provide diagnostic insight on components of the ads that have the strongest relationships with ad elasticities. Furthermore, sharpening the analyses beyond the blunt mean response to the entire ad, could very well tip the balance on additive predictive value in the direction of the EEG and biometrics.

Our Conclusions

The depth of our experience testing the implicit impact of ad exposure at Sentient leads us to conclude that if the Venkatraman et al. (2014) experiment had been designed to test the implicit lift in brand or product affinity (as opposed to using non-branded moment of the ad as a measure of ad evaluation), it is likely that the implicit association measures would have demonstrated significant lift in predictive accuracy above and beyond traditional measures.
Future research on the application of implicit association measures to copy testing would profit from this more appropriate approach. The current conclusions on the relative efficacy of the implicit association testing in predicting ad success in this paper are therefore not warranted.
Regarding the EEG and Biometrics, the authors make the appropriate suggestion that future research should focus on investigating the predictive advantages of the specific strengths of each method is made.
For EEG that includes temporal resolution. For implicit associations that must include change in associations with the brand or product advertised as an outcome measure.

The Need for Expert Design and Humble Conclusions

Given the above, it is fair to say that fMRI adds predictive value above and beyond traditional methods. Those are exciting results for the practice of applied consumer neuroscience.
However, the conclusion that the authors and the early reviewers in the market research industry want to run with, that only fMRI provides additive value, is spurious.
This presents a difficult duality for practitioners seeking to apply consumer neuroscience techniques.
On one hand, we want to place our trust in academic studies that should hold scientific integrity as our benchmarks rather than rely on the unpublished IP of savvy research vendors.
However, if the researchers in Academia do not have enough experience in the practical application of these advanced methods, can we really expect that their studies will utilize all of the best practices in the industry?
The answer may lie in finding research partners that bridge the gap between behavioral science and business and are focused on in-market validation of their methods and metrics. This may also be an appropriate manner for identifying the appropriate peers for collaboration and review of the applied research.
As Sentient Decision Science has shown through thousands of studies and hundreds of millions of subconscious association measurements, the addition of true implicit association measures to traditional conscious measures of brand preference significantly improves the accuracy of predicting real consumer behavior.
You can learn more about how to properly apply consumer neuroscience techniques in ad testing using our Subtext™ approach here, or contact us at info@sentientdecisionscience.com.