Don’t miss Ipsos experts who’ll be speaking at this year’s Federal Committee on Statistical Methodology annual conference.
Hitting the Target? The Use of Targeted Samples in Probability-based Samples.
Speakers: Randall K. Thomas, Ipsos Public Affairs Frances M. Barlas, Ipsos Public Affairs Megan A. Hendrich, Ipsos Public Affairs Kathleen Santos, Ipsos Public Affairs
Address-based (ABS) studies often attempt to obtain people with specific characteristics (e.g., black people, 18 to 24 year olds) at higher rates for more precision of group estimates. Households in the sample frame can have commercial-based information appended useful for targeted sample selection. Targeted samples can improve study efficiency and lower study costs. Though many researchers treat individuals with the desired characteristics from targeted sample as equivalent to those from non-targeted sample, individuals from targeted samples might differ. In the study we report, we were interested in obtaining higher rates of cigarette smokers. We first selected a non-targeted general population sample before selecting a targeted smoking sample. We compared the smokers obtained for each sample type and found that targeting was indeed associated with higher the prevalence of smokers, but also found differences in smoker characteristics – smokers from targeted households were older, smoked more, started smoking younger, and were less likely to use other tobacco products. We examined how weighting could offset the effect of targeting, but such adjustments can come at a cost in the unequal weighting effect. Our results show caution is recommended when using targeted samples.
Scrubbed Clean: Does Data Cleaning Improve the Quality of Analytic Models?
Speakers: Megan A. Hendrich, Ipsos Public Affairs Randall K. Thomas, Ipsos Public Affairs Frances M. Barlas, Ipsos Public Affairs
Many researchers believe that data cleaning leads to better data quality and often clean out participants exhibiting sub-optimal behaviors (e.g., speeding, response non-differentiation, or nonresponse), sometimes using aggressive cleaning criteria that removes up to 20% of their sample. Surprisingly, most research has failed to find that data cleaning reduces bias for point estimates (e.g., proportions, means). In this study, we were interested in assessing if data cleaning affects covariance, specifically in multiple regression models. In an online study with over 9,000 completes from three different sample sources (a probability-based sample and two opt-in samples), we examined regression coefficients for two different multiple regression models (political attitudes predicting party identification and demographics and life experiences predicting life satisfaction). We deleted cases in gradations from 2.5% up to 50% of the sample based on speed of completion, weighted each dataset, and then ran the regression analyses. We found that small to moderate amounts of data cleaning did not substantially affect the direction or degree of coefficients; however, some coefficients became more unstable at 30% deletion and higher. We urge caution in any data cleaning protocols that might eliminate a higher proportion of participants since this may actually increase bias in covariance.
Effects of and Reactions to Response Formats for Race and Ethnicity Measurement in the U.S.
Speakers: Frances M. Barlas, Ipsos Public Affairs Kip Davis, Ipsos Public Affairs Randall Thomas, Ipsos Public Affairs Megan Hendrich, Ipsos Public Affairs
Much has been made about the large change in the racial composition of the United States as indicated by the latest Decennial Census figures with many more Americans identifying as more than one race compared to 10 years ago. Typically, a question about Latino origin is asked prior to a question about racial identity. An often discussed modification is to merge the two questions (race and Latino ethnicity) into a single question. In addition, another modification has proposed adding a Middle Eastern/North African category (MENA). We conducted a study testing 6 versions of race and ethnicity questions, randomly assigning one per respondent, to test combining race and Latino ethnicity as well as the provision of a MENA option. Over 7,000 respondents from Ipsos’ KnowledgePanel, a probability-based online panel, completed this web-based experiment. We report on the similarities and differences we obtained between formats. We further asked respondents about their experiences answering questions about race and ethnicity generally and about the question format that they were asked in this study. Latino respondents across all question formats had the most difficulty in answering the question, particularly the more traditional two question format. High rates of respondents of color indicated that they had a difficult time answering the question, that they could not locate themselves in response lists, that they were concerned about data privacy, that they were concerned about answering the questions, and that they are offended when being asked such questions. Given the increasing diversity of the United States in our surveys and our interest in ensuring the most representative samples, it is critical that we consider revised, more inclusive, and more relevant race and ethnicity questions for our respondents.
It’s a Trap!: Use of Trap Questions and Data Quality
Speakers: Mina Muller, Ipsos Public Affairs Frances M. Barlas, Ipsos Public Affairs Randall K. Thomas, Ipsos Public Affairs Megan A. Hendrich, Ipsos Public Affairs
Researchers often have concerns over data quality due to inattentive or unmotivated respondents. Various measures have been developed to assess whether respondents are providing accurate responses. Besides speeding and response non-differentiation, researchers sometimes imbed trap questions that can be used to detect when someone is not paying attention. A compliance trap directs a respondent to select a particular response (e.g., “Select ‘Somewhat agree’ for this item”) regardless of the question. We studied whether cleaning out respondents who fail such traps would improve data quality. In a study with over 3,500 completes from an online probability-based panel, respondents were randomly assigned to experience two compliance traps or not. We examined respondent reactions following their presentation and we also looked at any reduction in bias using 10 demographic items for which we had benchmark values. We also looked at how trap failure was related to speed to complete (another indicator of sub-optimal behavior). For both trap conditions, we found that higher trap failure had a modest association with faster completion times. We found that there was no difference in average bias between the full samples versus the samples that had eliminated participants due to trap failures. It appears that trap failures are not as closely related to data quality as many have believed.
For more information about this conference, please visit the FCSM website.
Randall K. Thomas, SVP and Chief Survey Methodologist, Public Affairs