Population surveys are one of the most important tools for tapping how much citizens know about science and technology, how they perceive potential risks and benefits, and what their attitudes are about emerging technologies or research on particular applications.The encyclopedia, including the full chapter on surveys, is scheduled to appear with Sage in July 2010.
Sample surveys are defined as systematic studies of a geographically dispersed population by interviewing a sample of only certain members in an attempt to generalize to their population. Two terms of this definition are particularly important: “systematic” and “generalizable.”
The idea of systematically studying a population is a first main goal of sample surveys. Surveys therefore typically rely on a standardized questionnaire is in order to gather reliable and valid information from a wide variety of respondents. Reliability, in this context, refers to the idea that the same instrument – applied to comparable samples – will produce consistent results. But reliability is not enough. It is very possible, for example, that a questionnaire consistently measures the wrong construct. Validity therefore adds a second quality criterion, and refers to the idea that questionnaires need to provide not just consistent but also unbiased and accurate measurements of people’s behaviors, attitudes, etc.
Reliability and validity are tied to a number of factors in the survey process. But two aspects are particularly important when constructing a questionnaire: the overall structure of the questionnaire and wording of specific questions.
When structuring a survey questionnaire, the first concern is length. If a survey takes too much time to complete, it will likely result in significant incompletion rates. Unfortunately, the respondents who tend drop out of lengthy surveys are not a random subset of the population. Rather, they tend to be – among other characteristics – younger, more mobile, and employed full-time. As a result, excessively long survey instruments often produce samples that are plagued by systematic non-response among particular groups in the population, and are therefore limited in terms of their generalizability (see below).
A second concern with respect to questionnaire construction is the way questions are ordered on the questionnaire. Well-constructed questionnaires typically ask easy to answer questions first and sensitive or embarrassing questions later in the questionnaire. One of the most common pitfalls in survey instruments are priming effects, i.e., the notion that some questions can make certain considerations (for instance, risk or benefits of a specific technology) more salient in a respondent’s mind and therefore influence how he or she answers subsequent questions (for an overview, see Zaller & Feldman, 1992)
In addition to questionnaire structure, the wording of specific questions is a critical variable in building a valid instrument. In particular, well-constructed questionnaires use language and terminology that is designed to avoid biases. Such biases may stem from language that is likely to be more accessible to some respondents than others (e.g., terms that are more likely to be understood by certain ethnic groups or education-based cohorts) or that favors respondents who are more interested in or know more about science and technology in the first place. Any wording that feeds into these potential biases introduces systematic measurement error, since it does not produce an equally valid measure across all groups of the population.
These concerns about systematic measurement error are particular relevant for a researcher’s ability to generalize from a sample to the general population. This is both a statistical and a substantive problem.
From a statistical perspective, surveys are designed to allow researchers to make inferences from observed sampling statistics (e.g., 52 percent of the sample favor more research on a particular technology) to unobservable population parameters (the proportion of people favoring this research in the population). For surveys based on probability sampling (i.e., surveys that give each person in the population the same, known chance of being selected into the sample) the margin of error provides an indicator of how close the statistic observed in a sample is to the population, and how certain researchers can be about this inference (usually calculated with a certainty of 95%). For the example above, a margin of error of +/-3% would therefore indicate that we can be 95% certain that the true level of support for more research in the population falls somewhere between 49% and 55%.
But generalizability of survey results goes beyond just statistical considerations – especially for scientific issues, such as nanotechnology or stem cell research. Given the interplay between societal dynamics, scientific complexities, and a lack of widespread awareness, some have raised concerns about the appropriateness of using large scale surveys to tap public reactions to science and technology. These concerns typically fall into one of two categories that are both extremely important for any type of polling: first, what are we doing with people who are not fully aware or knowledgeable about the issue that we are interested in, and, second, can we capture an issue in all its complexities in a short survey?
The concern about unaware respondents is not unique to polling about science and technology. Political surveys routinely show that large proportions of the U.S. public are unable to accurately place presidential candidates relative to one another, even on simple issues, such as gun control (e.g., Patterson, 2002). And in fact, attitude formation about political and scientific issues – for many citizens – has little to do with awareness of or knowledge about the specifics of a particular issue (Scheufele, 2006).
In order make sure that all respondents have the same minimal baseline understanding of the technology that is being studied, surveys typically provide a short introduction to the issue as part of the question. Ideally, this introduction is comprehensive, but does not influence answers to subsequent questions by priming respondents about particular risks or benefits of the technology.
The second concern that is often raised related to the substantive generalizability of survey results about science and technology is the issue of how much detail a telephone survey can get into. Some have argued, in fact, that the systematic nature of standardized surveys is directly at odds with the need for an in-depth and contextualized understanding of how citizens interact with emerging technologies.
And of course these critics are right to a certain degree. Phone surveys, for instance, have clear constraints with respect to length and to the number of questions that can be asked about a single topic. Respondents participate on a voluntary basis and they spend a substantial amount of time on the phone with the interviewer. If researchers ask too many questions about a given topic or if the interview is too long, people tend to get bored or even annoyed and hang up. And this is not just a problem of having fewer respondents overall. Rather, as outlined earlier, if an interview is too long or goes into too much detail it usually creates problems with representativeness.
What we end up with, in this case, is a sample of people that is no longer representative of the overall population. And that, of course, hurts the validity of a poll because it no longer does what it is intended to do, i.e., capture the opinions of everybody in a given population, not just people who are more interested in a given issue or who happen to have more time to respond to a pollster's questions.
As a result, it is important to understand surveys for what they are, i.e.., one method of data collection that allows researchers to tap behaviors, levels of knowledge, and public attitudes toward science and technology in a very systematic and generalizable fashion. This comes with trade-offs related to the complexity of data that surveys provide. In particular, large scale population surveys are concerned with social patterns across large groups of respondents, and pay less attention to the potential complexity of a particular respondent’s belief system, for instance, and how it has developed over the course of his or her life.
Surveys can also be limited in how much they allow for causal inferences. This is particularly problematic for cross-sectional surveys, i.e., data collections at one point in time. Cross-sectional surveys may show a statistical correlation between exposure to science news in newspapers and scientific literacy, for instance, but they typically cannot provide conclusive evidence on the direction of this link. In other words, are knowledgeable respondents more likely to read the science section in newspapers, or does exposure to science news promote learning about science? Answers to these questions are typically provided by other research designs, some survey-based and some not.
Among the survey-based approaches that allow researchers to make some inferences about causality are longitudinal survey designs. These fall into three categories. Trend studies use multiple data collections with different samples to track responses to the same question over time. While trend studies can help researchers identify aggregate-level changes, they do not provide insights into how individual respondents change over time. Panel studies address this problem by providing multiple data collections over time for the exact same set of respondents. Cohort studies, finally are concerned with the effects that socialization or other influences have during certain periods of people’s lives. Is there a difference, for example between respondents who went to college during the first moon landing and those who went to college in the 1990s with respect to levels of interest in science and technology and science media use over the course of their life? In order to answer these questions, cohort analyses examine different subgroups (or cohorts), often defined by age, and compare their development as they grow older.
Dillman, D. A. (2007). Mail and Internet surveys: The tailored design method (2nd ed.). New York, NY: Wiley.
Patterson, T. E. (2002). The vanishing voter: Public involvement in an age of uncertainty. New York: Alfred A. Knopf Publishers.
Scheufele, D. A. (2006). Messages and heuristics: How audiences form attitudes about emerging technologies. In J. Turney (Ed.), Engaging science: Thoughts, deeds, analysis and action (pp. 20-25). London: The Wellcome Trust.
Zaller, J., & Feldman, S. (1992). A simple theory of survey response: Answering questions versus revealing preferences. American Journal of Political Science, 36(3), 579-616.
Sunday, October 11, 2009
Here are a few excerpts from an entry on "Surveys" I just wrote for Susanna Priest's forthcoming Encyclopedia of Science and Technology Communication.