11 Sampling Methods for Web and E-mail Surveys

Page created by Dennis Torres
 
CONTINUE READING
11
                                    Sampling Methods for Web
                                           and E-mail Surveys
                                                                               Ronald D. Fricker, Jr

                 ABSTRACT                                                   by postal mail and telephone, which in
                 This chapter is a comprehensive overview of                the aggregate we refer to as ‘traditional’
                 sampling methods for web and e-mail (‘Internet-            surveys.
                 based’) surveys. It reviews the various types of              The chapter begins with a general overview
                 sampling method – both probability and non-                of sampling. Since there are many fine
                 probability – and examines their applicability to
                 Internet-based surveys. Issues related to Internet-
                                                                            textbooks on the mechanics and mathematics
                 based survey sampling are discussed, including dif-        of sampling, we restrict our discussion to
                 ficulties assembling sampling frames for probability        the main ideas that are necessary to ground
                 sampling, coverage issues, and nonresponse and             our discussion on sampling for Internet-based
                 selection bias. The implications of the various survey     surveys. Readers already well versed in the
                 mode choices on statistical inference and analyses
                 are summarized.
                                                                            fundamentals of survey sampling may wish
                                                                            to proceed directly to the section on Sampling
                                                                            Methods for Internet-based Surveys.

               INTRODUCTION

              In the context of conducting surveys or                       WHY SAMPLE?
              collecting data, sampling is the selection of
              a subset of a larger population to survey.                    Surveys are conducted to gather information
              This chapter focuses on sampling methods                      about a population. Sometimes the survey is
              for web and e-mail surveys, which taken                       conducted as a census, where the goal is to
              together we call ‘Internet-based’ surveys.                    survey every unit in the population. However,
              In our discussion we will frequently com-                     it is frequently impractical or impossible to
              pare sampling methods for Internet-based                      survey an entire population, perhaps owing
              surveys to various types of non-Internet-                     to either cost constraints or some other
              based surveys, such as those conducted                        practical constraint, such as that it may not

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123    Fielding: Online Research Methods (Handbook)   Page: 195   195–217
196                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

               be possible to identify all the members of the                 The advantages of lower cost and less
               population.                                                 effort are obvious: keeping all else constant,
                  An alternative to conducting a census is                 reducing the number of surveys should cost
               to select a sample from the population and                  less and take less effort to field and analyze.
               survey only those sampled units. As shown                   However, that a survey based on a sample
               in Figure 11.1, the idea is to draw a sample                rather than a census can give better response
               from the population and use data collected                  rates and greater accuracy is less obvious.
               from the sample to infer information about                  Yet, greater survey accuracy can result when
               the entire population. To conduct statistical               the sampling error is more than offset by
               inference (i.e., to be able to make quantitative            a decrease in nonresponse and other biases,
               statements about the unobserved population                  perhaps due to increased response rates. That
               statistic), the sample must be drawn in such a              is, for a fixed level of effort (or funding), a
               fashion that one can both calculate appropriate             sample allows the surveying organization to
               sample statistics and estimate their standard               put more effort into maximizing responses
               errors. To do this, as will be discussed in                 from those surveyed, perhaps via more effort
               this chapter, one must use a probability-based              invested in survey design and pre-testing,
               sampling methodology.                                       or perhaps via more detailed non-response
                  A survey administered to a sample can                    follow-up.
               have a number of advantages over a census,                     What does all of this have to do with
               including:                                                  Internet-based surveys? Before the Internet,
                                                                           large surveys were generally expensive to
               •   lower cost                                              administer and hence survey professionals
               •   less effort to administer                               gave careful thought to how to best conduct
               •   better response rates                                   a survey in order to maximize information
               •   greater accuracy.                                       accuracy while minimizing costs. However,

                                    Population                                 Sample

                           Unobserved population                              inference                  Sample
                                 statistic                                                               statistic

               Figure 11.1 An illustration of sampling. When it is impossible or infeasible to observe a
               population statistic directly, data from a sample appropriately drawn from the population can
               be used to infer information about the population

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)    Page: 196   195–217
SAMPLING METHODS FOR WEB AND E-MAIL SURVEYS                                197

               as illustrated in Figure 11.2, the Internet                    Conducting surveys, as in all forms of data
               now provides easy access to a plethora                      collection, requires making compromises.
               of inexpensive survey software, as well as                  Specifically, there are almost always trade-
               to millions of potential survey respondents,                offs to be made between the amount of data
               and it has lowered other costs and barriers                 that can be collected and the accuracy of
               to surveying. While this is good news for                   the data collected. Hence, it is critical for
               survey researchers, these same factors have                 researchers to have a firm grasp of the trade-
               also facilitated a proliferation of bad survey-             offs they implicitly or explicitly make when
               research practice.                                          choosing a sampling method for collecting
                  For example, in an Internet-based survey                 their data.
               the marginal cost of collecting additional data
               can be virtually zero. At first blush, this seems
               to be an attractive argument in favour of                   AN OVERVIEW OF SAMPLING
               attempting to conduct censuses, or for sim-
               ply surveying large numbers of individuals                 There are many ways to draw samples
               without regard to how the individuals are                  from a population – and there are also
               recruited into the sample. And, in fact, these             many ways that sampling can go awry.
               approaches are being used more frequently                  We intuitively think of a good sample as
               with Internet-based surveys, without much                  one that is representative of the population
               thought being given to alternative sampling                from which the sample has been drawn. By
               strategies or to the potential impact such                 ‘representative’ we do not necessarily mean
               choices have on the accuracy of the survey                 the sample matches the population in terms
               results. The result is a proliferation of poorly           of observable characteristics, but rather that
               conducted ‘censuses’ and surveys based on                  the results from the data we collect from
               large convenience samples that are likely to               the sample are consistent with the results we
               yield less accurate information than a well-               would have obtained if we had collected data
               conducted survey of a smaller sample.                      on the entire population.

               Figure 11.2     Banners for various Internet survey software (accessed January 2007)

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)    Page: 197   195–217
198                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

                  Of course, the phrase ‘consistent with’                      The survey sample then consists of those
               is vague and, if this was an exposition of                    members of the sampling frame that were
               the mathematics of sampling, would require                    chosen to be surveyed, and coverage error is
               a precise definition. However, we will not                    the difference between the frame population
               cover the details of survey sampling here.1                   and the population of inference.
               Rather, in this section we will describe the                    The two most common approaches to
               various sampling methods and discuss the                      reducing coverage error are:
               main issues in characterizing the accuracy
               of a survey, with a particular focus on
                                                                             • obtaining as complete a sampling frame as pos-
               terminology and definitions, in order that
                                                                               sible (or employing a frameless sampling strategy
               we can put the subsequent discussion about                      in which most or all of the target population has
               Internet-based surveys in an appropriate                        a positive chance of being sampled);
               context.                                                      • post-stratifying to weight the survey sample
                                                                               to match the population of inference on some
                                                                               observed key characteristics.
               Sources of error in surveys
               The primary purpose of a survey is to gather                     Sampling error arises when a sample of the
               information about a population. However,                      target population is surveyed. It results from
               even when a survey is conducted as a census,                  the fact that different samples will generate
               the results can be affected by several sources                different survey data. Roughly speaking,
               of error. A good survey design seeks to reduce                assuming a random sample, sampling error is
               all types of error – not only the sampling                    reduced by increasing the sample size.
               error arising from surveying a sample of the                     Nonresponse errors occur when data is
               population. Table 11.1 below lists the four                   not collected on either entire respondents
               general categories of survey error as presented               (unit nonresponse) or individual survey ques-
               and defined in Groves (1989) as part of his                   tions (item nonresponse). Groves (1989) calls
               ‘Total Survey Error’ approach.                                nonresponse ‘an error of nonobservation’. The
                  Errors of coverage occur when some part                    response rate, which is the ratio of the number
               of the population cannot be included in the                   of survey respondents to the number sampled,
               sample. To be precise, Groves specifies three                 is often taken as a measure of how well
               different populations:                                        the survey results can be generalized. Higher
                                                                             response rates are taken to imply a lower
               1 The population of inference is the population               likelihood of nonresponse bias.
                 that the researcher ultimately intends to draw
                                                                                Measurement error arises when the survey
                 conclusions about.
                                                                             response differs from the ‘true’ response.
               2 The target population is the population of
                 inference less various groups that the researcher           For example, respondents may not answer
                 has chosen to disregard.                                    sensitive questions honestly for a variety
               3 The frame population is that portion of the target          of reasons, or respondents may misinterpret
                 population which the survey materials or devices            or make errors in answering questions.
                 delimit, identify, and subsequently allow access to         Measurement error is reduced in a variety of
                 (Wright and Tsao, 1983).                                    ways, including careful testing and revision of

               Table 11.1      Sources of survey error according to Groves (1989)
               Type of error                  Definition
               Coverage                       ‘…the failure to give any chance of sample selection to some persons in the population’.
               Sampling                       ‘…heterogeneity on the survey measure among persons in the population’.
               Nonresponse                    ‘…the failure to collect data on all persons in the sample’.
               Measurement                    ‘…inaccuracies in responses recorded on the survey instruments’.

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123    Fielding: Online Research Methods (Handbook)                 Page: 198   195–217
SAMPLING METHODS FOR WEB AND E-MAIL SURVEYS                                  199

               the survey instrument and questions, choice of             There are important analytical and practical
               survey mode or modes, etc.                                 considerations associated with how one draws
                                                                          and subsequently analyzes the results from
                                                                          each of these types of probability-based sam-
               Sampling methods                                           pling scheme, but space limitations preclude
                                                                          covering then here. Readers interested in
               Survey sampling can be grouped into two
                                                                          such details should consult texts such as
               broad categories: probability-based sampling
                                                                          Kish (1965), Cochran (1977), Fink (2003), or
               (also loosely called ‘random sampling’)
                                                                          Fowler (2002).
               and non-probability sampling. A probability-
                                                                             Non-probability samples, sometimes called
               based sample is one in which the respondents
                                                                          convenience samples, occur when either the
               are selected using some sort of probabilistic
                                                                          probability that every unit or respondent
               mechanism, and where the probability with
                                                                          included in the sample cannot be determined,
               which every member of the frame population
                                                                          or it is left up to each individual to choose
               could have been selected into the sample is
                                                                          to participate in the survey. For probability
               known. The sampling probabilities do not
                                                                          samples, the surveyor selects the sample
               necessarily have to be equal for each member
                                                                          using some probabilistic mechanism and the
               of the sampling frame.
                                                                          individuals in the population have no control
                  Types of probability sample include:
                                                                          over this process. In contrast, for example,
                                                                          a web survey may simply be posted on a
              • Simple random sampling (SRS) is a method in
                                                                          website where it is left up to those browsing
                which any two groups of equal size in the
                                                                          through the site to decide to participate in the
                population are equally likely to be selected.
                Mathematically, simple random sampling selects            survey (‘opt in’) or not. As the name implies,
                n units out of a population of size N such that           such non-probability samples are often used
                every sample of size n has an equal chance of             because it is somehow convenient to do so.
                being drawn.                                                 While in a probability-based survey par-
              • Stratified random sampling is useful when                  ticipants can choose not to participate in
                the population is comprised of a number of                the survey (‘opt out’), rigorous surveys seek
                homogeneous groups. In these cases, it can be             to minimize the number who decide not to
                either practically or statistically advantageous          participate (i.e., nonresponse). In both cases it
                (or both) to first stratify the population into the        is possible to have bias, but in non-probability
                homogeneous groups and then use SRS to draw
                                                                          surveys the bias has the potential to be much
                samples from each group.
              • Cluster sampling is applicable when the natural
                                                                          greater, since it is likely that those who
                sampling unit is a group or cluster of individual         opt in are not representative of the general
                units. For example, in surveys of Internet users it       population. Furthermore, in non-probability
                is sometimes useful or convenient to first sample          surveys there is often no way to assess the
                by discussion groups or Internet domains, and             potential magnitude of the bias, since there is
                then to sample individual users within the groups         generally no information on those who chose
                or domains.                                               not to opt in.
              • Systematic sampling is the selection of every                Non-probability-based samples often
                k th element from a sampling frame or from                require much less time and effort, and thus
                a sequential stream of potential respondents.             usually are less costly to generate, but
                Systematic sampling has the advantage that a
                                                                          generally they do not support statistical
                sampling frame does not need to be assembled
                                                                          inference. However, non-probability-based
                beforehand. In terms of Internet surveying, for
                example, systematic sampling can be used to               samples can be useful for research in other
                sample sequential visitors to a website. The              ways. For example, early in the course
                resulting sample is considered to be a probability        of research, responses from a convenience
                sample as long as the sampling interval does not          sample might be useful in developing research
                coincide with a pattern in the sequence being             hypotheses. Responses from convenience
                sampled and a random starting point is chosen.            samples might also be useful for identifying

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)     Page: 199   195–217
200                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

               issues, defining ranges of alternatives, or                    Taking larger samples will not correct for
               collecting other sorts of non-inferential data.             bias, nor is a large sample evidence of a lack
               For a detailed discussion on the application                of bias. For example, an estimate of average
               of various types of non-probability-based                   computer usage based on a sample of Internet
               sampling method to qualitative research, see                users will likely overestimate the average
               Patton (2002).                                              usage in the general population regardless
                  Specific types of non-probability samples                of how many Internet users are surveyed.
               include the following.                                      Randomization is used to minimize the chance
                                                                           of bias. The idea is that by randomly choosing
               • Quota sampling requires the survey researcher             potential survey respondents the sample is
                 only to specify quotas for the desired number             likely to ‘look like’ the population, even in
                 of respondents with certain characteristics. The          terms of those characteristics that cannot be
                 actual selection of respondents is then left up           observed or known.
                 to the survey interviewers who must match the                Variance, on the other hand, is simply a
                 quotas. Because the choice of respondents is left
                                                                           measure of variation in the observed data.
                 up to the survey interviewers, subtle biases may
                                                                           It is used to calculate the standard error of a
                 creep into the selection of the sample (see, for
                 example, the Historical Survey Gaffes section).           statistic, which is a measure of the variability
               • Snowball sampling is often used when the                  of the statistic. The precision of statistical
                 desired sample characteristic is so rare that it is       estimates drawn via probabilistic sampling
                 extremely difficult or prohibitively expensive to          mechanisms is improved by larger sample
                 locate a sufficiently large number of respondents          sizes.
                 by other means (such as simple random sampling).
                 Snowball sampling relies on referrals from initial
                 respondents to generate additional respondents.           Some important sources of bias
                 While this technique can dramatically lower search
                 costs, it comes at the expense of introducing             Bias can creep into survey results in many
                 bias because the technique itself substantially           different ways. In the absence of significant
                 increases the likelihood that the sample will not         nonresponse, probability-based sampling is
                 be representative of the population.                      assumed to minimize the possibility of bias.
               • Judgement sampling is a type of convenience sam-          Convenience sampling, on the other hand, is
                 pling in which the researcher selects the sample          generally assumed to have a higher likelihood
                 based on his or her judgement. For example, a
                                                                           of generating a biased sample. However,
                 researcher may decide to draw the entire random
                                                                           even with randomization, surveys of and
                 sample from one ‘representative’ Internet-user
                 community, even though the population of interest         about people may be subject to other kinds
                 includes all Internet users. Judgment sampling            of bias. For example, respondents may be
                 can also be applied in even less structured               inclined to over-or understate certain things
                 ways without the application of any random                (‘sensitivity bias’), particularly with socially
                 sampling.                                                 delicate questions (such as questions about
                                                                           income or sexual orientation, for example).
                                                                           Here we just focus on some of the more
               Bias versus variance                                        common sources of bias related to sampling.
               If a sample is systematically not representative
               of the population of inference in some way,                 • Frame coverage bias occurs when the sampling
                                                                             frame misses some important part of the
               then the resulting analysis is biased. For exam-
                                                                             population. For example, an e-mail survey using
               ple, results from a survey of Internet users
                                                                             a list of e-mail addresses will miss those without
               about personal computer usage is unlikely                     an e-mail address.
               to accurately quantify computer usage in                    • Selection bias is an error in how the individual or
               the general population, simply because the                    units are chosen to participate in the survey. It can
               sample is comprised only of those who use                     occur, for example, if survey participation depends
               computers.                                                    on the respondents having access to particular

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)            Page: 200   195–217
SAMPLING METHODS FOR WEB AND E-MAIL SURVEYS                                       201

                equipment, such as Internet-based surveys that             and Thomas E. Dewey, Gallup used a quota
                miss those without Internet access.                        sampling method in which each pollster was
              • Size bias occurs when some units have a greater            given a set of quotas of types of people
                chance of being selected than others. For example,         to interview, based on demographics. While
                in a systematic sample of website visitors, frequent       that seemed reasonable at the time, the
                site visitors are more likely to get selected into
                                                                           survey interviewers, for whatever conscious
                the sample than those that do not. In a similar
                vein, when selecting from a frame consisting of
                                                                           or subconscious reason, were biased towards
                e-mail addresses, individuals with multiple e-mail         interviewing Republicans more often than
                addresses would have a higher chance of being              Democrats. As a result, Gallup predicted a
                selected into a sample.                                    Dewey win of 49.5 percent to 44.5 percent:
              • Nonresponse bias occurs if those who refuse to             but almost the opposite occurred, with Truman
                answer the survey are somehow systematically               beating Dewey with 49.5 percent of the popu-
                different from those who do answer it.                     lar vote to Dewey’s 45.1 percent (a difference
                                                                           of almost 2.2 million votes).2
               Historical survey gaffes
              A famous example of a survey that reached                    SAMPLING METHODS FOR
              exactly the wrong inferential conclusion as                  INTERNET-BASED SURVEYS
              a result of bias, in this case frame coverage
              and nonresponse bias, is the ‘Literary Digest’              This section describes specific types of
              poll in the 1936 United States presidential                 Internet-based survey and the sampling meth-
              election. As described in Squires (1988),                   ods that are applicable to each. We concentrate
              for their survey ‘Literary Digest’ assembled                on differentiating whether particular sampling
              a sampling frame from telephone numbers                     methods and their associated surveys allow for
              and automobile registration lists. While using              generalization of survey results to populations
              telephone numbers today might result in a                   of inference or not, providing examples of
              fairly representative sample of the population,             some surveys that were done appropriately
              in 1936 only one in four households had a                   and well, and others that were less so.
              telephone and those were the more well-to-do.               Examples that fall into the latter category
              Compounding this, automobile registration                   should not be taken as a condemnation of
              lists only further skewed the frame towards                 a particular survey or sampling method,
              individuals with higher incomes.                            but rather as illustrations of inappropriate
                  ‘Literary Digest’ mailed 10 million straw-              application, execution, analysis, etc. Couper
              vote ballots, of which 2.3 million were                     (2000: 465–466) perhaps said it best,
              returned, an impressively large number, but
              it represented less than a 25 percent response                 Any critique of a particular Web survey approach
              rate. Based on the poll data, ‘Literary Digest’                must be done in the context of its intended purpose
                                                                             and the claims it makes. Glorifying or condemning
              predicted that Alfred Landon would beat
                                                                             an entire approach to survey data collection should
              Franklin Roosevelt 55 percent to 41 percent.                   not be done on the basis of a single implementation,
              In fact, Roosevelt beat Landon by 61 percent                   nor should all Web surveys be treated as equal.
              to 37 percent. This was the largest error ever
              made by a major poll and is considered to be                    Furthermore, as we previously discussed,
              one of the causes of ‘Literary Digest’s demise               simply because a particular method does not
              in 1938.                                                     allow for generalizing beyond the sample does
                  Gallup, however, called the 1936 presiden-               not imply that the methods and resulting data
              tial election correctly, even though he used                 are not useful in other research contexts.
              significantly less data. But even Gallup, a                     Similarly to Couper (2000), Table 11.2
              pioneer in modern survey methods, didn’t                     lists the most common probability and non-
              always get it right. In the 1948 United States               probability sampling methods, and indicates
              presidential election between Harry S Truman                 which Internet-based survey mode or modes

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)           Page: 201   195–217
202                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

                                Table 11.2 Types of Internet-based survey and associated
                                sampling methods
                                Sampling method                                             Web            E-mail
                                Probability-based
                                Surveys using a list-based sampling frame                   ✓              ✓
                                Surveys using non-list-based random sampling                ✓              ✓
                                Intercept (pop-up) surveys                                  ✓
                                Mixed-mode surveys with Internet-based option               ✓              ✓
                                Pre-recruited panel surveys                                 ✓              ✓
                                Non-probability
                                Entertainment polls                                         ✓
                                Unrestricted self-selected surveys                          ✓
                                Surveys using ‘harvested’ e-mail lists (and data)           ✓              ✓
                                Surveys using volunteer (opt-in) panels                     ✓

               may be used with each method. For example,                      can be assembled (for example, universities,
               it is possible to conduct both web and e-mail                   government organizations, large corporations,
               surveys using a list-based sampling frame                       etc). Couper (2000) calls these ‘list-based
               methodology. Conversely, while it is feasible                   samples of high-coverage populations’.
               to conduct an entertainment poll by e-mail,                        In more complicated sampling schemes,
               virtually all such polls are conducted via web                  such as a stratified sampling, auxiliary infor-
               surveys.                                                        mation about each unit, such as membership
                                                                               in the relevant strata, must be available and
               Surveys using a list-based sampling                             linked to the unit’s contact information. And
                                                                               more complicated multi-stage and cluster
               frame
                                                                               sampling schemes can be difficult or even
               Sampling for Internet-based surveys using a                     impossible to implement for Internet-based
               list-based sampling frame can be conducted                      surveys. First, to implement without having
               just as one would for a traditional survey                      to directly contact respondents will likely
               using a sampling frame. Simple random                           require significant auxiliary data, which is
               sampling in this situation is straightforward                   unlikely to be available except in the case
               to implement and requires nothing more                          of specialized populations. Second, if (non-
               than contact information (generally an e-mail                   Internet based) contact is required, then the
               address for an Internet-based survey) on each                   researchers are likely to have to resort to
               unit in the sampling frame. Of course, though                   the telephone or mail in order to ensure that
               only contact information is required to field                   sufficient coverage and response rates are
               the survey, having additional information                       achieved.
               about each unit in the sampling frame is                           An example of multi-stage sampling proce-
               desirable to assess (and perhaps adjust for)                    dure, used for an Internet-based survey of real-
               nonresponse effects.                                            estate journalists for which no sampling frame
                  While Internet-based surveys using list-                     existed, is reported by Jackob et al. (2005).
               based sampling frames can be conducted                          For this study, the researchers first assembled a
               either via the web or by e-mail, if an all-                     list of publications that would have journalists
               electronic approach is preferred the invitation                 relevant to the study. From this list a stratified
               to take the survey will almost always be                        random sample of publications was drawn,
               made via e-mail. And, because e-mail lists                      separately for each of five European countries.
               of general populations are generally not                        They then contacted the managing editor
               available, this survey approach is most                         at each sampled publication and obtained
               applicable to large homogeneous groups for                      the necessary contact information on all
               which a sampling frame with e-mail addresses                    of the journalists that were ‘occupied with

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4    Job No: 5123     Fielding: Online Research Methods (Handbook)        Page: 202   195–217
SAMPLING METHODS FOR WEB AND E-MAIL SURVEYS                                 203

               real-estate issues’. All of the journalists                generalizable to particular populations, such
               identified by the managing editors were then               as those that visit a particular website/page.
               solicited to participate in a web survey.                  The surveys can be restricted to only those
               Jackob et al. (2005) concluded that it ‘takes              with certain IP (Internet Protocol) addresses,
               a lot of effort especially during the phase                allowing one to target more specific subsets of
               of preparation and planning’ to assemble the               visitors, and ‘cookies’ can be used to restrict
               necessary data and then to conduct an Internet-            the submission of multiple surveys from the
               based survey using a multi-stage sampling                  same computer.
               methodology.                                                  A potential issue with this type of survey is
                                                                          nonresponse. Coomly (2000) reports typical
                                                                          response rates in the 15 to 30 percent range,
               Surveys using non-list-based
                                                                          with the lowest response rates occurring
               random sampling                                            for poorly targeted and/or poorly designed
               Non-list-based random sampling methods                     surveys. The highest response rates were
               allow for the selection of a probability-based             obtained for surveys that were relevant
               sample without the need to actually enumerate              to the individual, either in terms of the
               a sampling frame. With traditional surveys,                particular survey questions or, in the case
               random digit dialing (RDD) is a non-list-based             of marketing surveys, the commercial brand
               random sampling method that is used mainly                 being surveyed.
               for telephone surveys.                                        As discussed in Couper (2000), an impor-
                  There is no equivalent of RDD for                       tant issue with intercept surveys is that
               Internet-based surveys. For example, it is not             there is no way to assess nonresponse bias,
               possible (practically speaking) to generate                simply because no information is available on
               random e-mail addresses (see the Issues                    those that choose not to complete a survey.
               and Challenges in Internet-based Survey                    Coomly (2000) hypothesizes that responses
               Sampling section). Hence, with the exception               may be biased towards those who are more
               of intercept surveys, Internet-based surveys               satisfied with a particular product, brand, or
               requiring non-list-based random sampling                   website; towards those web browsers who
               depend on contacting potential respondents                 are more computer and Internet savvy; and,
               via some traditional means such as RDD,                    away from heavy Internet browsers who are
               which introduces other complications and                   conditioned to ignore pop-ups.Another source
               costs. For example, surveyors must either                  of nonresponse bias for intercept surveys
               screen potential respondents to ensure they                implemented as pop-up browser windows
               have Internet access or field a survey                     may be pop-up blocker software, at least to
               with multiple response modes. Surveys with                 the extent that pop-up blocker software is
               multiple response modes introduce further                  used differentially by various portions of the
               complications, both in terms of fielding                   web-browsing community.
               complexity and possible mode effects (again,
               see the Issues and Challenges in Internet-
                                                                           Pre-recruited panel surveys
               based Survey Sampling section).
                                                                           Pre-recruited panel surveys are, generally
                                                                           speaking, groups of individuals who have
               Intercept surveys
                                                                           agreed in advance to participate in a series of
               Intercept surveys on the web are pop-                       surveys. For Internet-based surveys requiring
               up surveys that frequently use systematic                   probability samples, these individuals are
               sampling for every kth visitor to a website                 generally recruited via some means other than
               or web page. These surveys seem to be most                  the web or e-mail – most often by telephone
               useful as customer-satisfaction surveys or                  or postal mail.
               marketing surveys. This type of systematic                     For a longitudinal effort consisting of a
               sampling can provide information that is                    series of surveys, researchers may recruit

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)    Page: 203   195–217
204                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

               panel members specifically for that effort. For             com). The telephone equivalent of these types
               smaller efforts or for single surveys, a number             of polls are call-in polls (or cell-phone text-
               of companies maintain panels of individuals,                message polls) such as those advertised on
               pre-recruited via a probability-based sampling              various television shows, where viewers can
               methodology, from which sub-samples can                     vote for their favourite contestant or character.
               be drawn according to a researcher’s speci-                 Of course, Internet-based entertainment
               fication. Knowledge Networks, for example,                  polls are as unscientific as call-in telephone
               recruits all of its panel members via telephone             polls.
               using RDD, and it provides equipment and
               Internet access to those that do not have
                                                                           Surveys using ‘Harvested’ e-mail lists
               it in an attempt to maintain a panel that
               is a statistically valid cross section of the               Harvested e-mail lists are sets of e-mail
               population (see Pineau and Dennis, 2004, for                addresses collected from postings on the
               additional detail).                                         web and from individuals who are (wittingly
                  Pre-recruited, Internet-enabled panels can               or unwittingly) solicited for their e-mail
               provide the speed of Internet-based sur-                    addresses. There are many commercial enti-
               veys while simultaneously eliminating the                   ties (‘e-mail brokers’) that sell lists of
               often-lengthy recruitment process normally                  e-mail addresses or access to lists of e-mail
               required. As such, they can be an attractive                addresses (just Google ‘buy e-mail list’).
               option to researchers who desire to field                   Lists can also be assembled from resources
               an Internet-based survey, but who require a                 on the web. For example, lists of Yahoo
               sample that can be generalized to populations               e-mail address holders by name or geographic
               outside of the Internet-user community.                     area can be created by anyone via the
                  However, pre-recruited panels are not                    Yahoo! People Search (http://email.people.
               without their potential drawbacks. In par-                  yahoo.com/py/). Similarly, World Email.com
               ticular, researchers should be aware that                   (www.worldemail.com) has an e-mail search
               long-term panel participants may respond                    feature by name.
               differently to surveys and survey questions                    However, it is important to note that
               than first-time participants (called ‘panel                 harvesting e-mail addresses and distribut-
               conditioning’ or ‘time-in-sample bias’). Also,              ing unsolicited e-mail related to surveys
               nonresponse can be an issue if the combined                 could be a violation of professional eth-
               loss of potential respondents throughout all                ical standards and/or illegal. For exam-
               of the recruitment and participation stages                 ple, European Union Article 13(1) of the
               is significant. However, as Couper (2000)                   Privacy and Electronic Communications
               concludes, ‘… in theory at least, this approach             Directive prohibits the sending of unso-
               begins with a probability sample of the                     licited commercial e-mail. In a similar vein,
               full (telephone) population, and assuming no                the Council of American Survey Research
               nonresponse error permits inference to the                  Organizations (CASRO) Code of Standards
               population…’.                                               and Ethics for Survey Research clearly
                                                                           states that using harvested e-mail addresses
                                                                           and sending unsolicited e-mail is unethical
               Entertainment polls
                                                                           (www.casro.org/codeofstandards.cfm). For a
               Internet-based entertainment polls are                      more detailed discussion of the ethical
               ‘surveys’ conducted purely for their                        considerations and implications, see Eynon
               entertainment value (though they are                        et al., and Charlesworth (this volume), and
               sometimes passed off to be more than what                   Krishnamurthy (2002) and the references
               they are). On the Internet, they largely                    contained therein.
               consist of websites where any visitor can                      Samples derived from harvested e-mail
               respond to a posted survey. An example of                   lists are non-probability samples because
               an Internet-based entertainment poll is The                 they are based on a convenience sample
               Weekly Web Poll (www.weeklywebpoll.                         of e-mail addresses. For example, Email

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)      Page: 204   195–217
SAMPLING METHODS FOR WEB AND E-MAIL SURVEYS                                    205

               Marketing Blitz (www.email-marketing-blitz.                   considered an exploratory study which introduces
               com/customized_email_list.htm) says, ‘Our                     the issues and will need to be supplemented
               targeted optin [sic] email lists are updated                  with ongoing research on specific characteristics
                                                                             of risk and prevention intervention. Furthermore,
               monthly and gathered through special interest                 the generalizability of the study results to the
               websites, entertainment websites and special                  larger population of adolescent girls needs to be
               alliances.’ Such samples should not be                        considered. Due to anonymity of the respondents,
               confused with list-based probability samples                  one of the limitations of the research design is
               where the e-mail addresses in the list-based                  the possibility that the survey respondents did
                                                                             not represent the experience of all adolescent
               sample represent a (virtually) complete list                  girls or that the responses were exaggerated or
               of the e-mail addresses of some target                        misrepresented.
               population.
                  The efficacy of sending unsolicited surveys
               to a list of purchased or otherwise procured                Unrestricted, self-selected surveys are a form
               list e-mail addresses is questionable. Not only             of convenience sampling and, as such, the
               do e-mail addresses turn over quite frequently,             results cannot be generalized to a larger
               but many of those on the list may have been                 population. But as Berson et al. illustrate, that
               recruited either without their knowledge, or                does not necessarily negate their usefulness
               they may have inadvertently agreed by failing               for research.
               to uncheck a box when they signed up for                       The web can also facilitate access to
               something else. As a result, response rates are             individuals who are difficult to reach either
               likely to be extremely low.                                 because they are hard to identify, locate,
                                                                           or perhaps exist in such small numbers
                                                                           that probability-based sampling would be
               Unrestricted self-selected surveys                          unlikely to reach them in sufficient numbers.
              As with entertainment polls, unrestricted, self-             Coomber (1997) describes such a use of
              selected surveys are surveys that are open                   the web for fielding a survey to collect
              to the public for anyone to participate in.                  information from drug dealers about drug
              They may simply be posted on a website so                    adulteration/dilution. By posting invitations
              that anyone browsing through may choose to                   to participate in a survey on various drug-
              take the survey, or they may be promoted                     related discussion groups, Coomber col-
              via website banners or other Internet-based                  lected data from 80 survey respondents
              advertisements, or they may be publicized                    (that he deemed reliable) located in 14
              in traditional print and broadcast media.                    countries on four different continents. The
              Regardless of how they are promoted (or not),                sample was certainly not generalizable, but
              the key characteristics of these types of survey             it also provided data that was unlikely
              are that there are no restrictions on who can                to be collected in any other way, and
              participate, and it is up to the individual to               which Coomber found consistent with other
              choose to participate (opt in).                              research.
                 For example, Berson et al. (2002) con-                       In addition, Alvarez et al. (2002) proposed
              ducted a web-based survey ‘to better under-                  that these types of non-probability sample
              stand the risks to adolescent girls online’                  can be useful and appropriate for conducting
              by posting a link to their survey on the                     experiments (say, in the design of web pages
              Seventeen Magazine Online website. Via the                   or web surveys) by randomly assigning mem-
              survey, the authors collected data on 10,800                 bers of the sample to control and experimental
              respondents with ‘identified behaviours that                 groups. In terms of psychology experiments,
              put them at risk’. The researchers were careful              Siah (2005) states, ‘For experimental research
              to appropriately qualify their results:                      on the Internet, the advantage of yielding
                                                                           a heterogeneous sample seems persuasive
                 The results highlighted in this paper are intended to
                                                                           considering that the most common criticism
                 explore the relevant issues and lay the groundwork        of psychological research is its over-reliance
                 for future research on youth in cyberspace. This is       on college student samples.’

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)        Page: 205   195–217
206                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

               Volunteer (opt-in) panels                                    issues and challenges related to sampling for
                                                                            Internet-based surveys.
               Volunteer (opt-in) panels are similar in
               concept to the pre-recruited panels, except
               the volunteers are not recruited using a                     Sampling Frame and Coverage
               probability-based method. Rather, partici-                   Challenges
               pants choose to participate, perhaps after                   A frequent impediment for conducting large-
               coming across a solicitation on a website.                   scale, Internet-based surveys is the lack of a
               In this regard, volunteer panels are similar                 sampling frame. Simply put, no single registry
               to unrestricted, self-selected surveys except                or list of e-mail addresses exists and thus list-
               that those who opt in do so to take a con-                   based sampling frames are generally available
               tinuing series of surveys. Harris Interactive                only for specific populations (government
               manages such a volunteer panel. Its web-                     organizations, corporations, etc).
               site (www.harrispollonline.com/became.asp)                      Compounding this difficulty, and leaving
               states, ‘You may have become a member                        aside the issue of population coverage to
               of the Harris Poll Online in one of several                  be discussed shortly, it is impossible to
               ways:                                                        employ a frameless sampling strategy, since
                                                                            for all practical purposes one cannot assemble
               • By registering directly with us through our website        random e-mail addresses. Of course, it is
                 (http://www.harrispollonline.com); or                      theoretically possible to ‘construct’ e-mail
               • By opting in to participate in the Harris Poll Online
                                                                            addresses by repeatedly randomly concate-
                 as a result of an offering made in conjunction with
                                                                            nating letters, numbers, and symbols, but
                 one of our many online partners.’
                                                                            the sheer variety of e-mail addresses means
                                                                            most of the constructed addresses will not
               Often these panels are focused on market                     work. More importantly, the unstructured
               research, soliciting consumer opinions about                 nature of the Internet means that even if one
               commercial products, and participants some-                  could tolerate the multitude of undeliverable
               times do it for monetary incentives. For                     e-mail messages that would result, they would
               example, the Harris website states,                          not be useful as the basis for a probability
                                                                            sample.
                 We offer the opportunity to earn HIPoints for                 In terms of coverage, it is widely recog-
                 the majority of our studies. On occasion a study
                 will be conducted that will not have HIPoints
                                                                            nized that Internet-based surveys using only
                 associated with it, but this only occurs in exceptions.    samples of Internet users do not generalize to
                 Once you’ve accumulated enough points you may              the general public. While Internet penetration
                 redeem them for your choice of a variety of great          into households continues at a rapid pace, the
                 rewards (www.harrispollonline.com/benefit.asp).             penetration is far from complete (compared
                                                                            to, say, the telephone) and varies widely by
                                                                            country and region of the world.3 The point
               ISSUES AND CHALLENGES IN                                     is, if the target of inference is the general
               INTERNET-BASED SURVEY SAMPLING                               public, considerable coverage error remains
                                                                            for any sample drawn strictly from Internet
               All survey modes have their strengths and                    users.
               weaknesses; Internet-based surveys are no                       Now, even if there is minimal coverage
               different in this regard. The various strengths              error for a particular Internet-based survey
               and weaknesses are more or less impor-                       effort, when using only an Internet-based
               tant, depending on the survey’s purpose.                     survey mode the target population must also
               Drawing an appropriate sample that will                      be sufficiently computer-literate and have
               provide the data necessary to appropriately                  both regular and easy access to the Internet
               address the research objective is critical.                  to facilitate responding to the survey. Simply
               Hence, in this section we focus on the                       put, just because an organization maintains a

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123    Fielding: Online Research Methods (Handbook)      Page: 206   195–217
SAMPLING METHODS FOR WEB AND E-MAIL SURVEYS                                       207

               list of e-mail addresses for everyone in the                 survey affects how respondents answer ques-
               organization it does not necessarily follow that             tions. Comparisons between Internet-based
               every individual on the list has equal access.               surveys and traditional surveys have found
               Lack of equal access could result in significant             conflicting results, with some researchers
               survey selection and nonresponse biases.                     reporting mode effects and others not. See,
                                                                            for example, the discussion and results
               Mixed-mode surveys using                                     in Schonlau et al. (2004: 130). Though
               internet-based and traditional                               not strictly a sampling issue, the point is
                                                                            that researchers should be prepared for the
               media
                                                                            existence of mode effects in a mixed-mode
               For some surveys it may be fiscally and                      survey. Vehovar and Manfreda’s overview
               operationally possible to contact respondents                chapter (this volume) explores in greater detail
               by some mode other than e-mail, such as mail                 the issues of combining data from Internet-
               or telephone. In these cases the survey target               based and traditional surveys.
               population can be broader than that for which                   In addition, when Internet-based surveys
               an e-mail sampling frame is available, up to                 are part of a mixed-mode approach, it is
               and including the general population. But at                 important to be aware that the literature
               present such a survey must also use multiple                 currently seems to show that respondents
               survey modes to allow respondents without                    will tend to favour the traditional survey
               Internet access the ability to participate.                  mode over an Internet-based mode. See, for
               Mixed-mode surveys may also be useful for                    example, the discussions in Schonlau et al.
               alleviating selection bias for populations with              (2002) and Couper (2000: 486–487). Fricker
               uneven or unequal Internet access, and the                   and Schonlau (2002), in a study of the
               sequential use of survey modes can increase                  literature on web-based surveys, found ‘that
               response rates.                                              for most of the studies respondents currently
                  For example, Dillman (2007: 456)                          tend to choose mail when given a choice
               describes a study in which surveys that                      between web and mail. In fact, even when
               were fielded using one mode were then                        respondents are contacted electronically it is
               followed up with an alternate mode three                     not axiomatic that they will prefer to respond
               weeks later. As shown in Table 11.3, in all                  electronically’.
               cases the response rate increased after the                     The tendency to favour non-Internet-based
               follow-up. Now, of course, some of this                      survey modes lead Schonlau et al. (2002: 75)
               increase can be attributed simply to the                     to recommend for mixed-mode mail and web
               fact that a follow-up effort was conducted.                  surveys that:
               However, the magnitude of the increases also
               suggests that offering a different response                    … the most effective use of the Web at the
               mode in the follow-up could be beneficial.                     moment seems to involve a sequential fielding
                  However, mixed-mode surveys are subject                     scheme in which respondents are first encouraged
                                                                              to complete the survey via the Web and then
               to other issues. Two of the most important                     nonrespondents are subsequently sent a paper
               are mode effects and respondent mode pref-                     survey in the mail. This approach has the advantage
               erences. Mode effects arise when the type of                   of maximizing the potential for cost savings from

                       Table 11.3 As reported in Dillman (2007), using an alternate survey mode as a
                       follow-up to an initial survey mode can result in higher overall response rates
                       Initial survey mode and           Follow-up survey mode and combined     Response rate increase
                       response rate                     response rate
                       Mail (75%)                        Telephone (83%)                        8%
                       Telephone (43%)                   Mail (80%)                             37%
                       IVR5 (28%)                        Telephone (50%)                        22%
                       Web (13%)                         Telephone (48%)                        35%

[17:36 4/3/2008 5123-Fielding-Ch11.tex]    Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)          Page: 207   195–217
208                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

                 using Internet while maintaining the population             which is representative of the other three, an
                 coverage and response rates of a mail survey.               animated banner advertisement resulted in more
                                                                             than 3.5 million ‘impressions’ (the number of times
                                                                             the banner was displayed), which resulted in the
               Web-based recruitment issues                                  banner being clicked 10,652 times, or a rate of
               and effects                                                   3 clicks per 1,000 displays. From these 10,652
                                                                             clicks, 599 survey participants were recruited.
               Whether e-mail addresses are constructed,                   • In the second recruitment effort, the authors
               assembled from third-party sources, or har-                   ran a ‘subscription’ campaign in 2001 in which
               vested directly from the web, there is the                    they arranged with a commercial organization to
               issue of unsolicited survey e-mail as spam.                   have a check box added to subscription forms
               For example, Sheehan (1999) conducted                         on various websites. Essentially, Internet users
               a survey with e-mail addresses harvested                      who were registering for some service were given
               from www.Four11.com and stated, ‘Several                      an opportunity to check a box on the service’s
               individuals receiving the solicitation e-mail                 subscription form indicating their willingness to
                                                                             participate in a survey. As part of this effort,
               censured the researchers for sending out unso-
                                                                             the authors conducted two recruitment drives,
               licited e-mails, and accused the researchers                  each of which was intended to net 10,000
               of “spamming”. ’ They further recounted that                  subscriptions. Across the two campaigns, 6,789
               ‘One [ISP] system operator [who observed a                    new survey participants were obtained from
               large number of e-mail messages originating                   21,378 subscribers.
               from a single address] then contacted his
               counterpart at our university.’
                  In addition, distributing an unsolicited                    The good news from the Alvarez et al.
               Internet-based survey is also not without its               (2002) study is that, even though the banner
               perils. For example, Andrews et al. (2002)                  approach yielded fewer new survey partici-
               report on a study of ‘hard-to-involve Internet              pants, both methods resulted in a significant
               users’ – those who lurk in, but do not par-                 number of potential survey respondents over
               ticipate publicly in, online discussion forums.             a relatively short period of time 3,431
               In their study, an invitation to participate                new subjects over the course of six or
               in a web survey was posted as a message                     seven weeks from the banner campaigns,
               to 375 online community discussion boards.                  and 6,789 new subjects over the course of
               While they collected 1,188 valid responses                  three weeks from the subscription campaigns.
               (out of 77,582 discussion board members),                   Each banner subject cost about $7.29 to
               they also ‘received unsolicited email offers,               recruit, while the subscription subjects cost
               some of which were pornographic in content                  only $1.27 per subject. (Unfortunately, the
               or aggressive in tone’ and they had their web               authors did not present any data on survey
               server hacked twice, once with the infection                completion rates, so we do not know whether
               of a virus.                                                 there were differences between the two
                  In spite of the challenges and possible                  samples that might have favored one over the
               perils, it is possible to recruit survey                    other).
               participants from the web. For example,                        The bad news is that the two groups
               Alvarez et al. (2002) conducted two                         differed significantly in all of the demo-
               Internet-based recruitment efforts – one                    graphic categories collected (gender, age,
               using banner advertisements on web pages                    race, and education) and they differed in
               and another using a subscription check box.                 how they answered questions on exactly
               In brief, their results were as follows.                    the same survey. In addition, both groups
                                                                           differed significantly from the demographics
               • In the first recruitment effort, Alvarez et al. ran        of the Internet population as measured by
                 four ‘banner’ campaigns in 2000 with the intention        the August 2000 Current Population Survey.
                 of recruiting survey participants using web-page          The problem, of course, is that there are
                 banner advertisements. In the first campaign,              clear effects associated with how subjects

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)          Page: 208   195–217
SAMPLING METHODS FOR WEB AND E-MAIL SURVEYS                                  209

               are recruited, such that the resulting samples             • whether incentives have different effects for
               are different even from the general Internet                 individuals taking a survey one time versus pre-
               population. Shillewaert et al. (1998) found                  recruited panel members who take a series of
               similar recruitment method biases. Hence,                    surveys.
               while it is possible to ethically recruit
               survey participants from the web, it seems                  Individual studies of Internet-based surveys
               that the recruitment methodology affects the                have generally found incentives to have little
               types of individual that self-select into the               or no effect. For example, Coomly (2000)
               sample.                                                     found that incentives had little effect on
                                                                           response rates for pop-up surveys, and Kypri
                                                                           and Gallagher (2003) found no effect in a
               Improving response rates for                                web-based survey. However, Göritz (2006)
               Internet-based surveys                                      conducted a meta-analysis of 32 experiments
               Response rates have a direct effect on                      evaluating the impact of incentives on survey
               sampling: the higher the response rate, the                 ‘response’ (the fraction of those solicited to
               fewer people need to be sampled to achieve                  take the survey that actually called up the
               a desired number of survey completions. In                  first page of the survey), and 26 experiments
               addition, higher response rates are associated              evaluating the effect of incentives on survey
               with lower nonresponse bias.                                ‘retention’ (the fraction of those who viewed
                  Unfortunately, in a summary of the aca-                  the first page that actually completed the
               demic survey-related literature up through                  survey). From the meta-analysis, Görtiz
               2001, Fricker and Schonlau (2002) concluded                 concluded that ‘material incentives promote
               that ‘Web-only research surveys have cur-                   response and retention in Web surveys’
               rently only achieved fairly modest response                 where ‘material incentives increase the odds
               rates, at least as documented in the literature.’           of a person responding by 19% over the
               S. Fricker et al. (2005) similarly summarized               odds without incentives’ and ‘an incentive
               the state of affairs as ‘Web surveys generally              increased retention by 4.2% on average’.
               report fairly low response rates.’                             In addition to incentives, Dillman (2007)
                  A good illustration of this is the Couper                and Dillman et al. (1999) have put forward
               et al. (1999) study in which employees                      a number of survey procedural recommen-
               of five US federal government statistical                   dations to increase survey response rates,
               agencies were randomly given a mail or e-mail               based on equivalent methods for traditional
               survey. Comparable procedures were used                     surveys, which we will not re-cover here since
               for both modes, yet higher response rates                   they are mainly related to survey design and
               were obtained for mail (68–76 percent) than                 fielding procedures. While we do note that
               for e-mail (37–63 percent) across all of the                the recommendations seem sensible, Couper
               agencies.                                                   (2000) cautions that ‘there is at present little
                  Incentives are a common and effective                    experimental literature on what works and
               means for increasing response rates in tradi-               what does not’.
               tional surveys. Goritz (2006) is an excellent
               review of the use of incentives in survey                   Bigger samples are not always
               research in which he distinguishes their use                better
               in traditional surveys from Internet-based
               surveys and provides a nice discussion of                  With Internet-based surveys using a list-
               the issues associated with using incentives in             based sampling frame, rather than sending the
               Internet-based surveys. Open issues include:               survey out to a sample, researchers often sim-
                                                                          ply send the survey out to the entire sampling
              • how best to deliver an incentive electronically;          frame. That is, researchers naively conducting
              • whether it is better to provide the incentive prior       (all electronic) Internet-based surveys – where
                to a respondent taking the survey or after;               the marginal costs for additional surveys can

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)      Page: 209   195–217
210                        THE SAGE HANDBOOK OF ONLINE RESEARCH METHODS

               be virtually nil – often fail to recognize                  were initiated and slightly more than 50,000
               the ‘trade-off between easy, low cost access                were completed. While this is an impressively
               to large numbers of patients [participants]                 large number of survey completions, the
               and the representativeness in the population                unrestricted, self-selected sampling strategy
               being studied’ (Soetikno et al., 1997). As we               clearly results in a convenience sample that
               previously discussed, for both probability and              is not generalizable to any larger population.
               non-probability-based samples, larger sample                Yet, Witte et al. (2000) go to extraordinary
               sizes do not necessarily mean the sample is                 lengths to rationalize that their results are
               more representative of any greater population:              somehow generalizable, while simultane-
               a sample can be biased whether it is large or               ously demonstrating that the results of the
               small.                                                      survey generally do not correspond to known
                  One might argue that in these situations                 population quantities.
               the researchers are attempting to conduct a
               census, but in practice they are forgoing a
               probability sample in favour of a convenience               Misrepresenting convenience
               sample by allowing members of the sampling                  samples
               frame to opt into the survey. Dillman et al.                A related and significant concern with non-
               (1999) summarized this practice as follows:                 probability-based sampling methods, both for
               ‘…the ease of collecting hundreds, thousands,               Internet-based and traditional surveys, is that
               or even tens of thousands of responses                      survey accuracy is characterized only in terms
               to web questionnaires at virtually no cost,                 of sampling error and without regard to
               except for constructing and posting, appears                the potential biases that may be present in
               to be encouraging a singular emphasis on the                the results. While this has always been a
               reduction of sampling error’. By this Dillman               concern with all types of survey, the ease
               et al. mean that researchers who focus only on              and spread of Internet-based surveys seems to
               reducing sampling error by trying to collect                have exacerbated the practice. For example,
               as large a sample as possible miss the point                the results of an ‘E-Poll’ were explained as
               that it is equally important to reduce coverage,            follows:
               measurement, and nonresponse error in order
               to be able to accurately generalize from the
                                                                             THE OTHER HALF / E-Poll® Survey of 1,007
               sample data.                                                  respondents was conducted January 16–20, 2003.
                  A myopic focus on large sample sizes –                     A representative group of adults 18+ were ran-
               and the idea that large samples equate to                     domly selected from the E-Poll online panel. At a
               sample representativeness which equates to                    95% confidence level, a sample error of +/− 3%
                                                                             is assumed for statistics based on the total sample of
               generalizability – occurs with convenience
                                                                             1,007 respondents. Statistics based on sub-samples
               sample-based web and e-mail surveys as                        of the respondents are more sensitive to sampling
               well. ‘Survey2000’ is an excellent example                    error. (From a press release posted on the E-Poll
               of this type of focus. A large-scale, unre-                   website.)
               stricted, self-selected survey, conducted as
               a collaborative effort between the National                    No mention was made in the press release
               Geographic Society (NGS) and some aca-                      that the ‘E-Poll online panel’ consists of
               demic researchers, Survey2000 was fielded                   individuals who had chosen to participate in
               in 1998. The survey was posted on the                       online polls, nor that they were unlikely to
               National Geographic Society’s website and                   be representative of the general population.
               participants were solicited both with a link on             Rather, it leaves readers with an incorrect
               the NGS homepage and via advertisements                     impression that the results apply to the general
               in NGS periodicals, other magazines, and                    population when, in fact, the margin of error
               newspapers.                                                 for this particular survey is valid only for
                  Upon completion of the effort, Witte et al.              adult members of that particular E-Poll online
               (2000) report that more than 80,000 surveys                 panel.

[17:36 4/3/2008 5123-Fielding-Ch11.tex]   Paper: a4   Job No: 5123   Fielding: Online Research Methods (Handbook)            Page: 210   195–217
You can also read