OCCUPANCY MODELING AND RESAMPLING OVERCOMES LOW TEST SENSITIVITY TO PRODUCE ACCURATE SARS-COV-2 PREVALENCE ESTIMATES

Page created by Harold Mcgee

Health & Fitness

English

Like
Share
Embed
Fullscreen
Slides
Download HTML
Download PDF
Abuse

←

→

Page content transcription

If your browser does not render page correctly, please read the page content below

OCCUPANCY MODELING AND RESAMPLING OVERCOMES LOW TEST SENSITIVITY TO PRODUCE ACCURATE SARS-COV-2 PREVALENCE ESTIMATES

Sanderlin et al. BMC Public Health   (2021) 21:577
https://doi.org/10.1186/s12889-021-10609-y

 RESEARCH ARTICLE                                                                                                                                   Open Access

Occupancy modeling and resampling
overcomes low test sensitivity to produce
accurate SARS-CoV-2 prevalence estimates
Jamie S. Sanderlin1*, Jessie D. Golding2, Taylor Wilcox2, Daniel H. Mason2, Kevin S. McKelvey2,
Dean E. Pearson3,4 and Michael K. Schwartz2

  Abstract
  Background: We evaluated whether occupancy modeling, an approach developed for detecting rare wildlife
  species, could overcome inherent accuracy limitations associated with rapid disease tests to generate fast, accurate,
  and affordable SARS-CoV-2 prevalence estimates. Occupancy modeling uses repeated sampling to estimate
  probability of false negative results, like those linked to rapid tests, for generating unbiased prevalence estimates.
  Methods: We developed a simulation study to estimate SARS-CoV-2 prevalence using rapid, low-sensitivity, low-
  cost tests and slower, high-sensitivity, higher cost tests across a range of disease prevalence and sampling
  strategies.
  Results: Occupancy modeling overcame the low sensitivity of rapid tests to generate prevalence estimates
  comparable to more accurate, slower tests. Moreover, minimal repeated sampling was required to offset low test
  sensitivity at low disease prevalence (0.1%), when rapid testing is most critical for informing disease management.
  Conclusions: Occupancy modeling enables the use of rapid tests to provide accurate, affordable, real-time
  estimates of the prevalence of emerging infectious diseases like SARS-CoV-2.
  Keywords: Occupancy modeling, Optimal sampling, Repeated sampling, Sampling strategies

Background                                                                             in place” rules [1, 2]. Effective community-level monitor-
The emergence of severe acute respiratory syndrome                                     ing, particularly at low disease prevalence, is essential to
coronavirus 2 (SARS-CoV-2) as a worldwide pandemic                                     inform disease management to both suppress the initial
has demonstrated both the logistical challenge of effect-                              invasion and to maintain low prevalence thereafter. To
ively monitoring an emerging infectious disease and the                                date, attention has focused on inadequate testing access,
enormous health and socio-economic costs associated                                    test supply chain disruptions, and testing delays [3–5],
with failing to meet this challenge. The ability to accur-                             but, as these problems are resolved, the clear challenge
ately track an emerging infectious disease within a popu-                              remaining lies in test efficacy and optimal sampling
lation is critical for balancing direct health impacts of                              strategies needed to provide accurate community-level
the disease against socio-economic impacts of commu-                                   metrics of disease dynamics.
nity mitigation strategies for the disease through “shelter                              Effectively managing an emerging infectious disease
                                                                                       requires logistically feasible, fast, and accurate
                                                                                       community-level monitoring to inform real-time deci-
* Correspondence: jamie.l.sanderlin@usda.gov
1
 USDA Forest Service, Rocky Mountain Research Station, 2500 S. Pine Knoll              sions about community mitigation actions [2, 6]. A pri-
Dr., Flagstaff, AZ, USA                                                                mary hurdle to achieving accurate community-level
Full list of author information is available at the end of the article

                                        © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License,
                                        which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give
                                        appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if
                                        changes were made. The images or other third party material in this article are included in the article's Creative Commons
                                        licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons
                                        licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
                                        permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
                                        The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the
                                        data made available in this article, unless otherwise stated in a credit line to the data.

Sanderlin et al. BMC Public Health   (2021) 21:577                                                               Page 2 of 10

monitoring is the tradeoff between speed and accuracy            for deploying rapid tests to optimize accuracy while con-
inherent in disease tests; rapid tests (e.g., antigen tests      sidering logistical and cost constraints?” We conducted a
developed by Abbott [7]) are more error prone and                series of simulations to answer these questions by
therefore have a lower sensitivity for disease detection         deploying two types of tests for SARS-CoV-2: a rapid,
[8–10]. Historically, and throughout the current pan-            low sensitivity, cheaper test and a slower, high sensitiv-
demic, disease testing has focused on individual testing,        ity, yet more expensive test. We examined a range of
emphasizing accurate diagnosis of symptomatic individ-           SARS-CoV-2 prevalence values. In addition, because we
uals conducted with slower, higher sensitivity tests [11].       were interested in effective sampling strategies, we ex-
For SARS-CoV-2 and other emerging diseases, initial              amined how the proportion of individuals initially sam-
sampling tends to target the most severe cases, whereby          pled, number of repeat tests, and proportion of
moderate and mild cases are not sampled, leading to an           individuals with repeat tests affected prevalence
underestimation of true community prevalence [4].                estimates.
   Novel modeling approaches developed for sampling
rare animals in wildlife sciences hold great potential to        Methods
address the inherent limitations of rapid tests [12]. Be-        In its origins, occupancy modeling relies on repeated
cause many wildlife species of concern are rare and/or           sampling for the presence of an organism at a site [16,
sparsely distributed, this discipline has spent decades de-      17]. Here, the organism we wish to detect is SARS-CoV-
veloping tools to improve population estimation with             2, and the place, or site, is the individual where the dis-
imperfect detection [13–15]. Imperfect detection trans-          ease might be found. A site—in this case a person—is
lates to false negative and/or false positive errors as it re-   “occupied” if the organism (the virus) is present, and
lates to disease prevalence. One such class of models is         “unoccupied” otherwise; occupied and unoccupied are
called occupancy models [16, 17]. Occupancy models               analogous to infected and uninfected. The occupancy
address imperfect detection by collecting repeated sam-          modeling goal as applied here is to determine the pro-
ples (observations) of a species to statistically model          portion of sites (people) within the population (county)
sampling error rates and use this information to improve         that are occupied (infected) given imperfect detection
estimates of species’ presence/absence. Importantly, oc-         from false negatives (test sensitivity). Thus, occupancy
cupancy models can account for imperfect detection               modeling provides an appropriate statistical design for
arising from biological processes and the effort to ob-          estimating disease prevalence.
serve these processes. For example, occupancy modeling
has been used in India to overcome issues of sampling            Model
for tigers (Panthera tigris) linked to their rarity, camou-      We used a Bayesian hierarchical occupancy model [21],
flage, stealth, and nocturnal behaviors [18]. Occupancy          which includes biological and observation processes.
modeling has also been used to generate unbiased esti-           The following describes the variables used to model each
mates of pathogen occurrence and prevalence in wildlife          process.
[19]. Importantly, due to more resource limitations in
wildlife sciences compared to human health fields, wild-         Biological process
life researchers have developed tools for optimizing sam-        Following a single-season occupancy model [16, 17],
pling designs [20] that can be adapted to generate               where I represents infected and U represents uninfected,
efficacious and accurate sampling designs for estimating         we modeled true SARS-CoV-2 presence in an individual
human disease prevalence.                                        (j) as a latent (not directly observed) Bernoulli variable
   Here we test whether occupancy modeling frameworks            (zj) with probability of success as the average infection
that account for imperfect detection from false negatives        probability within individuals in the community (ψI):
(test sensitivity) due to biological and observation pro-
cesses could be used to address low test sensitivity in              z j  Bernoulli ðΨI Þ:                              ð1Þ
SARS-CoV-2 rapid tests. We focused on U.S. county-
level estimates, although this approach may be applied           Number of infected individuals (NI) and uninfected indi-
at larger geographic or population scales. We specifically       viduals (NU) can be estimated as derived parameters,
ask two questions: 1) “Can an occupancy modeling                 and NPOP (number of individuals in the population) is
framework applied to rapid, low-sensitivity tests provide        known:
similar accuracy (e.g., less bias and greater precision) es-
timates of SARS-CoV-2 prevalence as higher sensitivity,                    XN POP
more accurate tests?”; 2) “Given a fixed number of tests            NI ¼       j¼1
                                                                                      z j;                               ð2Þ
(due to laboratory capacity, fixed budget, or combination
of these constraints), what is the best sampling strategy        and NU = NPOP − NI.

Sanderlin et al. BMC Public Health   (2021) 21:577                                                                Page 3 of 10

Observational process                                           individuals initially sampled to represent low (0.001),
We modeled the observation process as a simple ran-             medium (0.01), and high (0.05) proportions within the
dom sample (independent of symptoms) with imperfect             county. Because occupancy modeling relies on repeated
detection from false negatives (low test sensitivity) as a      testing, we modeled sampling with a single repeat test (2
zero-inflated binomial process. In occupancy modeling,          tests total) or 4 repeat tests (5 tests per individual total).
there is a formal requirement for closure at the site level,    We also varied proportion of individuals that were re-
i.e., with disease prevalence, each sampled person in the       peatedly sampled to represent 10, 50 and 100% of the
population must remain either infected or uninfected            sampled individuals repeat tested to account for the po-
throughout the sampling period. Test sensitivity is esti-       tential that some individuals initially sampled are unwill-
mated through repeated independent samples at a site.           ing to be resampled in a single visit.
Individuals sampled (i.e., sites) would therefore have             The combination of all parameter values described
multiple tests (Nsample,j) taken at a single ‘visit’ during     above resulted in 108 unique simulation scenarios (Add-
the sampling period. This is difficult to achieve with Re-      itional file 1), which were created with program R [22].
verse Transcription quantitative PCR (RT-qPCR)-based            To create a simulation scenario, we simulated a popula-
tests but is amenable to rapid and inexpensive tests such       tion (25,000) with occupied and unoccupied individuals
as antigen tests developed by Abbott [7] or other similar       (Eq. 1) with one of the three prevalence values. We then
tests. We modeled infected individual detection (yj) as a       simulated sampling that population with imperfect de-
binomial random variable, with number of binomial tri-          tection from false negatives (test sensitivity; Eq. 3) and
als represented as number of repeat tests per sampled           varied observation process parameters described above
individual (Nsample,j) and success of those trials as an un-    (proportion of individuals initially sampled, number of
known probability (muj), dependent on true SARS-CoV-            repeat tests, and proportion of individuals that were re-
2 presence in an individual (zj) and test sensitivity (ptest)   peat sampled). Next, we used observed data for a single
(muj = zj × ptest):                                             scenario as an input in the Bayesian hierarchical model
                                                              (Eq. 1–3) Markov chain Monte Carlo (MCMC) process,
     y j  Binomial N sample; j ; mu j :                 ð3Þ    ran the model in JAGS [23] using the rjags, jagsUI [24],
                                                                and coda [25] packages, to obtain posterior estimates of
In this case, infected individual detections (yj) and num-      SARS-CoV-2 prevalence (ψI) and test sensitivity (ptest).
ber of repeat tests per individual (Nsample,j) represent        That was for one replicate of the simulation-estimation
known real world data from a simple random sample of            process for one scenario. We used 100 replicates of the
a hypothetical county with SARS-CoV-2.                          simulation-estimation process for each simulation sce-
                                                                nario. For each of the 100 simulation-estimation repli-
Simulations                                                     cates, we used independent, non-informative priors for
To evaluate multiple biological and observational process       ψI and ptest and ran three parallel chains (length = 10,000
scenarios, we conducted a series of simulations that varied     iterations, burn-in = 1000 iterations, no thinning) to esti-
values within both processes. We used information from          mate the posterior distribution median of model param-
literature (if available) to support selection of low,          eters and 95% Bayesian credible intervals (BCI) for each
medium, and high levels for each parameter (Add-                replicate. We assessed model convergence by using R      ^ <
itional file 1). We used the median U.S. county population      1.1 [26]. We then compared true values we used to gen-
size of 25,000 (Census Bureau 2019) across all simulations      erate the biological and sampling processes to estimated
to represent population size (Npop) to reflect realistic        prevalence (ψI) and test sensitivity (ptest) for each repli-
SARS-CoV-2 U.S. sampling scenarios. To account for              cate of the simulation-estimation process over all simu-
SARS-CoV-2 prevalence affecting test sensitivity, we ex-        lation scenarios.
amined simulations with three values of very low (0.001),          For all simulations, we assumed the population was
low (0.01), and moderate (0.1) prevalence (ψI). Within the      closed to movement during the sampling time frame
observation process, we modeled deployment of two               and each individual was available for sampling in the
SARS-CoV-2 test types by modeling two test sensitivity          county. We also assumed disease state (i.e., occupied or
(ptest) values based on known test sensitivities (Additional    unoccupied) did not change during the sample period.
file 1): low (0.30) and high (0.78).                            These conditions equate to a short sampling time win-
   In addition, because we were interested in optimal           dow (point prevalence).
sampling strategies (question 2), we modified total num-
ber of tests per individual (Nsample,j) by varying propor-
tion of individuals initially sampled, number of repeat         Evaluating simulations for occupancy modeling (question 1)
tests per individual, and proportion of individuals with        To evaluate if an occupancy modeling framework with
repeat tests. We used three values for proportion of            rapid tests could provide accurate SARS-CoV-2

Sanderlin et al. BMC Public Health               (2021) 21:577                                                           Page 4 of 10

prevalence estimates, we examined relative root mean                     function was to minimize prevalence RRMSE subject to
square error (RRMSE) for prevalence (ψI) and test sensi-                 constraints:
tivity (ptest). RRMSE, or accuracy, is the combination of
bias and precision defined as:                                              C ¼ C 0 þ C s  N sample ≤ d;
                  rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ
                              P                       2                   RRMSE ≤ e:
                   ð1=r Þ n θ^i − θii¼1
    RRMSE ¼                                                    ;   ð4Þ     We constrained total cost below d to evaluate a range
                                     θ
                                                                         of sampling strategies given a fixed number of tests
where r is number of replicates, θ^i is the estimated par-               available (due to laboratory capacity, fixed budget, or
ameter (posterior median) at replicate i, θi is the true                 combination of these constraints) and RRMSE below e
parameter value at replicate i, and θ is the mean of the                 to represent a desired amount of prevalence accuracy.
true parameter values over all replicates. We also calcu-                We demonstrated the optimization process graphically:
lated relative bias (RBIAS), percent coverage, and Bayes-                optimal sampling strategy was determined by examining
ian Credible Interval (BCI) length (Additional file 2,                   where accuracy (RRMSE) asymptotes given costs (i.e.,
Additional file 3).                                                      there is marginal gain for additional sampling) before
                                                                         cost constraints.
Evaluating simulations for optimal sampling strategies
(question 2)                                                             Results
To evaluate optimal sampling strategies given fixed re-                  Overall, we found that occupancy modeling in conjunc-
sources (i.e., number of tests available, fixed budget) at               tion with resampling strategies can overcome low test
the county level, we used a constrained optimization                     sensitivity associated with rapid tests to provide accurate
framework [27] consisting of three components: (i) deci-                 SARS-CoV-2 prevalence estimates comparable to those
sion variables (proportion of individuals initially sam-                 of more accurate but slower tests. In addition, we identi-
pled, number of repeat tests per individual, proportion                  fied optimal sampling strategies using cost constraints
of individuals with repeat tests), (ii) objective function               across all prevalence levels. The specific results of each
(minimize SARS-CoV-2 prevalence RRMSE), and (iii)                        are     discussed     below     for      simulation    data
constraints (total number of samples represented as a                    (Additional file 4).
cost). We illustrate this framework using a cost con-
straint, but this framework can also include a time con-                 Evaluating simulations for occupancy modeling (question
straint, as quicker results could influence individual                   1)
behavior and contribute to slowing disease spread [6].                   Accounting for biological and observation processes
The cost function was:                                                   Relative bias and accuracy (RRMSE) of SARS-CoV-2
                                                                         prevalence were influenced by both true prevalence (bio-
    C ¼ C s  N sample ;                                           ð5Þ   logical process) and sampling strategy (observation
                                                                         process). Estimates were influenced most by prevalence
where C was total cost; Cs was per sample cost for col-                  magnitude, followed by sampling strategy (Figs. 1-3,
lecting sample, sample storage, sample preparation, and                  Additional file 2). Not surprisingly, overall, SARS-CoV-2
test materials; and Nsample was total number of samples                  prevalence accuracy was lower and more variable among
(sum of samples from initially sampled individuals                       sampling strategies when the disease was extremely rare
within the county and from all repeated tests for a sub-                 (true prevalence 0.001). Accuracy improved as preva-
set of individuals). For ptest of 0.3 associated with a rapid            lence increased and/or a greater proportion of the popu-
test, Cs was $5 [28]. For ptest of 0.78 associated with a                lation was sampled. With true prevalence of 0.01 and
RT-qPCR test, Cs was $100. We expect laboratory costs                    0.1, accuracy increases (RRMSE decreases) were smaller
to vary by county and laboratory technician and collec-                  with increases in the proportion of a population initially
tion staff salary, and thus present a general cost function              sampled.
to illustrate a framework to evaluate accuracy and asso-
ciated costs with decision variables for different test                  Improvement with repeated testing
types. We also recognize that start-up costs for labora-                 More repeat tests greatly improved SARS-CoV-2 preva-
tories may be substantial, thus we assume counties will                  lence accuracy (RRMSE) estimates, especially with fewer
utilize laboratories that already have necessary equip-                  total tests (Fig. 2). For example, when a county had 1%
ment and technical expertise.                                            prevalence and 250 tests to allocate, if 50 people get 5
  Given our RRMSE prevalence simulation values for                       repeat tests the RRMSE was reduced by 2.2% compared
each combination of decision variables, our objective                    to a scenario with 250 people that got a single test.

Sanderlin et al. BMC Public Health      (2021) 21:577                                                                                     Page 5 of 10

 Fig. 1 Prevalence accuracy as a function of percent of the population infected. Accuracy (relative root mean square error) of prevalence (Ψ) as a
 function of true prevalence, or percent of the population infected, from simulation scenarios of 100% of the initial sample with 5 repeat tests. The
 inset a) is for simulation scenarios with 10% of the population infected (true prevalence)

 Fig. 2 Prevalence accuracy as a function of total tests. Accuracy (relative root mean square error) of prevalence (Ψ) as a function of total tests
 from simulation scenarios of 1% infection rate and test sensitivity of 0.3. Note that the percent of the population initially sampled is not the same
 with one test (no repeat) versus 5 tests (4 repeats)

Sanderlin et al. BMC Public Health (2021) 21:577 Page 6 of 10

Fig. 3 Prevalence accuracy as a function of true test sensitivity and costs. Accuracy (relative root mean square error) of prevalence (Ψ) as a
function of a) true test sensitivity from simulation scenarios of 5% of the population initially sampled and 100% of that sample with 5 repeat
tests, and b) costs associated with the two test types

Occupancy modeling overcomes lower test sensitivity of Evaluating simulations for optimal sampling strategies
rapid tests (question 2)
Occupancy modeling overcame the lower test sensitivity of Across all disease prevalence levels, the optimal (lowest
rapid tests compared with high-accuracy tests, i.e., SARS- RRMSE relative to cost constraints) sampling strategy
CoV-2 prevalence accuracy was similar for low-sensitivity for a fixed proportion of the initially sampled population
rapid and higher-sensitivity, slower tests for all SARS-CoV- with repeat tests was: i) with 1% of population initially
2 prevalence levels and sampling strategies when occupancy sampled, and ii) sampling occurring five times per indi-
models were applied (Fig. 3, Additional file 2). For example, vidual with repeat tests (1 initial test and 4 repeat tests)
with 5% of the population initially sampled and 100% of (Fig. 4, Supplementary Figs. 10 and 11 in Additional file
that sample with 5 repeat tests, we found prevalence 2). Our simulation scenarios considered resampling a
RRMSE was similar for prevalence of 0.1% (RRMSE = subset of the initial group sampled, in addition to 100%
100.29 for ptest = 0.3, RRMSE = 100.30 for ptest = 0.78; Fig. of the initial group sampled. Similar accuracy can be
3a). However, rapid tests (ptest = 0.3) provide similar accur- achieved with only a subset of the initial group sampled,
acy at reduced costs with occupancy models (Fig. 3b). for reduced overall costs (Fig. 4, Additional file 2).

Sanderlin et al. BMC Public Health      (2021) 21:577                                                                                       Page 7 of 10

 Fig. 4 Optimal sampling designs with prevalence accuracy as a function of costs. Accuracy (relative root mean square error) of prevalence (Ψ) as
 a function of costs (USD) using arbitrary cost constraint (dotted vertical line) of $7500 USD for a) a subset of the simulation scenarios with the
 true test sensitivity of 0.3 and 50% of the initially sampled population with repeat tests; b) a subset of the simulation scenarios with the true test
 sensitivity of 0.3, true population infection rate (true prevalence) of 1, and 100% of the initially sampled population with repeat tests. Optimal
 designs are indicated within the figures with arrows

Discussion                                                                     time data for informed disease management. We dem-
Mitigating the impacts of emerging infectious disease                          onstrate how occupancy modeling can overcome low
like SARS-CoV-2 requires rapid testing to generate real-                       test sensitivity with rapid COVID-19 surveillance

Sanderlin et al. BMC Public Health (2021) 21:577 Page 8 of 10

schemes to generate accurate (high-precision, low-bias) monitoring data that is required to inform disease man-
SARS-CoV-2 prevalence estimates. Moreover, the ability agement actions and minimize human health impacts [2,
of this approach to offset test sensitivity with rapid tests 6]. Resolving this sampling challenge is essential, espe-
at low disease prevalence is crucial because decisive dis- cially as winter arrives in the northern hemisphere where
ease management actions are most critical at low preva- onset of additional respiratory diseases with similar
lence levels such as during disease onset and resurgence symptomology (e.g., rhinoviruses, seasonal corona-
following control. Rapid tests are also inexpensive and viruses, influenza) will confound SARS-CoV-2 detection.
logistically easy to administer, enabling the additional We demonstrate how occupancy modeling can help to
sampling effort requisite for resampling designs [10] overcome low test sensitivity to produce accurate disease
with quick turn-around times. While our modeling ef- prevalence estimates for real-time, informed decision
forts targeted the challenge associated with low sensitiv- making, even at low disease prevalence levels when de-
ity tests, occupancy modeling holds potential to address cisive action is most meaningful. We also show the opti-
other rapid testing limitations for improved disease mal sampling strategy in combination with occupancy
management. modeling will be equally effective for community-level
To further advance rapid testing designs for disease inference at different points in the course of an epi-
monitoring, other shortfalls will also need to be ad- demic. Finally, we demonstrate that additional testing
dressed. In addition to low test sensitivity or false nega- beyond the optimal sampling strategy in combination
tive results, rapid tests also produce false positive results, with occupancy modeling will not substantively improve
or low specificity [29]. False positive results can impact prevalence estimates, allowing funds to be directed to
patient risk and costs with unnecessary sequestration or the most pertinent disease mitigation measures.
even worse – uninfected individuals being assigned to
COVID-19 hospital wards where they may become in- Abbreviations
fected [29]. We did not account for false positives. How- BCI: Bayesian Credible Interval; MCMC: Markov Chain Monte Carlo;
RBIAS: relative bias; RRMSE: relative root mean square error; RT-qPCR: Reverse
ever, similar methods exist to account for false positive Transcription quantitative Polymerase Chain Reaction; SARS-CoV-2: severe
detections [30, 31]. Moreover, more sophisticated ap- acute respiratory syndrome coronavirus 2; SIR: susceptible infected recovered
proaches can be applied to refine estimates under com-
plex, real-world conditions that account for symptoms at
the time of testing and incorporate more underlying bio- Supplementary Information
The online version contains supplementary material available at https://doi.
logical processes, including an instantaneous model org/10.1186/s12889-021-10609-y.
using stratified sampling of states (symptomatic vs.
asymptomatic) with no transitions among states using Additional file 1. Simulation study parameter combinations. Table of
single-season, multi-state occupancy models with state simulation study parameter combinations for evaluating the effectiveness
of different sampling and biological parameters.
uncertainty [32, 33], and Hidden Markov models with
Additional file 2. Simulation summary. Document describing the
SIR (susceptible infected recovered) models using state simulation summary for relative root mean square error, relative bias,
transitions both in discrete and continuous time [34]. percent coverage, and Bayesian Credible Interval length.
An alternative to stratified sampling could include col- Additional file 3. Simulation code. R code of simulations (GitHub:
lection of site-level covariates to improve detection at https://github.com/jamiesanderlin/Sanderlin-et-al-occupancy-modeling-
and-SARS-CoV-2-prevalence).
sample collection time (i.e., symptomatic status, symp-
Additional file 4. Simulation data. Simulation data summarized by
tom start date) an approach common in occupancy simulation parameter combination. These data were used for all plots
models in the wildlife literature [18, 19]. We stress that presented within the paper.
the approach we present is designed to assess disease Additional file 5. Individual-level inference from occupancy model re-
prevalence across a population. It is not intended for de- sults. Document describing individual-level inference from occupancy
model results.
termining infection status of individuals (although see
Additional file 5). Nor is it appropriate for circumstances
where institutions seek to create an infection-free group. Acknowledgments
Such objectives require high test sensitivity at the indi- Joshua Christensen, M.D. and two anonymous reviewers provided invaluable
comments on previous manuscript drafts.
vidual level. Nonetheless, information from such
individual-oriented testing could be incorporated into
prevalence estimates in models like the one we Authors’ contributions
All authors conceived of the presented idea. JSS, JDG, and TW developed
introduced. the models and simulation code. DHM and TW performed literature review
to inform simulation parameter values. JSS, JDG, DHM, MKS, and TW ran the
Conclusions simulations. JSS analyzed the simulation results. JDG and JSS created the
figures. JSS, DEP, KSM, MKS, and JDG were key contributors to writing, and
For emerging infectious diseases like COVID-19, rapid TW and DHM provided editorial comments. All authors discussed the results,
testing is essential for generating the real-time disease read, and approved the final manuscript.

Sanderlin et al. BMC Public Health             (2021) 21:577                                                                                               Page 9 of 10

Authors’ information                                                                   6.    Larremore DB, Wilder B, Lester E, Shehata S, Burke JM, Hay JA, et al. Test
JSS is a Quantitative Vertebrate Ecologist with USDA Forest Service Rocky                    sensitivity is secondary to frequency and turnaround time for COVID-19
Mountain Research Station, Flagstaff, Arizona, USA, with research interests                  screening. Sci Adv. 2021;7(1):1–11. https://doi.org/10.1126/sciadv.abd5393.
including cost-effective sampling designs for monitoring, Bayesian hierarch-           7.    Abbott Press Releases [Internet]. PRNewswire. [cited 2020 Dec 5]. [posted
ical models, and wildlife population and community dynamics. JDG is the                      2020 Aug 16]. Available from: https://abbott.mediaroom.com/2020-08-26-A
Multispecies Mesocarnivore Monitoring Program Leader with USDA Forest                        bbotts-Fast-5-15-Minute-Easy-to-Use-COVID-19-Antigen-Test-Receives-FDA-
Service, National Genomics Center for Wildlife and Fish Conservation, Rocky                  Emergency-Use-Authorization-Mobile-App-Displays-Test-Results-to-Help-Our-
Mountain Research Station, Missoula, Montana, USA. TW is a Research Geneti-                  Return-to-Daily-Life-Ramping-Production-to-50-Million-Tests-a-Month
cist with USDA Forest Service, National Genomics Center for Wildlife and Fish          8.    Dao Thi VL, Herbst K, Boerner K, Meurer M, Kremer LPM, Kirrmaier D, et al. A
Conservation, Rocky Mountain Research Station, Missoula, Montana, USA.                       colorimetric RT-LAMP assay and LAMP-sequencing for detecting SARS-CoV-
DHM is an eDNA technician with USDA Forest Service, National Genomics                        2 RNA in clinical samples. Sci Transl Med. 2020;12(eabc7075).
Center for Wildlife and Fish Conservation, Rocky Mountain Research Station,            9.    Meyerson NR, Yang Q, Clark SK, Paige CL, Fattor WT, Gilchrist AR, et al. A
Missoula, Montana, USA. KSM is a Research Ecologist with USDA Forest Ser-                    community-deployable SARS-CoV-2 screening test using raw saliva with 45
vice, National Genomics Center for Wildlife and Fish Conservation, Rocky                     minutes sample-to-results turnaround. medRxiv. 2020;20150250. Available
Mountain Research Station, Missoula, Montana, USA. DEP is a Research Ecolo-                  from: https://doi.org/10.1101/2020.07.16.20150250.
gist with USDA Forest Service, Rocky Mountain Research Station, Missoula,              10.   Ramdas K, Darzi A, Jain S. “Test, re-test, re-test”: using inaccurate tests to
Montana, USA and Division of Biological Sciences, University of Montana,                     greatly increase the accuracy of COVID-19 testing. Nat Med. 2020;26(6):810–
Missoula, MT, USA. MKS is Director of USDA Forest Service, National Genom-                   1. https://doi.org/10.1038/s41591-020-0891-7.
ics Center for Wildlife and Fish Conservation, Rocky Mountain Research Sta-            11.   Vogels CBF, Brito AF, Wyllie AL, Fauver JR, Ott IM, Kalinich CC, Petrone ME,
tion, Missoula, Montana, USA and Program Manager for Wildlife and                            Casanovas-Massana A, Catherine Muenker M, Moore AJ, Klein J, Lu P, Lu-
Terrestrial Ecosystems, Rocky Mountain Research Station, USDA Forest                         Culligan A, Jiang X, Kim DJ, Kudo E, Mao T, Moriyama M, Oh JE, Park A, Silva
Service.                                                                                     J, Song E, Takahashi T, Taura M, Tokuyama M, Venkataraman A, Weizman
                                                                                             OE, Wong P, Yang Y, Cheemarla NR, White EB, Lapidus S, Earnest R, Geng B,
Funding                                                                                      Vijayakumar P, Odio C, Fournier J, Bermejo S, Farhadian S, dela Cruz CS,
This research was supported in part by the U.S. Department of Agriculture,                   Iwasaki A, Ko AI, Landry ML, Foxman EF, Grubaugh ND. Analytical sensitivity
Forest Service.                                                                              and efficiency comparisons of SARS-CoV-2 RT–qPCR primer–probe sets. Nat
                                                                                             Microbiol. 2020;5(10):1299–305. https://doi.org/10.1038/s41564-020-0761-6.
Availability of data and materials                                                     12.   McClintock BT, Nichols JD, Bailey LL, Mackenzie DI, Kendall WL, Franklin AB.
All data generated or analyzed during this study are included in this                        Seeking a second opinion: uncertainty in disease ecology. Ecol Lett. 2010;
published article [and its supplementary information files].                                 13(6):659–74. https://doi.org/10.1111/j.1461-0248.2010.01472.x.
                                                                                       13.   Williams BK, Nichols JD, Conroy MJ. Analysis and management of animal
Declarations                                                                                 populations - modeling, estimation, and decision making. San Diego,
                                                                                             California: Academic Press; 2002. 817 p.
Ethics approval and consent to participate                                             14.   Jolly GM. Estimates from capture-recapture data with both death and
Not applicable.                                                                              immigration-stochastic model. Biometrika. 1965;52(1):225–47. https://doi.
                                                                                             org/10.1093/biomet/52.1-2.225.
Consent for publication                                                                15.   Seber GAF. A note on the multiple-recapture census. Biometrika. 1965;52(1):
Not applicable.                                                                              249–59. https://doi.org/10.1093/biomet/52.1-2.249.
                                                                                       16.   Mackenzie DI, Nichols JD, Royle JA, Pollock KH, Bailey LL, Hines JE. Occupancy
Competing interests                                                                          estimation and modeling. New York, New York: Academic Press; 2006.
The authors declare that they have no competing interests.                             17.   MacKenzie DI, Nichols JD, Lachman GB, Droege S, Royle AA, Langtimm CA.
                                                                                             Estimating site occupancy rates when detection probabilities are less than
Author details                                                                               one. Ecology. 2002;83(8):2248–55. https://doi.org/10.1890/0012-9658(2002
1
 USDA Forest Service, Rocky Mountain Research Station, 2500 S. Pine Knoll                    )083[2248:ESORWD]2.0.CO;2.
Dr., Flagstaff, AZ, USA. 2USDA Forest Service, National Genomics Center for            18.   Karanth KU, Gopalaswamy AM, Kumar NS, Vaidyanathan S, Nichols JD,
Wildlife and Fish Conservation, Rocky Mountain Research Station, Missoula,                   Mackenzie DI. Monitoring carnivore populations at the landscape scale:
MT, USA. 3USDA Forest Service, Rocky Mountain Research Station, Missoula,                    occupancy modelling of tigers from sign surveys. J Appl Ecol. 2011;48(4):
MT, USA. 4Division of Biological Sciences, University of Montana, Missoula,                  1048–56. https://doi.org/10.1111/j.1365-2664.2011.02002.x.
MT, USA.                                                                               19.   Mosher BA, Brand AB, Wiewel ANM, Miller DAW, Gray MJ, Miller DL, Grant
                                                                                             EHC. Estimating occurrence, prevalence, and detection of amphibian
Received: 6 December 2020 Accepted: 8 March 2021                                             pathogens: insights from occupancy models. J Wildl Dis. 2019;55(3):563–75.
                                                                                             https://doi.org/10.7589/2018-02-042.
                                                                                       20.   Sanderlin JS, Block WM, Ganey JL. Optimizing study design for multi-species
References                                                                                   avian monitoring programmes. J Appl Ecol. 2014;51(4):860–70. https://doi.
1. Bryant James, Allen D, Block S, Cohen J, Eckersley P, Eifler M, Gostin L, et al.          org/10.1111/1365-2664.12252.
    Roadmap to pandemic resilience [Internet]. Edmond J. Safra Center for              21.   Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis. 2nd ed.
    Ethics, Harvard University; 2020. p. 56. Available from: https://ethics.harvard.         New York: Chapman and Hall/CRC; 2004.
    edu/covid-roadmap                                                                  22.   R Core Team. R: a language and environment for statistical computing
2. Mina MJ, Parker R, Larremore DB. Rethinking Covid-19 test sensitivity - a                 [internet]. Vienna, Austria: R Foundation for statistical Computing; 2020.
    strategy for containment. N Engl J Med [Internet]. 2020;1–2. Available from:             Available from: https://www.r-project.org
    nejm.org                                                                           23.   Plummer M. JAGS: A program for analysis of Bayesian graphical models using
3. Harris JE. Overcoming reporting delays is critical to timely epidemic                     Gibbs sampling. Proc 3rd Int Work Distrib Stat Comput (DSC 2003). 2003;20–22.
    monitoring: the case of COVID-19 in new York City. medRxiv [internet].             24.   Kellner K. jagsU: a wrapper around “jags” to streamline ‘JAGS’ analyses. jR
    2020;20159418. Available from: https://doi.org/https://doi.org/10.1101/2020.             package version 1.4.2. 2016. Available from: https://cran.r-project.org/web/pa
    08.02.20159418.                                                                          ckages/jagsUI/index.html.
4. Wu SL, Mertens AN, Crider YS, Nguyen A, Pokpongkiat NN, Djajadi S, et al.           25.   Plummer M, Best N, Cowles K, Vines K. CODA: Convergence diagnosis and
    Substantial underestimation of SARS-CoV-2 infection in the United States.                output analysis for MCMC. R News. 2006;6(1):7–11. Available from: https://
    Nat Commun. 2020;11(1).                                                                  cran.r-project.org/web/packages/coda/index.html.
5. Woloshin S, Patel N, Kesselheim AS. False negative tests for SARS-CoV-2             26.   Brooks SP, Gelman A. General methods for monitoring convergence of
    infection - challenges and implications. N Engl J Med [Internet]. 2020;38(1):            iterative simulations general methods for monitoring convergence of
    1–2. Available from: nejm.org                                                            iterative simulations. J Comput Graph Stat. 1998;7(4):434–55.

Sanderlin et al. BMC Public Health           (2021) 21:577                            Page 10 of 10

27. Taha HA. Operations research: an introduction. 9th ed. Prentice Hall: New
    Jersey, USA; 2011. 14 p.
28. Abbott's USD's 15-minute BinaxNOW COVID-19 Ag Card becomes first
    diagnostic test with Read-Result Test card to receive FDA EUA. HospiMedica
    International Staff writers. [cited 2020 Dec 5]. 2020. Available from: https://
    www.hospimedica.com/covid-19/articles/294784210/abbotts-usd-5-15-
    minute-binaxnow-covid-19-ag-card-becomes-first-diagnostic-test-with-read-
    result-test-card-to-receive-fda-eua.html.
29. Brooks ZC, Das S. Impact of prevalence, sensitivity, and specificity on patient
    risk and cost. Am J Clin Pathol. 2020;154(5):575–84. https://doi.org/10.1093/a
    jcp/aqaa141.
30. Miller DA, Nichols JD, McClintock BT, Grant EHC, Bailey LL, Weir LA.
    Improving occupancy estimation when two types of observational error
    occur: non-detection and species misidentification. Ecology. 2011;92(7):
    1422–8. https://doi.org/10.1890/10-1396.1.
31. Ruiz-Gutierrez V, Hooten MB, Campbell Grant EH, Ruiz-Gutiérrez V, Hooten
    MB, Campbell Grant EH. Uncertainty in biological monitoring: a framework
    for data collection and analysis to account for multiple sources of sampling
    bias. Methods Ecol Evol. 2016;7(8):900–9. https://doi.org/10.1111/2041-21
    0X.12542.
32. Nichols JD, Hines JE, Mackenzie DI, Seamans ME, Gutiérrez RJ. Occupancy
    estimation and modeling with multiple states and state uncertainty.
    Ecology. 2007;88(6):1395–400. https://doi.org/10.1890/06-1474.
33. Gimenez O, Blanc L, Besnard A, Pradel R, Doherty PF, Marboutin E, et al.
    Fitting occupancy models with E-SURGE: hidden Markov modelling of
    presence-absence data. Methods Ecol Evol. 2014;5(6):592–7. https://doi.org/1
    0.1111/2041-210X.12191.
34. Cooch EG, Conn PB, Ellner SP, Dobson AP, Pollock KH. Disease dynamics in
    wild populations: Modeling and estimation: a review. J Ornithol. 2012;
    152(SUPPL. 2):485–509. https://doi.org/10.1007/s10336-010-0636-3.

Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.

You can also read