STRs vs. SNPs: thoughts on the future of forensic DNA testing

Page created by Mitchell Powell
STRs vs. SNPs: thoughts on the future of forensic DNA testing
Forensic Sci Med Pathol (2007) 3:200–205
DOI 10.1007/s12024-007-0018-1


STRs vs. SNPs: thoughts on the future of forensic DNA testing
John M. Butler Æ Michael D. Coble Æ
Peter M. Vallone

Accepted: 4 January 2007 / Published online: 12 September 2007
Ó Humana Press Inc. 2007

Abstract Largely due to technological progress coming                     material, we believe that STRs rather than SNPs will
from the Human Genome and International HapMap                            fulfill the dominant role in human identity testing for the
Projects, the issue has been raised in recent years within                foreseeable future. However, SNPs may play a useful role
the forensic DNA typing community of the potential for                    in specialized applications such as mitochondrial DNA
single nucleotide polymorphism (SNP) markers as possi-                    (mtDNA) testing, Y-SNPs as lineage markers, ancestry
ble replacements of the currently used short tandem                       informative markers (AIMs), the prediction of phenotypic
repeat (STR) loci. Our human identity testing project                     traits, and other potential niche forensic casework
team at the U.S. National Institute of Standards and                      applications.
Technology (NIST) has explored numerous SNP and STR
loci and assays as well as developing miniSTRs for                        Keywords DNA  DNA typing  DNA profiling 
degraded DNA samples. Based on their power of dis-                        Short tandem repeat  Single nucleotide polymorphism 
crimination, use in deciphering mixture components, and                   MiniSTR  STR  SNP  mtDNA
ability to be combined in multiplex assays in order to
recover information from low amounts of biological
Official disclaimer: Contribution of the U.S. National Institute of
Standards and Technology. Not subject to copyright. Points of view in
                                                                          Both length and sequence genetic variation exist in human
this document are those of the authors and do not necessarily             populations. These forms of variation enable forensic DNA
represent the official position or policies of the U.S. Department of     testing because many different alleles can exist in non-
Justice. Certain commercial equipment, instruments and materials are      coding regions of the genome. When information from
identified in order to specify experimental procedures as completely
as possible. In no case does such identification imply a                  multiple unlinked genetic markers is combined, high
recommendation or endorsement by the National Institute of                powers of discrimination are possible. Length variation, in
Standards and Technology nor does it imply that any of the materials,     the form of short tandem repeat (STR) markers, has
instruments, or equipment identified are necessarily the best available   become the primary means for forensic DNA profiling over
for the purpose. This work was funded in part by the National Institute
of Justice through interagency agreement 2003-IJ-R-029 with the
                                                                          the past decade [1]. Large national DNA databases now
NIST Office of Law Enforcement Standards.                                 exist containing millions of STR profiles based on a few
                                                                          core STR markers [2, 3].
J. M. Butler (&)  M. D. Coble  P. M. Vallone                               While current STR systems and DNA databases are
National Institute of Standards and Technology, Biochemical
                                                                          working well, the question often posed is ‘‘what does the
Science Division, 100 Bureau Drive, Mail Stop 8311, Building
227, Room A243, Gaithersburg, MD 20899, USA                               future hold for forensic DNA testing?’’ Will another set of
e-mail:                                              genetic markers, such as single nucleotide polymorphisms
                                                                          (SNPs), replace the core STR loci already so effectively in
Present Address:
                                                                          use? These questions were addressed in 2000 by the
M. D. Coble
Armed Forces DNA Identification Laboratory, Research Section,             Research and Development Working Group of the National
1413 Research Blvd., Bldg 101, Rockville, MD 20850, USA                   Commission on the Future of DNA Evidence [4] and will
STRs vs. SNPs: thoughts on the future of forensic DNA testing
Forensic Sci Med Pathol (2007) 3:200–205                                                                                   201

be considered here briefly based on the current state of the     Significant SNP disadvantages
science with STRs and SNPs.
   For a number of years, largely due to technology pro-         Several significant disadvantages exist with SNP markers
gress coming from the Human Genome and International             when considered as a possible replacement for currently
HapMap Projects [5], the primary potential replacement for       used STR loci with the top two being the number of loci
STRs has usually been proposed to be SNPs, which are             needed and the inability to easily decipher mixtures. First,
sequence variants that occur on average every several            because SNPs are not as polymorphic as STRs, more SNPs
hundred bases throughout the human genome [6]. Abun-             are required to reach equivalent powers of discrimination
dant SNP loci have been characterized and studied in             or random match probabilities. Numbers on the order of
various human populations [7].                                   40–60 SNPs have been suggested in order to approximate
   Most articles to date discussing the potential applica-       the power of 13–15 STR loci as are commonly in use today
tions of SNPs in forensic DNA testing have focused on            [10, 12, 19].
technology reviews [8, 9] or the number of markers needed           Remember that 15 STRs can be routinely amplified
to generate equivalent powers of discrimination in single        simultaneously in a single multiplex amplification reaction
source samples [10–12]. Recently work with improving the         from minimal amounts (e.g., 500 pg) of DNA template
level of multiplex amplification has been addressed [13,         using commercially available kits such as PowerPlex 16
14], as has the selection of optimal SNP loci for use in         and Identifiler. While multiplex PCR amplification of such
forensic panels [15]. However, direct comparisons between        a large number of SNPs (e.g., *50) has only recently been
SNPs and STRs suggest that SNPs are not ready to replace         demonstrated in a research setting [14], routine production
STRs as the workhorse of forensic DNA identification             and commercialization of robust assays containing upwards
markers.                                                         of 100 oligonucleotide PCR primers will not be trivial.
                                                                 Likewise, the expense of examining more loci will be
Potential SNP advantages                                            Perhaps more importantly data interpretation becomes
                                                                 increasingly difficult with more loci and amplification
The primary reason provided for considering SNPs with            products. Issues with locus drop out will become more
forensic applications centers around the fact that a higher      significant when three to five times more loci are involved
recovery of information from degraded DNA samples is             in comparison to traditional STR typing. In addition, assays
theoretically possible since a smaller target region is nee-     with a larger number of loci are more sensitive to the
ded. Only a single nucleotide needs to be measured with          quantity and integrity of the input DNA template particu-
SNP markers instead of an array of nucleotides—some-             larly when trying to amplify limited DNA materials.
times hundreds of nucleotides in length—as with STRs                Current SNP typing for use in HapMap population
(Fig. 1). While this argument has been made for many             studies (e.g., 7) or other clinical genetics projects involve
years, a recent direct comparison for SNPs and STRs found        attempting to type many thousands of SNP loci in a rela-
that STR markers, when shortened to miniSTRs, perform            tively small number of samples. If a few dozen or even
quite well on degraded DNA material [13]. A wide variety         hundreds of loci fail to produce a result on a sample, these
of STR loci exist, and ones with moderate allele ranges and      data are excluded from the final analysis or further attempts
small amplicon sizes can work effectively with compro-           are made using a replenishable supply of DNA. This type
mised DNA samples. Primer redesign to create smaller             of data loss when attempting to perform a direct compar-
amplicons with core STR loci [16] and a number of new            ison between a suspect and evidence is undesirable or even
STR loci [17] has been described recently.                       unacceptable under the current paradigm of sample
   Another advantage of SNPs is that they possess muta-          matching performed with STR typing on only a dozen or so
tion rates approximately 100 thousand times lower than           loci. Will investigators or the courts care that a fraction of
STRs (10–8 vs. 10–3). Thus, theoretically SNPs, being more       the SNP loci attempted failed to produce a result? With
stable in terms of inheritance, could aid parentage testing in   limited amounts of starting material in many situations,
some cases or kinship analysis such as is performed with         there may not be opportunities to repeat the testing in an
identifying mass disaster victims [18]. However, Amorim          effort to recover the lost loci. Furthermore, the loci missing
and Pereira [19] note some unexpected drawbacks of using         on reference samples may be different from those that
SNPs in forensic kinship investigations based on simula-         failed on the evidentiary material leaving even less of an
tions. They predicted that a battery of 45 SNPs would            overlap of successfully typed loci for comparison purposes.
produce a higher frequency of cases where statistical evi-          When attempting to analyze a greater number of genetic
dence would be inconclusive when applied to routine              loci, there will be an increased complexity of data to be
paternity investigations [19].                                   examined. Depending on the SNP detection platform, the
202                                                                                                     Forensic Sci Med Pathol (2007) 3:200–205

Fig. 1 Comparison of (A)                                                                                                         (Biallelic)
STRs and (B) SNPs in terms of                                       Conventional STR                                                SNP
the number of possible allele                                              miniSTR
combinations and relative size                                                                                                        C
of the target region                                        7 repeats
                                                            8 repeats                                                                 T
                                                            9 repeats
                                                            10 repeats
                                                            11 repeats
                                                            12 repeats
                                                                                                                              Target region
                                                                                                                              (single nucleotide
                                                            13 repeats                                                          polymorphism)

                                                                            Target region
                                                                           (short tandem repeat)                    Alleles        Genotypes
                                                                                                                                   (3 possible genotypes)

                                                                                                                      C                   CC
                                                                                                                      T                   TT

                                                       Alleles           Genotypes
                                                                         (28 possible genotypes)

                                                        7                7,7
                                                        8                7,8     8,8
                                                        9                7,9     8,9     9,9
                                                       10                7,10    8,10    9,10      10,10
                                                       11                7,11    8,11    9,11      10,11 11,11
                                                       12                7,12    8,12    9,12      10,12 11,12 12,12
                                                       13                7,13    8,13    9,13      10,13 11,13 12,13 13,13

data to be interpreted will vary. With so many loci being                exist for SNP typing [8, 9, 21, 22]. The various chemistries
typed, software interpretation will become more reliant on               involved are far from being standardized. Without con-
expert computer systems as the primary form of data                      sensus throughout the community on the SNP markers to
analysis. More loci mean more peak signals and the                       be employed or the detection platform(s) to be utilized,
potential for more artifacts. Thus, detailed data analysis               SNP typing cannot gain a foothold over the dominance of
will become more tedious and practically impossible                      the widely used STR typing systems.
without validated expert systems. Although the argument
has been made that data interpretation will be easier with
SNPs because they do not possess stutter products or                     Likely future role of SNPs
microvariants, in a certain sense these biological amplifi-
cation artifacts provide greater confidence in results—i.e.,             However, SNPs may play a useful role in niche applica-
that a measured peak with a stutter product is truly an STR              tions such as mitochondrial DNA (mtDNA), Y-SNPs,
allele rather than a spike or a fluorescent dye artifact.                ancestry informative markers (AIMs), predicting pheno-
   However, from our point of view, the most significant                 typic traits, and other potential forensic case work
disadvantage of SNP typing is that the limited number of                 applications. Coding region SNPs can fulfill a useful role
alleles (typically two) for each SNP locus limits or prevents            for separating common HV1/HV2 mitochondrial DNA
reliable mixture interpretation. A major advantage with                  types [23, 24], and assays have been developed to reliably
STRs in a forensic setting is that many possible alleles exist           examine mtDNA coding region SNP variation [25, 26].
providing the possibility that the multiple contributors to a            While Y-SNPs have limited utility for individualizing a
mixture will have distinguishable (non-overlapping)                      sample, they may, depending on the population(s) of
alleles. Figure 2 shows SNP and STR typing data obtained                 interest, be helpful in aiding estimations of ethnic origin
on the same mixed DNA sample. While the STR results                      [27, 28].
clearly suggest more than one contributor based on the                      Predicting ancestry [29, 30] or phenotypic characteris-
number of alleles present at multiple loci, an analyst would             tics, such as red hair color [31] or eye color [32], is another
find it difficult to determine that a mixture is present based           role that SNPs may play in the future where the amount of
on the six SNP loci shown.                                               sample is not limited. In these cases, SNP typing could help
   Another challenge preventing routine application of                   provide investigators with information about a perpetrator
SNPs today is that multiple possible detection platforms                 based solely on the biological evidence left at the crime
Forensic Sci Med Pathol (2007) 3:200–205                                                                                                    203

Fig. 2 Mixture detection on the               (A)    SNP typing results
same samples with (A) SNPs
                                                     SNP                   SNP            SNP            SNP        SNP        SNP
and (B) STRs. The top panel in                      Locus 1               Locus 2        Locus 3        Locus 4    Locus 5    Locus 6
each section contains a mixture
possessing as one of its
                                                                 Mixture                                  TT
components the same DNA that                                                                                           CC
is shown as a single source in                        C    T                                                                       CT
the bottom panel. Six different                                                 T          C T
SNP loci (from a total of 70
SNPs evaluated; see [20]) are                                                                                          CC
                                                      TT        Single Source                            TT                         T
shown in (A). Note that only                                                               CC
SNP locus 2 in the top panel of
                                                                           C    T
(A) has an allele imbalance
suggesting that a possible
second contributor is present.
On the other hand in the top
panel of (B), all 5 STR loci                  (B)    STR typing results
shown from the green dye
channel of the Identifiler kit                                 Mixture
contain three alleles making
mixture detection much easier

                                                           Single Source

scene. Research in these areas is on-going and not yet               platform/chemistry for SNPs that is robust enough for
ready for routine use. Thus, it is important to stress that          forensic use (at least as good as STRs) has not been fully
while we do not think autosomal SNPs will replace auto-              developed.
somal STRs anytime soon, the utility of various SNP
assays should still be evaluated.
   It will be valuable to understand which SNP markers               Summary
may be useful when/if the technology platforms catch up to
the level acceptable for forensic usage. Since SNP markers           The overall points made in this article are summed up in
are abundant, a larger pool of potential markers is available        Table 1. While automation of SNP detection has improved
for testing. The majority of SNPs are bi-allelic so a search         significantly in recent years enabling millions of SNPs to be
for ‘most informative’ loci is not required (conversely a            examined on hundreds of samples in a relatively short period
useful STR should have multiple well-populated allelic               of time (e.g., 7), due to the large number of loci that must be
states). For human identification purposes, loci with low            co-amplified and the inability to easily decipher mixtures, we
FST values and a heterozygosity of [40% are generally                do not feel that SNPs stand on the horizon as future markers
adequate for forensic usage [15]. The abundance of SNPs              (i.e., replacing STRs) for widespread use in forensic DNA
means that ‘poor’ markers can be thrown out since the                testing. That being said, there are and will likely be important
information content is essentially equivalent between loci
(this would be done in the initial selection phase of a SNP
panel).                                                              Table 1 A simple summary comparing the characteristics of STRs
   It is important to keep in mind that an infrastructure            and SNPs
exists that supports STR typing. Databases with core STR             Characteristics                               STRs/miniSTRs        SNPs
loci contain millions of profiles and are still being popu-
                                                                     Success with degraded DNA                     +                    +
lated. More than 10 years of expertise in STR typing is
                                                                     Power of discrimination (without              +                    –
present in many forensic laboratories. A question that                 significant multiplexing)
should be asked is, ‘‘Is it worth the time and cost to convert
                                                                     Mixture detection/interpretation              +                    –
over to a new marker/technology?’’ The main proposed
                                                                     Other applications:
benefits of SNPs are the potential for use on degraded
                                                                     Ethnicity estimation, physical traits, etc.   –                    +
samples and possible rapid—high throughout analysis.
                                                                     Commercial testing kits available             +                    –
However, it appears that miniSTRs complete well in the
                                                                     Consistent platform/assay for analysis        +                    –
arena of degraded samples [13], and a high throughput
204                                                                                                  Forensic Sci Med Pathol (2007) 3:200–205

niche roles for SNP typing. We agree with Gill et al. [33] that                 development of new DNA typing systems. Electrophoresis
autosomal SNPs will likely supplement STR results rather                        1999;20:1682–96.
                                                                          11.   Krawczak M. Informativity assessment for biallelic single
than supplant them due to the immense investment already                        nucleotide polymorphisms. Electrophoresis 1999;20:1676–81.
made in national DNA databases.                                           12.   Gill P. An assessment of the utility of single nucleotide poly-
                                                                                morphisms (SNPs) for forensic purposes. Int J Legal Med
                                                                          13.   Dixon LA, Dobbins AE, Pulker HK, Butler JM, Vallone PM,
Educational message                                                             Coble MD, Parson W, Berger B, Grubwieser P, Mogensen HS,
                                                                                Morling N, Nielsen K, Sanchez JJ, Petkovski E, Carracedo A,
1.    The two primary advantages for SNPs include (a)                           Sanchez-Diz P, Ramos-Luis E, Brion M, Irwin JA, Just RS,
      potential ability to work well on degraded DNA                            Loreille O, Parsons TJ, Syndercombe-Court, Schmitter H,
                                                                                Stradmann-Bellinghausen B, Bender K, Gill P. Analysis of arti-
      because a small target region can be amplified and (b)                    ficially degraded DNA using STRs and SNPs-results of a
      lower mutation rates compared to STRs, which could                        collaborative European (EDNAP) exercise. Forensic Sci Int
      aid kinship testing.                                                      2006;164(1):33–44.
2.    Significant disadvantages for SNPs include needing                  14.   Sanchez JJ, Phillips C, Borsting C, Balogh K, Bogus M, Fon-
                                                                                devila M, Harrison CD, Musgrave-Brown E, Salas A,
      40–60 loci to obtain equivalent match probabilities as                    Syndercombe-Court D, Schneider PM, Carracedo A, Morling N.
      13–15 STRs commonly used today and the greater                            A multiplex assay with 52 single nucleotide polymorphisms for
      difficulty with mixture interpretation due to a limited                   human identification. Electrophoresis 2006;27:1713–24.
      number of alleles compared to multi-allelic STR                     15.   Kidd KK, Pakstis AJ, Speed WC, Grigorenko EL, Kajuna SL,
                                                                                Karoma NJ, Kungulilo S, Kim JJ, Lu RB, Odunsi A, Okonofua F,
      markers.                                                                  Parnas J, Schulz LO, Zhukova OV, Kidd JR. Developing a SNP
3.    SNPs do have a potential future role in aiding                            panel for forensic identification of individuals. Forensic Sci Int
      investigators with predicting ancestry or phenotypic                      2006;164(1):20–32.
      characteristics although research is still on-going in              16.   Butler JM, Shen Y, McCord BR. The development of reduced
                                                                                size STR amplicons as tools for analysis of degraded DNA. J
      these areas.                                                              Forensic Sci 2003;48:1054–64.
                                                                          17.   Coble MD, Butler JM. Characterization of new miniSTR loci to
Acknowledgements The assistance of Amy Decker with perform-                     aid analysis of degraded DNA. J Forensic Sci 2005;50:43–53.
ing SNP typing and Becky Hill with performing STR and miniSTR             18.   Biesecker LG, Bailey-Wilson JE, Ballantyne J, Baum H, Bieber
typing in our laboratory is gratefully acknowledged. This work was              FR, Brenner C, Budowle B, Butler JM, Carmody G, Conneally
funded in part by the National Institute of Justice through interagency         PM, Duceman B, Eisenberg A, Forman L, Kidd KK, LeClair B,
agreement 2003-IJ-R-029 with the NIST Office of Law Enforcement                 Niezgoda S, Parsons T, Pugh E, Shaler R, Sherry ST, Sozer A,
Standards.                                                                      Walsh A. DNA identifications after the 9/11 World Trade Center
                                                                                attack. Science 2005;310:1122–3.
                                                                          19.   Amorim A, Pereira L. Pros and cons in the use of SNPs in
                                                                                forensic kinship investigation: a comparative analysis with STRs.
References                                                                      Forensic Sci Int 2005;150:17–21.
                                                                          20.   Vallone PM, Decker AE, Butler JM. Allele frequencies for 70
 1. Butler JM. Forensic DNA typing: Biology, technology, and                    autosomal SNP loci with U.S. Caucasian, African-American, and
    genetics of STR markers. 2nd ed. New York: Elsevier; 2005.                  Hispanic samples. Forensic Sci Int 2005;149:279–86.
 2. Gill P. Role of short tandem repeat DNA in forensic casework in       21.   Gut IG. Automation in genotyping of single nucleotide poly-
    the UK – past, present, and future perspectives. BioTechniques              morphisms. Hum Mutat 2001;17:475–92.
    2002;32:366–72.                                                       22.   Syvanen AC. Accessing genetic variation: genotyping single
 3. Butler JM. Genetics and genomics of core short tandem repeat loci           nucleotide polymorphisms. Nat Rev Genet 2001;2:930–42.
    used in human identity testing. J Forensic Sci 2006;51:253–65.        23.   Coble MD, Just RS, O’Callaghan JE, Letmanyi IH, Peterson CT,
 4. National Institute of Justice. The future of forensic DNA testing:          Irwin JA, Parsons TJ. Single nucleotide polymorphisms over the
    predictions of the research and development working group. Nov              entire mtDNA genome that increase the power of forensic testing
    2000, Washington, DC; available at                in Caucasians. Int J Legal Med 2004;118:137–46.
    nij/pubs-sum/183697.htm                                               24.   Coble MD, Vallone PM, Just RS, Diegoli TM, Smith BC, Parsons
 5. International HapMap Consortium. A haplotype map of the                     TJ. Effective strategies for forensic analysis in the mitochondrial
    human genome. Nature 2005;437:1229–320.                                     DNA coding region. Int J Legal Med 2006;120:27–32.
 6. Brookes AJ. The essence of SNPs. Gene 1999;234:177–86.                25.   Brandstatter A, Parsons TJ, Parson W. Rapid screening of
 7. Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger               mtDNA coding region SNPs for the identification of west Euro-
    DG, Frazer KA, Cox DR. Whole-genome patterns of common                      pean Caucasian haplogroups. Int J Legal Med 2003;117(5):291–
    DNA variation in three human populations. Science                           8.
    2005;307:1072–9.                                                      26.   Vallone PM, Just RS, Coble MD, Butler JM, Parsons TJ. A
 8. Budowle B. SNP typing strategies. Forensic Sci Int 2004;S139–42.            multiplex allele-specific primer extension assay for forensically
 9. Sobrino B, Brion M, Carracedo A. SNPs in forensic genetics: a               informative SNPs distributed throughout the mitochondrial gen-
    review on SNP typing methodologies. Forensic Sci Int                        ome. Int J Legal Med 2004;118:147–57.
    2005;154:181–94.                                                      27.   Vallone PM, Butler JM. Y-SNP typing of U.S. African American
10. Chakraborty R, Stivers DN, Su Y, Budowle B. The utility of short            and Caucasian samples using allele-specific hybridization and
    tandem repeat loci beyond human identification: implications for            primer extension. J Forensic Sci 2004;49:723–32.
Forensic Sci Med Pathol (2007) 3:200–205                                                                                                   205

28. Wetton JH, Tsang KW, Khan H. Inferring the population of                  indicator of the red hair phenotype. Forensic Sci Int
    origin of DNA evidence within the UK by allele-specific                   2001;122:124–9.
    hybridization of Y-SNPs. Forensic Sci Int 2005;152:45–53.             32. Sturm RA, Frudakis TN. Eye colour: portals into pigmentation
29. Shriver MD, Kittles RA. Genetic ancestry and the search for               genes and ancestry. Trends Genet 2004;20(8):327–32.
    personalized genetic histories. Nat Rev Genet 2004;5(8):611–8.        33. Gill P, Werrett DJ, Budowle B, Guerrieri R. An assessment of
30. Frudakis T, Venkateswarlu K, Thomas MJ, Gaskin Z, Ginjupalli              whether SNPs will replace STRs in national DNA databases –
    S, Gunturi S, Ponnuswamy V, Natarajan S, Nachimuthu PK. A                 joint considerations of the DNA working group of the European
    classifier for the SNP-based inference of ancestry. J Forensic Sci.       Network of Forensic Science Institutes (ENFSI) and the Scientific
    2003;48(4):771–82.                                                        Working Group on DNA Analysis Methods (SWGDAM). Sci
31. Grimes EA, Noake PJ, Dixon L, Urquhart A. Sequence poly-                  Justice 2004;44:51–3.
    morphism in the human melanocortin 1 receptor gene as an
You can also read