Forensic Science International: Genetics
Page content transcription ( If your browser does not render page correctly, please read the page content below )
Forensic Science International: Genetics 7 (2013) 98–115 Contents lists available at SciVerse ScienceDirect Forensic Science International: Genetics journal homepage: www.elsevier.com/locate/fsig The HIrisPlex system for simultaneous prediction of hair and eye colour from DNA Susan Walsh a, Fan Liu a, Andreas Wollstein a, Leda Kovatsi b, Arwin Ralf a, Agnieszka Kosiniak-Kamysz c, Wojciech Branicki d,e, Manfred Kayser a,* a Department of Forensic Molecular Biology, Erasmus MC University Medical Centre Rotterdam, The Netherlands b Laboratory of Forensic Medicine & Toxicology, School of Medicine, Aristotle University of Thessaloniki, Greece c Department of Dermatology, Collegium Medicum of the Jagiellonian University, Kraków, Poland d Section of Forensic Genetics, Institute of Forensic Research, Kraków, Poland e Department of Genetics and Evolution, Institute of Zoology, Jagiellonian University, Kraków, Poland A R T I C L E I N F O A B S T R A C T Article history: Recently, the ﬁeld of predicting phenotypes of externally visible characteristics (EVCs) from DNA Received 2 May 2012 genotypes with the ﬁnal aim of concentrating police investigations to ﬁnd persons completely unknown Received in revised form 25 June 2012 to investigating authorities, also referred to as Forensic DNA Phenotyping (FDP), has started to become Accepted 23 July 2012 established in forensic biology. We previously developed and forensically validated the IrisPlex system for accurate prediction of blue and brown eye colour from DNA, and recently showed that all major hair Keywords: colour categories are predictable from carefully selected DNA markers. Here, we introduce the newly Hair colour developed HIrisPlex system, which is capable of simultaneously predicting both hair and eye colour from Eye colour DNA. HIrisPlex consists of a single multiplex assay targeting 24 eye and hair colour predictive DNA Forensic DNA Phenotyping DNA intelligence variants including all 6 IrisPlex SNPs, as well as two prediction models, a newly developed model for hair Externally visible characteristics colour categories and shade, and the previously developed IrisPlex model for eye colour. The HIrisPlex assay was designed to cope with low amounts of template DNA, as well as degraded DNA, and preliminary sensitivity testing revealed full DNA proﬁles down to 63 pg input DNA. The power of the HIrisPlex system to predict hair colour was assessed in 1551 individuals from three different parts of Europe showing different hair colour frequencies. Using a 20% subset of individuals, while 80% were used for model building, the individual-based prediction accuracies employing a prediction-guided approach were 69.5% for blond, 78.5% for brown, 80% for red and 87.5% for black hair colour on average. Results from HIrisPlex analysis on worldwide DNA samples imply that HIrisPlex hair colour prediction is reliable independent of bio-geographic ancestry (similar to previous IrisPlex ﬁndings for eye colour). We furthermore demonstrate that it is possible to infer with a prediction accuracy of >86% if a brown-eyed, black-haired individual is of non-European (excluding regions nearby Europe) versus European (including nearby regions) bio-geographic origin solely from the strength of HIrisPlex eye and hair colour probabilities, which can provide extra intelligence for future forensic applications. The HIrisPlex system introduced here, including a single multiplex test assay, an interactive tool and prediction guide, and recommendations for reporting ﬁnal outcomes, represents the ﬁrst tool for simultaneously establishing categorical eye and hair colour of a person from DNA. The practical forensic application of the HIrisPlex system is expected to beneﬁt cases where other avenues of investigation, including STR proﬁling, provide no leads on who the unknown crime scene sample donor or the unknown missing person might be. ß 2012 Elsevier Ireland Ltd. Open access under CC BY-NC-ND license. 1. Introduction DNA Phenotyping (FDP). The ability to predict the physical appearance of an individual directly from crime scene material Over the last few years, the prediction of externally visible can in principle help police investigations by limiting a large characteristics (EVCs) from DNA has been an interesting topic of number of potential suspects in cases where perpetrators study for many reasons, in particular, its anticipated use within unknown to the investigating authorities are involved. These forensic genetics [1–3] resulting in the chosen term Forensic include cases where conventional STR proﬁling could not provide a hit within the forensic DNA (proﬁle) database, or could not provide a match with a suspect singled-out by police * Corresponding author. Tel.: +31 10 7038073; fax: +31 10 7044575. investigation, or cases where an STR proﬁle could simply not be E-mail address: email@example.com (M. Kayser). generated due to low quality and/or quantity of DNA available. 1872-4973 ß 2012 Elsevier Ireland Ltd. Open access under CC BY-NC-ND license. http://dx.doi.org/10.1016/j.fsigen.2012.07.005
S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 99 Using EVC information obtained from the crime scene material excluding red and either blond or brown hair colour in its via FDP, police would then proceed with more concentrated prediction for many of their individuals. More recently, Valenzuela enquires, and ﬁnally request standard forensic STR proﬁling only et al.  assessed 75 SNPs from 24 genes previously implicated in for the reduced number of EVC matching suspects aiming DNA hair, skin and eye colour in samples of various bio-geographic individualisation for court room use. Obviously, the more EVCs origins (Europe and elsewhere) and found that three of them, i.e. that are predictable from crime scene material, the better a rs12913832 (HERC2), rs16891982 (SLC45A2) and rs1426654 person’s appearance can be described, and in turn the smaller (SLC24A5) combined gave the best prediction for light and dark the number of appearance-matching potential suspects for hair colour. subsequent forensic STR proﬁling. Also in missing person cases Armed with previous knowledge on hair colour associated DNA where a body was found decomposed with no EVC information variants and in considering the most up-to-date list of DNA discernable from visual inspection, or body parts that do not variants related to human hair colour variation available at the provide EVC information including bones, FDP is expected to time, we recently performed an evaluation of 46 SNPs from 13 provide leads for ﬁnding the right antemortem samples or genes  for model-based population-wise hair colour predic- family members for ﬁnal STR-based identiﬁcation. tion aiming to ﬁnd a set of most hair colour predictive DNA The use of DNA (or other biomarkers) for investigative purposes variants. In this previous study we identiﬁed a set of 13 DNA termed ‘DNA intelligence’, rather than for identiﬁcation purposes markers (2 MC1R combined marker sets and 11 single DNA in the court room as currently applied in forensics, marks a markers) from 11 genes (MC1R, HERC2, OCA2, SLC45A2 (MATP), completely new application of DNA in forensics and is currently at KITLG, EXOC2, TYR, SLC24A4, IRF4, PIGU/ASIP and TYRP1) containing the early stages of development. At present there is only one FDP most hair colour predictive information. This DNA marker set tool available that has already been developmentally validated for provided a high degree of population-based, prevalence-adjusted forensic use and that is the IrisPlex system, capable of predicting overall prediction accuracy as expressed by the area under the eye colour from DNA [4,5]. Although other studies have suggested curve of a receiver operating characteristic curve (AUC) with DNA markers and methods for predicting externally visible traits, estimates at 0.93 for red, 0.87 for black, 0.82 for brown, and 0.81 most notably eye colour [6–18] none of them introduced a tool that for blond hair colour, where 1 means completely accurate had undergone systematic forensic developmental validation prediction. However, the genotyping methodology used in this testing as of yet. The IrisPlex system allows the prediction of previous screening study did not allow simultaneous genotyping eye colour from minute amounts of DNA (31 pg DNA input full of all 22 identiﬁed hair colour predictive DNA markers in a single proﬁles) and has proven to be 94% accurate for predicting blue and reaction as would be appreciated in forensic DNA analysis where brown eye colour when tested on a European set of >3800 there can be limited amounts of starting material. Furthermore, in individuals . However, work is on-going with regards to the previous study, only samples with hair colour genotypes and identifying the underlying genes and developing predictive DNA phenotypes from a single country in Eastern Europe, i.e. Poland, markers for several other EVCs  such as skin colour [8,20,21], were available, whereas the inclusion of individuals from other hair colour [6,8], body height [22,23], male baldness , and hair European regions, such as Western and Southern parts, would be morphology [25,26]. beneﬁcial in order to enrich with individuals displaying hair The previous progress on categorical eye colour DNA predict- colours such as brown and black that are more common in these ability together with the strong genetic and phenotypic relation- parts of Europe. ship between eye and hair colour variation, as well as the increased In the present study, we developed and evaluated the understanding of the genetic basis of hair colour, all suggest that sensitivity of a single-tube multiplex assay targeting the 22 hair colour may represent the next-promising candidate EVC for previously recognised hair colour predictive DNA variants as well DNA prediction after eye colour. Hair colour (as well as eye colour), as the six eye colour predictive SNPs from our previously is generally known to be highly variable in people of (at least developed IrisPlex system (four of which are overlapping). We partial) European descent and those from nearby regions such as employed the SNaPshot technology because it can be easily the Middle East and parts of Western Asia , with individuals implemented in forensic DNA laboratories as no additional displaying numerous variations of hair colour shade that are equipment or serious interference with protocols is needed to usually summarised in four main categories of colour such as red, apply it. Furthermore, we assessed the power of the 22 DNA blond, brown and black. In contrast, people from any other parts of variants to predict hair colour categories, as well as hair colour the world (and without European/nearby genetic admixture) shade, via model-based prediction studies using an expanded usually display the ancestral black hair colour (together with the database of hair colour genotype and phenotype data for >1500 ancestral brown eye colour) phenotype. Variation in hair (and eye) individuals from Eastern, Western and Southern parts of Europe colour is assumed to be of European origin and is thought to have that displayed varying degrees of hair colouration. Moreover, we reached their currently observed frequencies via sexual selection investigated via analysing a worldwide set of individuals from 51 (i.e. mate choice preferences) . The genetic basis of human hair populations (HGDP-CEPH), whether or not the reliability of hair colour variation has been studied considerably in the last few colour prediction available with these 22 DNA variants depends on years. Recent studies either employing the candidate gene knowledge of bio-geographic ancestry. We present and make approach or genome-wide association and/or linkage analysis available for future use, the ﬁrst system for parallel prediction of have identiﬁed genes and DNA variants likely to be involved in hair and eye colour from DNA we termed HIrisPlex, consisting of a human hair colour variation [6–8,12,14,29–33]. Some preliminary single multiplex assay for 24 eye and/or hair colour predictive DNA attempts have already been made towards the prediction of hair variants and two prediction models, i.e. a newly developed model colour from informative DNA variants. In fact, an early red hair for hair colour and shade prediction and the previously developed prediction protocol based on a combination of non-synonymous IrisPlex model for eye colour prediction. An interactive spread- single nucleotide polymorphisms (SNPs) in the MC1R gene that sheet tool for obtaining individual hair colour, hair colour shade, incur the red hair phenotype effect was already developed for and eye colour prediction probabilities from HIrisPlex genotypes as forensic use more than ten years ago  and its accuracy was 84% well as a prediction guide for accurate interpretation of individual in the prediction of red hair individuals. Sulem et al.  in their hair colour and shade probabilities are made available to enhance genome-wide association study for European pigmentation traits the practical use of the HIrisPlex system in future applications such developed a hair colour prediction tool, which was capable of as forensics.
100 S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 2. Materials and methods population and green eye colour in the Irish population were intentionally enriched due to their rare occurrence, therefore both 2.1. Subjects, imagery and hair and eye colour classiﬁcation phenotypes do not reﬂect natural population frequencies. DNA samples and hair colour information was collected from 2.2. DNA samples and HIrisPlex genotyping 1551 European subjects living in Poland (n = 1093), the Republic of Ireland (n = 339) and Greece (n = 119). All participants gave DNA from the Polish samples was extracted as described informed consent. The study was approved in part by the Ethics previously . Saliva samples collected from individuals in Committee of the Jagiellonian University, number KBET/17/B/2005 Ireland were extracted using the Puregene DNA isolation kit and the Commission on Bioethics of the Regional Board of Medical (Qiagen, Hilden, Germany). Buccal swabs collected from individu- Doctors in Krakow number 48 KBL/OIL/2008. Hair and eye colour als in Greece were extracted using an in-house organic extraction phenotypes were collected by a combination of self-assessment protocol. DNA from the H952 subset of the HGDP-CEPH panel that and professional single observer grading (Polish data). The represents 952 individuals from 51 worldwide populations [36,37] professional grader (AKK) for the polish dataset is a medical were purchased from CEPH. Due to lack of DNA in some samples doctor (dermatologist) who evaluated hair colour upon observa- belonging to the HGDP-CEPH 952 set, 7 individuals could not be tion, and questioning of individuals in circumstances where hair genotyped by the HIrisPlex assay, and therefore the ﬁnal number of was dyed or grey. For hair colour phenotype self-assessment, worldwide samples was 945. individuals were asked to ﬁll into the questionnaire, the colour of All samples were genotyped using the HIrisPlex assay. The their hair during their 20s, and at what age grey/white hairs started assay includes 23 SNPs and 1 insertion/deletion (INDEL) polymor- to appear (Irish collection), this avoided the effects of hair greying phism, altogether 24 DNA variants, from 11 genes: MC1R, HERC2, and whitening on phenotyping. Sample collection in Ireland OCA2, SLC24A4, SLC45A2, IRF4, EXOC2, TRYP1, TYR, KITLG, and PIGU/ included high-resolution eye and hair photographic imagery. In a ASIP. Further information on these 24 markers can be found in brief description, hair and eye images were taken using a Nikon Table 2, including primer sequences. The 24 PCR primer pairs were D3100 with an AF-S Micro Nikkor 60 mm macro lens, the aperture, designed using the default parameters of the program Primer3Plus shutterspeed and ISO were ﬁxed to f = 22, 1/125, and 200 , which is a free web-based design software. PCR fragments respectively. A ring ﬂash (model Speedlight SB-R200) and an were designed to be as short as possible to cater for degraded DNA, average distance of 7 cm was used from the eye and from the back and therefore all are less than 160 bp in length. To reduce the of the head for hair imagery. This ensured consistent sampling and possibility of primer pairs interacting with each other, the program regulated lighting conditions, including lens settings of a 0.2 and Autodimer  was used to analyse primer sequences. Surround- 0.23 ﬁxed focal length. All individuals were asked to ﬁll in a ing sequence regions were also searched with BLAST  against questionnaire that included basic information, such as gender and dbSNP  to reduce the chance of a primers location covering a age as well as data concerning eye and hair pigmentation known interfering SNP site for efﬁcient primer binding. phenotype. However, due to many Irish individuals having dyed For the population genotyping, genomic DNA quantities or grey hair, self-reported hair colour classiﬁcations were used for ranging from 300 pg to 3 ng in 1 ml formats were ampliﬁed per this set in model training. For the Greek collection, a buccal swab individual in a 10 ml reaction volume consisting of 1 PCR buffer, was taken from each individual and a self-reported questionnaire 2.5 mM MgCl2, 220 mM of each dNTP, and 1.75 U AmpliTaq Gold regarding hair and eye colour information was collected. For both DNA polymerase (Applied Biosystems Inc., Foster City, CA) the Irish and Greek set, hair colour was classiﬁed into 7 categories: including PCR primer concentrations found in Table 2. Thermo- blond (5.9%), light-brown (34%), dark-brown (45.2%), auburn cycling was performed on the 96-well GeneAmp1 PCR system (5.7%), blond-red (1.3%), red (2.2%), and black (5.7%). For the Polish 9700 (Applied Biosystems) under the following conditions (1) dataset, this data was collected as previously reported  and 95 8C for 10 min, (2) 33 cycles of 95 8C for 30 s and 61 8C for 30 s, (3) hair colour was classiﬁed into 7 categories: blond (13.7%), dark- 5 min at 61 8C. PCR products were cleaned with ExoSAP-IT (USB blond (44.2%), brown (22.6%), auburn (1%), blond-red (3.9%), red Corp., Cleveland, OH), as recommended by the manufacturer. (3.8%), and black (10.8%)). For hair colour prediction analyses, we Following removal of unincorporated dNTPs and primers. The grouped blond and dark-blond into one blond category (42.6%), multiplex SBE (single base extension) assay was performed using light brown and dark brown into one brown category (39.3%) and 2 ml of product with 1 ml of ABI SNaPshot kit (Applied Biosystems, auburn, blond-red, and red into one red category (8.8%) with black Foster City, CA) reaction mix in a total reaction volume of 5 ml. as an additional fourth category (9.3%). Eye colour was classiﬁed Single base extension (SBE) primer sequences and concentrations into 3 categories blue, brown and intermediate (including green). used in the assay can be found in Table 2. Thermocycling The term category in this context refers to the grouping of similar conditions were as follows: 96 8C for 2 min and 25 cycles of phenotypic colours into one group to separate them from another 96 8C for 10 s, 50 8C for 5 s and 60 8C for 30 s. Products were cleaned colour group, i.e. blond category, black category. Table 1 displays using SAP (USB Corp.), following manufacturers guidelines and the numbers of hair and eye colour phenotypes including sex, 1 ml of cleaned product was run on the ABI 3130xl Genetic within all 3 populations sampled. Notably red hair in the Polish Analyser (Applied Biosystems) with POP-7 on a 36 cm capillary Table 1 Phenotype frequencies according to hair and eye colour categories (including sex) for the full combined set of individuals from Poland, Ireland and Greece. Hair colour Blond Dark Dark Brown Blond Red Black Total Eye Intermediate Brown Total Male Female Total blond*/light brown red/auburn red colour – (green, brown blue heterochromia) Poland 150 483* 247 11 43 41 118 1093 590 164 339 1093 449 644 1093 Ireland 16 111 158 23 6 10 15 339 172 90 77 339 77 262 339 Greece 11 45 49 3 0 0 11 119 13 15 91 119 51 68 119 Total 177 639 454 37 49 51 144 1551 775 269 507 1551 577 974 1551 * represents individuals who were reported as dark blond in the dark blond/light brown category.
Table 2 Information about the 24 DNA variants of the HIrisPlex assay, including PCR and single base extension (SBE) primer sequences and concentrations. Assay SNP CHR Position Gene Major Minor PCR primers Concentration Product SBE Concentration position Allele Allele size primers 1 N29insA 16 89985753 Exonic MC1R C insA MC1Rset1F Set1 0.55 mm GCAGGGATCCCAGAGAAGAC 117bp CCCCAGCTGGGGCTGGCTGCCAA 1.3 mm 2 rs11547464 16 89986091 Exonic MC1R G A MC1Rset1R 0.55 mm TCAGAGATGGACACCTCCAG ttttttttttttGCCATCGCCGTGGACC 0.1 mm 3 rs885479 16 89986154 Exonic MC1R C T MC1Rset2F Set2 0.5 mm CTGGTGAGCTTGGTGGAGA 158bp ttttttttttttttttttGATGGCCGCAACGGCT 1.25 mm 4 rs1805008 16 89986144 Exonic MC1R C T MC1Rset2R 0.5 mm TCCAGCAGGAGGATGACG tttttttttttttACAGCATCGTGACCCTGCCG 0.375 mm 5 rs1805005 16 89985844 Exonic MC1R G T MC1Rset3F Set3 0.5 mm GTCCAGCCTCTGCTTCCTG 147bp tttttttttttttttTGGTGGAGAACG- 0.75 mm CGCTGGTG 6 rs1805006 16 89985918 Exonic MC1R C A MC1Rset3R 0.5 mm AGCGTGCTGAAGACGACAC ttttttttttttttttttttCTGCCTGGC- 0.75 mm CTTGTCGGA 7 rs1805007 16 89986117 Exonic MC1R C T MC1Rset4F Set4 0.4 mm CAAGAACTTCAACCTCTTTCTCG 106bp tttttttttttttttttttttttttCTCCATCTTC- 1 mm TACGCACTG 8 rs1805009 16 89986546 Exonic MC1R G C MC1Rset4R 0.4 mm CACCTCCTTGAGCGTCCTG ttttttttttttttttttttttttttttttATCTGC- 0.4 mm AATGCCATCATC 9 Y152OCH 16 89986122 Exonic MC1R C A ttttttttttttttttttttttttttttttCATCTT- 0.6 mm S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 CTACGCACTGCGCTA 10 rs2228479 16 89985940 Exonic MC1R G A ttttttttttttttttttttttttttttttttttttC- 0.375 mm TGGTGAGCGGGAGCAAC 11 rs1110400 16 89986130 Exonic MC1R T C ttttttttttttttttttttttttttttttCTTCTA- 0.3 mm CGCACTGCGCTACCACAGCA 12 rs28777 5 33994716 Intronic SLC45A2 A C rs28777_F Set5 0.4 mm TACTCGTGTGGGAGTTCCAT 150bp tttttttttttttttttttttttttttttttttttttttCA- 1.2 mm TGTGATCCTCACAGCAG rs28777_R 0.4 mm TCTTTGATGTCCCCTTCGAT 13 rs16891982 5 33987450 Exonic SLC45A2 G C Rs16891982_F Set6 0.4 mm TCCAAGTTGTGCTAGACCAGA 128bp tttttttttttttttttttttttttttttttttttttttttttt- 1 mm AAACACGGAGTTGATGCA Rs16891982_R 0.4 mm CGAAAGAGGAGTCGAGGTTG 14 rs12821256 12 87852466 Intergenic KITLG A G rs12821256_F Set7 0.4 mm ATGCCCAAAGGATAAGGAAT 118bp ttttttttttttttttttttttttttttttttttttttt- 0.1 mm GGAGCCAAGGGCATGTTACTACGGCAC rs12821256_R 0.4 mm GGAGCCAAGGGCATGTTACT 15 rs4959270 6 402748 Intergenic EXOC2 C A Rs4959270_F Set8 0.4 mm TGAGAAATCTACCCCCACGA 140bp ttttttttttttttttttttttttttttttttttttttttt- 0.375 mm GGAACACATCCAAACTATGACACTATG Rs4959270_R 0.4 mm GTGTTCTTACCCCCTGTGGA 16 rs12203592 6 341321 Intronic IRF4 C T rs12203592_F Set9 0.4 mm AGGGCAGCTGATCTCTTCAG 126bp tttttttttttttttttttttttttttttttttttttttttttt- 0.3 mm tTCCACTTTGGTGGGTAAAAGAAGG rs12203592_R 0.4 mm GCTTCGTCATATGGCTAAACCT 17 rs1042602 11 88551344 Exonic TYR G T rs1042602 _F Set10 0.4 mm CAACACCCATGTTTAACGACA 124bp ttttttttttttttttttttttttttttttttttttttttttttt- 1.25 mm tttttttTCAATGTCTCTCCAGATTTCA rs1042602 _R 0.4 mm GCTTCATGGGCAAAATCAAT 18 rs1800407 15 25903913 Exonic OCA2 G A rs1800407_F Set11 0.4 mm AAGGCTGCCTCTGTTCTACG 124bp tttttttttttttttttttttttttttttttttttttttttttttt- 0.1 mm tttttttttttttttGCATACCGGCTCTCCC rs1800407_R 0.4 mm CGATGAGACAGAGCATGATGA 19 rs2402130 14 91870956 Intronic SLC24A4 A G rs2402130_F Set12 0.4 mm ACCTGTCTCACAGTGCTGCT 150bp tttttttttttttttttttttttttttttttttttttttttttttttt- 0.75 mm ttttttttttttTGAACCATACGGAGCCCGTG rs2402130_R 0.4 mm TTCACCTCGATGACGATGAT 20 rs12913832 15 26039213 Intronic HERC2 C T rs12913832_F Set13 0.4 mm TCAACATCAGGGTAAAA- 150bp ttttttttttttttttttttttttttttttttttttttttttttttt- 1.2 mm ATCATGT tttttttttttttttttTAGCGTGCAGAACTTGACA rs12913832_R 0.4 mm GGCCCCTGATGATGATAGC 21 rs2378249 20 32681751 Intronic ASIP/PIGU T C rs2378249_F Set14 0.4 mm CGCATAACCCATCCCTCTAA 136bp tttttttttttttttttttttttttttttttttttttttttttttt- 0.18 mm ttttttttttttttttttttCCACACCTCTCCTCAGCCCA rs2378249_R 0.4 mm CATTGCTTTTCAGCCCACAC 101
102 S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 Concentration array following the SNaPshot kit sample preparation guidelines, 1.125 mm however run parameters of 2.5 kV for 10 s injection voltage and 0.175 mm 1.1 mm run time of 500 s at 60 8C were used for increased sensitivity. For assay sensitivity studies, genotyping results from two different individuals were assessed from serial dilutions of DNA input samples of 500 pg, 250 pg, 125 pg, 63 pg and 31 pg. Each result was investigated for allelic drop out, which includes peaks ttttttttttttttttttttttttttttttttttttttttttt- below the 50-rfu threshold that cannot be called. The determina- ttttttttttttttttttttttttttttttttttttttttt- ttttttttttttttttttttttttttttttttttttttt- ttttttttttttttttttttttttttCATTTGTA- AAAAGTATGCCTAGAACTTTAAT tion of sensitivity was based on the production of a full proﬁle in tttttttttttttttttttttttttttGCTTTG- ttttttttttttttttttttttTCTTTAGGT- every replicate at a particular DNA input level. AAAGACCACACAGATTT 2.3. HIrisPlex DNA variants and their use for eye/hair colour CAGTATATTTTGGG prediction including in a worldwide sample The HIrisPlex assay consists of 24 DNA variants (23 SNPs and 1 primers INDEL), 6 of these markers, rs12913832 (HERC2), rs1800407 (OCA2), rs12896399 (SLC24A4), rs16891982 (SLC45A2 (MATP)), Product SBE rs1393350 (TYR) and rs12203592 (IRF4) are taken from the IrisPlex system which has already been well established [4,5,19,42] and are 125bp 124bp 138bp size used for the eye colour prediction part of the HIrisPlex system. The results of these 6 SNPs when their minor allele is input into the GGGAAGGTGAATGATAACACG HIrisPlex prediction tool are used to predict the eye colour of the GACCCTGTGTGAGACCCAGT CACAAAACCACCTGGTTGAA TGAAAGGGTCTTCCCAGCTT CTGGCGATCCAATTCTTTGT TTTCTTTATCCCCCTGATGC individual using the IrisPlex model as previously published [5,19] with the highest probability of the three categories, brown, blue or intermediate being the predicted eye colour. The 22 DNA variables used for hair colour prediction are Y152OCH, N29insA, rs1805006, rs11547464, rs1805007, rs1805008, rs1805009, rs1805005, rs2228479, rs1110400 and rs885479 from the MC1R gene, rs1042602 (TYR), rs4959270 (EXOC2), rs28777 (SLC45A2 (MATP)), rs683 (TYRP1), rs2402130 Concentration (SLC24A4), rs12821256 (KITLG), rs2378249 (PIGU/ASIP), rs12913832 (HERC2), rs1800407 (OCA2), rs16891982 (SLC45A2 (MATP)) and rs12203592 (IRF4) based on our previous publication Rs12896399_F Set15 0.4 mm 0.4 mm Set16 0.4 mm 0.4 mm Set17 0.4 mm 0.4 mm for hair colour prediction . When their minor alleles are input into the HIrisPlex prediction tool, they are used to predict the hair colour of the individual using the HIrisPlex hair prediction model developed in this paper. From the four hair colour categories of Rs12896399_R blond, brown, red and black, the highest probability value is Rs1393350_R Rs1393350_F Major Minor PCR primers indicative of the predicted hair colour following guidelines that are rs683_R rs683_F published within this paper and described in the next section. For worldwide hair colour prediction, we assessed the HIrisPlex assay performance on 945 samples from 51 populations of the Allele Allele HGDP-CEPH set. The MapViewer 7 (Golden Software, Inc., Golden, G G CO, USA) package was used to plot the predicted hair colour T categories and the distribution of SNP genotypes on the world map. A non-metric multidimensional scaling (MDS) plot was produced C T T to illustrate the pairwise FST distances  of the 24 eye and hair 91843416 Intergenic SLC24A4 colour SNPs between populations, using SPSS 17.0.2 for Windows TYRP1 Gene TYR (SPSS Inc., Chicago, USA). Analysis of molecular variance (AMOVA) (Excofﬁer 1992) was performed using Arlequin v3.11 . A threshold assessment of prediction probabilities for each hair 88650694 Intronic 12699305 Exonic colour category was also carried out including a combined eye and hair colour prediction probability threshold in the inference of a Non-European individual with Black hair and brown eyes. For the CHR Position assessment of an age-dependent hair colour change, a Pearson correlation was calculated and the graph plotted using SPSS 17.0.2 for Windows (SPSS Inc., Chicago, USA). rs12896399 14 11 9 2.4. Prediction modelling for hair colour rs1393350 Table 2 (Continued ) To develop a hair colour prediction model using samples from rs683 SNP several sites with varying levels of hair colour due to their position within Europe, central, western and southern Europe, we took a position random subset of 80% of the samples from each site, Poland Assay (n = 875), Ireland (n = 272) and Greece (n = 96). This 80% subset 22 23 24 was used to train the model and was based on Multinomial Logistic
S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 103 Regression (MLR), as previously published by Liu et al. . In brief, model approach, a predicted hair colour is generated with an individuals were categorised according to their hair phenotypes approximate indication of the colour being light or dark (i.e. light and were split into 4 categories, Blond (n = 529), Brown (n = 490), brown, dark brown) due to the inﬂuence of the genotypes Red (n = 109) and Black (n = 115). For their genotypes, 22 of the 24 commonly associated with the light/dark categories, of blond HIrisPlex DNA variations (as described above) were used to test for and black respectively. The further 20% of the combined dataset hair colour differentiation and use in the prediction model. By (total n = 308), i.e. from Poland (n = 218), Ireland (n = 67) and inputting the minor allele of each DNA variant, including its Greece (n = 23), was used to assess the accuracy of the prediction phenotype and applying MLR, alpha and beta values are generated model in terms of the ﬁnal hair colour prediction being correct or that form the core of the prediction model. This model then allows incorrect based on colour category, shade and use of the hair colour the probabilistic prediction of an individuals hair colour category prediction guide that is described in detail in Section 3, and an solely based on the input of the 22 variant minor alleles into the assessment of optimal category thresholds was undertaken. The HIrisPlex hair colour prediction tool. To assess the effect of the light steps to take when acquiring a prediction based on colour and and dark shades of hair colour that may be contributed from blond shade are outlined in a guide provided below. and black respectively, a similar approach was used that combined the individuals grouped in the light category (blond, n = 529) 3. Results and discussion versus a dark category (black, n = 115). Red hair individuals were omitted (n = 109) from this analysis as their resulting colour is 3.1. HIrisPlex genotyping assay – design and sensitivity based upon an MC1R cumulative mutation and not on the continuous spectrum of light to dark (i.e. blond to black). Brown The HIrisPlex assay was designed with the intention to cope hair individuals (n = 490) were omitted, as only the extremes of with low template and degraded DNA, a standard concern when light and dark were required. Therefore using this two-pronged genotyping forensic casework samples. Therefore, care was taken Fig. 1. An assessment of the HIrisPlex assay’s sensitivity on two individuals ascertained to have high numbers of heterozygote alleles (7 and 11, respectively) for quantiﬁed DNA input at 500 pg, 125 pg, 63 pg and 31 pg. Full proﬁles were observed down to 63 pg DNA input with drop out occurring at 31 pg DNA input for insertion 1 (N29insA), SNP 15 (rs4959270), SNP 17 (rs1042602), SNP 18 (rs1800407), and SNP 23 (rs1393350) including a C allele drop in at SNP 9 (Y152OCH).
104 S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 to ensure small PCR amplicon sizes of
S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 105 sensitive IrisPlex assay may be applied subsequently and may provide a full 6-SNP proﬁle for eye colour prediction on critical DNA samples. 3.2. HIrisPlex model-based hair colour prediction MC1R polymorphisms are largely recessive when considered individually, but also interact with each other through a genetic mechanism known as ‘‘compound heterozygosity’’ [47–49]. In our previous population-based hair colour prediction study , the MC1R variants Y152OCH, N29insA, rs1805006, rs11547464, rs1805007, rs1805008, rs1805009, rs1805005, rs2228479, rs1110400 and rs885479 were all collapsed into two markers, MC1R-R (R/R, R/wt, wt/wt) and MC1R-r (r/r, r/wt, wt/wt), depending on the penetrance of the mutant alleles. Thus, the total 22 hair colour markers were considered as 13 markers in our previous prediction analysis, including, MC1R_R, MC1R_r, rs1042602 (TYR), rs4959270 (EXOC2), rs28777 (SLC45A2 (MATP)), rs683 (TYRP1), rs2402130 (SLC24A4), rs12821256 (KITLG), rs2378249 (PIGU/ASIP), rs12913832 (HERC2), rs1800407 (OCA2), rs16891982 (SLC45A2 (MATP)) and rs12203592 (IRF4). In the current study, we had two main reasons for the development of a new hair colour prediction model utilising a 22 DNA variant set without collapsing into MC1R-R and MC1R-r. First, we were able to produce a larger dataset that provides a broader representation of Europe and its highly variable hair colour regions. Notably, we not only increased the sample size relative to our previous study  by 3-fold, but in addition to considering more Eastern Europeans from Poland (also used before) we also added individuals from Western Europe, i.e. Ireland and from Southern Europe, i.e. Greece. These three countries display very different hair colour phenotype frequencies (Table 1), which would also impact on the modelling. The use of samples from three European regions and countries provides an increase in overall sample size and also a better representation of the hair colour phenotype variation across Europe, but this also increases the different genotype combina- Fig. 2. Hypothesised scenario of the effect of each HIrisPlex SNPs minor allele input tions observable. Second, some of the MC1R variants also on the model for hair colour prediction as a homozygous genotype (the minor allele contribute to hair colours other than red  (as seen in Table input is 2). The highest effect in terms of probability for a certain hair colour 3). As many individuals from Ireland display a higher frequency of category is noted and the SNP is named near that category within the ﬁgure. MC1R mutations, up to 75% noted in a previous study with 30% of these being double mutations  and 78% in our own set contain at least one of the MC1R mutations without displaying the red hair non-brown and black versus non-black), we used a total set of phenotype, other Europeans from other regions to which our hair 1134 individuals representing the 80% model-building subset but colour prediction tool may be applied in the future may also reﬂect now omitting the red hair individuals from the analyses due to this. Therefore, a new hair colour prediction model was developed their rare DNA variants and the fact that red hair is not a to examine the input of each single DNA variation for hair colour continuous colour but more a combined MC1R mutation effect on categorical prediction, including the individual impact of all MC1R colour change [47,51]. As the probability values shown suggest, the variants separately. results for hair colour variation from blond via brown to black Fig. 2 shows a hypothesised tree model illustrating how each of (without red) are consistent with our previous ﬁndings  in the 22 DNA variants contributes towards a categorical hair colour several DNA variants, i.e. rs12913832 (HERC2) and rs12203592 prediction as inferred from our current data. This scenario (IRF4) with high statistical support (P 10 6 to 10 16) in the present represents the extreme of a 2 minor allele input for each single enlarged dataset considering Poland, Ireland and Greece. Although DNA variant and the largest single hair colour category effect that is less powerful, additional DNA variants also show signiﬁcant seen on the models prediction, based on that input. However, it is evidence (p < 0.05) for some hair colours, such as rs2402130 important to note here (and as further outlined below) that it is the (SLC24A4), rs12821256 (KITLG), rs4959270 (EXOC2), rs1805006 combination of all 22 DNA variants together in a single model that (MC1R), rs1805007 (MC1R), rs1805008 (MC1R) for blond, ﬁnally allows the prediction of hair colours as we suggest with this rs1805006 (MC1R) and rs2402130 (SLC24A4) for brown, and study. rs1805007 (MC1R) for black. Red hair colour prediction is observed Table 3 provides a measure of the strength of each DNA variant’s with highest probability values (P 10 8 to 10 16) for several of the contribution towards each hair colour category prediction using individually considered MC1R variants as expected, i.e. rs1805008, beta values including p-values obtained from the MLR model. The rs1805007, rs1805009 and rs11547464, and with somewhat less analysis is based on the combined 80% model-building subset of statistical strength (p < 0.05) for other MC1R variants, i.e. 1243 Polish, Irish and Greek individuals assigned into a red versus rs1805005, rs1805006 and rs1110400. However, due to the very non-red colour category which then displays each DNA variants low frequency in our set of individuals of the generally rare MC1R contribution towards red hair colour within the model. For the variant allele at N29insA (INDEL) and Y152OCH, their contribution other categories (i.e. blond versus non-blond, brown versus towards red hair probabilities are particularly high (Table 3; red
106 S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 Fig. 3. Illustrative example of 44 individuals on the model performance of the HIrisPlex system for hair colour prediction. The 44 individual set was taken from the Irish collection for which hair imagery was noted as neither grey nor dyed. These individuals are only a visual example of how the model performs. Probability values are given for all four hair-colour categories (black, brown, red and blond) with the highest probability value and the category for which a colour is called highlighted. Dark and light probability values are also indicated which show the amount of black and blond contribution and effect towards the ﬁnal colour prediction using the guide in Fig. 4. The individual hair ﬁgures are ordered from left to right, top to bottom, starting with the highest probabilities for black to the lowest in column one, for the remaining columns the order is the lowest to the highest probabilities for brown, red and blond, respectively.
S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 107 hair beta values of 22 and 19.4 respectively), i.e. the presence of no threshold, to p > 0.7, which we previously recommended for an A allele at N29insA or Y152OCH produces red hair prediction eye colour prediction using the IrisPlex system [4,19]. As seen in probabilities of 1. This effect is not mirrored in the other MC1R Table 4, using the p > 0.7 (B) threshold increases the percentage of variants investigated and reﬂects the presence of these very rare correct calls relative to the value obtained without using any alleles (heterozygote and homozygote state) within all individuals threshold (A) for some hair colours such as red hair by 10% (i.e. displaying a red hair phenotype in our model training set at a very from 89.5 without threshold to 100% with threshold), and for blond low frequency (n = 6). Although this does not affect the ﬁnal hair by 6.5% (i.e. from 57.2 to 63.6%), whereas no difference was prediction of red hair, it is important to note the abnormally high seen for brown hair at 75%, and for black we saw a decrease by probability values for red when these rare variants are present. 8.5% (i.e. from 28.6 to 20%). The low prediction accuracy obtained Notably, some DNA variants outside the MC1R gene also show with this approach for black hair may reﬂect the difﬁculty of signiﬁcant red hair colour probabilities (p < 0.05), i.e. rs12913832 deﬁning the true black hair colour phenotype relative to the dark (HERC2), and rs2378249 (PIGU/ASIP). brown phenotype within this European dataset, where black hair is Fig. 3 provides the results of HIrisPlex prediction for a subset of rare. Notably, the low correct call rate of 28.6% for black (without 44 Irish individuals where high-resolution non-dyed hair colour using a threshold) is mainly caused by 30 individuals with non- imagery was available to illustrate the model’s performance. The black self-reported phenotypes that were predicted as black by the individuals natural hair colour images were ordered according to HIrisPlex model. Of these, almost all (i.e. 90%) had the brown–dark their predicted hair colour category probability values achieved via brown phenotype. We could speculate that at least some of them HIrisPlex analysis while the actual hair colour phenotypes were may have been self-categorised as black if black hair colour would not considered in the ordering. From left to right, top to bottom, the be more frequent in the sampled populations and therefore easier images are ordered from the highest to lowest HIrisPlex prediction to differentiate from dark brown in the phenotyping procedure. probabilities for black hair and then the lowest to highest Although red hair is also rare in the European population (albeit in prediction probabilities for brown, red and blond hair respectively. our Polish dataset it was enriched for) this problem is less expected As evident, there is a high correlation with the predicted hair for red hair as red is usually well differentiable from other hair colour category from HIrisPlex and the hair colour phenotype colours, perhaps with the exception of the blond-red individuals. observed from visual inspection of these images. The prediction accuracy for blond hair, being lower than those for Table 4 shows the accuracy of hair colour prediction in the 20% red and brown hair colour with and without threshold, is partly model-testing subset of the Polish, Irish, and Greek individuals due to another phenomenon that will be discussed in detail in (n = 308). It is important to emphasise here that these individuals Section 3.3; age-dependent hair colour changes. As brown hair is were not used for model building. The highest probability category the intermediary stage between blond and black, no prediction approach (as opposed to the prediction-guide approach explained threshold for this category is required as can be seen in Table 4. in the next paragraph) considers the colour category with the Even at the 75% correct call rate, the incorrect 5/8 deﬁned highest predicted probability as the ﬁnal predicted colour and does themselves as being dark blond. Since we know an overlap exists not take other categories into account for the ﬁnal prediction. between light-brown and dark-blond in people’s perception and Using this approach, we tested various probability thresholds, from deﬁnition of colour, it is best to consider dark blond the same Table 4 HIrisPlex hair colour prediction accuracies obtained from a 308 separate model testing set of individuals from Poland, Ireland and Greece (individuals were not considered for prediction model building for which a different set of 1243 individuals was used) using two approaches: the highest probability category approach (with and without thresholds) and the prediction guide approach (see Fig. 4 for the prediction guide). Highest probability category approach Observed categories No threshold p value Red Blond Brown Black Total prediction(n) Red predicted 17(89.5%) 1(5.3%) 1(5.3%) 0(0%) 19(100%) Blond predicted 8(3.7%) 123(57.2%) 68(31.6%) 16(7.4%) 215(100%) Brown predicted 2(6.3%) 5(15.6%) 24(75%) 1(3.1%) 32(100%) Black predicted 1(2.4%) 2(4.8%) 27(64.3%) 12(28.6%) 42(100%) Total phenotype(n) 28 131 120 29 308 Total Highest probability category approach Observed categories >0.7p threshold Red Blond Brown Black Total prediction(n) Red predicted 8(100%) 0(0%) 0(0%) 0(0%) 8(100%) Blond predicted 2(1.8%) 70(63.6%) 33(30%) 5(4.6%) 110(100%) Brown predicted 0(0%) 0(0%) 3(75%) 1(25%) 4(100%) Black predicted 0(0%) 0(0%) 4(80%) 1(20%) 5(100%) Total phenotype(n) 10 70 40 7 127 Total Undefined 18(64.3%) 61(46.6%) 80(66.7%) 22(75.9%) 181 (58.8%) Observed categories Prediction guide approach Red Blond d-blond/l-brown D-Brown Black Total prediction (n) Red predicted 24(80%) 0 6(20%) 0(0%) 0(0%) 30(100%) Blond/d blond predicted 2(1.7%) 32(26.4%) 52(43%) 30(24.8%) 5(4.1%) 121(100%) L-brown/d-brown predicted 2(1.3%) 9(6%) 59(39.6%) 58(38.9%) 21(14.1%) 149(100%) Black/d-brown predicted 0 0 1(12.5%) 4(50%) 3(37.5%) 8(100%) Total phenotype (n) 28 41 118 92 29 308 Total
108 S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 colour as light brown. Therefore, brown hair colour may also be highest-probability approach that we aimed to overcome by seen at black and blond category predictions 0.7 p threshold) or nearly all (89.5% in combination with light/dark hair colour shade probabilities as without threshold) individuals for which the red hair category was obtained from the HIrisPlex genotype data (Fig. 4, see also Section the highest prediction probability were correctly predicted as seen 3.5 for additional practical recommendations). The reason for in Table 4. Notably, the two individuals that were incorrectly considering light/dark shade prediction in addition to categorical predicted red without using a threshold deﬁned themselves as hair colour prediction in the ﬁnal approach is that the 22 DNA blond and brown, respectively; upon inspection of a hair image of variants not only impact on the main hair colour categories, but the latter individual that was available to us, it did in fact display also on more detailed hair colour information, which is difﬁcult to light red hints of colour. This reﬂects another example of how the measure; hence, we express in light/dark prediction probabilities. phenotyping procedure, particularly self-reported hair colour For this, we took the individuals from the black category, now grading as done in our Irish and Greek datasets, inﬂuences DNA termed dark, and the individuals from the blond category, now prediction accuracy. However, it is important to point out here that termed light, and designed an additional prediction model for light for 11(39%) individuals that had deﬁned themselves as having red and dark colour shade. Therefore, the HIrisPlex genotype input hair, the red hair probability was not the highest, relative to ﬁnally provides the core prediction colour category with an added probabilities for non-red hair colour, and these individuals were level or shade, i.e. light or dark. This part of the prediction should therefore missed out with HIrisPlex using this highest-probability be useful as additional information to the initial prediction approach. Furthermore, for 8 (6%) of the phenotypic blond, 96 category, e.g. to differentiate light blond from dark blond (light (80%) of the phenotypic brown, and 17 (59%) of the phenotypic brown), or light brown from dark brown/black. It becomes black hair individuals the highest predicted hair colour category particularly beneﬁcial in the lower hair colour category prediction did not correspond to the phenotypic hair colour category probability levels (i.e. category prediction
S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 109 genotype combinations. A >0.9 threshold is used for light versus reasons yet to be unveiled, and there are no other high impact hair dark shade prediction. As seen in Table 4(C), using the prediction colour SNP that take its place. For instance, in our full dataset using guide approach the correct call percentages were for all hair 1551 individuals, the correlation of rs12913832 with eye colour is colours considerably higher than using the highest probability nearly twice as high (Pearson correlation r2 = 0.46, p = 2.2e 16) as category approach, except for red hair. In fact, using the prediction its correlation with hair colour (Pearson correlation r2 = 0.24, guide approach we obtained on average 69.5% correct calls for p = 2.2e 16). Furthermore, the colour distribution of European blond, 78.5% for brown and 87.5% for black. Particularly black hair hair appears much wider than that of European eyes, requiring the prediction was strongly improved by using the prediction guide combination of several similar gene effects . Thus, categorical approach with an increase of almost 60% on average relative to the hair colour prediction is expected to be more error-prone highest probability category approach without a threshold. For an especially when involving factors such as shade and intensity, explanation of why blond is the least accurately predictable hair etc. at least with the DNA markers known thus far. Additional colour with currently available DNA markers, also after applying effects such as environmental contributions particularly life time the prediction guide, see Section 3.3. Although we saw an apparent that are much stronger on certain hair colours than they are on all decrease of accurate prediction for red hair with the prediction eye colours also inﬂuence hair colour prediction accuracy more so guide approach (80% versus 89.5% with highest-probability than eye colour prediction accuracy and will be discussed in the approach without threshold), this can be explained by the total following chapter (see Section 3.3). number of red predictions made by the models and if they were correct or not. In particular, for the highest probability approach 3.3. Age-dependent hair colour changes and consequences for hair the model was incorrect at predicting red only 2 times but missed colour prediction out on 11 actual reds from our dataset. The prediction guide approach, although was inaccurate for red hair prediction for 6 Age-dependent changes in hair colour are evident from individuals, it managed to predict 24 out of the 28 actual red hair anecdotal knowledge. The most often observed age-dependent phenotypes from our test set. In summary, the number of hair colour changes occurs from light blond during childhood individuals in our 308 model-test set that were missed by towards dark blond/light brown as an adult, but can also occur HIrisPlex hair colour prediction using the prediction guide from light brown to dark brown/almost black. Suggestions of approach were 4 (14%) of the phenotypic red, 8 (19.5%) of the hormonal changes during adolescence have been advocated as a phenotypic blond, 7 (6%) of the phenotypic d-blond/l-brown, 28 possible explanation , but the molecular basis are yet to be (31%) of the phenotypic d-brown and 26 (90%) of the phenotypic unveiled. In order to study the effect of age-dependent hair colour black, with an overall hair colour prediction accuracy of 76%. All are change on hair colour prediction from child to adulthood we considerably less than what was missed when applying the highest recorded via questionnaires in the Irish sample set hair colour probability category approach, apart from black hair where we during childhood and adulthood separately, including the approx- believe phenotyping inaccuracy/perception of colour plays a role imate age of the hair colour change. Of the 339 Irish individuals, as discussed above already, as 21 of those individuals were 157 contained current images in which the hair was not dyed and predicted as having d-brown hair and may have in fact displayed d- not grey, and from these the 8 individuals that were classiﬁed as brown hair that was perceived as black within Europe. We blond in adulthood were 100% correctly predicted by the HIrisPlex therefore recommend using the prediction guide approach for system following the prediction guide approach. However, for 14 properly interpreting HIrisPlex genotype data and the probability individuals with light brown to black phenotypes the HIrisPlex values derived from our prediction tool to infer the most likely hair model had faltered and gave a high blond prediction probability colour phenotype in future practical applications. (>0.7 p) with high light shade probabilities (>0.9 p). On further There are several important differences between eye and hair examination of these incorrectly predicted individuals, 8 (57%) of colour, both on the phenotypic as well as the genotypic levels, that them noted that a change in hair colour regarding a darkening from may play a role in why some eye colours (i.e. blue and brown) blond to brown had occurred in their younger lives at ages ranging appear to be currently predictable from DNA with higher accuracy from 9 to 12 years. Furthermore, we found a high and statistically than some hair colours (i.e. all non-red hair colours). Rs12913832 signiﬁcant correlation (Pearson correlation r2 = 0.81, p < 0.01) from the HERC2 gene plays a major role in the functional aspects of between the increase in brown (darkening of hair) and the increase iris pigmentation [10,52] and its proposed model of action reﬂects in age since the hair colour change occurred for those Irish a type of on/off switch from the absence of the T allele (and the individuals for whom such data were available to us (Fig. 7), which homozygous presence of the C-allele) resulting in blue eye colour, substantiates that the hair colour change observed is age to the presence of one or two T allele(s) reﬂecting brown eye colour dependent in these individuals. From this data we can see that . Indeed, it has been shown recently  via a series of our current HIrisPlex system works to a high degree of accuracy for functional genetic experiments that the rs12913832 T-allele leads hair colour prediction, but there may be processes that alter the to binding of several transcription factors and a chromatin loop hair colour over an individual’s lifetime (possibly molecular with the promoter of the neighbouring pigmentation gene OCA2 processes) without changing the HIrisPlex predicted hair colour leading to elevated OCA2 expression and dark pigmentation. In of the individual. For instance, an adult that had blond hair as a contrast, when the rs12913832 C-allele is present, transcription young child, but now displays light–dark brown/black hair colour factor binding, loop formation and OCA2 expression are all reduced is likely to display blond HIrisPlex genotypes and therefore a blond leading to light pigmentation. Because of its strong functional hair colour prediction will be obtained. This is due to the fact that involvement, HERC2 rs12913832 shows the strongest predictive the hair colour SNPs included in the HIrisPlex system, as well as power on categorical eye colour with an AUC of 0.877 for blue and any additional hair colour associated DNA variant available today, 0.899 for brown alone for this SNP , and it also shows a strong were identiﬁed in studies dealing with adults, and not in studies impact on quantitative eye colour variance explaining on average that particularly searched for bio-markers informative for the age- 46% of the H and S spectrum . When comparing this to its dependent hair colour change, which is still yet to be carried out. It impact on hair colour, the percentage of residual variation from is important to note therefore that the HIrisPlex model cannot black to blond explained by HERC2 rs12913832 was 10.7% in a decipher between these change-affected individuals and blonds previous study . However, the effect of rs12913832 is who remain blonds from childhood to adulthood and thus a considerably less on hair colour than it is on eye colour for HIrisPlex prediction of blond hair may be inaccurate to a certain
110 S. Walsh et al. / Forensic Science International: Genetics 7 (2013) 98–115 degree (30% (Supplementary Table 3) in our dataset). This questionable individuals during a police inquiry. For differentiat- limitation in DNA-based hair colour prediction will remain as ing whether a crime scene sample donor still had his/her natural long as bio-markers informative for indicating age-dependent hair hair colour, or perhaps turned grey or white already, a molecular colour changes are not identiﬁed. Furthermore, this age-dependent age estimation performed on crime scene samples such as blood study was conducted using images solely taken from a small Irish would be useful in combination with the HIrisPlex application. set (childhood hair colour was not available for the Polish and the Previously, our group developed a DNA test for chronological age, Greek set); it is worth mentioning that this may reﬂect a trend in which allows age-group estimation on an accurate level . other countries within Europe, however we do not have this Obviously, any dyed hair colour, as long as it produces a hair colour information as of present. Therefore more samples and increased different from the natural hair colour category, would not be accuracy testing of the HIrisPlex system on a broader collection identiﬁable with HIrisplex or any other DNA-based hair colour around Europe would be advantageous to get a better measure of prediction tool. However, in general it is believed that many people this phenomenon. Furthermore, activities shall be placed for who dye their hair as a result of hair greying, and with the intention ﬁnding the processes/genes responsible for age-dependent hair of hiding the fact that their hair has greyed, try to achieve their colour changes and developing respective bio-markers that may natural hair colour category via dyeing, especially in the case of increase hair colour prediction accuracy in the future. men, to avoid stigmatisms associated with hair colouring. In such A different aspect of age-dependent hair colour change is the cases HIrisPlex hair colour prediction can still be useful even loss of hair colour when turning grey and white at a more or less though the hair is dyed. advanced age, which likely represents a different mechanism of action  than changing from one hair colour to another. We 3.4. HIrisPlex analysis on a worldwide scale examined the Irish population of 339 individuals for which we had questionnaire information on the age at which grey or white hairs Due to the fact that the HIrisPlex hair prediction model was had started to grow. As shown in Supplementary Fig. 1, after the created using individuals solely from Europe, as it should be for a age of 30 there are more individuals starting to produce grey or European trait, to verify its use outside of Europe we performed white hairs relative to those who do not, conﬁrming anecdotal HIrisPlex analysis on worldwide DNA samples from the H952 knowledge. However, we have no data on how long it will take for subset of the HGDP-CEPH panel that represents 952 individuals those individuals who started to have grey hairs to turn grey to a from 51 populations [36,37]. Due to lack of DNA in some samples, a substantially obvious phenotypic degree. For practical consider- ﬁnal number of 945 worldwide samples were used. Fig. 5 displays ations, knowing the natural hair colour for an individual during its the prediction of the four hair-colour categories blond, brown, red youth that now at more advanced age displays an obvious grey or and black on a worldwide scale. This ﬁgure does not use any white hair phenotype will not be directly useful in an investigative threshold parameters and therefore it is worthy to note that the search, but this information can still be useful albeit less strongly, prediction levels of blond hair in Europe (especially with when asked for natural hair colour prior to greying in these probability values
Pages you may also like
You can also read
Next part ... Cancel