Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Genotyping errors: causes, consequences and solutions

Key Points

  • Although genotyping errors affect most data and can markedly influence the biological conclusions of a study, they are too often neglected.

  • Genotyping errors can result from very diverse, complex, and sometimes cryptic origins and are linked to the primary DNA sequence itself, the low quality or quantity of the DNA sample, biochemical artefacts or human factors.

  • Although several estimates of genotyping error rates are commonly used, the error rate per locus is considered to be the most universal metric, as it allows comparisons to be made between studies and different types of markers.

  • Even low rates of genotyping error can markedly affect linkage and association studies, individual identification and population genetic studies.

  • The optimal strategies to limit the occurrence and the impact of genotyping errors are case-specific and will be determined by several factors (for example, biological question, tolerable error rate, equipment and technical skills locally available).

  • General recommendations are provided to help researchers to build their own procedure to face genotyping errors by limiting the production of errors during genotyping, cleaning the dataset after genotyping and analysing data taking into account the errors.

  • Providing information about the methods for error detection and error rate estimation in published work would make it possible to assign a quality index to each genotype, and would allow the scientific community to critically assess unexpected results.

Abstract

Although genotyping errors affect most data and can markedly influence the biological conclusions of a study, they are too often neglected. Errors have various causes, but their occurrence and effect can be limited by considering these causes in the production and analysis of the data. Procedures that have been developed for dealing with errors in linkage studies, forensic analyses and non-invasive genotyping should be applied more broadly to any genetic study. We propose a protocol for estimating error rates and recommend that these measures be systemically reported to attest the reliability of published genotyping studies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The recent increase in the number of papers that deal with genotyping errors.
Figure 2: Flow chart that shows the important steps in a genotyping process for limiting the occurrence and effect of genotyping errors.
Figure 3: The use of blind replicates to estimate the error rate.

Similar content being viewed by others

References

  1. Gagneux, P., Woodruff, D. S. & Boesch, C. Furtive mating in female chimpanzees. Nature 387, 358–359 (1997).

    Article  CAS  PubMed  Google Scholar 

  2. Gagneux, P., Boesch, C. & Woodruff, D. S. Microsatellite scoring errors associated with noninvasive genotyping based on nuclear DNA amplified from shed hair. Mol. Ecol. 6, 861–868 (1997). This paper deals with genotyping errors in non-invasive studies and is the first one to mention 'allelic dropout'.

    Article  CAS  PubMed  Google Scholar 

  3. Bonin, A. et al. How to track and assess genotyping errors in population genetics studies. Mol. Ecol. 13, 3261–3273 (2004). An extensive study of the causes and consequences of genotyping errors on AFLP and microsatellite data, with practical recommendations for limiting error occurrence and effect.

    Article  CAS  PubMed  Google Scholar 

  4. Thompson, E. A. A paradox of genealogical inference. Adv. Appl. Probab. 8, 648–650 (1976).

    Article  Google Scholar 

  5. Taberlet, P., Waits, L. P. & Luikart, G. Noninvasive genetic sampling: look before you leap. Trends Ecol. Evol. 14, 323–327 (1999). This article focuses on the processes for limiting the occurrence of genotyping errors in non-invasive studies, highlighting the role of pilot studies.

    Article  CAS  PubMed  Google Scholar 

  6. Hoffman, J. I. & Amos, W. Microsatellite genotyping errors: detection approaches, common sources and consequences for paternal exclusion. Mol. Ecol. 14, 599–612 (2005). A careful examination of the causes of genotyping errors on microsatellite data, showing the importance of human factors.

    Article  CAS  PubMed  Google Scholar 

  7. Ewen, K. R. et al. Identification and analysis of error types in high-throughput genotyping. Am. J. Hum. Genet. 67, 727–736 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Göring, H. H. H. & Terwilliger, J. D. Linkage analysis in the presence of errors II: Marker-locus genotyping errors modeled with hypercomplex recombination fractions. Am. J. Hum. Genet. 66, 1107–1118 (2000).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Brzustowicz, L. M. et al. Molecular and statistical approaches to the detection and correction of errors in genotype databases. Am. J. Hum. Genet. 53, 1137–1145 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Taberlet, P. et al. Reliable genotyping of samples with very low DNA quantities using PCR. Nucleic Acids Res. 24, 3189–3194 (1996). This study shows the difficulty for producing reliable genotype data that is due to the occurrence of false alleles and false homozygotes (that is, allelic dropout).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Taberlet, P. et al. Noninvasive genetic tracking of the endangered Pyrenean brown bear population. Mol. Ecol. 6, 869–876 (1997).

    Article  CAS  PubMed  Google Scholar 

  12. Mitchell, A. A., Zwick, M. E., Chakravarti, A. & Cutler, D. J. Discrepancies in dbSNP confirmation rates and allele frequency distributions from varying genotyping error rates and patterns. Bioinformatics 20, 1022–1032 (2004).

    Article  CAS  PubMed  Google Scholar 

  13. Abecasis, G. R., Cherny, S. S. & Cardon, L. R. The impact of genotyping error on family-based analysis of quantitative traits. Eur. J. Hum. Genet. 9, 130–134 (2001).

    Article  CAS  PubMed  Google Scholar 

  14. Yao, Y. -G., Bravi, C. M. & Bandelt, H. -J. A call for mtDNA data quality control in forensic science. Forensic Sci. Int. 141, 1–6 (2004).

    Article  CAS  PubMed  Google Scholar 

  15. Schlötterer, C. The evolution of molecular markers — just a matter of fashion? Nature Rev. Genet. 5, 63–69 (2004).

    Article  CAS  PubMed  Google Scholar 

  16. Bandelt, H. J., Lahermo, P., Richards, M. & Macaulay, V. The fingerprint of phantom mutations in mitochondrial DNA data. Am. J. Hum. Genet. 71, 1150–1160 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Callen, D. F. et al. Incidence and origin of 'null' alleles in the (AC)n microsatellite markers. Am. J. Hum. Genet. 52, 922–927 (1993). The first study to report the occurrence of non-amplifying microsatellite alleles (that is, null alleles).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Paetkau, D. & Strobeck, C. The molecular basis and evolutionary history of a microsatellite null allele in bears. Mol. Ecol. 4, 519–520 (1995).

    Article  CAS  PubMed  Google Scholar 

  19. Brownstein, M. J., Carpten, J. D. & Smith, J. R. Modulation of non-templated nucleotide addition by Taq DNA polymerase: primer modifications that facilitate genotyping. BioTechniques 20, 1004–1010 (1996).

    Article  CAS  PubMed  Google Scholar 

  20. Magnuson, V. L. et al. Substrate nucleotide-determined non-templates addition of adenine by Taq DNA polymerase: implications for PCR-based genotyping and cloning. BioTechniques 21, 700–709 (1996).

    Article  CAS  PubMed  Google Scholar 

  21. Li, J. L. et al. Toward high-throughput genotyping: dynamic and automatic software for manipulating large-scale genotype data using fluorescently labeled dinucleotide markers. Genome Res. 11, 1304–1314 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Ghosh, S. et al. Methods for precise sizing, automated binning of alleles, and reduction of error rates in large-scale genotyping using fluorescently labeled dinucleotide markers. Genome Res. 7, 165–178 (1997).

    Article  CAS  PubMed  Google Scholar 

  23. Creel, S. et al. Population size estimation in Yellowstone wolves with error-prone noninvasive microsatellite genotypes. Mol. Ecol. 12, 2003–2009 (2003).

    Article  PubMed  Google Scholar 

  24. Waits, J. L. & Leberg, P. L. Biases associated with population estimation using molecular tagging. Anim. Conserv. 3, 191–199 (2000).

    Article  Google Scholar 

  25. Gordon, D., Heath, S. C. & Ott, J. True pedigree errors more frequent than apparent errors for single nucleotide polymorphisms. Hum. Hered. 49, 65–70 (1999).

    Article  CAS  PubMed  Google Scholar 

  26. Geller, F. & Ziegler, A. Detection rates for genotyping errors in SNPs using the trio design. Hum. Hered. 54, 111–117 (2002).

    Article  PubMed  Google Scholar 

  27. Akey, J. M., Zhang, K., Xiong, M. M., Doris, P. & Jin, L. The effect that genotyping errors have on the robustness of common linkage-disequilibrium measures. Am. J. Hum. Genet. 68, 1447–1456 (2001). A study that investigates the effects of genotyping error on estimates of linkage disequilibrium, and shows that the robustness of the estimates depends on allelic frequencies and assumed error models.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Kirk, K. M. & Cardon, L. R. The impact of genotyping error on haplotype reconstruction and frequency estimation. Eur. J. Hum. Genet. 10, 616–622 (2002).

    Article  CAS  PubMed  Google Scholar 

  29. Hackett, C. A. & Broadfott, L. B. Effects of genotyping errors, missing values and segregation distortion in molecular marker data on the construction of linkage maps. Heredity 90, 33–38 (2003).

    Article  CAS  PubMed  Google Scholar 

  30. Buetow, K. H. Influence of aberrant observations on high-resolution linkage analysis outcomes. Am. J. Hum. Genet. 49, 985–994 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Goldstein, D. R., Zhao, H. Y. & Speed, T. P. The effects of genotyping errors and interference on estimation of genetic distance. Hum. Hered. 47, 86–100 (1997).

    Article  CAS  PubMed  Google Scholar 

  32. Douglas, J. A., Boehnke, M. & Lange, K. A multipoint method for detecting genotyping errors and mutations in sibling-pair linkage data. Am. J. Hum. Genet. 66, 1287–1297 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Terwilliger, J. D., Weeks, D. E. & Ott, J. Laboratory errors in the reading of marker alleles cause massive reductions in LOD score and lead to gross overestimation of the recombination fraction. Am. J. Hum. Genet. 47, A201 (1990).

    Google Scholar 

  34. Gordon, D., Matisse, T. C., Heath, S. C. & Ott, J. Power loss for multiallelic transmission/disequilibrium test when errors introduced: GAW11 simulated data. Genet. Epidemiol. 17, S587–S592 (1999).

    Article  PubMed  Google Scholar 

  35. Rebbeck, T. R. et al. SNPs, haplotypes, and cancer: applications in molecular epidemiology. Cancer Epidemiol. Biomarkers Prev. 13, 681–687 (2004).

    CAS  PubMed  Google Scholar 

  36. Pemberton, J. M., Slate, J., Bancroft, D. R. & Barrett, J. A. Non-amplifying alleles at microsatellite loci: a caution for parentage and population studies. Mol. Ecol. 4, 249–252 (1995).

    Article  CAS  PubMed  Google Scholar 

  37. Marshall, T. C., Slate, J., Kruuk, L. E. B. & Pemberton, J. M. Statistical confidence for likelihood-based paternity inference in natural populations. Mol. Ecol. 7, 639–655 (1998).

    Article  CAS  PubMed  Google Scholar 

  38. Dakin, E. E. & Avise, J. C. Microsatellite null alleles in parentage analysis. Heredity 93, 504–509 (2004).

    Article  CAS  PubMed  Google Scholar 

  39. Weiser Easteal, P. & Easteal, S. The forensic use of DNA profiling. Trends Issues Crime Crim. Justice 26, 1–8 (1990).

    Google Scholar 

  40. Luikart, G., England, P., Tallmon, D., Jordan, S. & Taberlet, P. The power and promise of population genomics: from genotyping to genome typing. Nature Rev. Genet. 4, 981–994 (2003).

    Article  CAS  PubMed  Google Scholar 

  41. Paetkau, D. An empirical exploration of data quality in DNA-based population inventories. Mol. Ecol. 12, 1375–1387 (2003). A review of the various approaches that were designed to probe the reliability of data in non-invasive studies on bears.

    Article  CAS  PubMed  Google Scholar 

  42. Douglas, J. A., Skol, A. D. & Boehnke, M. Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. Am. J. Hum. Genet. 70, 487–495 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Butler, J. M. Forensic DNA Typing: Biology and Technology Behind STR Markers (Academic Press, San Diego, 2001).

    Google Scholar 

  44. Perlin, M. W., Lancia, G. & Ng, S. K. Toward fully automated genotyping: genotyping microsatellite markers by deconvolution. Am. J. Hum. Genet. 57, 1199–1210 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Papa, R., Troggio, M., Ajmone-Marsan, P. & Nonnis Marzano, F. An improved protocol for the production of AFLP markers in complex genomes by means of capillary electrophoresis. J. Anim. Breed. Genet. 122, 62–68 (2005).

    Article  CAS  PubMed  Google Scholar 

  46. Millikan, R. The changing face of epidemiology in the genomics era. Epidemiology 13, 472–480 (2002).

    Article  PubMed  Google Scholar 

  47. Navidi, W., Arnheim, N. & Waterman, M. S. A multiple-tubes approach for accurate genotyping of very small DNA samples by using PCR: statistical considerations. Am. J. Hum. Genet. 50, 347–359 (1992).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Miller, C. R., Joyce, P. & Waits, L. P. Assessing allelic drop out and genotype reliability using maximum likelihood. Genetics 160, 357–366 (2002).

    PubMed  PubMed Central  Google Scholar 

  49. Paetkau, D., Calvert, W., Stirling, I. & Strobeck, C. Microsatellite analysis of population structure in Canadian polar bears. Mol. Ecol. 4, 347–354 (1995).

    Article  CAS  PubMed  Google Scholar 

  50. Tenenbein, A. A double sampling scheme for estimating from misclassified multinomial data with applications to sampling inspection. Technometrics 14, 187–202 (1972).

    Article  Google Scholar 

  51. Stringham, H. M. & Boehnke, M. Identifying marker typing incompatibilities in linkage analysis. Am. J. Hum. Genet. 59, 946–950 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Gordon, D., Heath, S. C., Liu, X. & Ott, J. A transmission/disequilibrium test that allows for genotyping errors in the analysis of single-nucleotide polymorphism data. Am. J. Hum. Genet. 69, 371–380 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Boehnke, M. & Cox, N. J. Accurate inference of relationships in sib-pair linkage studies. Am. J. Hum. Genet. 61, 423–429 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Gordon, D. et al. A transmission disequilibrium test for general pedigrees that is robust to the presence of random genotyping errors and any number of untyped parents. Eur. J. Hum. Genet. 12, 752–761 (2004).

    Article  CAS  PubMed  Google Scholar 

  55. Lange, K. et al. Mendel version 4.0: a complete package for the exact genetic analysis of discrete traits in pedigree and population data sets. Am. J. Hum. Genet. 69, A1886 (2001).

    Google Scholar 

  56. Chakraborty, R., De Andrade, M., Daiger, S. P. & Budowle, B. Apparent heterozygote deficiencies observed in DNA typing data and their implications in forensic applications. Ann. Hum. Genet. 56, 45–57 (1992).

    Article  CAS  PubMed  Google Scholar 

  57. Gomes, I. et al. Hardy–Weinberg quality control. Ann. Hum. Genet. 63, 535–538 (1999).

    Article  CAS  PubMed  Google Scholar 

  58. Xu, J., Turner, A., Little, J., Bleecker, E. R. & Meyers, D. A. Positive results in association studies are associated with departure from Hardy–Weinberg equilibrium: hint for genotyping error? Hum. Genet. 111, 573–574 (2002).

    Article  PubMed  Google Scholar 

  59. Hosking, L. et al. Detection of genotyping errors by Hardy–Weinberg equilibrium testing. Eur. J. Hum. Genet. 12, 395–399 (2004).

    Article  CAS  PubMed  Google Scholar 

  60. Morton, N. E. & Collins, A. E. Statistical and genetic aspects of quality control for DNA identification. Electrophoresis 16, 1670–1677 (1995).

    Article  CAS  PubMed  Google Scholar 

  61. Morris, R. W. & Kaplan, N. L. Testing for association with a case-parents design in the presence of genotyping errors. Genet. Epidemiol. 26, 142–154 (2004).

    Article  PubMed  Google Scholar 

  62. Kang, S. J., Gordon, D. & Finch, S. J. What SNP genotyping errors are most costly for genetic association studies? Genet. Epidemiol. 26, 132–141 (2004).

    Article  PubMed  Google Scholar 

  63. Zou, G. H. & Zhao, H. Y. The impact of errors in individual genotyping and DNA pooling on association studies. Genet. Epidemiol. 26, 1–10 (2004).

    Article  PubMed  Google Scholar 

  64. Rice, K. M. & Holmans, P. Allowing for genotyping error in analysis of unmatched case-control studies. Ann. Hum. Genet. 67, 165–174 (2003).

    Article  CAS  PubMed  Google Scholar 

  65. Sobel, E., Papp, J. C. & Lange, K. Detection and integration of genotyping errors in statistical genetics. Am. J. Hum. Genet. 70, 496–508 (2002). A nice reference study that illustrates several possibilities for integrating genotyping errors in statistical analyses in human pedigree studies.

    Article  PubMed  PubMed Central  Google Scholar 

  66. Wang, J. L. Sibship reconstruction from genetic data with typing error. Genetics 166, 1963–1979 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Duchesne, P., Gobout, M. -H. & Bernatchez, L. PAPA (Package for the Analysis of Parental Allocation): a computer program for simulated and real parental allocation. Mol. Ecol. Notes 2, 191–194 (2002).

    Article  CAS  Google Scholar 

  68. Gordon, D. et al. Increasing power for tests of genetic association in the presence of phenotype and/or genotype error by use of double-sampling. Stat. Appl. Genet. Mol. Biol. 3, a26 (2004).

    Article  Google Scholar 

  69. Weeks, D. E., Conley, Y. P., Ferrell, R. E., Mah, T. S. & Gorin, M. B. A tale of two genotypes: consistency between two high-throughput genotyping centers. Genome Res. 12, 430–435 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Jones, C. J. et al. Reproducibility testing of RAPD, AFLP and SSR markers in plants by a network of European laboratories. Mol. Breed. 3, 381–390 (1997).

    Article  CAS  Google Scholar 

  71. Dequeker, E. & Cassiman, J. J. Evaluation of CFTR gene mutation testing methods in 136 diagnostic laboratories: report of a large European external quality assessment. Eur. J. Hum. Genet. 6, 165–175 (1998).

    Article  CAS  PubMed  Google Scholar 

  72. Muller, C. R. Quality control in mutation analysis: the European Molecular Genetics Quality Network (EMQN). Eur. J. Pediatr. 160, 464–467 (2001).

    Article  CAS  PubMed  Google Scholar 

  73. Brazma, A. et al. Minimum information about a microarray experiment (MIAME) — toward standards for microarray data. Nature Genet. 29, 365–371 (2001).

    Article  CAS  PubMed  Google Scholar 

  74. Smith, C. A. B. Counting methods in genetic statistics. Ann. Hum. Genet. 21, 254–276 (1957).

    Article  CAS  PubMed  Google Scholar 

  75. Stevens, W. L. Estimation of blood-group gene frequencies. Ann. Eugen. Lond. 8, 362–375 (1938).

    Article  Google Scholar 

  76. Zischler, H. et al. Detecting dinosaur DNA. Science 268, 1192–1193 (1995).

    Article  CAS  PubMed  Google Scholar 

  77. Austin, J. J., Ross, A. J., Smith, A. B., Fortey, R. A. & Thomas, R. H. Problems of reproducibility — does geologically ancient DNA survive in amber-preserved insects? Proc. R. Soc. Lond. B 264, 467–474 (1997).

    Article  CAS  Google Scholar 

  78. Quackenbush, J. Computational analysis of microarray data. Nature Rev. Genet. 2, 418–427 (2001).

    Article  CAS  PubMed  Google Scholar 

  79. Aach, J., Rindone, W. & Church, G. M. Systematic management and analysis of yeast gene expression data. Genome Res. 10, 431–445 (2000).

    Article  CAS  PubMed  Google Scholar 

  80. Broquet, T. & Petit, E. Quantifying genotyping errors in noninvasive population genetics. Mol. Ecol. 13, 3601–3608 (2004). A critical analysis of the various methods available to estimate allelic dropout rates and false allele rates in protocols designed for non-invasive studies.

    Article  CAS  PubMed  Google Scholar 

  81. Bellemain, E., Swenson, J. E. & Taberlet, P. Mating strategies in relation to sexually selected infanticide in a nonsocial carnivore: the brown bear. Ethology 111, 1–14 (2005).

    Article  Google Scholar 

  82. Paetkau, D. & Strobeck, C. Microsatellite analysis of genetic variation in black bear populations. Mol. Ecol. 3, 489–495 (1994).

    Article  CAS  PubMed  Google Scholar 

  83. Waits, L. P., Taberlet, P., Swenson, J. E. & Sandegren, F. Nuclear DNA microsatellite analysis of genetic diversity and gene flow in the Scandinavian brown bear (Ursus arctos). Mol. Ecol. 9, 421–431 (2000).

    Article  CAS  PubMed  Google Scholar 

  84. Cercueil, A., Bellemain, E. & Manel, S. PARENTE: computer program for parentage analysis. J. Hered. 93, 458–459 (2003).

    Article  Google Scholar 

  85. Walsh, P. S., Erlich, H. A. & Higuchi, R. Preferential PCR amplification of alleles: mechanisms and solutions. PCR Methods Appl. 1, 241–250 (1992).

    Article  CAS  PubMed  Google Scholar 

  86. Shaw, P. W., Pierce, G. J. & Boyle, P. R. Subtle population structuring within a highly vagile marine invertebrate, the veined squid Loligo forbesi, demonstrated wityh microsatellite DNA markers. Mol. Ecol. 8, 407–417 (1999).

    Article  CAS  Google Scholar 

  87. Vekemans, X., Beauwens, T., Lemaire, M. & Roldan-Ruiz, I. Data from amplified fragment length polymorphism (AFLP) markers show indication of size homoplasy and of a relationship between degree of homoplasy and fragment size. Mol. Ecol. 11, 139–151 (2002).

    Article  CAS  PubMed  Google Scholar 

  88. Lindahl, T. Instability and decay of the primary structure of DNA. Nature 362, 709–715 (1993).

    Article  CAS  PubMed  Google Scholar 

  89. Wattier, R., Engel, C. R., Saumitou-Laprade, P. & Valera, M. Short allele dominance as a source of heterozygote deficiency at microsatellite loci: experimental evidence at the dinucleotide locus Gv1CT in Gracilaria gracilis (Rhodophyta). Mol. Ecol. 7, 1569–1573 (1998).

    Article  CAS  Google Scholar 

  90. Martinez, J. G. & Burke, T. Microsatellite typing of sperm trapped in the perivitelline layers of avian eggs. J. Avian Biol. 34, 20–24 (2003).

    Article  Google Scholar 

  91. Kohn, M. H. & Wayne, R. K. Facts from feces revisited. Trends Ecol. Evol. 12, 223–227 (1997). A review on non-invasive DNA analyses from faeces, with valuable technical notes on the sources of error.

    Article  CAS  PubMed  Google Scholar 

  92. Valière, N. & Taberlet, P. Urine collected in the field as a source of DNA for species and individual identification. Mol. Ecol. 9, 2150–2152 (2000).

    Article  PubMed  Google Scholar 

  93. Uchihi, R., Tamaki, K., Kojima, T., Yamamoto, T. & Katsumata, Y. Deoxyribonucleic acid (DNA) typing of human leukocyte antigen (HLA)-DQA1 from single hairs in Japanese. J. Forensic Sci. 37, 853–859 (1992).

    Article  CAS  PubMed  Google Scholar 

  94. Koonjul, P. K., Brandt, W. F., Farrant, J. M. & Lindsey, G. G. Inclusion of polyvinylpyrrolidone in the polymerase chain reaction reverses the inhibitory effects of polyphenolic contamination of RNA. Nucleic Acids Res. 27, 915–916 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Foucault, F., Praz, F., Jaulin, C. & Amor-Gueret, M. Experimental limits of PCR analysis of (CA)n repeats alterations. Trends Genet. 12, 450–452 (1996).

    Article  CAS  PubMed  Google Scholar 

  96. Parsons, K. M. Reliable microsatellite genotyping of dolphin DNA from faeces. Mol. Ecol. Notes 1, 341–344 (2001).

    Article  CAS  Google Scholar 

  97. Shinde, D., Lai, Y., Sun, F. & Arnheim, N. Taq DNA polymerase slippage mutation rates measured by PCR and quasi-likelihood analysis: (CA/GT)n and (A/T)n microsatellites. Nucleic Acids Res. 31, 974–980 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  98. Polisky, B. et al. Specificity of substrate recognition by the EcoRI restriction endonuclease. Proc. Natl Acad. Sci USA 72, 3310–3314 (1975).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Haberl, M. & Tautz, D. Comparative allele sizing can produce inaccurate allele size differences microsatellites. Mol. Ecol. 8, 1347–1350 (1999).

    Article  CAS  PubMed  Google Scholar 

  100. Delmotte, F., Leterme, N. & Simon, J. -C. Microsatellite allele sizing: difference between automated capillary electrophoresis and manual technique. BioTechniques 31, 810–818 (2001).

    CAS  PubMed  Google Scholar 

  101. Fernando, P., Evans, B. J., Morales, J. C. & Melnick, D. J. Electrophoresis artefacts — a previously unrecognized cause of error in microsatellite analysis. Mol. Ecol. Notes 1, 325–328 (2001).

    Article  CAS  Google Scholar 

  102. Davison, A. & Chiba, S. Laboratory temperature variation is a previously unrecognized source of genotyping error during capillary electrophoresis. Mol. Ecol. Notes 3, 321–323 (2003).

    Article  CAS  Google Scholar 

  103. Gerloff, U. et al. Amplification of hypervariable simple sequence repeats (microsatellites) from excremental DNA of wild bonobos (Pan paniscus). Mol. Ecol. 4, 515–518 (1995).

    Article  CAS  Google Scholar 

  104. Constable, J. L., Ashley, M. V., Goodall, J. & Pusey, A. E. Noninvasive paternity assignment in Gombe chimpanzees. Mol. Ecol. 10, 1279–1300 (2001).

    Article  CAS  PubMed  Google Scholar 

  105. Matsuzaki, H. et al. Parallel genotyping of over 10,000 SNPs using a one-primer assay on a high-density oligonucleotide array. Genome Res. 14, 414–425 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Ekstrom, C. T. Detecting low-quality markers using map expanders. Genet. Epidemiol. 25, 214–224 (2003).

    Article  PubMed  Google Scholar 

  107. van Oosterhout, C., Hutchinson, W. F., Wills, D. P. M. & Shipley, P. Micro-checker: software for identifying and correcting genotyping errors in microsatellite data. Mol. Ecol. Notes 4, 535–538 (2004).

    Article  CAS  Google Scholar 

  108. Valière, N., Berthier, P., Mouchiroud, D. & Pontier, D. GEMINI: software for testing the effects of genotyping errors and multitubes approach for individual identification. Mol. Ecol. Notes 2, 83–86 (2002).

    Article  CAS  Google Scholar 

  109. Gordon, D., Finch, S. J., Nothnagel, M. & Ott, J. Power and sample size calculations for case-control genetic association tests when errors are present: application to single nucleotide polymorphisms. Hum. Hered. 54, 22–23 (2002).

    Article  PubMed  Google Scholar 

  110. Gordon, D., Levenstien, M. A., Finch, S. J. & Ott, J. Errors and linkage disequilibrium interact multiplicatively when computing sample sizes for genetic case-control association studies. Pac. Symp. Biocomput., 490–501 (2003).

  111. McPeek, M. S. & Sun, L. Statistical tests for detection of misspecified relationships by use of genome-screen data. Am. J. Hum. Genet. 66, 1076–1094 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. O'Connell, J. R. & Weeks, D. E. PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am. J. Hum. Genet. 63, 259–266 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Ehm, M. G., Cottingham, R. W. Jr & Kimmel, M. Error detection in genetic linkage data using likelihood based methods. J. Biol. Syst. 3, 13–25 (1995).

    Article  Google Scholar 

  114. Broman, K. W., Wu, H., Sen, S. & Churchill, G. A. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890 (2003).

    Article  CAS  PubMed  Google Scholar 

  115. Valière, N. GIMLET: a computer program for analysing genetic individual identification data. Mol. Ecol. Notes 2, 377–379 (2002).

    Google Scholar 

  116. McKelvey, K. S. & Schwartz, M. K. DROPOUT: a program to identify problem loci and samples for noninvasive genetic samples in a capture-mark-recapture framework. Mol. Ecol. Notes 5, 716–718 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The authors are grateful to G. Luikart for fruitful discussions and comments on the manuscript and to the persons from the Scandinavian Brown Bear Research Project who provided the bear samples. They thank three anonymous reviewers for providing references and helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to François Pompanon.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Related links

Related links

FURTHER INFORMATION

An alphabetical list of genetic analysis software

DNA Advisory Board Quality Assurance Standards for Forensic DNA Testing Laboratories

European Molecular Genetics Quality Network

ISI Web of Science

Minimum Information About a Microarray Experiment

PARENTE

Programs useful for detecting genotyping and pedigree errors

UCLA Human Genetics Software Distribution

Glossary

PATERNITY EXCLUSION

The elimination of a male as the potential father of a given offspring, owing to incompatibility between the multilocus genotypes of the two individuals concerned.

NON-INVASIVE GENOTYPING

Genotyping from samples that are collected without capturing the animal (such as hair or faeces).

AMPLIFIED FRAGMENT-LENGTH POLYMORPHISMS

A PCR-based DNA fingerprinting technique that reveals polymorphisms in restriction-enzyme recognition sites by generating dozens of dominant marker bands.

MICROSATELLITE

A class of repetitive DNA that is made up of repeats that are 2–8 nucleotides in length. They can be highly polymorphic and are frequently used as molecular markers in population genetics studies.

SIZE HOMOPLASY

The generation of alleles that are the same size which are not the result of common ancestry (not homologous), but arose independently in different ancestors by parallel or convergent mutations.

ALLELIC DROPOUTS

The stochastic non-amplification of an allele; that is, amplification of only one of the two alleles present at a heterozygous locus.

FALSE ALLELE

An allele-like artefact that is generated by PCR.

ALLELE CALLING

The determination of an allele from an electropherogram or a fluorescent profile.

REPLICATED GENOTYPES

Genotypes that are produced from different (preferentially independent) samples from the same individual.

STUTTER BANDS

Artefacts that occur during the PCR amplification of microsatellites.

HAPLOTYPE

The combination of alleles found at neighbouring loci on a single chromosome or haploid DNA molecule.

FST ESTIMATES

Statistics that were first defined by Sewall Wright to describe the genetic structure at different hierarchical levels (individuals, subpopulations and total populations).

POPULATION BOTTLENECK

A marked reduction in population size that often results in the loss of genetic variation and more frequent matings among closely related individuals.

HARDY–WEINBERG TEST

A test that assesses whether the frequency of each diploid genotype at a locus equals that expected from the random union of alleles.

MAXIMUM LIKELIHOOD APPROACH

A statistical approach that is used to make inferences about the combination of parameter values that gives the greatest probability of obtaining the observed data.

POPULATION ADMIXTURE

A process that leads to a composite gene pool in which at least some individuals come from more than one population.

LIKELIHOOD RATIO TEST

A method for hypothesis testing. The maximum of the likelihood that the data fit a full model of the data is compared with the maximum of the likelihood that the data fit a restricted model and the likelihood ratio (LR) test statistic is computed. If the LR is significant, the full model provides a better fit to the data than does the restricted model.

SHORT-ALLELE DOMINANCE

The preferential PCR amplification of the shorter allele from a heterozygote individual. This is equivalent to a long-allele dropout.

PROBABILITY OF IDENTITY

The overall probability that two individuals drawn at random from a given population share identical genotypes at all typed loci.

EFFECTIVE POPULATION SIZE

The size of the ideal population in which the effects of random drift would be the same as those seen in the actual population.

DIRECTED-ERROR MODEL

A model postulating that there is a greater probability for a particular allele to be consistently incorrectly genotyped.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pompanon, F., Bonin, A., Bellemain, E. et al. Genotyping errors: causes, consequences and solutions. Nat Rev Genet 6, 847–859 (2005). https://doi.org/10.1038/nrg1707

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg1707

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing