The utility of single nucleotide polymorphisms in inferences of population history

https://doi.org/10.1016/S0169-5347(03)00018-1Get rights and content

Abstract

Single nucleotide polymorphisms (SNPs) represent the most widespread type of sequence variation in genomes, yet they have only emerged recently as valuable genetic markers for revealing the evolutionary history of populations. Their occurrence throughout the genome also makes them ideal for analyses of speciation and historical demography, especially in light of recent theory suggesting that many unlinked nuclear loci are needed to estimate population genetic parameters with statistical confidence. In spite of having lower variation compared with microsatellites, SNPs should make the comparison of genomic diversities and histories of different species (the core goal of comparative biogeography) more straightforward than has been possible with microsatellites. The most pervasive, but correctable, complication to SNP analysis is a bias towards analyzing only the most variable loci, an artifact that is usually introduced by the limited number of individuals used to screen initially for polymorphisms. Although the use of SNPs as markers in population studies is still new, innovative methods for SNP identification, automated screening, haplotype inference and statistical analysis might quickly make SNPs the marker of choice.

Section snippets

Application of SNPs in studies of population history

It is only recently that a conceptual framework for the population genetic analysis of SNPs, founded appropriately on coalescent theory, has been developed 9, 11. Perhaps the most important nuance to SNP analysis is the need to correct for the ascertainment bias [33] that arises as a by-product of how the SNPs are identified and/or screened [11]. Before starting a SNP study of population history, the researcher must make two decisions that will determine the necessary ascertainment correction.

Mutation pattern

Unlike microsatellites, which have mutation rates per generation of the order of 10−4, SNPs have relatively low mutation rates (10−8–10−9). Multiple mutations at a single site are thus unlikely, and so most SNPs are bi-allelic, a quality that facilitates high-throughput genotyping and minimizes recurrent substitutions at a single site that would confound the population history. The restriction to four character states might make SNPs less informative than microsatellites for parentage analyses

Recombination

A complication of an approach that assays variation at many loci across the genome is that recombination cannot be ignored, as it is in mitochondrial and human Y chromosome studies. Recombination influences both the interpretation and the sampling strategy of SNP variation. Nachman [46] has reviewed the dramatic effects of recombination on the level of SNP variation in humans, in which SNP variation is low in regions of low recombination, and high in regions of high recombination. The same

Conclusion

SNPs have the potential to place historical demography and speciation studies on a common molecular framework, one that is easily comparable to the decades of mtDNA work already undertaken [77]. Their simplicity, ease of modeling and sheer abundance will make them powerful contributors to the new era of using multiple biparentally inherited, potentially recombining loci to infer population histories. The challenge for evolutionary biologists will be to harness and assay variation at large

Acknowledgements

For providing helpful comments, we thank A. Di Rienzo, M. Hare, H.L. Gibbs, B. Jennings, C. Moritz, M. Nachman, R. Nielsen, P. Palsbøll, M. Slatkin, J. Wakeley and two anonymous reviewers. We thank L. Knowles for providing us with a prepublication copy of her manuscript. Work on this article was made possible in part by support from National Science Foundation grants DBI-9974235 (to R.T.B.), DEB 0108249 (to S.V.E. and P.B.), DEB 0129487 (to S.V.E.) and DEB9815650 (to J. Felsenstein); and

Glossary

Glossary

Ascertainment bias:
bias introduced into an analysis because of arbitrary decisions made during data sampling. In SNP or microsatellite studies, ascertainment bias can arise if only the most variable loci are analysed or if only a small panel of individuals is used to discover variation.
Coalescent theory:
population genetic framework that allows one to calculate the probability of obtaining a given genealogical structure for many contemporary samples under many different population genetic models.

References (78)

  • M. Stephens

    A new statistical method for haplotype reconstruction from population data

    Am. J. Hum. Genet.

    (2001)
  • M. Przeworski

    Adjusting the focus on human variation

    Trends Genet.

    (2000)
  • J.C. Avise

    Molecular Markers, Natural History and Evolution

    (1994)
  • L.L. Knowles et al.

    Statistical phylogeography

    Mol. Ecol.

    (2002)
  • N. Takahata et al.

    Pre-speciation coalescence and the effective size of ancestral populations

  • S.V. Edwards et al.

    Perspective: gene divergence, population divergence, and the variance in coalescence time in phylogeographic studies

    Evolution

    (2000)
  • N. Rosenberg et al.

    Genealogical trees, coalescent theory and the analysis of genetic polymorphisms

    Nat. Rev. Genet.

    (2002)
  • J. Wakeley et al.

    Estimating ancestral population parameters

    Genetics

    (1997)
  • M.K. Kuhner

    Usefulness of single nucleotide polymorphism data for estimating population parameters

    Genetics

    (2000)
  • H.C. Harpending

    Genetic traces of ancient demography

    Proc. Natl. Acad. Sci. U. S. A.

    (1998)
  • R. Nielsen

    Estimation of population parameters and recombination rates using single nucleotide polymorphisms

    Genetics

    (2000)
  • R.M. Kliman

    The population genetics of the origin and divergence of the Drosophila simulans complex species

    Genetics

    (2000)
  • C.A. Machado

    Inferring the history of speciation from multilocus DNA sequence data: the case of Drosophila pseudoobscura and close relatives

    Mol. Biol. Evol.

    (2002)
  • F.S. Collins

    A DNA polymorphism discovery resource for research on human genetic variation

    Genome Res.

    (1998)
  • A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms

    Nature

    (2001)
  • L. Picoult-Newberg

    Mining SNPs from EST databases

    Genome Res.

    (1999)
  • R. Sachidanandam

    A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms

    Nature

    (2001)
  • K. Lindblad-Toh

    Large-scale discovery and genotyping of single-nucleotide polymorphisms in the mouse

    Nat. Genet.

    (2000)
  • R.A. Hoskins

    Single nucleotide polymorphism markers for genetic mapping in Drosophila melanogaster

    Genome Res.

    (2001)
  • R.J. Cho

    Genome-wide mapping with biallelic markers in Arabidopsis thaliana

    Nat. Genet.

    (1999)
  • S.R. Wicks

    Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map

    Nat. Genet.

    (2001)
  • E.A. Winzeler

    Direct allelic variation scanning of the yeast genome

    Science

    (1998)
  • M.R. Barnes

    SNP and mutation data on the Web: hidden treasures for uncovering

    Comp. Funct. Genomics

    (2002)
  • F.X. Villablanca

    Invasion genetics of the Mediterranean fruit fly: variation in multiple nuclear introns

    Mol. Ecol.

    (1998)
  • C.R. Primmer

    Single-nucleotide polymorphism characterization in species with limited available sequence information: high nucleotide diversity revealed in the avian genome

    Mol. Ecol.

    (2002)
  • V.L. Friesen

    PCR primers for the amplification of five nuclear introns in vertebrates

    Mol. Ecol.

    (1999)
  • W.F. Dietrich

    Identification and analysis of DNA polymorphisms

  • S.A. Karl et al.

    PCR-based assays of Mendelian polymorphisms from anonymous single-copy nuclear DNA: techniques and applications for population genetics

    Mol. Biol. Evol.

    (1993)
  • J.M. Bradeen et al.

    Conversion of an AFLP fragment linked to the carrot Y2 locus to a simple, codominant, PCR-based marker form

    Theor. Appl. Genet.

    (1998)
  • Cited by (0)

    View full text