Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

The study of structured populations — new hope for a difficult and divided science

Key Points

  • Population genetics can be used to study the history of natural populations. However, it is a difficult science because natural populations have complex geographies and histories.

  • With the advent of DNA-sequence-based data sets drawn from natural populations two main schools of study developed: the phylogeographic approach, which uses the data to estimate the evolutionary tree, or gene tree, then attempts to interpret the history of the populations from which the samples came; and the summary statistics approach, which is an outgrowth of mathematical population genetics and proceeds by mathematically fitting specific population-genetic models to the data.

  • The phylogeographic approach has the advantage of not being constrained by specific models, and lends itself to exploratory types of analysis. However, it is highly dependent on gene-tree estimates, which are often incorrect. This method can be misleading if investigators focus on just a single gene or stretches of tightly-linked sequence, such as mitochondrial DNA, and overlook the large stochastic variance that arises among genes in populations.

  • Summary-statistic approaches can be mathematically sophisticated and provide ways to compare models and assess the sources of variance in the process that gave rise to the data. However, these methods are often highly constrained by the available models and are difficult to apply if investigators have little knowledge of the locations and boundaries of populations in nature. Also, they do not usually take full advantage of all of the information that is available in the data.

  • In recent years, a new family of methods has begun to offer the advantages of the phylogeographic approach, using all of the information in the data and allowing diverse models to be considered, together with the mathematical sophistication of the summary-statistics methods. These are probabilistic methods in which gene trees have a role, but in a framework in which they are used strictly in conjunction with their probability. As these methods continue to develop, they offer the promise of increased flexibility and applicability to a wide range of questions in the history of populations.

Abstract

Natural populations, including those of humans, have complex geographies and histories. Studying how they evolve is difficult, but it is possible with population-based DNA sequence data. However, the study of structured populations is divided by two distinct schools of thought and analysis. The phylogeographic approach is fundamentally graphical and begins with a gene-tree estimate. By contrast, the more traditional approach of using summary statistics is fundamentally mathematical. Both approaches have limitations, but there is promise in newer probabilistic methods that offer the flexibility and data exploitation of the phylogeographic approach in an explicitly model-based mathematical framework.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Models of population structure.
Figure 2: Contrasting phylogeographic and summary-statistic methods.
Figure 3: The stochastic variance of gene trees.
Figure 4: The isolation-with-migration model.

Similar content being viewed by others

References

  1. Provine, W. B. The Origins of Theoretical Population Genetics (Univ. of Chicago Press, Chicago, 1971).

    Google Scholar 

  2. Fisher, R. The Genetical Theory of Natural Selection (Clarenson, Oxford, 1930).

    Google Scholar 

  3. Wright, S. Evolution in Mendelian populations. Genetics 16, 97–159 (1931). The first paper to mathematically address the effects of population structure on patterns of genetic variation.

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Wright, S. Evolution and the Genetics of Populations Volume 2: The Theory of Gene Frequencies (Univ. of Chicago Press, Chicago, 1969).

    Google Scholar 

  5. Wakeley, J. & Hey, J. Estimating ancestral population parameters. Genetics 145, 847–855 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Wakeley, J. Nonequilibrium migration in human history. Genetics 153, 1863–1871 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Slatkin, M. Isolation by distance in equilibrium and non-equilibrium populations. Evolution 47, 264–279 (1993).

    PubMed  Google Scholar 

  8. Van Dooren, T. J. M. & Metz, J. A. J. Delayed maturation in temporally structured populations with non-equilibrium dynamics. J. Evol. Biol. 11, 41–62 (1998).

    Google Scholar 

  9. Avise, J. C. et al. Intraspecific phylogeography: the mitochondrial-DNA bridge between population genetics and systematics. Annu. Rev. Ecol. Syst. 18, 489–522 (1987). This review paper marks the birth of phylogeography.

    Google Scholar 

  10. Avise, J. C. Phylogeography (Harvard Univ. Press, Cambridge, Massachusetts, 2000).

    Google Scholar 

  11. Bermingham, E. & Mortiz, C. Comparative phylogeography: concepts and applications. Mol. Evol. 7, 367–369 (1998).

    Google Scholar 

  12. Kingman, J. F. C. The coalescent. Stoch. Proc. Appl. 13, 235–248 (1982). The original mathematical description of the coalescent theory.

    Google Scholar 

  13. Hudson, R. R. in Oxford Surveys in Evolutionary Biology (eds Futuyma, D. & Antonovics, J.) 1–44 (Oxford Univ. Press, New York, 1990). A comprehensive review of coalescent theory by one of its developers, which provides computer code for conducting basic simulations of neutral processes.

    Google Scholar 

  14. Rosenberg, N. A. & Nordborg, M. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nature Rev. Genet. 3, 380–390 (2002).

    CAS  PubMed  Google Scholar 

  15. Tavare, S. Line-of-descent and genealogical processes, and their applications in population genetics models. Theor. Popul. Biol. 26, 119–164 (1984).

    CAS  PubMed  Google Scholar 

  16. Hare, M. P. Prospects for nuclear gene phylogeography. Trends Ecol. Evol. 16, 700–706 (2001).

    Google Scholar 

  17. Bernardi, G., Sordino, P. & Powers, D. A. Concordant mitochondrial and nuclear DNA phylogenies for populations of the teleost fish Fundulus heteroclitus. Proc. Natl Acad. Sci. USA 90, 9271–9274 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Burton, R. S. & Lee, B. N. Nuclear and mitochondrial gene genealogies and allozyme polymorphism across a major phylogeographic break in the copepod Tigriopus californicus. Proc. Natl Acad. Sci. USA 91, 5197–5201 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Palumbi, S. R. & Baker, C. S. Contrasting population structure from nuclear intron sequences and mtDNA of humpback whales. Mol. Biol. Evol. 11, 426–435 (1994).

    CAS  PubMed  Google Scholar 

  20. Hare, M. P. & Avise, J. C. Population structure in the American oyster as inferred by nuclear gene genealogies. Mol. Phylogenet. Evol. 15, 119–128 (1998).

    CAS  Google Scholar 

  21. Hare, M. P., Cipriano, F. & Palumbi, S. R. Genetic evidence on the demography of speciation in allopatric dolphin species. Evolution 56, 804–816 (2002).

    PubMed  Google Scholar 

  22. Machado, C. A. & Hey, J. The causes of phylogenetic conflict in a classic Drosophila species group. Proc. Royal Soc. Lond. B 270, 1193–1202 (2003).

    CAS  Google Scholar 

  23. Cann, R. L., Stoneking, M. & Wilson, A. C. Mitochondrial DNA and human evolution. Nature 325, 31–36 (1987). A much-discussed paper that describes one of the first attempts to use mitochondrial DNA data to study the history of the human species.

    CAS  PubMed  Google Scholar 

  24. Vigilant, L., Stoneking, M., Harpending, H., Hawkes, K. & Wilson, A. C. African populations and the evolution of human mitochondrial DNA. Science 253, 1503–1507 (1991).

    CAS  PubMed  Google Scholar 

  25. Maddison, D. R., Ruvolo, M. & Swofford, D. L. Geographic origins of human mitochondrial DNA: phylogenetic evidence from control region sequences. Syst. Biol. 41, 111–124 (1992).

    Google Scholar 

  26. Templeton, A. R. Human origins and analysis of mitochondrial DNA sequences. Science 255, 737 (1992).

    CAS  PubMed  Google Scholar 

  27. Templeton, A. R. The “Eve” hypothesis: a genetic critique and reanalysis. Am. Anthropol. 95, 51–72 (1993).

    Google Scholar 

  28. Hey, J. Mitochondrial and nuclear genes present conflicting portraits of human origins. Mol. Biol. Evol. 14, 166–172 (1997).

    CAS  PubMed  Google Scholar 

  29. Templeton, A. R., Routman, E. & Phillips, C. A. Separating population structure from population history: a cladistic analysis of the geographical distribution of mitochondrial DNA haplotypes in the tiger salamander, Ambystoma tigrinum. Genetics 140, 767–782 (1995). The original description of the nested-clade-analysis method.

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Templeton, A. R. Nested clade analyses of phylogeographic data: testing hypotheses about gene flow and population history. Mol. Ecol. 7, 381–397 (1998).

    CAS  PubMed  Google Scholar 

  31. Templeton, A. Out of Africa again and again. Nature 416, 45–51 (2002).

    CAS  PubMed  Google Scholar 

  32. Stringer, C. B. & Andrews, P. Genetic and fossil evidence for the origins of modern humans. Science 239, 1263–1268 (1988).

    CAS  PubMed  Google Scholar 

  33. Knowles, L. L. & Maddison, W. P. Statistical phylogeography. Mol. Ecol. 11, 2623–2635 (2002).

    PubMed  Google Scholar 

  34. Edwards, S. V. & Beerli, P. Gene divergence, population divergence, and the variance in coalescence time in phylogeographic studies. Evolution 54, 1839–1854 (2000).

    CAS  PubMed  Google Scholar 

  35. Hudson, R. R. & Turelli, M. Stochasticity overrules the “three-times rule”: genetic drift, genetic draft, and coalescence times for nuclear loci versus mitochondrial DNA. Evolution 57, 182–190 (2003).

    PubMed  Google Scholar 

  36. Hudson, R. R. & Coyne, J. A. Mathematical consequences of the genealogical species concept. Evolution 56, 1557–1565 (2002).

    PubMed  Google Scholar 

  37. Maynard Smith, J. & Haigh, J. The hitch-hiking effect of a favourable gene. Genome Res. 23, 23–35 (1974).

    Google Scholar 

  38. Felsenstein, J. Phylogenies from molecular sequences: inference and reliability. Annu. Rev. Gen. 22, 521–565 (1988).

    CAS  Google Scholar 

  39. Swofford, D., Olsen, G., Waddel, P. & Hillis, D. in Molecular Systematics (eds. Hillis, D., Mortiz, C. & Mable, B.) 486–493 (Sinauer Associates, Sunderland, Massachusetts, 1996).

    Google Scholar 

  40. Hudson, R. R. & Kaplan, N. L. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Templeton, A. R. et al. Recombinational and mutational hotspots within the human lipoprotein lipase gene. Am. J. Hum. Genet. 66, 69–83 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Kimura, M. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61, 893–903 (1969).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Ewens, W. J. The sampling theory of selectively neutral alleles. Theor. Popul. Biol. 3, 87–112 (1972).

    CAS  PubMed  Google Scholar 

  44. Watterson, G. A. On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 7, 256–275 (1975).

    CAS  PubMed  Google Scholar 

  45. Wright, S. The genetical structure of populations. Ann. Eugen. 15, 323–354 (1951).

    CAS  PubMed  Google Scholar 

  46. Wright, S. The interpretation of population structure by F-statistics with special regards to systems of mating. Evolution 19, 395–420 (1965).

    Google Scholar 

  47. Slatkin, M. & Voelm, L. Fst in a hierarchical island model. Genetics 127, 627–629 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Slatkin, M. Inbreeding coefficients and coalescence times. Genome Res. 58, 167 (1991).

    CAS  Google Scholar 

  49. Notohara, M. The coalescent and the genealogical process in geographically structured population. J. Math. Biol. 29, 59–75 (1990).

    CAS  PubMed  Google Scholar 

  50. Wakeley, J. Segregating sites in Wright's Island model. Theor. Popul. Biol. 53, 166–174 (1998).

    CAS  PubMed  Google Scholar 

  51. Wakeley, J. The effects of subdivision on the genetic divergence of populations and species. Evolution 54, 1092–1101 (2000).

    CAS  PubMed  Google Scholar 

  52. Wilkins, J. F. & Wakeley, J. The coalescent in a continuous, finite, linear population. Genetics 161, 873–888 (2002).

    PubMed  PubMed Central  Google Scholar 

  53. Whitlock, M. C. Neutral additive genetic variance in a metapopulation. Genet. Res. 74, 215–221 (1999).

    CAS  PubMed  Google Scholar 

  54. Wakeley, J. & Aliacar, N. Gene genealogies in a metapopulation. Genetics 159, 893–905 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Hey, J. A multi-dimensional coalescent process applied to multi-allelic selection models and migration models. Theor. Popul. Biol. 39, 30–48 (1991).

    CAS  PubMed  Google Scholar 

  56. Tajima, F. Evolutionary relationships of DNA sequences in finite populations. Genetics 105, 437–460 (1983).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. Fu, Y. X. Estimating effective population size or mutation rate using the frequencies of mutations of various classes in a sample of DNA sequences. Genetics 138, 1375–1386 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  58. Tajima, F. The effect of change in population size on DNA polymorphism. Genetics 123, 597–601 (1989).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Slatkin, M. & Hudson, R. R. Pairwise comparisons of mitochondrial DNA sequences in stable and exponentially growing populations. Genetics 129, 555–562 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Rogers, A. R. & Harpending, H. Population growth makes waves in the distribution of pairwise genetic differences. Mol. Biol. Evol. 9, 552–568 (1992).

    CAS  PubMed  Google Scholar 

  61. Innan, H. & Stephan, W. The coalescent in an exponentially growing metapopulation and its application to Arabidopsis thaliana. Genetics 155, 2015–2019 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  62. Hudson, R. R., Slatkin, M. & Maddison, W. P. Estimation of levels of gene flow from DNA sequence data. Genetics 132, 583–589 (1992).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Tajima, F. DNA polymorphism in a subdivided population: the expected number of segregating sites in the two-subpopulation model. Genetics 123, 229–240 (1989).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Tajima, F. Relationship between migration and DNA polymorphism in a local population. Genetics 126, 231–234 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. Slatkin, M. The average number of sites separating DNA sequences drawn from a subdivided population. Theor. Popul. Biol. 32, 42–49 (1987).

    CAS  PubMed  Google Scholar 

  66. Strobeck, C. Average number of nucleotide differences in a sample from a single subpopulation: a test for population subdivision. Genetics 117, 149–153 (1987).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Wakeley, J. Pairwise differences under a general model of population subdivision. J. Genet. 75, 81–89 (1996).

    Google Scholar 

  68. Arbogast, B. S., Edwards, S. V., Wakeley, J., Beerli, P. & Slowinski, J. B. Estimating divergence times from molecular data on phylogenetic and population genetic timescales. Annu. Rev. Ecol. Syst. 33, 707–740 (2002).

    Google Scholar 

  69. Ford, M. J. Applications of selective neutrality tests to molecular ecology. Mol. Ecol. 11, 1245–1262 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. Braverman, J. M., Hudson, R. R. & Stephan, W. The hitchhiking effect on the site frequency spectrum of DNA polymorphisms. Genetics 140, 783–796 (1990).

    Google Scholar 

  71. Fu, Y. X. & Li, W. H. Statistical tests of neutrality of mutations. Genetics 133, 693–709 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. Tavare, S., Balding, D. J., Griffiths, R. C. & Donnelly, P. Inferring coalescence times from DNA sequence data. Genetics 145, 505–518 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. Beaumont, M. A., Zhang, W. & Balding, D. J. Approximate bayesian computation in population genetics. Genetics 162, 2025–2035 (2002).

    PubMed  PubMed Central  Google Scholar 

  74. Hudson, R. R., Kreitman, M. & Aguadé, M. A test of neutral molecular evolution based on nucleotide data. Genetics 116, 153–159 (1987).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. Slatkin, M. & Maddison, W. P. A cladistic measure of gene flow inferred from the phylogenies of alleles. Genetics 123, 603–613 (1989). The first method that was developed to estimate migration rates using a gene tree.

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Felsenstein, J. Estimating effective population size from samples of sequences: a bootstrap Monte Carlo integration method. Gene. Res. 60, 209–220 (1992). The first study to describe a method to estimate a population-genetic parameter (population size) by integrating over multiple gene trees.

    CAS  Google Scholar 

  77. Fu, Y. X. A phylogenetic estimator of effective population size or mutation rate. Genetics 136, 685–692 (1994).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Nee, S., Holmes, E. C., Rambaut, A. & Harvey, P. H. Inferring population history from molecular phylogenies. Phil. Trans. Royal Soc. Lond. B 349, 25–31 (1995).

    CAS  Google Scholar 

  79. Pybus, O. G., Rambaut, A. & Harvey, P. H. An integrated framework for the inference of viral population history from reconstructed genealogies. Genetics 155, 1429–1437 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. Felsenstein, J., Kuhner, M. K., Yamato, J. & Beerli, P. in Statistics in Genetics and Molecular Biology (ed. Seillier-Moiseiwitsch, F.) (Institute of Mathematical Statistics and American Mathematical Soc., Hayward, California, 1999).

    Google Scholar 

  81. Griffiths, R. C. & Tavare, S. Simulating probability distributions in the coalescent. Theor. Popul. Biol. 46, 131–159 (1994).

    Google Scholar 

  82. Griffiths, R. C. & Tavare, S. The age of a mutation in a general coalescent tree. Stochastic Models 14, 273–295 (1998).

    Google Scholar 

  83. Kuhner, M. K., Yamato, J. & Felsenstein, J. Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. Genetics 140, 1421–1430 (1995).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. Bahlo, M. & Griffiths, R. C. Inference from gene trees in a subdivided population. Theor. Popul. Biol. 57, 79–95 (2000).

    CAS  PubMed  Google Scholar 

  85. Kuhner, M. K., Yamato, J. & Felsenstein, J. Maximum likelihood estimation of population growth rates based on the coalescent. Genetics. 149, 429–434 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Kuhner, M. K., Yamato, J. & Felsenstein, J. Maximum likelihood estimation of recombination rates from population data. Genetics 156, 1393–1401 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  87. Nielsen, R. Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics 154, 931–942 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. Beerli, P. & Felsenstein, J. Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. Genetics 152, 763–773 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Takahata, N. & Slatkin, M. Genealogy of neutral genes in two partially isolated populations. Theor. Popul. Biol. 38, 331–350 (1990). The first paper to address the difficulty of distinguishing the presence of gene flow in a non-equilibrium isolation model.

    CAS  PubMed  Google Scholar 

  90. Hey, J. in Molecular Approaches to Ecology and Evolution. (eds. Schierwater, B., Streit, B., Wagner, G. & DeSalle, R.) 435–449 (Birkhäuser, Basel, 1994).

    Google Scholar 

  91. Wakeley, J. & Hey, J. in Molecular Approaches to Ecology and Evolution (eds. DeSalle, R. & Schierwater, B.) 157–175 (Birkhäuser, Basel, 1998).

    Google Scholar 

  92. Nielsen, R. & Wakeley, J. Distinguishing migration from isolation: a Markov chain Monte Carlo approach. Genetics 158, 885–896 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  93. Moran, P. A. P. Random processes in genetics. Camb. Philos. Soc. Proc. 54, 60–71 (1958).

    Google Scholar 

  94. Templeton, A. R., Crandall, K. A. & Sing, C. F. A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimating. Genetics 132, 619–633 (1992).

    CAS  PubMed  PubMed Central  Google Scholar 

  95. Templeton, A. R., Boerwinkle, E. & Sing, C. F. Cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping. I. Basic theory and an analysis of alcohol dehydrogenase activity in Drosophila. Genetics 117, 343–351 (1987).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. Templeton, A. R. & Sing, C. F. A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping IV. Nested analyses with cladogram uncertainty and recombination. Genetics 134, 659–669 (1993).

    CAS  PubMed  PubMed Central  Google Scholar 

  97. Posada, D., Crandall, K. A. & Templeton, A. R. GeoDis: a program for the cladistic nested analysis of the geographical distribution of genetic haplotypes. Mol. Ecol. 9, 487–488 (2000).

    CAS  PubMed  Google Scholar 

  98. Wright, S. Breeding structure of populations in relation to speciation. Am. Nat. 74, 232–248 (1940).

    Google Scholar 

  99. Kimura, M. & Weiss, G. H. The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics 49, 561–576 (1964).

    CAS  PubMed  PubMed Central  Google Scholar 

  100. Wright, S. Isolation by distance. Genetics 28, 114–138 (1943).

    CAS  PubMed  PubMed Central  Google Scholar 

  101. Malecot, G. The Mathematics of Heredity (Freeman, San Francisco, 1969).

    Google Scholar 

  102. Slatkin, M. Gene flow and genetic drift in a species subject to frequent local extinction. Theor. Popul. Biol. 12, 253–262 (1977).

    CAS  PubMed  Google Scholar 

  103. Wade, M. J. & McCauley, D. E. Extinction and recolonization: their effects on the genetic differentiation of local populations. Evolution 42, 995–1005 (1988).

    PubMed  Google Scholar 

Download references

Acknowledgements

We are grateful to M. Hare, Y.-J. Won and two anonymous referees for helpful suggestions and corrections. This work was supported in part by a grant from the National Institutes of Health to J.H.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jody Hey.

Related links

Related links

FURTHER INFORMATION

Batwing

Genetree

GeoDis

LAMARC

MDIV

Glossary

DEMOGRAPHIC HISTORY

The reproductive history of a population or group of populations. This can include population sizes, sex ratios, migration rates, population-splitting events, variation in reproductive rates and times among organisms, as well as variation over time in all of these quantities.

POISSON DISTRIBUTION

A probability distribution that is commonly used to describe the frequency at which similar but independent events can be expected to occur over a given period of time.

GENE EXCHANGE

The process by which genetic material is shared among organisms, which can occur through sexual reproduction or lateral genetic transfer.

GENETIC DRIFT

Random changes in gene frequency in a population that occur when a finite number of progeny are formed by the random sampling of gametes from the parents.

HARDY-WEINBERG

A classical mathematical principle in population genetics that describes the expected frequencies of genotypes for one locus after one generation of random mating if the allele frequencies in the parents are known.

EVOLUTIONARY TREE

A graph or branching diagram that describes the pattern of evolutionary ancestry (historical relationships) among a group of organisms.

GENE TREE

A graph or branching diagram that describes the pattern of ancestry among homologous DNA sequences from different individuals of a population or species.

PHYLOGENETIC TREE

A graph or branching diagram that describes the pattern of ancestry among different species or other taxa.

SYSTEMATICS

A branch of biology that deals with the classification of living organisms on the basis of their evolutionary relationships. This differs from 'taxonomy' as organisms are grouped on the basis of shared ancestry, not just on their similarities (which might or might not correspond to shared evolutionary history).

COALESCENT THEORY

A mathematical approach that models the depths of gene trees for samples that are drawn from one or more closely related populations.

ESTIMATOR

A method for calculating an estimate of a parameter in a model.

SUMMARY STATISTIC

A number that is calculated from a data set, which represents much of the information in the data. For a set of DNA sequences, one commonly used summary statistic is S, which represents the number of variable sites in the sample. Summary statistics are often easier to use to fit models to data than would be the case with the data itself.

OUTGROUP

A sample or group of samples that are included in an evolutionary tree because they are known, or assumed, to connect directly to the root of the tree (that is, to the node of the tree that represents the common ancestor of all samples in the tree).

HOMOPLASY

Identical character states (for example, the same nucleotide base in a DNA sequence) that are not the result of common ancestry (not homologous), but arose independently in different ancestors by parallel or convergent mutations.

LINKAGE BLOCK

A region of DNA that is inherited as a single unit owing to a lack of recombination, such as the mitochondrial DNA of metazoans. The histories of genes that are located in such regions are not independent, and are equally affected by all the selective forces that have acted anywhere in the linkage block.

ALLOPATRIC DIFFERENTIATION

The process of divergence between populations or species that are geographically separated.

INFERENCE KEY

A list of paired rules that are used for diagnosis or identification. Keys are a classic tool for identifying organisms to the species level, on the basis of the presence or absence of specific morphological characters or character states. A similar tool is used in nested-clade analysis to distinguish between different historical scenarios.

HEURISTIC

A method of inference that relies on educated guesses or simplifications that limit the parameter space over which solutions are searched. This approach is not guaranteed to find the correct answer.

STOCHASTIC VARIANCE

In the context of gene histories, this is the variation in gene trees and mutations among unlinked genes that have passed through the same demographic history of populations of organisms.

MONOPHYLY

The property that is attributed to a group of samples in an evolutionary tree that all share the same common ancestor exclusive of other samples in the tree. A set of samples that constitute an entire branch on an evolutionary tree is said to be monophyletic.

F-STATISTICS

A method of summary statistics that was devised by Sewall Wright to describe correlations among alleles that are sampled at different hierarchical levels (individuals, subpopulations and total populations). F-statistics are frequently used to describe the presence of population structure.

SINGLETON MUTATIONS

Polymorphic sites in which a rare base is found in only one of the sampled sequences.

POLYMORPHIC-SITE FREQUENCY DISTRIBUTION

A polymorphic site in a DNA sequence can be described by the frequency of one of its variable bases. The distribution of these values for all the polymorphic sites in a sample can be described using a histogram or bar chart. The shape of the histogram can provide qualitative information on the processes that are involved in the history of the sample.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hey, J., Machado, C. The study of structured populations — new hope for a difficult and divided science. Nat Rev Genet 4, 535–543 (2003). https://doi.org/10.1038/nrg1112

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg1112

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing