Genetics in geographically structured populations: defining, estimating and interpreting FST

Holsinger, Kent E.; Weir, Bruce S.

doi:10.1038/nrg2611

Review Article
Published: September 2009

Genetics in geographically structured populations: defining, estimating and interpreting F_ST

Kent E. Holsinger¹ &
Bruce S. Weir²

Nature Reviews Genetics volume 10, pages 639–650 (2009)Cite this article

27k Accesses
674 Citations
21 Altmetric
Metrics details

Key Points

Wright's F-statistics, and especially F_ST, provide important insights into the evolutionary processes that influence the structure of genetic variation within and among populations, and they are among the most widely used descriptive statistics in population and evolutionary genetics.
F_ST is a property of the distribution of allele frequencies among populations. It reflects the joint effects of drift, migration, mutation and selection on the distribution of genetic variation among populations.
F_ST has a central role in population and evolutionary genetics and has wide applications in fields from disease association mapping to forensic science.
F_ST can be used to describe the distribution of genetic variation among any set of samples, but it is most usefully applied when the samples represent discrete units rather than arbitrary divisions along a continuous distribution.
Statistics related to F_ST can be useful for haplotype or microsatellite data if an appropriate measure of evolutionary distance among alleles is available.
Comparison of an estimate of F_ST from marker data with an estimate of Q_ST from continuously varying trait data can be used to detect selection, but the estimate of F_ST may depend on the choice of marker and the estimate of Q_ST may differ from neutral expectations if there is a non-additive component of genetic variance.
Although the simple relationship between F_ST and migration rates in Wright's island model makes it tempting to infer migration rates from F_ST, caution is needed if such an approach is to be used.
If estimates of F_ST from many loci are available, it may be possible to identify certain loci as 'outliers' that may have been subject to different patterns of selection or to different demographic processes.
Case–control studies for association-mapping studies must account for the possibility that population substructure accounts for an observed association between a marker and a disease. The genomic control method uses background estimates of F_ST to control for such substructure.
In forensic applications, the probabilities of obtaining a match are sometimes calculated for subpopulations that lack specific allele frequency data. A θ correction, in which θ is F_ST, is used to calculate the probability of a match using allele frequency information from a broader population that the subpopulation is part of.
The massive amount of data that is being generated by population genomics projects can be understood fundamentally as allelic variation at individual loci. We therefore expect F-statistics to be at least as useful in understanding these data sets as they have been in population and evolutionary genetics for most of the last century.

Abstract

Wright's F-statistics, and especially F_ST, provide important insights into the evolutionary processes that influence the structure of genetic variation within and among populations, and they are among the most widely used descriptive statistics in population and evolutionary genetics. Estimates of F_ST can identify regions of the genome that have been the target of selection, and comparisons of F_ST from different parts of the genome can provide insights into the demographic history of populations. For these reasons and others, F_ST has a central role in population and evolutionary genetics and has wide applications in fields that range from disease association mapping to forensic science. This Review clarifies how F_ST is defined, how it should be estimated, how it is related to similar statistics and how estimates of F_ST should be interpreted.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Locus-specific estimates of F_ST on human chromosome 7.**

Complexity of avian evolution revealed by family-level genomes

Article 01 April 2024

Josefin Stiller, Shaohong Feng, … Guojie Zhang

Exome-wide analysis implicates rare protein-altering variants in human handedness

Article Open access 02 April 2024

Dick Schijven, Sourena Soheili-Nezhad, … Clyde Francks

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

References

Rosenberg, N. A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).
CAS PubMed Google Scholar
Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008).
Article CAS PubMed Google Scholar
Wright, S. The genetical structure of populations. Ann. Eugen. 15, 323–354 (1951). This paper develops the explicit framework for the analysis and interpretation of F -statistics in an evolutionary context.
CAS PubMed Google Scholar
Malécot, G. Les Mathématiques de l'Hérédié (Masson, Paris, 1948). This book develops a framework — equivalent to Wright's F -statistics — for the analysis of genetic diversity in hierarchically structured populations.
Wright, S. Evolution in Mendelian populations. Genetics 16, 97–159 (1931). A landmark paper in population genetics in which the effect of population size, mutation and migration on the abundance and distribution of genetic variation in populations is first quantitatively described.
CAS PubMed PubMed Central Google Scholar
Akey, J. M., Zhang, G., Khang, K., Jin, L. & Shriver, M. D. Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 12, 1805–1814 (2002).
CAS PubMed PubMed Central Google Scholar
Weir, B. S., Cardon, L. R., Anderson, A. D., Nielsen, D. M. & Hill, W. G. Measures of human population structure show heterogeneity among genomic regions. Genome Res. 15, 1468–1476 (2005).
CAS PubMed PubMed Central Google Scholar
Guo, F., Dey, D. K. & Holsinger, K. E. A Bayesian hierarchical model for analysis of SNP diversity in multilocus, multipopulation models. J. Am. Stat. Assoc. 164, 142–154 (2009).
Google Scholar
Keinan, A., Mullikin, J. C., Patterson, N. & Reich, D. Accelerated genetic drift on chromosome X during the human dispersal out of Africa. Nature Genet. 41, 66–70 (2009).
CAS PubMed Google Scholar
Cockerham, C. C. Variance of gene frequencies. Evolution 23, 72–84 (1969). This paper develops the first approach for the analysis of F -statistics that recognizes the effect of genetic sampling on estimates of F -statistics from population data.
PubMed Google Scholar
Wahlund, S. Zusammensetzung von Population und Korrelationserscheinung vom Standpunkt der Vererbungslehre aus betrachtet. Hereditas 11, 65–106 (1928).
Google Scholar
Sokal, R. R., Oden, N. L. & Thomson, B. A. A simulation study of microevolutionary inferences by spatial autocorrelation analysis. Biol. J. Linn. Soc. 60, 73–93 (1997).
Google Scholar
Sokal, R. R. & Oden, N. L. Spatial autocorrelation analysis as an inferential tool in population genetics. Am. Nat. 138, 518–521 (1991).
Google Scholar
Epperson, B. K. Geographical Genetics (Princeton Univ. Press, 2003).
Google Scholar
Weir, B. S. & Cockerham, C. C. Mixed self- and random-mating at two loci. Genet. Res. 21, 247–262 (1973).
CAS PubMed Google Scholar
Wright, S. Evolution and the Genetics of Populations Vol. 4 (Univ. Chicago Press, 1978).
Google Scholar
Weir, B. S. Genetic Data Analysis II: Methods for Discrete Population Genetic Data (Sinauer Associates, Sunderland, USA, 1996).
Google Scholar
Rousset, F. Inbreeding and relatedness coefficients: what do they measure? Heredity 88, 371–380 (2002).
CAS PubMed Google Scholar
Casella, G. & Berger, R. L. Statistical Inference (Duxbury, Pacific Grove, 2002).
Google Scholar
Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984). This paper develops the ANOVA framework to apply Cockerham's approach to F -statistics and provides method-of-moments estimates for F -statistics.
CAS PubMed Google Scholar
Excoffier, L. in Handbook of Statistical Genetics (eds Balding, D. J., Bishop, M. & Cannings, V.) 271–307 (John Wiley & Sons, Chichester, 2001).
Google Scholar
Cockerham, C. C. Analyses of gene frequencies. Genetics 74, 679–700 (1973).
CAS PubMed PubMed Central Google Scholar
Berger, J. O. Statistical Decision Theory and Bayesian Analysis (Springer, New York, 1985).
Google Scholar
Robert, C. P. The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation (Springer, New York, 2001).
Google Scholar
Lee, P. M. Bayesian Statistics: An Introduction (Edward Arnold, London, 1989).
Google Scholar
Gelfand, A. E. & Smith, A. F. M. Sampling-based approaches to calculating marginal densities. J. Am. Stat. Assoc. 85, 398–409 (1990).
Google Scholar
Weir, B. S. & Hill, W. G. Estimating F-statistics. Annu. Rev. Genet. 36, 721–750 (2002).
CAS PubMed Google Scholar
Wehrhahn, C. Proceedings of the ecological genetics workshop. Genome 31, 1098–1099 (1989).
Google Scholar
Samanta, S., Li, Y. J. & Weir, B. S. Drawing inferences about the coancestry coefficient. Theor. Popul. Biol. 75, 312–319 (2009).
PubMed PubMed Central Google Scholar
Gaggiotti, O. E. et al. Patterns of colonization in a metapopulation of grey seals. Nature 13, 424–427 (2002).
Google Scholar
Levsen, N. D., Crawford, D. J., Archibald, J. K., Santos-Geurra, A. & Mort, M. E. Nei's to Bayes': comparing computational methods and genetic markers to estimate patterns of genetic variation in Tolpis (Asteraceae). Am. J. Bot. 95, 1466–1474 (2008).
PubMed Google Scholar
Nei, M. & Chesser, R. K. Estimation of fixation indices and gene diversities. Ann. Hum. Genet. 47, 253–259 (1983).
CAS PubMed Google Scholar
Nei, M. Analysis of gene diversity in subdivided populations. Proc. Natl Acad. Sci. USA 70, 3321–3323 (1973). This article introduces G ST as a measure of genetic differentiation among populations.
CAS PubMed PubMed Central Google Scholar
Excoffier, L., Smouse, P. E. & Quattro, J. M. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131, 479–491 (1992). This paper introduces Φ ST and AMOVA for the analysis of haplotype data.
CAS PubMed PubMed Central Google Scholar
Slatkin, M. A measure of population subdivision based on microsatellite allele frequencies. Genetics 139, 457–462 (1995). This article introduces R ST for the analysis of microsatellite data.
CAS PubMed PubMed Central Google Scholar
Rousset, F. Equilibrium values of measures of population subdivision for stepwise mutation processes. Genetics 142, 1357–1362 (1996).
CAS PubMed PubMed Central Google Scholar
Slatkin, M. Inbreeding coefficients and coalescence times. Genet. Res. 58, 167–175 (1991).
CAS PubMed Google Scholar
Holsinger, K. E. & Mason-Gamer, R. J. Hierarchical analysis of nucleotide diversity in geographically structured populations. Genetics 142, 629–639 (1996).
CAS PubMed PubMed Central Google Scholar
Balloux, F. & Lugon-Molin, N. The estimation of population differentiation with microsatellite markers. Mol. Ecol. 11, 155–165 (2002).
PubMed Google Scholar
Balloux, F., Brunner, F. & Goudet, J. Microsatellites can be misleading: an empirical and simulation study. Evolution 54, 1414–1422 (2000).
CAS PubMed Google Scholar
Gaggiotti, O. E., Lange, O., Rassman, K. & Gliddon, C. A comparison of two indirect methods for estimating average levels of gene flow using microsatellite data. Mol. Ecol. 8, 1513–1520 (1999).
CAS PubMed Google Scholar
Spitze, K. Population structure in Daphnia obtusa: quantitative genetic and allozymic variation. Genetics 135, 467–374 (1993). This paper introduces Q ST for the analysis of continuously varying trait data.
Google Scholar
Lande, R. Neutral theory of quantitative genetic variance in an island model with local extinction and colonization. Evolution 46, 381–389 (1992).
PubMed Google Scholar
McKay, J. K. & Latta, R. G. Adaptive population divergence: markers, QTL and traits. Trends Ecol. Evol. 17, 285–291 (2002).
Google Scholar
O'Hara, R. B. & Merila, J. Bias and precision in QST estimates: problems and some solutions. Genetics 171, 1331–1339 (2005).
CAS PubMed PubMed Central Google Scholar
Goudet, J. & Martin, G. Under neutrality, QST ≤ FST when there is dominance in an island model. Genetics 176, 1371–1374 (2007).
PubMed PubMed Central Google Scholar
Notohara, M. The coalescent and the genealogical process in geographically structured population. J. Math. Biol. 29, 59–75 (1990).
CAS PubMed Google Scholar
Charlesworth, B. Fundamental concepts in genetics: effective population size and patterns of molecular evolution and variation. Nature Rev. Genet. 10, 195–205 (2009).
CAS PubMed Google Scholar
McCauley, D. E. & Whitlock, M. C. Indirect measures of gene flow and migration: FST ≠ 1/(4Nm+1). Heredity 82, 117–125 (1999).
PubMed Google Scholar
Wright, S. Isolation by distance. Genetics 28, 114–138 (1943).
CAS PubMed PubMed Central Google Scholar
Rousset, F. Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145, 1219–1228 (1997).
CAS PubMed PubMed Central Google Scholar
Felsenstein, J. How can we infer geography and history from gene frequencies? J. Theor. Biol. 96, 9–20 (1982).
CAS PubMed Google Scholar
Cann, H. M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002).
CAS PubMed Google Scholar
Beerli, P. Comparison of Bayesian and maximum-likelihood estimation of population genetic parameters. Bioinformatics 22, 341–345 (2006).
CAS PubMed Google Scholar
Kuhner, M. K. Coalescent genealogy samplers: windows into population history. Trends Ecol. Evol. 24, 86–93 (2009).
PubMed Google Scholar
Kuhner, M. K. LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22, 768–770 (2006).
CAS PubMed Google Scholar
Fu, R., Gelfand, A. & Holsinger, K. E. Exact moment calculations for genetic models with migration, mutation, and drift. Theor. Popul. Biol. 63, 231–243 (2003).
PubMed Google Scholar
Beaumont, M. A. & Balding, D. J. Identifying adaptive genetic divergence among populations from genome scans. Mol. Ecol. 13, 969–980 (2004).
CAS PubMed Google Scholar
Vitalis, R., Dawson, K. & Boursot, P. Interpretation of variation across marker loci as evidence of selection. Genetics 158, 1811–1823 (2001).
CAS PubMed PubMed Central Google Scholar
Beaumont, M. A. & Nichols, R. A. Evaluating loci for use in the genetic analysis of population structure. Proc. R. Soc. Lond. B 263, 1619–1626 (1996).
Google Scholar
Foll, M. & Gaggiotti, O. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180, 977–993 (2008).
PubMed PubMed Central Google Scholar
Zhang, Y. et al. Positional cloning of the mouse obese gene and its human homologue. Nature 372, 425–432 (1994).
CAS PubMed Google Scholar
Mammès, O. et al. Association of the G2548A polymorphism in the 5′ region of the LEP gene with overweight. Ann. Hum. Genet. 64, 391–394 (2000).
PubMed Google Scholar
Balding, D. J. & Donnelly, P. How convincing is DNA evidence? Nature 368, 285–286 (1994).
CAS PubMed Google Scholar
Balding, D. J. & Nichols, R. A. DNA match probability calculation: how to allow for population stratification, relatedness, database selection, and single bands. Forensic Sci. Int. 64, 125–140 (1994).
CAS PubMed Google Scholar
Council, N. R. The Evaluation of Forensic DNA Evidence (National Academy Press, Washington DC, 1996).
Google Scholar
Devlin, B., Roeder, K. & Wasserman, L. Genomic control, a new approach to genetic-based association studies. Theor. Popul. Biol. 60, 155–166 (2001).
CAS PubMed Google Scholar
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
CAS PubMed Google Scholar
Pritchard, J. K. & Donnelly, P. Case–control studies of association in structured or admixed populations. Theor. Popul. Biol. 60, 227–237 (2001).
CAS PubMed Google Scholar
Pritchard, J. K. & Rosenberg, N. A. Use of unlinked genetic markers to detect population stratification in association studies. Am. J. Hum. Genet. 65, 220–228 (1999).
CAS PubMed PubMed Central Google Scholar
Kingman, J. F. C. On the genealogy of large populations. J. Appl. Prob. 19A, 27–43 (1982).
Google Scholar
Kingman, J. F. C. The coalescent. Stoch. Proc. Appl. 13, 235–248 (1982).
Google Scholar
Kuhner, M. K. & Smith, L. P. Comparing likelihood and Bayesian coalescent estimation of population parameters. Genetics 175, 155–165 (2007).
PubMed PubMed Central Google Scholar
Wang, J. A coalescent-based estimator of admixture from DNA sequences. Genetics 173, 1679–1692 (2006).
CAS PubMed PubMed Central Google Scholar
Innan, H., Zhang, K., Marjoram, P., Tavare, S. & Rosenberg, N. A. Statistical tests of the coalescent model based on the haplotype frequency distribution and the number of segregating sites. Genetics 169, 1763–1777 (2005).
CAS PubMed PubMed Central Google Scholar
Wall, J. D. & Hudson, R. R. Coalescent simulations and statistical tests of neutrality. Mol. Biol. Evol. 18, 1134–1135 (2001).
CAS PubMed Google Scholar
Nordborg, M. Structured coalescent processes on different time scales. Genetics 146, 1501–1514 (1997).
CAS PubMed PubMed Central Google Scholar
Donnelly, P. & Tavaré, S. Coalescents and genealogical structure under neutrality. Annu. Rev. Genet. 29, 401–421 (1995).
CAS PubMed Google Scholar
Griffiths, R. C. & Tavare, S. Simulating probability distributions in the coalescent. Theor. Popul. Biol. 46, 131–159 (1994).
Google Scholar
Fearnhead, P. & Donnelly, P. Estimating recombination rates from population genetic data. Genetics 159, 1299–1318 (2001).
CAS PubMed PubMed Central Google Scholar
Kuhner, M. K., Beerli, P., Yamato, J. & Felsenstein, J. Usefulness of single nucleotide polymorphism data for estimating population parameters. Genetics 156, 439–447 (2000).
CAS PubMed PubMed Central Google Scholar
Kuhner, M. K., Yamato, J. & Felsenstein, J. Maximum likelihood estimation of recombination rates from population data. Genetics 156, 1393–1401 (2000).
CAS PubMed PubMed Central Google Scholar
Kuhner, M. K. & Felsenstein, J. Sampling among haplotype resolutions in a coalescent-based genealogy sampler. Genet. Epidemiol. 19 (Suppl. 1), 15–21 (2000).
Google Scholar
Kuhner, M. K., Yamato, J. & Felsenstein, J. Maximum likelihood estimation of population growth rates based on the coalescent. Genetics 149, 429–434 (1998).
CAS PubMed PubMed Central Google Scholar
Beerli, P. & Felsenstein, J. Maximum-likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. Genetics 152, 763–773 (1999).
CAS PubMed PubMed Central Google Scholar
Drummond, A. J., Nicholls, G. K., Rodrigo, A. G. & Solomon, W. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161, 1307–1320 (2002).
CAS PubMed PubMed Central Google Scholar
Wright, S. An analysis of local variability of flower color in Linanthus parryae. Genetics 28, 139–156 (1943).
CAS PubMed PubMed Central Google Scholar
Malécot, G. The Mathematics of Heredity (W. H. Freeman, San Francisco, 1969).
Google Scholar
Hamrick, J. L. & Godt, M. J. W. Effects of life history traits on genetic diversity in plant species. Philos. Trans. R. Soc. Lond. B 351, 1291–1298 (1996).
Google Scholar
Hamrick, J. L. in Isozymes in Plant Biology (eds Soltis, D. E. & Soltis, P. S.) 87–105 (Dioscorides, Portland, 1989).
Google Scholar
Loveless, M. D. & Hamrick, J. L. Ecological determinants of genetic structure in plant populations. Annu. Rev. Ecol. Syst. 15, 65–95 (1984).
Google Scholar
Hamrick, J. L., Linhart, Y. B. & Mitton, J. B. Relationships between life history characteristics and electrophoretically detectable genetic variation in plants. Annu. Rev. Ecol. Syst. 10, 173–200 (1979).
Google Scholar
Gottlieb, L. D. in Progress in Phytochemistry Vol. 7 (eds Reinhold, L., Harborne, J. B. & Swain, T.) 1–46 (Pergamon, Oxford, 1981).
Google Scholar
Brown, A. H. D. Enzyme polymorphism in plant populations. Theor. Popul. Biol. 15, 1–42 (1979).
Google Scholar
International HapMap Consortium et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).
International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
He, M. et al. Geographical affinities of the HapMap samples. PLoS ONE 4, e4684 (2009).
PubMed PubMed Central Google Scholar
Balding, D. J. Likelihood-based inference for genetic correlation coefficients. Theor. Popul. Biol. 63, 221–230 (2003).
PubMed Google Scholar
Foll, M. & Gaggiotti, O. Identifying the environmental factors that determine the genetic structure of populations. Genetics 174, 875–891 (2006).
CAS PubMed PubMed Central Google Scholar
Begun, D. J. et al. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 5, e310 (2007).
PubMed PubMed Central Google Scholar
Luikart, G., England, P. R., Tallmon, D., Jordan, S. & Taberlet, P. The power and promise of population genomics: from genotyping to genome typing. Nature Rev. Genet. 4, 981–994 (2003).
CAS PubMed Google Scholar
Goudet, J., Raymond, M., de Meeus, T. & Rousset, F. Testing differentiation in diploid populations. Genetics 144, 1933–1940 (1996).
CAS PubMed PubMed Central Google Scholar
Workman, P. L. & Niswander, J. D. Population studies on southwest Indian tribes. II. Local genetic differentiation in the Papago. Am. J. Hum. Genet. 22, 24–49 (1970).
CAS PubMed PubMed Central Google Scholar
Holsinger, K. E. in Hierarchical Modeling for the Environmental Sciences (eds Clark, J. S. & Gelfand, A. E.) 25–37 (Oxford Univ. Press, 2006).
Google Scholar
Holsinger, K. E. Analysis of genetic diversity in hierarchically structured populations: a Bayesian perspective. Hereditas 130, 245–255 (1999).
Google Scholar
Weir, B. S. The rarity of DNA profiles. Ann. Appl. Stat. 1, 358–370 (2007).
PubMed PubMed Central Google Scholar
Ritland, K. R. Joint maximum-likelihood estimation of genetic and mating system structure using open-pollinated progenies. Biometrics 42, 25–43 (1986).
Google Scholar
Thompson, S. L. & Ritland, K. A novel mating system analysis for modes of self-oriented mating applied to diploid and polyploid arctic Easter daisies (Townsendia hookeri). Heredity 97, 119–126 (2006).
CAS PubMed Google Scholar

Download references

Acknowledgements

We thank R. Prunier and K. Theiss for their helpful comments on earlier versions of this Review. The work in the laboratories of the authors was supported in part by grants from the US National Institutes of Health (1 R01 GM 068449-01A1 to K.E.H; 1 R01 GM 075091 to B.S.W).

Author information

Authors and Affiliations

Department of Ecology and Evolutionary Biology, U-3043, University of Connecticut, Storrs, 06269-3043, Connecticut, USA
Kent E. Holsinger
Department of Biostatistics, University of Washington, Box 357232, Seattle, 98195, Washington, USA
Bruce S. Weir

Authors

Kent E. Holsinger
View author publications
You can also search for this author in PubMed Google Scholar
Bruce S. Weir
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Kent E. Holsinger or Bruce S. Weir.

Glossary

Genetic drift: The random fluctuations in allele frequencies over time that are due to chance alone.
Short tandem repeat loci: Loci consisting of short sequences (2–6 nucleotides) that are repeated multiple times. Alleles at short tandem repeat loci differ from one another in their number of repeats.
Variance: A measure of the amount of variation around a mean value.
Diversifying selection: Selection in which different alleles are favoured in different populations. It is often a consequence of local adaptation (in which genotypes from different populations have higher fitness in their home environments owing to historical natural selection).
Hardy–Weinberg proportions: When the frequency of each diploid genotype at a locus equals that expected from the random union of alleles. That is, the genotypes AA, Aa and aa will be at frequencies p², 2pq and q², respectively.
Heterozygote advantage: A pattern of natural selection in which heterozygotes are more likely to survive than homozygotes.
Likelihood: A mathematical function that describes the relationship between the unknown parameters of a statistical distribution — for example, the mean and variance of the allele frequency distribution among populations or the allele frequency in a particular population — and the data. It is directly proportional to the probability of the data given the unknown parameters.
Prior distribution: A statistical distribution used in Bayesian analysis to describe the probability that parameters take on a particular value before examining any data. It expresses the level of uncertainty about those parameters before the data have been analysed.
Posterior distribution: A statistical distribution used in Bayesian analysis to describe the probability that parameters take a particular value after the data have been analysed. It reflects both the likelihood of the data given particular parameters and the prior probability that parameters take particular values.
Markov chain Monte Carlo methods: Methods that implement a computational technique that is widely used for approximating complex integrals and other functions. In this context, these methods are used to approximate the posterior distribution of a Bayesian model.
Multinomial distribution: A statistical distribution that describes the probability of obtaining a sample with a specified number of objects in each of several categories. The probability is determined by the total sample size and the probability of drawing an object from each category. The binomial distribution is a special case of the multinomial distribution in which there are two categories.
Additive genetic variance: The part of the total genetic variation that is due to the main (or additive) effects of alleles on a phenotype. The additive variance determines the degree of resemblance between relatives and therefore the response to selection.
Stabilizing selection: Selection in which either the same allele or the same genotype is favoured in different populations.
Effective population size: Formulated by Wright in 1931, the effective population size reflects the size of an idealized population that would experience drift in the same way as the actual (census) population. The effective population size can be lower than the census population size owing to various factors, including a history of population bottlenecks and reduced recombination.
Coalescent-based approaches: Approaches that use statistical properties of the genealogical relationship among alleles under particular demographic and mutational models to make inferences about the effective size of populations and about rates of mutation and migration.
Conditional autoregressive scheme: A statistical approach developed for analysis of data in which a random effect is associated with the spatial location of each observation. The magnitude of the random effect is determined by a weighted average of the random effects of nearby positions. In most applications, the weights of the averages are inversely related to the spatial distance between two sample points.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Holsinger, K., Weir, B. Genetics in geographically structured populations: defining, estimating and interpreting F_ST. Nat Rev Genet 10, 639–650 (2009). https://doi.org/10.1038/nrg2611

Download citation

Issue Date: September 2009
DOI: https://doi.org/10.1038/nrg2611

This article is cited by

Fine scale diversity in the lava: genetic and phenotypic diversity in small populations of Arctic charr Salvelinus alpinus
- Camille A. Leblanc
- Katja Räsänen
- Bjarni K. Kristjánsson
BMC Ecology and Evolution (2024)
PSReliP: an integrated pipeline for analysis and visualization of population structure and relatedness based on genome-wide genetic variant data
- Elena Solovieva
- Hiroaki Sakai
BMC Bioinformatics (2023)
The role of transposon inverted repeats in balancing drought tolerance and yield-related traits in maize
- Xiaopeng Sun
- Yanli Xiang
- Mingqiu Dai
Nature Biotechnology (2023)
Hybrid autoencoder with orthogonal latent space for robust population structure inference
- Meng Yuan
- Hanne Hoskens
- Peter Claes
Scientific Reports (2023)
Genetic diversity in creole pigs in south central Peru
- Rosa Luna
- Wendy Acuña
- Eudosio Veli
Tropical Animal Health and Production (2023)

Genetics in geographically structured populations: defining, estimating and interpreting F_ST

Key Points

Abstract

Access options

Similar content being viewed by others

Complexity of avian evolution revealed by family-level genomes

Exome-wide analysis implicates rare protein-altering variants in human handedness

Genome-wide association studies

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Related links

FURTHER INFORMATION

Glossary

Rights and permissions

About this article

Cite this article

This article is cited by

Fine scale diversity in the lava: genetic and phenotypic diversity in small populations of Arctic charr Salvelinus alpinus

PSReliP: an integrated pipeline for analysis and visualization of population structure and relatedness based on genome-wide genetic variant data

The role of transposon inverted repeats in balancing drought tolerance and yield-related traits in maize

Hybrid autoencoder with orthogonal latent space for robust population structure inference

Genetic diversity in creole pigs in south central Peru

Search

Quick links

Key Points

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Related links

Related links

FURTHER INFORMATION

Glossary

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links