A model for the length of tracts of identity by descent in finite random mating populations

https://doi.org/10.1016/S0040-5809(03)00071-6Get rights and content

Abstract

Linkage disequilibrium (LD) reflects coinheritance of an ancestral segment by chromosomes in a population. To begin to understand the effects of population history on the extent of LD, we model the length of a tract of identity-by-descent (IBD) between two chromosomes in a finite, random mating population. The variance of an IBD tract is large: a model described by (Genet. Res. Cambridge 35 (1980) 131) underestimates this variance. Using Fisher's concept of junctions, we predict the mean length of an IBD tract, given the age of the population and the population sizes over time. We derive results also for subdivided populations, given times of subdivision events and sizes of the resulting subpopulations. The model demonstrates that population growth and subdivision strongly affect the expected length of an IBD tract in small populations. These effects are less dramatic in large populations.

Introduction

An isolated population is one that is descended from a small group of individuals (founders) and in which population growth is due almost exclusively to births within the population, rather than immigration from outside. Interest in the genetics of isolated human populations has recently been revived among medical geneticists, because it is hoped that diseases for which there are several susceptibility loci in large outbred populations may be more homogeneous in small isolated populations (Ardlie et al., 2002). Isolated human populations may differ from one another in many aspects of their history. Populations are founded at different times, by founder groups of different sizes, experience different types of growth, and may have varying levels of internal subdivision (see Chapman and Thompson (2000) for a brief survey of the variety of histories seen in human populations). It is important to understand the potential effects of these aspects of a population's history on its genetics, both in order to assess its utility for genetic studies, and to understand the results of such studies.

The effects of population history on linkage disequilibrium are of particular interest, since the observation of association between marker alleles or haplotypes in a region and a disease phenotype of interest can be used to infer the existence of a disease susceptibility locus in that region (Couzin, 2002). Recently, attention has focused on blocks of genome in which linkage disequlibrium is high (Daly et al., 2001; Patil et al., 2001; Gabriel et al., 2002). That is, the population variation within each such block is characterized by a limited number of high-frequency haplotypes, which have not been broken up by recombination. Linkage disequilibrium reflects coinheritance of this particular ancestral haplotype block by multiple chromosomes in the population. By considering the length of a block of chromosome inherited intact from a common ancestor by two chromosomes sampled from a population, we begin to address the question of the genetic distance over which disequilibrium extends, and hence how the feasibility of disequilibrium mapping may be affected by population history.

Two chromosomes are said to be identical by descent (IBD) at a particular point if they are copies of the same ancestral chromosome at that point. IBD is therefore defined relative to some set of ancestors. In this paper, founder individuals are assumed to be non-inbred and unrelated, and therefore IBD is measured relative to the set of founder chromosomes. It is well known that the probability of IBD at a single locus is affected by population size, age, and internal subdivision (Crow and Kimura, 1970), but the effects of these factors on the lengths of IBD tracts have not been studied. In this paper, we develop a model for the length of a random tract of IBD between two chromosomes from a finite monoecious random mating population. We assume that population sizes at each generation are known, and that crossovers occur according to a Poisson process along the chromosome. The model is used to explore the effects of different types of growth and subdivision on the expected length of an IBD tract between two randomly sampled chromosomes.

This work relies on Fisher's theory of junctions (Fisher, 1954). A junction is a point on the chromosome where DNA from two distinct ancestors meets. A junction is therefore formed by a crossover event which occurs in a region where the chromosomes crossing over are not identical by descent (non-IBD). Junctions are thus also defined relative to the founder chromosomes. Fisher (1954) distinguished external and internal junctions. When the ancestral types of two chromosomes are compared, a junction is internal if the chromosomes are IBD on both sides of it, or non-IBD on both sides of it. External junctions mark the ends of IBD and non-IBD tracts between a pair of chromosomes.

Stam (1980) extended Fisher's ideas to a random mating population of constant size, and obtained the expected number of external junctions in a pair of chromosomes from a given generation. He assumed that the IBD process is Markov, and hence that IBD and non-IBD tract lengths are exponentially distributed. This assumption yields an expression for the mean length of an IBD tract between two chromosomes in a particular generation, as a function of population size. The model developed here is different from that of Stam (1980). First, we consider four distinct types of junctions which may be observed when comparing two chromosomes—two of these are external, and two are internal. We model the sequence of junction types along the pair of chromosomes as a first-order Markov chain. Second, we do not make specific distributional assumptions about the lengths of either IBD or non-IBD tracts.

Wiuf and Hein (1997) consider a related problem, the moments of the number of ancestral chromosome segments, in terms of the long-run equilibrium between recombination and coancestry. In contrast, our interest is in the coancestry of chromosome segments relative to a small founder population at a recent point in history. This would be of interest in a study of haplotype diversity in a genetic isolate or the subdivided populations of an endangered species.

In Section 2, we show how IBD tracts and non-IBD tracts can be described as a particular sequence of junction types with intervening segments of varying lengths. We describe a model for IBD tract length which consists of two parts – one for the sequence of junction types along the pair of chromosomes, and a second for the lengths of the intervening segments. Simulation studies validating the models are also described, and the model is discussed in relation to the work of Stam (1980). Finally, in Section 3, we apply these models to examine the effects of population size and growth type on the expected length of an IBD tract between two chromosomes chosen from within the same population. We also consider the effect of subdivision on the expected length of IBD tracts between chromosomes chosen from different subpopulations.

Section snippets

IBD tracts in terms of junctions and segments

Fig. 1 depicts two chromosomes sampled from a population some time after founding. Different patterns represent different ancestral chromosomes. Tracts of IBD and non-IBD are indicated above the chromosomes by white (IBD) and black (non-IBD) bars. A single tract of IBD or non-IBD is made up of a variable number of segments, where a segment is defined as the piece of the chromosome between two neighbouring junctions. In order to describe precisely the tracts of IBD and non-IBD, we classify each

Application to growing populations with and without subdivision

To demonstrate the potential effects of different types of population growth on the mean length of an IBD tract, we consider an example. Consider a population which has grown to 100 times its initial size, over a period of 100 generations. This example reflects the age of modern Finnish (Nevanlinna, 1972) and Japanese (Benedict, 1989) populations. We consider initial population sizes (N0) of 20, 100, and 500 individuals, and for each we consider five growth scenarios:

  • Linear growth: Expansion by

Conclusions

In this paper we have developed a model for the length of an IBD tract between two chromosomes chosen from a random mating population. The sequence of junction types along the pair of chromosomes being compared is well approximated by a first-order Markov chain, and we allow for different length distributions for IBD and non-IBD segments. Prediction of the mean length of an IBD tract using this model does not require the assumption of an exponential distribution for the IBD tract lengths, as

Acknowledgements

The authors are grateful for support from the Burroughs Wellcome Fund for the Program in Mathematical and Molecular Biology (N.H.C.) and National Institutes of Health Grant GM-46255 (E.A.T.).

References (14)

  • K.G. Ardlie et al.

    Patterns of linkage disequilibrium in the human genome

    Nat. Rev. Genet.

    (2002)
  • R. Benedict

    The Crysanthemum and the Sword

    (1989)
  • Chapman, N.H., 2001. Genome descent in isolated populations. Ph.D. Thesis, University of Washington, Seattle,...
  • Chapman, N.H., Thompson, E.A., 2000. Linkage disequilibrium mapping: the role of population history, size and...
  • J. Couzin

    New mapping project splits the community

    Science

    (2002)
  • J.F. Crow et al.

    An Introduction to Population Genetics Theory

    (1970)
  • M.J. Daly et al.

    High resolution haplotype structure in the human genome

    Nat. Genet.

    (2001)
There are more references available in the full text version of this article.

Cited by (0)

View full text