On the inference of ancestries in admixed populations

  1. Sriram Sankararaman1,4,
  2. Gad Kimmel1,2,4,
  3. Eran Halperin2, and
  4. Michael I. Jordan1,3,5
  1. 1 Computer Science Division, University of California Berkeley, Berkeley, California 94720, USA;
  2. 2 International Computer Science Institute, Berkeley, California 94704, USA;
  3. 3 Department of Statistics, University of California Berkeley, Berkeley, California 94720, USA
  1. 4 These authors contributed equally to this work.

Abstract

Inference of ancestral information in recently admixed populations, in which every individual is composed of a mixed ancestry (e.g., African Americans in the United States), is a challenging problem. Several previous model-based approaches to admixture have been based on hidden Markov models (HMMs) and Markov hidden Markov models (MHMMs). We present an augmented form of these models that can be used to predict historical recombination events and can model background linkage disequilibrium (LD) more accurately. We also study some of the computational issues that arise in using such Markovian models on realistic data sets. In particular, we present an effective initialization procedure that, when combined with expectation-maximization (EM) algorithms for parameter estimation, yields high accuracy at significantly decreased computational cost relative to the Markov chain Monte Carlo (MCMC) algorithms that have generally been used in earlier studies. We present experiments exploring these modeling and algorithmic issues in two scenarios—the inference of locus-specific ancestries in a population that is assumed to originate from two unknown ancestral populations, and the inference of allele frequencies in one ancestral population given those in another.

Footnotes

  • 5 Corresponding author.

    5 E-mail jordan{at}cs.berkeley.edu; fax (510) 642-5775.

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.072751.107.

    • Received October 25, 2007.
    • Accepted February 13, 2008.

Related Articles

| Table of Contents

Preprint Server