Reconstructing contiguous regions of an ancestral genome

  1. Jian Ma1,5,6,
  2. Louxin Zhang2,
  3. Bernard B. Suh3,
  4. Brian J. Raney3,
  5. Richard C. Burhans1,
  6. W. James Kent3,
  7. Mathieu Blanchette4,
  8. David Haussler3, and
  9. Webb Miller1
  1. 1 Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania 16802, USA;
  2. 2 Department of Mathematics, National University of Singapore, Singapore 117543;
  3. 3 Center for Biomolecular Science and Engineering, University of California Santa Cruz, Santa Cruz, California 95064, USA;
  4. 4 School of Computer Science, McGill University, Montreal, Quebec H3A 2B4, Canada
  1. 5 Present address: Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California 95064, USA.

Abstract

This article analyzes mammalian genome rearrangements at higher resolution than has been published to date. We identify 3171 intervals, covering ∼92% of the human genome, within which we find no rearrangements larger than 50 kilobases (kb) in the lineages leading to human, mouse, rat, and dog from their most recent common ancestor. Combining intervals that are adjacent in all contemporary species produces 1338 segments that may contain large insertions or deletions but that are free of chromosome fissions or fusions as well as inversions or translocations >50 kb in length. We describe a new method for predicting the ancestral order and orientation of those intervals from their observed adjacencies in modern species. We combine the results from this method with data from chromosome painting experiments to produce a map of an early mammalian genome that accounts for 96.8% of the available human genome sequence data. The precision is further increased by mapping inversions as small as 31 bp. Analysis of the predicted evolutionary breakpoints in the human lineage confirms certain published observations but disagrees with others. Although only a few mammalian genomes are currently sequenced to high precision, our theoretical analyses and computer simulations indicate that our results are reasonably accurate and that they will become highly accurate in the foreseeable future. Our methods were developed as part of a project to reconstruct the genome sequence of the last ancestor of human, dogs, and most other placental mammals.

Footnotes

Related Articles

| Table of Contents
OPEN ACCESS ARTICLE

Preprint Server