Darwinian and demographic forces affecting human protein coding genes

  1. Rasmus Nielsen1,2,9,
  2. Melissa J. Hubisz3,4,
  3. Ines Hellmann1,2,
  4. Dara Torgerson3,
  5. Aida M. Andrés5,
  6. Anders Albrechtsen1,2,
  7. Ryan Gutenkunst4,
  8. Mark D. Adams6,
  9. Michele Cargill7,
  10. Adam Boyko4,
  11. Amit Indap4,
  12. Carlos D. Bustamante4 and
  13. Andrew G. Clark8
  1. 1 Department of Biology, University of Copenhagen, 2100 Kbh Ø, Denmark;
  2. 2 Departments of Integrative Biology and Statistics, UC Berkeley, Berkeley, California 94720, USA;
  3. 3 Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA;
  4. 4 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14853, USA;
  5. 5 Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA;
  6. 6 Department of Genetics, Case Western Reserve University, Cleveland, Ohio 44106, USA;
  7. 7 Navigenics, Redwood Shores, California 94065, USA;
  8. 8 Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA

    Abstract

    Past demographic changes can produce distortions in patterns of genetic variation that can mimic the appearance of natural selection unless the demographic effects are explicitly removed. Here we fit a detailed model of human demography that incorporates divergence, migration, admixture, and changes in population size to directly sequenced data from 13,400 protein coding genes from 20 European-American and 19 African-American individuals. Based on this demographic model, we use several new and established statistical methods for identifying genes with extreme patterns of polymorphism likely to be caused by Darwinian selection, providing the first genome-wide analysis of allele frequency distributions in humans based on directly sequenced data. The tests are based on observations of excesses of high frequency–derived alleles, excesses of low frequency–derived alleles, and excesses of differences in allele frequencies between populations. We detect numerous new genes with strong evidence of selection, including a number of genes related to psychiatric and other diseases. We also show that microRNA controlled genes evolve under extremely high constraints and are more likely to undergo negative selection than other genes. Furthermore, we show that genes involved in muscle development have been subject to positive selection during recent human history. In accordance with previous studies, we find evidence for negative selection against mutations in genes associated with Mendelian disease and positive selection acting on genes associated with several complex diseases.

    Footnotes

    Related Articles

    | Table of Contents

    Preprint Server