Skip to main content
Log in

Whole-Genome Duplications in the Ancestral Vertebrate Are Detectable in the Distribution of Gene Family Sizes of Tetrapod Species

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

A clustering of all protein coding genes from the complete genomes of five tetrapod species into gene families shows a clear deviation from the expected power-law distribution of gene family size. We hypothesize that at least part of the deviation is the result of the two whole-genome duplications (WGDs) that are now known, with reasonable certainty, to have occurred prior to the fish-tetrapod split. We build a model of homologous gene family evolution and perform simulations to show that speciations alone cannot produce a distribution that resembles the empirical data. In order to replicate the features of the empirical distribution, the simulation must incorporate two WGD events. These WGDs must be such that a significant number of the gene duplicates generated in the WGDs have a higher retention rate than they do following small-scale duplication (SSD). This requirement is consistent with what is known about duplicate retention following a WGD, namely, that genes belonging to specific functional classes, such as genes regulating transcription, are much more likely to be retained following WGD than SSD. We conclude that the deviation from the power-law that we observe in the empirical data is the result of the two WGDs that occurred in the ancestral chordate. This implies that the two ancient WGDs continue to have a structural effect on gene families approximately 500 million years after the initial events. On the one hand, this is a surprising result, given the limited retention of duplicates generated by a WGD and the continual SSD, which further weakens the signal created by the fraction of duplicate pairs that are retained. On the other hand, WGD’s capacity to fundamentally change the architecture of gene families in a profound and lasting way is consistent with the observed correlation between WGDs and important evolutionary transitions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Abi-Rached L, Gilles A, Shiina T, Pontarotti P, Inoko H (2002) Evidence of en bloc duplication in vertebrate genomes. Nature Genet 31:100–105

    Article  PubMed  CAS  Google Scholar 

  • Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

    Article  PubMed  CAS  Google Scholar 

  • Aury JM, Jaillon O, Duret L, Noel B, Jubin C, Porcel BM, Ségurens B, Daubin V, Anthouard V, Aiach N, Arnaiz O, Billaut A, Beisson J, Blanc I, Bouhouche K, Câmara F, Duharcourt S, Guigo R, Gogendeau D, Katinka M, Keller AM, Kissmehl R, Klotz C, Koll F, Mouël AL, Lepère G, Malinsky S, Nowacki M, Nowak JK, Plattner H, Poulain J, Ruiz F, Serrano V, Zagulski M, Dessen P, Bétermier M, Weissenbach J, Scarpelli C, Schächter V, Sperling L, Meyer E, Cohen J, Wincker P (2006) Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature 444:171–178

    Article  PubMed  CAS  Google Scholar 

  • Birney E, Andrews D, Caccamo M et al (2006) Ensembl 2006. Nucleic Acids Res 34:D556–D561

    Article  PubMed  CAS  Google Scholar 

  • Blair JE, Hedges SB (2005) Molecular phylogeny and divergence times of deuterostome animals. Mol Biol Evol 22:2275–2284

    Article  PubMed  CAS  Google Scholar 

  • Blanc G, Wolfe KH (2004) Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell 16:1679–1691

    Article  PubMed  CAS  Google Scholar 

  • Blomme T, Vandepoele K, Bodt SD, Simillion C, Maere S, van de Peer Y (2006) The gain and loss of genes during 600 million years of vertebrate evolution. Genome Biol 7:R43

    Article  PubMed  Google Scholar 

  • Brunet FG, Crollius HR, Paris M, Aury JM, Gibert P, Jaillon O, Laudet V, Robinson Rechavi M (2006) Gene loss and evolutionary rates following whole-genome duplication in teleost fishes. Mol Biol Evol 23:1808–1816

    Article  PubMed  CAS  Google Scholar 

  • Christoffels A, Koh EGL, Chia JM, Brenner S, Aparicio S, Venkatesh B (2004) Fugu genome analysis provides evidence for a whole-genome duplication early during the evolution of ray-finned fishes. Mol Biol Evol 21:1146–1151

    Article  PubMed  CAS  Google Scholar 

  • Dehal P, Boore JL (2005) Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol 3:e314

    Article  PubMed  Google Scholar 

  • Demuth JP, Bie TD, Stajich JE, Cristianini N, Hahn MW (2006) The evolution of mammalian gene families. PLoS ONE 1:e85

    Article  PubMed  Google Scholar 

  • Enright AJ, Dongen SV, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30:1575–1584

    Article  PubMed  CAS  Google Scholar 

  • Enright AJ, Kunin V, Ouzounis CA (2003) Protein families and TRIBES in genome sequence space. Nucleic Acids Res 31:4632–4638

    Article  PubMed  CAS  Google Scholar 

  • Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J (1999) Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151:1531–1545

    PubMed  CAS  Google Scholar 

  • Friedman R, Hughes AL (2001) Pattern and timing of gene duplication in animal genomes. Genome Res 11:1842–1847

    Article  PubMed  CAS  Google Scholar 

  • Friedman R, Hughes AL (2003) The temporal distribution of gene duplication events in a set of highly conserved human gene families. Mol Biol Evol 20:154–161

    Article  PubMed  CAS  Google Scholar 

  • Gilad Y, Man O, Pääbo S, Lancet D (2003) Human specific loss of olfactory receptor genes. Proc Natl Acad Sci USA 100:3324–3327

    Article  PubMed  CAS  Google Scholar 

  • Graur D, Martin W (2004) Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends Genet 20:80–86

    Article  PubMed  CAS  Google Scholar 

  • Harrison PM, Gerstein M (2002) Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol 318:1155–1174

    Article  PubMed  CAS  Google Scholar 

  • He X, Zhang J (2005) Rapid subfunctionalization accompanied by prolonged and substantial neofunctionalization in duplicate gene evolution. Genetics 169:1157–1164

    Article  PubMed  Google Scholar 

  • Hedges SB, Kumar S (2004) Precision of molecular time estimates. Trends Genet 20:242–247

    Article  PubMed  CAS  Google Scholar 

  • Hughes AL, da Silva J, Friedman R (2001) Ancient genome duplications did not structure the human hox-bearing chromosomes. Genome Res 11:771–780

    Article  PubMed  CAS  Google Scholar 

  • Hughes T, Liberles D (2007) The pattern of evolution of smaller-scale gene duplicates in mammalian genomes is more consistent with neo- than subfunctionalisation. J Mol Evol 65:574–588

    Article  PubMed  CAS  Google Scholar 

  • Hughes T, Liberles DA (2008) The power-law distribution of gene family size is driven by the pseudogenisation rate’s heterogeneity between gene families. Gene 414:85–94

    Article  PubMed  CAS  Google Scholar 

  • Huynen MA, van Nimwegen E (1998) The frequency distribution of gene family sizes in complete genomes. Mol Biol Evol 15:583–589

    PubMed  CAS  Google Scholar 

  • Jaillon O, Aury JM, Brunet F, Petit JL, Stange-Thomann N, Mauceli E, Bouneau L, Fischer C, Ozouf-Costaz C, Bernot A, Nicaud S, Jaffe D, Fisher S, Lutfalla G, Dossat C, Segurens B, Dasilva C, Salanoubat M, Levy M, Boudet N, Castellano S, Anthouard V, Jubin C, Castelli V, Katinka M, Vacherie B, Biémont C, Skalli Z, Cattolico L, Poulain J, Berardinis VD, Cruaud C, Duprat S, Brottier P, Coutanceau JP, Gouzy J, Parra G, Lardier G, Chapple C, McKernan KJ, McEwan P, Bosak S, Kellis M, Volff JN, Guigó R, Zody MC, Mesirov J, Lindblad-Toh K, Birren B, Nusbaum C, Kahn D, Robinson-Rechavi M, Laudet V, Schachter V, Quétier F, Saurin W, Scarpelli C, Wincker P, Lander ES, Weissenbach J, Crollius HR (2004) Genome duplication in the teleost fish tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature 431:946–957

    Article  PubMed  Google Scholar 

  • Kikuta H, Laplante M, Navratilova P, Komisarczuk AZ, Engström PG, Fredman D, Akalin A, Caccamo M, Sealy I, Howe K, Ghislain J, Pezeron G, Mourrain P, Ellingsen S, Oates AC, Thisse C, Thisse B, Foucher I, Adolf B, Geling A, Lenhard B, Becker TS (2007) Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. Genome Res 17:545–555

    Article  PubMed  CAS  Google Scholar 

  • Koonin EV (2003) Comparative genomics, minimal gene-sets and the last universal common ancestor. Nature Rev Microbiol 1:127–136

    Article  CAS  Google Scholar 

  • Lundin LG, Larhammar D, Hallböök F (2003) Numerous groups of chromosomal regional paralogies strongly indicate two genome doublings at the root of the vertebrates. J Struct Funct Genomics 3:53–63

    Article  PubMed  CAS  Google Scholar 

  • Luscombe NM, Qian J, Zhang Z, Johnson T, Gerstein M (2002) The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties. Genome Biol 3(8):research0040.1-0040.7. Available at: http://www.genomebiology.com/2002/3/8/research/0040

  • Lynch M, Conery JS (2000) The evolutionary fate and consequences of duplicate genes. Science 290:1151–1155

    Article  PubMed  CAS  Google Scholar 

  • Lynch M, Conery JS (2003) The evolutionary demography of duplicate genes. J Struct Funct Genomics 3:35–44

    Article  PubMed  CAS  Google Scholar 

  • Lynch M, Force A (2000) The probability of duplicate gene preservation by subfunctionalization. Genetics 154:459–473

    PubMed  CAS  Google Scholar 

  • Maere S, Bodt SD, Raes J, Casneuf T, Montagu MV, Kuiper M, de Peer YV (2005) Modeling gene and genome duplications in eukaryotes. Proc Natl Acad Sci USA 102:5454–5459

    Article  PubMed  CAS  Google Scholar 

  • McLysaght A, Hokamp K, Wolfe KH (2002) Extensive genomic duplication during early chordate evolution. Nature Genet 31:200–204

    Article  PubMed  CAS  Google Scholar 

  • Ohno S (1970) Evolution by gene duplication. Springer-Verlag, New York

    Google Scholar 

  • Promponas VJ, Enright AJ, Tsoka S, Kreil DP, Leroy C, Hamodrakas S, Sander C, Ouzounis CA (2000) CAST: an iterative algorithm for the complexity analysis of sequence tracts. Bioinformatics 16:915–922

    Article  PubMed  CAS  Google Scholar 

  • Rastogi S, Liberles DA (2005) Subfunctionalization of duplicated genes as a transition state to neofunctionalization. BMC Evol Biol 5:28

    Article  PubMed  Google Scholar 

  • Rastogi S, Reuter N, Liberles DA (2006) Evaluation of models for the evolution of protein sequences and functions under structural constraint. Biophys Chem 124:134–144

    Article  PubMed  CAS  Google Scholar 

  • Rivera MC, Jain R, Moore JE, Lake JA (1998) Genomic evidence for two functionally distinct gene classes. Proc Natl Acad Sci USA 95:6239–6244

    Article  PubMed  CAS  Google Scholar 

  • Roth C, Betts MJ, Steffansson P, Saelensminde G, Liberles DA (2005) The Adaptive Evolution Database (TAED): a phylogeny based tool for comparative genomics. Nucleic Acids Res 33:D495–D497

    Article  PubMed  CAS  Google Scholar 

  • Vandepoele K, Vos WD, Taylor JS, Meyer A, de Peer YV (2004) Major events in the genome evolution of vertebrates: paranome age and size differ considerably between ray-finned fishes and land vertebrates. Proc Natl Acad Sci USA 101:1638–1643

    Article  PubMed  CAS  Google Scholar 

  • Wang Y, Gu X (2000) Evolutionary patterns of gene families generated in the early stage of vertebrates. J Mol Evol 51:88–96

    PubMed  CAS  Google Scholar 

  • Woods IG, Wilson C, Friedlander B, Chang P, Reyes DK, Nix R, Kelly PD, Chu F, Postlethwait JH, Talbot WS (2005) The zebrafish gene map defines ancestral vertebrate chromosomes. Genome Res 15:1307–1314

    Article  PubMed  CAS  Google Scholar 

  • Yanai I, Camacho CJ, DeLisi C (2000) Predictions of gene family distributions in microbial genomes: evolution by gene duplication and modification. Phys Rev Lett 85:2641–2644

    Article  PubMed  CAS  Google Scholar 

  • Yang Z, Nielsen R (1998) Synonymous and nonsynonymous rate variation in nuclear genes of mammals. J Mol Evol 46:409–418

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgment

This work was funded by FUGE, the functional genomics platform of the Norwegian Research Council.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Timothy Hughes or David A. Liberles.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hughes, T., Liberles, D.A. Whole-Genome Duplications in the Ancestral Vertebrate Are Detectable in the Distribution of Gene Family Sizes of Tetrapod Species. J Mol Evol 67, 343–357 (2008). https://doi.org/10.1007/s00239-008-9145-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-008-9145-x

Keywords

Navigation