Skip to main content

Advertisement

Log in

Proteome Evolution and the Metabolic Origins of Translation and Cellular Life

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

The origin of life has puzzled molecular scientists for over half a century. Yet fundamental questions remain unanswered, including which came first, the metabolic machinery or the encoding nucleic acids. In this study we take a protein-centric view and explore the ancestral origins of proteins. Protein domain structures in proteomes are highly conserved and embody molecular functions and interactions that are needed for cellular and organismal processes. Here we use domain structure to study the evolution of molecular function in the protein world. Timelines describing the age and function of protein domains at fold, fold superfamily, and fold family levels of structural complexity were derived from a structural phylogenomic census in hundreds of fully sequenced genomes. These timelines unfold congruent hourglass patterns in rates of appearance of domain structures and functions, functional diversity, and hierarchical complexity, and revealed a gradual build up of protein repertoires associated with metabolism, translation and DNA, in that order. The most ancient domain architectures were hydrolase enzymes and the first translation domains had catalytic functions for the aminoacylation and the molecular switch-driven transport of RNA. Remarkably, the most ancient domains had metabolic roles, did not interact with RNA, and preceded the gradual build-up of translation. In fact, the first translation domains had also a metabolic origin and were only later followed by specialized translation machinery. Our results explain how the generation of structure in the protein world and the concurrent crystallization of translation and diversified cellular life created further opportunities for proteomic diversification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Abbreviations

aRS:

Aminoacyl-tRNA synthetase

F:

Fold

FSF:

Fold superfamily

FF:

Fold family

Nd:

Node distance

r-Protein:

Ribosomal protein

SCOP:

Structural classification of proteins

References

  • Altman S (2009) A view of RNase P. Mol Biosys 3:604–607

    Google Scholar 

  • Andreeva A, Howorth D, Chandonia J-M, Brenner SE, Hubbard TJP, Chothia C, Murzin AG (2008) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 36:D419–D425

    CAS  PubMed  Google Scholar 

  • Archie JW (1989) Homoplasy excess ratios: new indices for measuring levels of homoplasy in phylogenetic systematics and a critique of the consistency index. Syst Zool 38:253–269

    Google Scholar 

  • Ashburner M, Ball CA, Blake JA et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29

    CAS  PubMed  Google Scholar 

  • Bacher JM, Waas WF, Metzgar D, de Crecy-Lagard V, Schimmel P (2007) Genetic code ambiguity confers a selective advantage on Acinetobacter baybili. J Bacteriol 189:6469–6496

    Google Scholar 

  • Bagley RJ, Farmer JD, Fontana W (1991) Evolution of metabolism. In: Langton CG, Taylor C, Farmer JD, Rasmussen S (eds) Artificial life II. Studies in the science of complexity, vol X. Addison-Wesley, Princeton, pp 141–158

    Google Scholar 

  • Berchtold H, Reshetnikova L, Reiser COA, Schirmer NK, Sprinzl M, Hilgenfeld R (1993) Crystal structure of active elongation factor Tu reveals major domain rearrangements. Nature 365:126–132

    CAS  PubMed  Google Scholar 

  • Bogdanov AA, Dontsova OA, Dokudovskaya SS, Lavrik IN (1995) Structure and function of 5S rRNA in the ribosome. Biochem Cell Biol 73:869–876

    CAS  PubMed  Google Scholar 

  • Brenner SE, Kohl P, Levitt M (2000) The ASTRAL compendium for protein and sequence analysis. Nucleic Acids Res 29:254–256

    Google Scholar 

  • Britton RA (2009) Role of GTPases in bacterial ribosome assembly. Annu Rev Microbiol 63:155–176

    CAS  PubMed  Google Scholar 

  • Caetano-Anollés G (2002) Tracing the evolution of RNA structure in ribosomes. Nucleic Acids Res 30:2575–2587

    PubMed  Google Scholar 

  • Caetano-Anollés G, Caetano-Anollés D (2003) An evolutionarily structured universe of protein architecture. Genome Res 13:1563–1571

    PubMed  Google Scholar 

  • Caetano-Anollés G, Kim HS, Mittenthal JE (2007) The origin of modern metabolic networks inferred from phylogenomic analysis of protein architecture. Proc Natl Acad Sci USA 104:9358–9363

    PubMed  Google Scholar 

  • Caetano-Anollés G, Wang M, Caetano-Anollés D, Mittenthal JE (2009a) The origin, evolution and structure of the protein world. Biochem J 417:621–637

    PubMed  Google Scholar 

  • Caetano-Anollés G, Yafremava LS, Gee H, Caetano-Anollés D, Kim HS, Mittenthal JE (2009b) The origin and evolution of modern metabolism. Intl J Biochem Cell Biol 41:285–297

    Google Scholar 

  • Caetano-Anollés G, Yafremava LS, Mittenthal JE (2010) Modularity and dissipation in evolution of macromolecular structures, functions, and networks. In: Caetano-Anollés G (ed) Evolutionary genomics and systems biology. Wiley, Hoboken, pp 431–450

    Google Scholar 

  • Choi I-G, Kim S-H (2007) Global extent of horizontal gene transfer. Proc Natl Acad Sci USA 104:4489–4494

    CAS  PubMed  Google Scholar 

  • Chothia C, Gough J (2009) Genomic and structural aspects of protein evolution. Biochem J 419:15–28

    CAS  PubMed  Google Scholar 

  • Collins LJ, Kurland CG, Biggs P, Penny D (2009) The modern RNP world of eukaryotes. J Hered 100:597–604

    CAS  PubMed  Google Scholar 

  • Coulson AFW, Moult J (2002) A unifold, mesofold, and superfold model of protein fold use. Proteins 46:61–71

    CAS  PubMed  Google Scholar 

  • Csaba G, Birzele F, Zimmer R (2009) Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis. BMC Struct Biol 9:23

    PubMed  Google Scholar 

  • Daigle DM, Brown ED (2004) Studies of the interaction of Escherichia coli YjeQ with the ribosome in vitro. J Bacteriol 186:1381–1387

    CAS  PubMed  Google Scholar 

  • Danchin A, Fang G, Noria S (2007) The extant core bacterial proteome is an archive of the origin of life. Proteomics 7:875–889

    CAS  PubMed  Google Scholar 

  • Deutscher MP (1984) Processing of tRNA in prokaryotes and eukaryotes. CRC Crit Rev Biochem 17:45–71

    CAS  PubMed  Google Scholar 

  • Dokudovskaya S, Dontsova O, Shpanchenko O, Bogdanov A, Brimacombe R (1996) Loop IV of 5S ribosomal RNA has contacts both to domain II and to domain V of the 23S RNA. RNA 2:146–152

    CAS  PubMed  Google Scholar 

  • Doolittle WF (1999) Phylogenetic classification and the universal tree. Science 284:2124–2129

    CAS  PubMed  Google Scholar 

  • Doolittle RF (2005) Evolutionary aspects of whole-genome biology. Curr Opin Struct Biol 15:248–253

    CAS  PubMed  Google Scholar 

  • Dupont CL, Butcher A, Valas RE, Bourne PE, Caetano-Anollés G (2010) History of biological metal utilization inferred through phylogenomic analysis of protein structure. Proc Natl Acad Sci USA 107:10567–10572

    CAS  PubMed  Google Scholar 

  • Egel R (2009) Peptide-dominated membranes preceding the genetic takeover by RNA: latest thinking on a classic controversy. BioEssays 31:1100–1109

    CAS  PubMed  Google Scholar 

  • Ellington AD, Chen X, Robertson M, Syrett A (2009) Evolutionary origins and directed evolution of RNA. Intl J Biochem Cell Biol 41:254–265

    CAS  Google Scholar 

  • Forslund K, Henricson A, Hollich V, Sonnhammer E (2008) Domain tree-based analysis of protein architecture evolution. Mol Biol Evol 25:254–264

    CAS  PubMed  Google Scholar 

  • Fox SW (1980) Metabolic microspheres. Naturwissenschaften 67:378–383

    CAS  PubMed  Google Scholar 

  • Freeland SJ, Knight RD, Landweber LF (1999) Do proteins predate DNA. Science 286:690–692

    CAS  PubMed  Google Scholar 

  • Gesteland RF, Cech TR, Atkins JF (2006) The RNA world, 3rd edn. Cold Spring Harbor Laboratory Press, New York

    Google Scholar 

  • Goldman AD, Samudrala R, Baross JA (2010) The evolution and functional repertoire of translation proteins following the origin of life. Biol Direct 5:15

    PubMed  Google Scholar 

  • Gough J (2005) Convergent evolution of domain architectures (is rare). Bioinformatics 21:1464–1471

    CAS  PubMed  Google Scholar 

  • Gough J, Karplus K, Hughey R, Chothia C (2001) Assignment of homology to genome sequences using a library of Hidden Markov Models that represent all proteins of known structure. J Mol Biol 313:903–919

    CAS  PubMed  Google Scholar 

  • Greene LH, Lewis TE, Addou S, Cuff A, Dallman T, Dibley M, Redfern O, Pearl F, Nambudiry R, Reid A, Sillitoe I, Yeats C, Thornton JM, Orengo CA (2007) The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution. Nucleic Acids Res 35:D291–D297

    CAS  PubMed  Google Scholar 

  • Hartwell LH, Hopfield JJ, Leibler S, Murray AW (1999) From molecular to modular cell biology. Nature 402:C47–C52

    CAS  PubMed  Google Scholar 

  • Hillis DM, Huelsenbeck JP (1992) Signal, noise, and reliability in molecular phylogenetic analysis. J Hered 83:189–195

    CAS  PubMed  Google Scholar 

  • Holland T, Veretnik S, Shindyalov I, Bourne P (2006) Partitioning protein structures into domains: why is it so difficult? J Mol Biol 361:562–590

    CAS  PubMed  Google Scholar 

  • Holzmann J, Frank P, Löfler E, Bennett KL, Gerner C, Rossmanith W (2008) RNase P without RNA: identification and functional reconstitution of the human mitochondrial tRNA processing enzyme. Cell 135:462–474

    CAS  PubMed  Google Scholar 

  • Hoogstraten CG, Sumita M (2007) Structure-function relationships in RNA and RNP enzymes: recent advances. Biopolymers 87:317–328

    CAS  PubMed  Google Scholar 

  • Huber C, Wächtershäuser G (2007) α-Hydroxy and α-amino acids under possible Hadean, volcanic origin-of-life conditions. Science 314:630–632

    Google Scholar 

  • Ikehara K (2009) Pseudo-replication of [GADV]-proteins and origin of life. Int J Mol Sci 10:1525–1537

    CAS  PubMed  Google Scholar 

  • Jain R, Rivera MC, Lake JA (1999) Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA 96:3801–3806

    CAS  PubMed  Google Scholar 

  • Jeffares DC, Poole AM, Penny D (1998) Relics from the RNA world. J Mol Evol 46:18–36

    CAS  PubMed  Google Scholar 

  • Ji HF, Kong DX, Shen L, Chen LL, Ma BG, Zhang HY (2007) Distribution patterns of small molecule ligands in the protein universe and implications for origins of life and drug discovery. Genome Biol 8:R176

    PubMed  Google Scholar 

  • Joyce GF (2002) The antiquity of RNA-based evolution. Nature 418:214–221

    CAS  PubMed  Google Scholar 

  • Kacser H, Beeby R (1984) On the origin of enzyme species by means of natural selection. J Mol Evol 20:38–51

    CAS  PubMed  Google Scholar 

  • Karplus K (2009) SAM-T08, HHM-based protein structure prediction. Nucleic Acids Res 37:W492–W497

    CAS  PubMed  Google Scholar 

  • Kauffmann SA (1986) Autocatalytic sets of proteins. J Theor Biol 119:1–24

    Google Scholar 

  • Kauffmann SA (1993) The origins of order. Oxford University Press, New York

    Google Scholar 

  • Kauffmann SA (2007) Question 1: origin of life and the living state. Orig Life Evol Biosph 37:315–322

    Google Scholar 

  • Kim KM, Caetano-Anollés G (2010) Emergence and evolution of modern molecular functions inferred from phylogenomic analysis of ontological data. Mol Biol Evol 27:1710–1733

    CAS  PubMed  Google Scholar 

  • Kim HS, Mittenthal JE, Caetano-Anollés G (2006) MANET: tracing evolution of protein architecture in metabolic networks. BMC Bioinformatics 7:351

    PubMed  Google Scholar 

  • Kluge AG, Farris JS (1969) Quantitative phyletics and the evolution of anurans. Syst Zool 30:1–32

    Google Scholar 

  • Kummerfeld SK, Teichmann SA (2009) Protein domain organization: adding order. BMC Bioinformatics 10:39

    PubMed  Google Scholar 

  • Kurland CG, Canback B, Berg OG (2003) Horizontal gene transfer: a critical view. Proc Natl Acad Sci USA 100:9658–9662

    CAS  PubMed  Google Scholar 

  • Leibundgut M, Frick C, Thanbichler M, Böck A, Ban N (2005) Selenocysteine tRNA-specific elongation factor SelB is a structural chimaera of elongation and initiation factors. EMBO J 24:11–22

    CAS  PubMed  Google Scholar 

  • Lesk AM (2001) Introduction to protein architecture. Oxford University Press, New York, USA

    Google Scholar 

  • Levitt M (2009) Nature of the protein universe. Proc Natl Acad Sci USA 106:11079–11084

    CAS  PubMed  Google Scholar 

  • Li J, Browning S, Mahal SP, Oelschiegel AM, Weissmann C (2010) Darwinian evolution of prions in cell culture. Science 327:869–872

    CAS  PubMed  Google Scholar 

  • Maguire BA, Beniaminov AD, Ramu H, Mankin AS, Zimmermann RA (2005) A protein component at the heart of an RNA machine: the importance of protein L27 for the function of the bacterial ribosome. Molecular Cell 20:427–435

    CAS  PubMed  Google Scholar 

  • Marahiel MA (2009) Working outside the protein-synthesis rules: insights into non-ribosomal peptide synthesis. J Pept Sci 15:799–807

    CAS  PubMed  Google Scholar 

  • Moore P (2005) The GTPase switch in ribosomal translocation. J Biol 4:7

    PubMed  Google Scholar 

  • Moore AD, Björklund ÅK, Ekman D, Bornberg-Buer E, Elofsson A (2008) Arrangements in the modular evolution of proteins. Trends Biochem Sci 33:444–451

    CAS  PubMed  Google Scholar 

  • Morowitz HJ (1999) A theory of biochemical organization, metabolic pathways, and evolution. Complexity 4:39–53

    Google Scholar 

  • Murzin AG, Brenner SE, Hubbard TH, Chothia C (1995) SCOP: the structural classification of proteins database. J Mol Biol 247:536–540

    CAS  PubMed  Google Scholar 

  • Nixon KC (1999) The parsimony ratchet, a new method for rapid parsimony analysis. Cladistics 15:407–414

    Google Scholar 

  • Orgel LE (2000) Self-organizing biochemical cycles. Proc Natl Acad Sci USA 97:12503–12507

    CAS  PubMed  Google Scholar 

  • Philippe H, Laurent J (1998) How good are deep phylogenetic trees? Curr Opin Genet Dev 8:616–623

    CAS  PubMed  Google Scholar 

  • Raff R (1996) The shape of life. University of Chicago Press, Chicago

    Google Scholar 

  • Ranea JAG, Sillero A, Thornton JM, Orengo CA (2006) Protein superfamily evolution and the last universal common ancestor (LUCA). J Mol Evol 63:513–525

    CAS  PubMed  Google Scholar 

  • Rodnina MV, Wintermeyer W (2009) Recent mechanistic insights into eukaryotic ribosomes. Curr Opin Cell Biol 21:435–443

    CAS  PubMed  Google Scholar 

  • Schimmel P (2009) Development of tRNA synthetases and connection to genetic code and disease. Protein Sci 17:1643–1652

    Google Scholar 

  • Schimmel P, Ribas de Pouplana L (2000) Footprints of aminoacyl-tRNA synthetases are everywhere. Trends Genet 25:207–209

    CAS  Google Scholar 

  • Schuster P (2010) Genotypes and phenotypes in the evolution of molecules. In: Caetano-Anollés G (ed) Evolutionary genomics systems biology. Wiley, Hoboken, pp 123–152

    Google Scholar 

  • Seiradake E, Mao W, Hernandez V, Baker SJ, Plattner JJ, Alley MRK, Cusack S (2009) Structure of the human cytosolic leucyl-tRNA synthetase editing domain. J Mol Biol 390:196–207

    CAS  PubMed  Google Scholar 

  • Sun F-J, Caetano-Anollés G (2008a) Evolutionary patterns in the sequence and structure of transfer RNA: a window into early translation and the genetic code. PLoS ONE 3:e2799

    PubMed  Google Scholar 

  • Sun F-J, Caetano-Anollés G (2008b) The origin and evolution of tRNA inferred from phylogenetic analysis of structure. J Mol Evol 66:21–35

    CAS  PubMed  Google Scholar 

  • Sun F-J, Caetano-Anollés G (2009) The evolutionary history of the structure of 5S ribosomal RNA. J Mol Evol 69:430–443

    CAS  PubMed  Google Scholar 

  • Sun F-J, Caetano-Anollés G (2010) The ancient history of the structure of ribonuclease P and the early origins of Archaea. BMC Bioinformatics 11:153

    PubMed  Google Scholar 

  • Swofford DL (2002) Phylogenetic Analysis Using Parsimony and Other Programs (PAUP*). Ver 4.0b10. Sinauer, Sunderland, MA

    Google Scholar 

  • Trefil J, Morowitz HJ, Smith E (2009) The origins of life. Am Sci 97:206–213

    Google Scholar 

  • Tress ML, Ezkurdia A, Richardson JS (2009) Target domain definition and classification in CAP8. Proteins 77:10–17

    CAS  PubMed  Google Scholar 

  • Vetsigian K, Woese CR, Goldenfeld N (2006) Collective evolution and the genetic code. Proc Natl Acad Sci USA 103:10696–10701

    CAS  PubMed  Google Scholar 

  • Vogel C (2005) Function annotation of SCOP domain superfamilies 1.69. Superfamily—HMM library and genome assignments server. http://supfam.mrc-lmb.cam.ac.uk/beta_SUPERFAMILY/function.html

  • Vogel C, Chothia C (2006) Protein family expansions and biological complexity. PLoS Comp Biol 2:e48

    Google Scholar 

  • Voorhees RM, Weixlbaumer A, Loakes D, Kelley AC, Ramakrishnan V (2009) Insights into substrate stabilization from snapshots of the peptidyl transferase center of the intact 70S ribosome. Nat Struct Mol Biol 16:528–533

    CAS  PubMed  Google Scholar 

  • Wächtershäuser G (1990) Evolution of the first metabolic cycles. Proc Natl Acad Sci USA 87:200–204

    PubMed  Google Scholar 

  • Wächtershäuser G (2007) On the chemistry and evolution of the pioneer organism. Chem Biodivers 4:584–602

    PubMed  Google Scholar 

  • Wang M, Caetano-Anollés G (2006) Global phylogeny determined by the combination of protein domains in proteomes. Mol Biol Evol 23:2444–2454

    CAS  PubMed  Google Scholar 

  • Wang M, Caetano-Anollés G (2009) The evolutionary mechanics of domain organization in proteomes and the rise of modularity in the protein world. Structure 17:66–78

    CAS  PubMed  Google Scholar 

  • Wang M, Boca SM, Kalelkar R, Mittenthal JE, Caetano-Anollés G (2006) A phylogenomic reconstruction of the protein world based on a genomic census of protein fold architecture. Complexity 12:27–40

    CAS  Google Scholar 

  • Wang M, Yafremava LS, Caetano-Anollés D, Mittenthal JE, Caetano-Anollés G (2007) Reductive evolution of architectural repertoires in proteomes and the birth of the tripartite world. Genome Res 17:1572–1585

    PubMed  Google Scholar 

  • Wang M, Jiang Y-Y, Kim KM, Qu G, Ji HF, Mittenthal JE, Zhang H-Y, Caetano-Anollés G (2010) A universal molecular clock of protein folds and its power in tracing the early history of aerobic metabolism and planet oxygenation. Mol Biol Evol [Epub ahead of print]

  • Woese CR (1998) The universal ancestor. Proc Natl Acad Sci USA 95:6854–6859

    CAS  PubMed  Google Scholar 

  • Woese CR (2002) On the evolution of cells. Proc Natl Acad Sci USA 99:8742–8747

    CAS  PubMed  Google Scholar 

  • Wolf YI, Aravind L, Grishin NV, Koonnin EV (1999) Evolution of aminoacyl-tRNA synthetases—analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events. Genome Res 9:689–710

    CAS  PubMed  Google Scholar 

  • Yang S, Bourne PE (2009) The evolutionary history of protein domains viewed by species phylogeny. PLoS ONE 4:e8378

    PubMed  Google Scholar 

  • Yang S, Doolittle RF, Bourne PE (2005) Phylogeny determined based on protein domain content. Proc Natl Acad Sci USA 102:373–378

    CAS  PubMed  Google Scholar 

  • Ycas M (1974) On earlier states of the biochemical system. J Theor Biol 44:145–160

    CAS  PubMed  Google Scholar 

  • Yomo T, Saito S, Sasai M (1999) Gradual development of protein-like global structures through functional selection. Nat Struct Biol 6:743–746

    CAS  PubMed  Google Scholar 

  • Yusupov MM, Yusupov GZ, Baucom A, Lieberman L, Earnest TN, Cate JHD, Noller HF (2001) Crystal structure of the ribosome at 5.5 Å resolution. Science 292:883–896

    CAS  PubMed  Google Scholar 

  • Zavialov AV, Hauryliuk VV, Ehrenberg M (2005) Guanine-nucleotide exchange on ribosome-bound elongation factor G initiates the translocation of tRNAs. J Biol 4:9

    PubMed  Google Scholar 

Download references

Acknowledgments

A substantial portion of this work is part of DCA’s undergraduate thesis. We thank Ajith Harish and Feng-Jie Sun for providing data on RNA-protein interactions, Minglei Wang for phylogenomic reconstruction, and Rakhee Kalelkar for help with construction of Z-diagrams. Research was supported by the National Science Foundation (MCB-0749836), the Illinois C-FAR program, CREES-USDA, and the International Atomic Energy Agency in Vienna. Any opinions, findings, and conclusions and recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gustavo Caetano-Anollés.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 733 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Caetano-Anollés, D., Kim, K.M., Mittenthal, J.E. et al. Proteome Evolution and the Metabolic Origins of Translation and Cellular Life. J Mol Evol 72, 14–33 (2011). https://doi.org/10.1007/s00239-010-9400-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-010-9400-9

Keywords

Navigation