Skip to main content
Log in

Modular Assembly of Genes and the Evolution of New Functions

  • Published:
Genetica Aims and scope Submit manuscript

Abstract

Modular assembly of novel genes from existing genes has long been thought to be an important source of evolutionary novelty. Thanks to major advances in genomic studies it has now become clear that this mechanism contributed significantly to the evolution of novel biological functions in different evolutionary lineages. Analyses of completely sequenced bacterial, archaeal and eukaryotic genomes has revealed that modular assembly of novel constituents of various eukaryotic intracellular signalling pathways played a major role in the evolution of eukaryotes. Comparison of the genomes of single-celled eukaryotes, multicellular plants and animals has also shown that the evolution of multicellularity was accompanied by the assembly of numerous novel extracellular matrix proteins and extracellular signalling proteins that are absolutely essential for multicellularity. There is now strong evidence that exon-shuffling played a general role in the assembly of the modular proteins involved in extracellular communications of metazoa. Although some of these proteins seem to be shared by all major groups of metazoa, others are restricted to certain evolutionary lineages. The genomic features of the chordates appear to have favoured intronic recombination as evidenced by the fact that exon-shuffling continued to be a major source of evolutionary novelty during vertebrate evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adami, C., C. Ofria, & T.C. Collier, 2000. Evolution of biological complexity. Proc. Natl. Acad. Sci. USA 97: 4463-4468.

    PubMed  Google Scholar 

  • Adams, M.D., S.E. Celniker, R.A. Holt, C.A. Evans, J.D. Gocayne et al., 2000. The genome sequence of Drosophila melanogaster. Science 287: 2185-2195.

    Google Scholar 

  • Al-Sharif, W.Z., J.O. Sunyer, J.D. Lambris & L.C. Smith, 1998. Sea urchin coelomocytes specifically express a homologue of the complement component C3. J. Immunol. 160: 2983-2997.

    PubMed  Google Scholar 

  • Aravind, L. & G. Subramanian, 1999. Origin of multicellular eukaryotes-insights from proteome comparisons. Curr. Opin. Genet. Dev. 9: 688-694.

    PubMed  Google Scholar 

  • Arnold, J.M., C. Kennett & M.F. Lavin, 1997. Transient expression of a novel serine protease in the ectoderm of the ascidian Herdmania momus during development. Dev. Genes Evol. 206: 455-463.

    Google Scholar 

  • Bakal, C.J. & J.E. Davies, 2000. No longer an exclusive club: eukaryotic signalling domains in bacteria. Trends Cell Biol. 10: 32-38.

    PubMed  Google Scholar 

  • Banfield, D.K., D.M. Irwin, D.A. Walz & R.T.A. MacGillivray, 1994. Evolution of prothrombin: isolation and characterization of the cDNAs encoding chicken and hagfish prothrombin. J. Mol. Evol. 38: 177-187.

    PubMed  Google Scholar 

  • Bányai, L., A. Váradi & L. Patthy, 1983. Common evolutionary origin of the fibrin-binding structures of fibronectin and tissue-type plasminogen activator. FEBS Lett. 163: 37-41.

    PubMed  Google Scholar 

  • Bassett Jr., D.E., M.A. Basrai, C. Connelly, K.M. Hyland, K. Kitagawa et al., 1996. Exploiting the complete yeast genome sequence. Curr. Opin. Genet. Dev. 6: 763-766.

    PubMed  Google Scholar 

  • Blattner, F.R., G. Plunkett, III, C.A. Bloch, N.T. Perna, V. Burland et al., 1997. The complete genome sequence of Escherichia coli K-12. Science 277: 1453-1462.

    Google Scholar 

  • Bork, P., J. Schultz & C.P. Ponting, 1997. Cytoplasmic signalling domains: the next generation. Trends Biochem. Sci. 22: 296-298.

    PubMed  Google Scholar 

  • Bult, C.J., O. White, G.J. Olsen, L. Zhou, R.D. Fleischmann et al., 1996. Complete genome sequence of the methanogenic Archaeon, Methanococcus jannaschii. Science 273: 1058-1073.

    PubMed  Google Scholar 

  • Cameron, R.A., G. Mahairas, J.P. Rast, P. Martinez, T.R. Biondi et al., 2000. A sea urchin genome project: sequence scan, virtual map, and additional resources. Proc. Natl. Acad. Sci. USA 97: 9514-9518.

    PubMed  Google Scholar 

  • Cavalier-Smith, T., 1985. The Evolution of Genome Size. Wiley, New York.

    Google Scholar 

  • Chervitz, S.A., L. Aravind, G. Sherlock, C.A. Ball, E.V. Koonin, et al., 1998. Comparison of the complete protein sets of worm and yeast: orthology and divergence. Science 282: 2022-2028.

    PubMed  Google Scholar 

  • Claverie, J.M., 2001. Gene number. What if there are only 30,000 human genes? Science 291: 1255-1257.

    PubMed  Google Scholar 

  • Copley, R.R., J. Schultz, C.P. Ponting & P. Bork, 1999. Protein families in multicellular organisms. Curr. Opin. Struct. Biol. 9: 408-415.

    PubMed  Google Scholar 

  • de Chateau, M. & L. Bjorck, 1994. Protein PAB, a mosaic albuminbinding bacterial protein representing the first contemporary example of module shuffling. J. Biol. Chem. 269: 12147-12151.

    PubMed  Google Scholar 

  • de Chateau, M. & L. Bjorck, 1996. Identification of interdomain sequences promoting the intronless evolution of a bacterial protein family. Proc. Natl. Acad. Sci. USA 93: 8490-8495.

    PubMed  Google Scholar 

  • Dunham, I., N. Shimizu, B.A. Roe, S. Chissoe, A.R. Hunt et al., 1999. The DNA sequence of human chromosome 22. Nature 402: 489-495.

    Google Scholar 

  • Duret, L., D. Mouchiroud & C. Gautier, 1995. Statistical analysis of vertebrate sequences reveals that long genes are scarce in GCrich isochores. J. Mol. Evol. 40: 308-317.

    PubMed  Google Scholar 

  • Fleischmann, R.D., M.D. Adams, O. White, R.A. Clayton, E.F. Kirkness et al., 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269: 496-512.

    Google Scholar 

  • Fraser, C.M., J.D. Gocayne, O. White, M.D. Adams, R.A. Clayton et al., 1995. The minimal gene complement of Mycoplasma genitalium. Science 270: 397-403.

    Google Scholar 

  • Gilbert, W., 1978. Why genes in pieces? Nature 271: 501.

    Google Scholar 

  • Gerstein, M., 1997. A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. J. Mol. Biol. 274: 562-576.

    PubMed  Google Scholar 

  • Gerstein, M. & M. Levitt, 1997. A structural census of the current population of protein sequences. Proc. Natl. Acad. Sci. USA 94: 11911-11916.

    PubMed  Google Scholar 

  • Herz, J., U. Hamann, S. Rogne, O. Myklebost, H. Gausepohl et al., 1988. Surface location and high affinity for calcium of a 500 kd liver membrane protein closely related to the LDL-receptor suggest a physiological role as lipoprotein receptor. EMBO J. 7: 4119-4127.

    PubMed  Google Scholar 

  • Hogenesch, J.B., K.A. Ching, S. Batalov, A.I. Su, J.R. Walker et al., 2001. A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes. Cell 106: 413-415.

    PubMed  Google Scholar 

  • Hutter, H., B.E. Vogel, J.D. Plenefisch, C.R. Norris, R.B. Proenca et al., 2000. Conservation and novelty in the evolution of cell adhesion and extracellular matrix genes. Science 287: 989-994.

    PubMed  Google Scholar 

  • Hynes, R.O. & Q. Zhao, 2000. The evolution of cell adhesion. J. Cell Biol. 150: F89-F96.

    PubMed  Google Scholar 

  • Jasny, B.R., 2000. The universe of Drosophila genes. Science 287: 2181.

    PubMed  Google Scholar 

  • Jeong, H., S.P. Mason, A.L. Barabasi & Z.N. Oltvai, 2001. Lethality and centrality in protein networks. Nature 411: 41-42.

    PubMed  Google Scholar 

  • Ji, X., K. Azumi, M. Sasaki & M. Nonaka, 1997. Ancient origin of the complement lectin pathway revealed by molecular cloning of mannan binding protein-associated serine protease from a urochordate, the Japanese ascidian, Halocynthia roretzi. Proc. Natl. Acad. Sci. USA 94: 6340-6345.

    PubMed  Google Scholar 

  • Klenk, H.P., R.A. Clayton, J.F. Tomb, O. White, K.E. Nelson et al., 1997. The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 390: 364-370.

    PubMed  Google Scholar 

  • Kunst, F., N. Ogasawara, I. Moszer, A.M. Albertini, G. Alloni et al., 1997. The complete genome sequence of the Gram-positive bacterium Bacillus subtilis. Nature 390: 248-255.

    Google Scholar 

  • Kusche-Gullberg, M., K. Garrison, A.J. MacKrell, L.I. Fessler & J.H. Fessler, 1992. Laminin A chain: expression during Drosophila development and genomic sequence. EMBO J. 11: 4519-4527.

    PubMed  Google Scholar 

  • Lander, E.S., L.M. Linton, B. Birren, C. Nusbaum, M.C. Zody et al., 2001. Initial sequencing and analysis of the human genome. Nature 409: 860-921.

    PubMed  Google Scholar 

  • Li, W.H., Z. Gu, H. Wang & A. Nekrutenko, 2001. Evolutionary analyses of the human genome. Nature 409: 847-849.

    PubMed  Google Scholar 

  • Liang, F., I. Holt, G. Pertea, S. Karamycheva, S.L. Salzberg et al., 2000. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat. Genet. 25: 239-240.

    PubMed  Google Scholar 

  • MacKrell, A.J., M. Kusche-Gullberg & K. Garrison, 1993. Novel Drosophila laminin A chain reveals structural relationships between laminin subunits. FASEB J. 376: 375-381.

    Google Scholar 

  • Miyata, T. & N. Suga, 2001. Divergence pattern of animal gene families and relationship with the Cambrian explosion. BioEssays 23: 1018-1027.

    PubMed  Google Scholar 

  • Nonaka, M., K. Azumi, X. Ji, C. Namikawa-Yamada, M. Sasaki et al., 1999. Opsonic complement component C3 in the solitary ascidian, Halocynthia roretzi. J. Immunol. 162: 387-391.

    PubMed  Google Scholar 

  • Ny, T., F. Elgh & B. Lund, 1984. The structure of human tissuetype plasminogen activator gene: correlation of intron and exon structures to functional and structural domains. Proc. Natl. Acad. Sci. USA 81: 5355-5359.

    PubMed  Google Scholar 

  • Ono, K., H. Suga, N. Iwabe, K. Kuma & T. Miyata, 1999. Multiple protein tyrosine phosphatases in sponges and explosive gene duplication in the early evolution of animals before the parazoan-eumetazoan split. J. Mol. Evol. 48: 654-662.

    PubMed  Google Scholar 

  • Patthy, L., 1985. Evolution of the proteases of blood coagulation and fibrinolysis by assembly from modules. Cell 41: 657-663.

    PubMed  Google Scholar 

  • Patthy, L., 1987. Intron-dependent evolution: preferred types of exons and introns. FEBS Lett. 214: 1-7.

    Google Scholar 

  • Patthy, L., 1991. Modular exchange principles in proteins. Curr. Opin. Struct. Biol. 1: 351-361.

    Google Scholar 

  • Patthy, L., 1994. Exons and introns. Curr. Opin. Struct. Biol. 4: 383-392.

    Google Scholar 

  • Patthy, L., 1995. Protein evolution by exon-shuffling. Molecular Biology Intelligence Unit. R.G. Landes Company, Springer-Verlag, New York, Berlin, Heidelberg, London, Paris, Tokyo, Hong Kong, Barcelona, Budapest.

    Google Scholar 

  • Patthy, L., 1996a. Evolution of human proteins by exon-shuffling, pp. 35-71 in Human Genome Evolution, edited by M. Jackson, T. Strachan & G.A. Dover, Human Molecular Genetics Series BIOS Scientific Publishers Ltd., Oxford.

    Google Scholar 

  • Patthy, L., 1996b. Exon shuffling and other ways of module exchange. Matrix Biol. 15: 301-310.

    PubMed  Google Scholar 

  • Patthy, L., 1999a. Genome evolution and the evolution of exonshuffling-a review. Gene 238: 103-114.

    PubMed  Google Scholar 

  • Patthy, L., 1999b. Protein Evolution. Blackwell Science Ltd., Oxford.

    Google Scholar 

  • Pennisi, E., 2000. Human Genome Project. And the gene number is...? Science 288: 1146-1147.

    PubMed  Google Scholar 

  • Plowman, G.D., S. Sudarsanam, J. Bingham, D. Whyte & T. Hunter, 1999. The protein kinases of Caenorhabditis elegans: a model for signal transduction in multicellular organisms. Proc. Natl. Acad. Sci. USA 96: 3603-3610.

    Google Scholar 

  • Raychowdhury, R., J.L. Niles, R.T. McCluskey & J.A. Smith, 1989. Autoimmune target in Heymann Nephritis is a glycoprotein with homology to the LDL receptor. Science 244: 1163-1165.

    PubMed  Google Scholar 

  • Rubin, G.M., M.D. Yandell, J.R. Wortman, G.L. Gabor Miklos, C.R. Nelson et al., 2000. Comparative genomics of the eukaryotes. Science 24(287): 2204-2215.

    Google Scholar 

  • Saito, A., S. Pietromonaco, A.K. Loo & M.G. Farquhar, 1994. Complete cloning and sequencing of rat gp330/'megalin', a distinctive member of the low density lipoprotein receptor gene family. Proc. Natl. Acad. Sci. USA 91: 9725-9729.

    PubMed  Google Scholar 

  • Sheehan, J., M. Templer, M. Gregory, R. Hanumanthaiah, D. Troyer et al., 2001. Demonstration of the extrinsic coagulation pathway in teleostei: identification of zebrafish coagulation factor VII. Proc. Natl. Acad. Sci. USA 98: 8768-8773.

    PubMed  Google Scholar 

  • Shimeld, S.M., 1998. Characterization of AmphiF-spondin reveals the modular evolution of chordate F-spondin genes. Mol. Biol. Evol. 15: 1218-1223.

    PubMed  Google Scholar 

  • Sidow, A., 1996. Gen(om)e duplications in the evolution of early vertebrates. Curr. Opin. Genet. Dev. 6: 715-722.

    PubMed  Google Scholar 

  • Simmen, M.W., S. Leitgeb, V.H. Clark, S.J.M. Jones & A. Bird, 1998. Gene number in an invertebrate chordate, Ciona intestinalis. Proc. Natl. Acad. Sci. USA 95: 4437-4440.

    PubMed  Google Scholar 

  • Smaglik, P., 2000. Researchers take a gamble on the human genome. Nature 405: 264.

    Google Scholar 

  • Smith, L.C., C.S. Shih & S.G. Dachenhausen, 1998. Coelomocytes express SpBf, a homologue of factor B, the second component in the sea urchin complement system. J. Immunol. 161: 6784-6793.

    PubMed  Google Scholar 

  • Szathmary, E., F. Jordan & C. Pal, 2001. Can genes explain biological complexity? Science 292: 1315-1316.

    PubMed  Google Scholar 

  • Suga, H., M. Koyanagi, D. Hoshiyama, K. Ono, N. Iwabe et al., 1999. Extensive gene duplication in the early evolution of animals before the parazoan-eumetazoan split demonstrated by G proteins and protein tyrosine kinases from sponge and hydra. J. Mol. Evol. 48: 646-653.

    PubMed  Google Scholar 

  • The Arabidopsis Genome Initiative, 1999. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815.

    Google Scholar 

  • The C. elegans Sequencing Consortium, 1998. Genome sequence of the nematode C. elegans. A platform for investigating biology. Science 282: 2012-2018.

    Google Scholar 

  • Venter, J.C., M.D. Adams, E.W. Myers, P.W. Li, R.J. Mural et al., 2001. The sequence of the human genome. Science 291: 1304-1351.

    Google Scholar 

  • Yochem, J. & I. Greenwald, 1993. A gene for a low density lipoprotein-related protein in the nematode Caenorhabditis elegans. Proc. Natl. Acad. Sci. USA 90: 4572-4576.

    PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patthy, L. Modular Assembly of Genes and the Evolution of New Functions. Genetica 118, 217–231 (2003). https://doi.org/10.1023/A:1024182432483

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1024182432483

Navigation