Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Who's your neighbor? New computational approaches for functional genomics

Abstract

Several recently developed computational approaches in comparative genomics go beyond sequence comparison. By analyzing phylogenetic profiles of protein families, domain fusions, gene adjacency in genomes, and expression patterns, these methods predict many functional interactions between proteins and help deduce specific functions for numerous proteins. Although some of the resultant predictions may not be highly specific, these developments herald a new era in genomics in which the benefits of comparative analysis of the rapidly growing collection of complete genomes will become increasingly obvious.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Context-based approaches in comparative genomics.
Figure 2: Phylogenetic patterns, domain fusion, and gene clustering help predict functional pathways.

Similar content being viewed by others

References

  1. Koonin, E.V., Tatusov, R.L. & Galperin, M.Y. Beyond complete genomes: from sequence to structure and function. Curr. Opin. Struct. Biol. 8, 355–363 (1998).

    Article  CAS  PubMed  Google Scholar 

  2. Bork, P. et al. Predicting function: From genes to genomes and back. J. Mol. Biol. 283, 707–725 (1998).

    Article  CAS  PubMed  Google Scholar 

  3. Bork, P. & Koonin, E.V. Predicting functions from protein sequences—where are the bottlenecks? Nat. Genet. 18, 313–318 (1998).

    Article  CAS  PubMed  Google Scholar 

  4. http://www.ncbi.nlm.nih.gov/Entrez/Genome/org.html.

  5. http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/bact.html.

  6. http://www-fp.mcs.anl.gov/~gaasterland/genomes.html.

  7. Huynen, M.J. & Snel, B. Gene and context: integrative approaches to genome analysis. Adv. Prot. Chem. 54, 345–380 (2000).

    CAS  Google Scholar 

  8. Fitch, W.M. Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–106 (1970).

    Article  CAS  PubMed  Google Scholar 

  9. Henikoff, S. et al. Gene families: the taxonomy of protein paralogs and chimeras. Science 278, 609–614 (1997).

    Article  CAS  PubMed  Google Scholar 

  10. Tatusov, R.L., Koonin, E.V. & Lipman, D.J. A genomic perspective on protein families. Science 278, 631–637 (1997).

    Article  CAS  PubMed  Google Scholar 

  11. Tatusov, R.L., Galperin, M.Y., Natale, D.A. & Koonin, E.V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. http://www.ncbi.nlm.nih.gov/COG.

  13. Gaasterland, T. & Ragan, M.A. Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes. Microb. Comp. Genomics 3, 199–217 (1998).

    Article  CAS  PubMed  Google Scholar 

  14. Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D. & Yeates, T.O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. U S A 96, 4285–4288 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Brown, J.R. & Doolittle, W.F. Archaea and the prokaryote-to-eukaryote transition. Microbiol. Mol. Biol. Rev. 61, 456–502 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  16. Makarova, K.S. et al. Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell. Genome Res. 9, 608–628 (1999).

    CAS  PubMed  Google Scholar 

  17. Dandekar, T., Schuster, S., Snel, B., Huynen, M. & Bork, P. Pathway alignment: application to the comparative analysis of glycolytic enzymes. Biochem. J. 343, 115–124 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Huynen, M.A., Dandekar, T. & Bork, P. Variation and evolution of the citric-acid cycle: a genomic perspective. Trends Microbiol. 7, 281–291 (1999).

    Article  CAS  PubMed  Google Scholar 

  19. Ibba, M. et al. A euryarchaeal lysyl-tRNA synthetase: resemblance to class I synthetases. Science 278, 1119–1122 (1997).

    Article  CAS  PubMed  Google Scholar 

  20. Ibba, M., Bono, J.L., Rosa, P.A. & Soll, D. Archaeal-type lysyl-tRNA synthetase in the Lyme disease spirochete Borrelia burgdorferi. Proc. Natl. Acad. Sci. USA 94, 14383–14388 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Wolf, Y.I., Aravind, L., Grishin, N.V. & Koonin, E.V. Evolution of aminoacyl-tRNA synthetases—analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events. Genome Res. 9, 689–710 (1999).

    CAS  PubMed  Google Scholar 

  22. Galperin, M.Y., Aravind, L. & Koonin, E.V. Aldolases of the DhnA family: a possible solution to the problem of pentose and hexose biosynthesis in archaea. FEMS Microbiol. Lett. 183, 269–264 (2000).

    Article  Google Scholar 

  23. Thomson, G.J., Howlett, G.J., Ashcroft, A.E. & Berry, A. The dhnA gene of Escherichia coli encodes a class I fructose bisphosphate aldolase. Biochem. J. 331, 437–445 (1998).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Dynes, J.L. & Firtel, R.A. Molecular complementation of a genetic marker in Dictyostelium using a genomic DNA library. Proc. Natl. Acad. Sci. USA. 86, 7966–7970 (1989).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Marcotte, E.M. et al. Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999).

    Article  CAS  PubMed  Google Scholar 

  26. Enright, A.J., Ilipoulos, I., Kyrpides, N.C. & Ouzounis, C.A. Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90 (1999).

    Article  CAS  PubMed  Google Scholar 

  27. Snel, B., Bork, P. & Huynen, M. Genome evolution: Gene fusion versus gene fission. Trends Genet. 16, 9–11 (2000).

    Article  CAS  PubMed  Google Scholar 

  28. Aravind, L. & Ponting, C.P. The GAF domain: an evolutionary link between diverse phototransducing proteins. Trends Biochem. Sci. 22, 458–459 (1997).

    Article  CAS  PubMed  Google Scholar 

  29. Galperin, M.Y., Natale, D.A., Aravind, L. & Koonin, E.V. A specialized version of the HD hydrolase domain implicated in signal transduction. J. Mol. Microbiol. Biotechnol. 1, 303–305 (1999).

    CAS  PubMed  Google Scholar 

  30. Doolittle, R.F. Do you dig my groove? Nat. Genet. 23, 6–8 (1999).

    Article  CAS  PubMed  Google Scholar 

  31. Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G.D. & Maltsev, N. Use of contiguity on the chromosome to predict functional coupling. In Silico Biol. http://www.bioinfo.de/isb/1998/01/0009/(1998).

  32. Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G.D. & Maltsev, N. The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Dandekar, T., Snel, B., Huynen, M. & Bork, P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 23, 324–328 (1998).

    Article  CAS  PubMed  Google Scholar 

  34. Schiott, T., Throne-Holst, M. & Hederstedt, L. Bacillus subtilis CcdA-defective mutants are blocked in a late step of cytochrome c biogenesis. J. Bacteriol. 179, 4523–4529 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Mironov, A.A., Koonin, E.V., Roytberg, M.A. & Gelfand, M.S. Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes. Nucleic Acids Res. 27, 2981–2989 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Gelfand, M.S., Koonin, E.V. & Mironov, A.A. Prediction of transcription regulatory sites in Archaea by a comparative-genomic approach. Nucleic Acids Res. 28, 695–705 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Walhout, A.J. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).

    Article  CAS  PubMed  Google Scholar 

  38. Marcotte, E.M., Pellegrini, M., Thompson, M.J., Yeates, T.O. & Eisenberg, D. A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999).

    Article  CAS  PubMed  Google Scholar 

  39. Mewes, H.W., Hani, J., Pfeiffer, F. & Frishman, D. MIPS: a database for protein sequences and complete genomes. Nucleic Acids Res. 26, 33–37 (1998).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Doolittle, W.F. You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet. 14, 307–311 (1998).

    Article  CAS  PubMed  Google Scholar 

  41. Doolittle, W.F. Phylogenetic classification and the universal tree. Science 284, 2124–2129 (1999).

    Article  CAS  PubMed  Google Scholar 

  42. Mushegian, A.R. & Koonin, E.V. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA 93, 10268–10273 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Aravind, L., Tatusov, R.L., Wolf, Y.I., Walker, D.R. & Koonin, E.V. Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet. 14, 442–444 (1998).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank L. Aravind, Arcady Mushegian, and Yuri Wolf for numerous helpful discussions of the issues considered in this article, and Martijn Huynen and Peer Bork for sending us preprints of their publications.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eugene V. Koonin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Galperin, M., Koonin, E. Who's your neighbor? New computational approaches for functional genomics. Nat Biotechnol 18, 609–613 (2000). https://doi.org/10.1038/76443

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/76443

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing