Abstract
Several recently developed computational approaches in comparative genomics go beyond sequence comparison. By analyzing phylogenetic profiles of protein families, domain fusions, gene adjacency in genomes, and expression patterns, these methods predict many functional interactions between proteins and help deduce specific functions for numerous proteins. Although some of the resultant predictions may not be highly specific, these developments herald a new era in genomics in which the benefits of comparative analysis of the rapidly growing collection of complete genomes will become increasingly obvious.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Koonin, E.V., Tatusov, R.L. & Galperin, M.Y. Beyond complete genomes: from sequence to structure and function. Curr. Opin. Struct. Biol. 8, 355–363 (1998).
Bork, P. et al. Predicting function: From genes to genomes and back. J. Mol. Biol. 283, 707–725 (1998).
Bork, P. & Koonin, E.V. Predicting functions from protein sequences—where are the bottlenecks? Nat. Genet. 18, 313–318 (1998).
Huynen, M.J. & Snel, B. Gene and context: integrative approaches to genome analysis. Adv. Prot. Chem. 54, 345–380 (2000).
Fitch, W.M. Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–106 (1970).
Henikoff, S. et al. Gene families: the taxonomy of protein paralogs and chimeras. Science 278, 609–614 (1997).
Tatusov, R.L., Koonin, E.V. & Lipman, D.J. A genomic perspective on protein families. Science 278, 631–637 (1997).
Tatusov, R.L., Galperin, M.Y., Natale, D.A. & Koonin, E.V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000).
Gaasterland, T. & Ragan, M.A. Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes. Microb. Comp. Genomics 3, 199–217 (1998).
Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D. & Yeates, T.O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. U S A 96, 4285–4288 (1999).
Brown, J.R. & Doolittle, W.F. Archaea and the prokaryote-to-eukaryote transition. Microbiol. Mol. Biol. Rev. 61, 456–502 (1997).
Makarova, K.S. et al. Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell. Genome Res. 9, 608–628 (1999).
Dandekar, T., Schuster, S., Snel, B., Huynen, M. & Bork, P. Pathway alignment: application to the comparative analysis of glycolytic enzymes. Biochem. J. 343, 115–124 (1999).
Huynen, M.A., Dandekar, T. & Bork, P. Variation and evolution of the citric-acid cycle: a genomic perspective. Trends Microbiol. 7, 281–291 (1999).
Ibba, M. et al. A euryarchaeal lysyl-tRNA synthetase: resemblance to class I synthetases. Science 278, 1119–1122 (1997).
Ibba, M., Bono, J.L., Rosa, P.A. & Soll, D. Archaeal-type lysyl-tRNA synthetase in the Lyme disease spirochete Borrelia burgdorferi. Proc. Natl. Acad. Sci. USA 94, 14383–14388 (1997).
Wolf, Y.I., Aravind, L., Grishin, N.V. & Koonin, E.V. Evolution of aminoacyl-tRNA synthetases—analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events. Genome Res. 9, 689–710 (1999).
Galperin, M.Y., Aravind, L. & Koonin, E.V. Aldolases of the DhnA family: a possible solution to the problem of pentose and hexose biosynthesis in archaea. FEMS Microbiol. Lett. 183, 269–264 (2000).
Thomson, G.J., Howlett, G.J., Ashcroft, A.E. & Berry, A. The dhnA gene of Escherichia coli encodes a class I fructose bisphosphate aldolase. Biochem. J. 331, 437–445 (1998).
Dynes, J.L. & Firtel, R.A. Molecular complementation of a genetic marker in Dictyostelium using a genomic DNA library. Proc. Natl. Acad. Sci. USA. 86, 7966–7970 (1989).
Marcotte, E.M. et al. Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999).
Enright, A.J., Ilipoulos, I., Kyrpides, N.C. & Ouzounis, C.A. Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90 (1999).
Snel, B., Bork, P. & Huynen, M. Genome evolution: Gene fusion versus gene fission. Trends Genet. 16, 9–11 (2000).
Aravind, L. & Ponting, C.P. The GAF domain: an evolutionary link between diverse phototransducing proteins. Trends Biochem. Sci. 22, 458–459 (1997).
Galperin, M.Y., Natale, D.A., Aravind, L. & Koonin, E.V. A specialized version of the HD hydrolase domain implicated in signal transduction. J. Mol. Microbiol. Biotechnol. 1, 303–305 (1999).
Doolittle, R.F. Do you dig my groove? Nat. Genet. 23, 6–8 (1999).
Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G.D. & Maltsev, N. Use of contiguity on the chromosome to predict functional coupling. In Silico Biol. http://www.bioinfo.de/isb/1998/01/0009/(1998).
Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G.D. & Maltsev, N. The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901 (1999).
Dandekar, T., Snel, B., Huynen, M. & Bork, P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 23, 324–328 (1998).
Schiott, T., Throne-Holst, M. & Hederstedt, L. Bacillus subtilis CcdA-defective mutants are blocked in a late step of cytochrome c biogenesis. J. Bacteriol. 179, 4523–4529 (1997).
Mironov, A.A., Koonin, E.V., Roytberg, M.A. & Gelfand, M.S. Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes. Nucleic Acids Res. 27, 2981–2989 (1999).
Gelfand, M.S., Koonin, E.V. & Mironov, A.A. Prediction of transcription regulatory sites in Archaea by a comparative-genomic approach. Nucleic Acids Res. 28, 695–705 (2000).
Walhout, A.J. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).
Marcotte, E.M., Pellegrini, M., Thompson, M.J., Yeates, T.O. & Eisenberg, D. A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999).
Mewes, H.W., Hani, J., Pfeiffer, F. & Frishman, D. MIPS: a database for protein sequences and complete genomes. Nucleic Acids Res. 26, 33–37 (1998).
Doolittle, W.F. You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet. 14, 307–311 (1998).
Doolittle, W.F. Phylogenetic classification and the universal tree. Science 284, 2124–2129 (1999).
Mushegian, A.R. & Koonin, E.V. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA 93, 10268–10273 (1996).
Aravind, L., Tatusov, R.L., Wolf, Y.I., Walker, D.R. & Koonin, E.V. Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet. 14, 442–444 (1998).
Acknowledgements
We thank L. Aravind, Arcady Mushegian, and Yuri Wolf for numerous helpful discussions of the issues considered in this article, and Martijn Huynen and Peer Bork for sending us preprints of their publications.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Galperin, M., Koonin, E. Who's your neighbor? New computational approaches for functional genomics. Nat Biotechnol 18, 609–613 (2000). https://doi.org/10.1038/76443
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/76443
This article is cited by
-
In silico characterization, docking, and simulations to understand host–pathogen interactions in an effort to enhance crop production in date palms
Journal of Molecular Modeling (2021)
-
Modified base-binding EVE and DCD domains: striking diversity of genomic contexts in prokaryotes and predicted involvement in a variety of cellular processes
BMC Biology (2020)
-
Uncovering a superfamily of nickel-dependent hydroxyacid racemases and epimerases
Scientific Reports (2020)
-
Systematic prediction of functionally linked genes in bacterial and archaeal genomes
Nature Protocols (2019)
-
Environmental drivers of a microbial genomic transition zone in the ocean’s interior
Nature Microbiology (2017)