Annotation Transfer Between Genomes: Protein–Protein Interologs and Protein–DNA Regulogs

  1. Haiyuan Yu1,
  2. Nicholas M. Luscombe1,
  3. Hao Xin Lu1,
  4. Xiaowei Zhu1,
  5. Yu Xia1,
  6. Jing-Dong J. Han2,
  7. Nicolas Bertin2,
  8. Sambath Chung1,
  9. Marc Vidal2, and
  10. Mark Gerstein1,3
  1. 1 Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
  2. 2 Dana-Farber Cancer Institute and Department of Genetics, Harvard Medical School, Boston 02115, Massachusetts, USA

Abstract

Proteins function mainly through interactions, especially with DNA and other proteins. While some large-scale interaction networks are now available for a number of model organisms, their experimental generation remains difficult. Consequently, interolog mapping—the transfer of interaction annotation from one organism to another using comparative genomics—is of significant value. Here we quantitatively assess the degree to which interologs can be reliably transferred between species as a function of the sequence similarity of the corresponding interacting proteins. Using interaction information from Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, and Helicobacter pylori, we find that protein–protein interactions can be transferred when a pair of proteins has a joint sequence identity >80% or a joint E-value <10–70. (These “joint” quantities are the geometric means of the identities or E-values for the two pairs of interacting proteins.) We generalize our interolog analysis to protein–DNA binding, finding such interactions are conserved at specific thresholds between 30% and 60% sequence identity depending on the protein family. Furthermore, we introduce the concept of a “regulog”—a conserved regulatory relationship between proteins across different species. We map interologs and regulogs from yeast to a number of genomes with limited experimental annotation (e.g., Arabidopsis thaliana) and make these available through an online database at http://interolog.gersteinlab.org. Specifically, we are able to transfer ∼90,000 potential protein–protein interactions to the worm. We test a number of these in two-hybrid experiments and are able to verify 45 overlaps, which we show to be statistically significant.

Footnotes

  • [Supplemental material is available online at www.genome.org. The interologs and regulogs mapped from yeast to other genomes are available online at http://interolog.gersteinlab.org.]

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1774904.

  • 3 Corresponding author. E-MAIL Mark.Gerstein{at}yale.edu; FAX 1 360 838 7861.

    • Accepted March 18, 2004.
    • Received July 19, 2003.
| Table of Contents

Preprint Server