The value of high quality protein–protein interaction networks for systems biology

https://doi.org/10.1016/j.cbpa.2006.10.005Get rights and content

Protein–protein interaction (PPI) networks contain a large amount of useful information for the functional characterization of proteins and promote the understanding of the complex molecular relationships that determine the phenotype of a cell. Recently, large human interaction maps have been generated with high throughput technologies such as the yeast two-hybrid system. However, they are static and incomplete and do not provide immediate clues about the cellular processes that convert genetic information into complex phenotypes. Refined multiple-aspect PPI screening and confirmation strategies will have to be put in place to increase the validity of interaction maps. Integration of interaction data with other qualitative and quantitative information (e.g. protein expression or localization data), will be required to construct networks of protein function that reflect dynamic processes in the cell. In this way, combined PPI networks can become valuable resources for a systems-level understanding of cellular processes and complex phenotypes.

Introduction

Protein–protein interactions (PPIs) are fundamental to all biological processes. A comprehensive determination of all PPIs that can take place in a cell would be an invaluable asset for the understanding of biology at a systems level. In the past five years, PPI and other high throughput (HTP) research to define protein function and regulation has been predominantly focused on yeast.

The yeast genome was one of the first to be fully sequenced. The mostly intron-lacking genes coding for ∼6200 proteins were cloned with relative ease. Yeast geneticists have been successful in large scale recombinational cloning, whole genome tagging and gene deletion analysis. Approximately 1100 genes essential for yeast viability have been determined, and several phenotypic screens have been performed with the remaining gene deletion strains [1]. Systematic yeast two-hybrid (Y2H) and mass spectrometry-based biochemical affinity co-purification approaches have revealed most of the ∼18 000 PPIs. Synthetic lethality screens have revealed another 4000 functional relationships between yeast proteins [2]. The subcellular localization of most yeast proteins has been determined with high precision, as evident from 80% agreement between two independent studies [3, 4]. Protein abundances have also been measured using the complete set of chromosomally TAP-tagged yeast genes [5] as well as GFP-tagged strains [6]. Transcription profiles have been recorded under numerous conditions. Therefore, yeast has become the central organism for ‘systems biology of the cell’ [7], with distinct types of qualitative and quantitative data having been integrated. The first dynamic network models of cellular processes such as the yeast cell cycle have been successfully generated [8••]. It has become obvious that qualitative data are highly useful and probably essential for time-resolved, quantitative descriptions. The generation of quantitative data are probably best performed in a framework of qualitative results, such as interaction or localization data [9].

Genome-wide approaches for mammalian systems are advancing, largely profiting from the comprehensive yeast studies. However, integrative network studies for human systems are still in their infancy for several reasons. Human gene annotation is much more complicated; the cloning of full length ORFs is a major challenge for functional genome research [10]. Recombinant protein expression is difficult and genetic manipulation is not as powerful and efficient as in model organisms. Phenotypical readouts are much more complex than in yeast. Nevertheless, several proteomics studies have revealed the first cell type specific protein compositions [11]. Together with protein localization studies they have started defining cell type and tissue specific parts of the human proteome [12••]. After the publication of smaller, focused PPI maps, the first genome-scale human PPI networks were generated experimentally [13••, 14••].

Here, we focus on recent approaches to determine PPIs of higher eukaryotes. We discuss the quality of the interaction data, the results achieved and the potential application of human protein interaction networks for a systems level understanding of cellular processes.

Section snippets

Yeast two-hybrid based PPI networks

The Y2H system is a genetic technique where two interacting proteins reconstitute transcriptional activity of a split transcription factor in the nucleus of yeast, activating reporter genes (Figure 1a). Demonstrated first for yeast proteins and later for Drosophila melanogaster and Caenorhabditis elegans, the method can be used for the identification of PPIs in an HTP manner. Several large-scale studies have been published describing two-hybrid-based interaction networks of human or mouse

Towards high quality interaction networks

Although much information is provided by HTP interaction studies that greatly helps in developing new biological hypotheses and designing experiments, interaction networks are still incomplete and error prone. From a comparison of methods and strategies of the different Y2H studies, several issues emerge that are relevant for the improvement of the quality and usefulness of the data, which is paramount for the generation of a comprehensive human protein interaction map.

  • 1.

    Well annotated, fully

Alternative two-hybrid methods

One clear bias of the classical Y2H approach is that membrane proteins are strongly underrepresented in the resulting interaction data. This is probably due to difficulties in expression of membrane protein and their inability to enter the nucleus. A concise account of alternative methods for the analysis of membrane protein interactions using yeast-based technologies has been given previously [29]. The split ubiquitin system, for example, is based on the functional reconstitution of the

Protein arrays for the determination of PPIs

Protein array technology [32] is becoming increasingly important for the identification of PPIs. Developments circumventing the difficulties in mammalian protein purification turned protein arrays into useful tools for interaction screening. In situ synthesis of proteins in reticulocyte cell extracts on antibody-coated chips, for example, has eliminated the need to purify proteins [33]. These chips carry sufficient protein to perform interaction studies, as was shown for a 29 × 29 interaction

Literature collection and computational prediction of human PPIs

Recently, a somewhat provocative study came to the conclusion that new interaction knowledge tends to attach to islands of highly interconnected known facts [36]. To bridge these gaps between the islands, a few initiatives are trying to simply collect protein interaction knowledge from the literature, indeed creating large, extremely valuable protein interaction datasets. The most ‘puristic’ approach is to read through the scientific literature and collect reported PPIs one by one. This is what

Combining PPI interaction data

There is a broad spectrum of methods to be used for the generation of protein interaction data. The methods differ to such an extent that they result in complementary rather than overlapping data. Therefore, combination of different techniques for the identification of PPIs is a good way to draw a more comprehensive picture (Figure 2b). In a recent study, more than 600 interactions for the yeast chaperone Hsp90 were identified with a combination of four different methods [44••]. Y2H and tandem

Making PPI networks dynamic

The results of comprehensive PPI studies are large, highly connected, static PPI maps. Much has been learned from the analysis of topology, motifs and statistical parameters treating the networks as single entireties [45]. Nevertheless, this view does not account for spatial and temporal aspects, and obviously does not reflect the actual situation in a cell.

The identification of interaction clusters, believed to represent functional cellular modules or protein complexes, remains a key issue in

Conclusion

Current approaches to determine human PPIs have room for improvement, both in terms of coverage and accuracy. However, experimental and computational interactome research will ultimately yield a reliable global map of protein relationships. Using large-scale PPI maps as frameworks, subnetworks of several distinct states of dynamic cellular processes will be describable through integration of other types of information such as protein expression, localization and gene regulatory data. Perturbed

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

We like to thank S. Schnögl and M. Lalowski for critical reading and suggestions and to acknowledge the support of the German BMBF (NGFN2, KB-P04T03, 01GR0471).

References (49)

  • H.W. Mewes et al.

    MIPS: analysis and annotation of proteins from whole genomes in 2005

    Nucleic Acids Res

    (2006)
  • W.K. Huh et al.

    Global analysis of protein localization in budding yeast

    Nature

    (2003)
  • A. Kumar et al.

    Subcellular localization of the yeast proteome

    Genes Dev

    (2002)
  • S. Ghaemmaghami et al.

    Global analysis of protein expression in yeast

    Nature

    (2003)
  • J.R. Newman et al.

    Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise

    Nature

    (2006)
  • U. de Lichtenberg et al.

    Dynamic complex formation during the yeast cell cycle

    Science

    (2005)
  • P. Aloy et al.

    Structural systems biology: modelling protein interactions

    Nat Rev Mol Cell Biol

    (2006)
  • D.S. Gerhard et al.

    The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)

    Genome Res

    (2004)
  • M.O. Collins et al.

    Molecular characterization and comparison of the components and multiprotein complexes in the postsynaptic proteome

    J Neurochem

    (2006)
  • M. Uhlen et al.

    A human protein atlas for normal and cancer tissues based on antibody proteomics

    Mol Cell Proteomics

    (2005)
  • U. Stelzl et al.

    A human protein-protein interaction network: a resource for annotating the proteome

    Cell

    (2005)
  • H. Goehler et al.

    A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease

    Mol Cell

    (2004)
  • F. Colland et al.

    Functional proteomics mapping of a human signaling pathway

    Genome Res

    (2004)
  • H. Suzuki et al.

    The mammalian protein-protein interaction database and its viewing system that is linked to the main FANTOM2 viewer

    Genome Res

    (2003)
  • Cited by (91)

    • Proximity Labeling and Proteomics: Get to Know Neighbors

      2023, Methods in Enzymology
      Citation Excerpt :

      Sometimes they do not function alone and instead contribute to various biological functions by interacting with each other (protein–protein interactions) or other components, such as lipids and nucleic acids. Among them, protein–protein interactions are an indispensable event in biological reactions (Frieden, 1971; Stelzl & Wanker, 2006). Studies have found that several molecules involved in many signaling pathways specifically interact in various organelles (Virkamäki, Ueki, & Kahn, 1999; Wodak, Vlasblom, Turinsky, & Pu, 2013).

    • Discovery of functional module alignment

      2016, Neurocomputing
      Citation Excerpt :

      Protein interaction networks (PINs) are fundamental to understand the biology process [1].

    View all citing articles on Scopus
    View full text