The value of high quality protein–protein interaction networks for systems biology
Introduction
Protein–protein interactions (PPIs) are fundamental to all biological processes. A comprehensive determination of all PPIs that can take place in a cell would be an invaluable asset for the understanding of biology at a systems level. In the past five years, PPI and other high throughput (HTP) research to define protein function and regulation has been predominantly focused on yeast.
The yeast genome was one of the first to be fully sequenced. The mostly intron-lacking genes coding for ∼6200 proteins were cloned with relative ease. Yeast geneticists have been successful in large scale recombinational cloning, whole genome tagging and gene deletion analysis. Approximately 1100 genes essential for yeast viability have been determined, and several phenotypic screens have been performed with the remaining gene deletion strains [1]. Systematic yeast two-hybrid (Y2H) and mass spectrometry-based biochemical affinity co-purification approaches have revealed most of the ∼18 000 PPIs. Synthetic lethality screens have revealed another 4000 functional relationships between yeast proteins [2]. The subcellular localization of most yeast proteins has been determined with high precision, as evident from 80% agreement between two independent studies [3, 4]. Protein abundances have also been measured using the complete set of chromosomally TAP-tagged yeast genes [5] as well as GFP-tagged strains [6]. Transcription profiles have been recorded under numerous conditions. Therefore, yeast has become the central organism for ‘systems biology of the cell’ [7], with distinct types of qualitative and quantitative data having been integrated. The first dynamic network models of cellular processes such as the yeast cell cycle have been successfully generated [8••]. It has become obvious that qualitative data are highly useful and probably essential for time-resolved, quantitative descriptions. The generation of quantitative data are probably best performed in a framework of qualitative results, such as interaction or localization data [9].
Genome-wide approaches for mammalian systems are advancing, largely profiting from the comprehensive yeast studies. However, integrative network studies for human systems are still in their infancy for several reasons. Human gene annotation is much more complicated; the cloning of full length ORFs is a major challenge for functional genome research [10]. Recombinant protein expression is difficult and genetic manipulation is not as powerful and efficient as in model organisms. Phenotypical readouts are much more complex than in yeast. Nevertheless, several proteomics studies have revealed the first cell type specific protein compositions [11•]. Together with protein localization studies they have started defining cell type and tissue specific parts of the human proteome [12••]. After the publication of smaller, focused PPI maps, the first genome-scale human PPI networks were generated experimentally [13••, 14••].
Here, we focus on recent approaches to determine PPIs of higher eukaryotes. We discuss the quality of the interaction data, the results achieved and the potential application of human protein interaction networks for a systems level understanding of cellular processes.
Section snippets
Yeast two-hybrid based PPI networks
The Y2H system is a genetic technique where two interacting proteins reconstitute transcriptional activity of a split transcription factor in the nucleus of yeast, activating reporter genes (Figure 1a). Demonstrated first for yeast proteins and later for Drosophila melanogaster and Caenorhabditis elegans, the method can be used for the identification of PPIs in an HTP manner. Several large-scale studies have been published describing two-hybrid-based interaction networks of human or mouse
Towards high quality interaction networks
Although much information is provided by HTP interaction studies that greatly helps in developing new biological hypotheses and designing experiments, interaction networks are still incomplete and error prone. From a comparison of methods and strategies of the different Y2H studies, several issues emerge that are relevant for the improvement of the quality and usefulness of the data, which is paramount for the generation of a comprehensive human protein interaction map.
- 1.
Well annotated, fully
Alternative two-hybrid methods
One clear bias of the classical Y2H approach is that membrane proteins are strongly underrepresented in the resulting interaction data. This is probably due to difficulties in expression of membrane protein and their inability to enter the nucleus. A concise account of alternative methods for the analysis of membrane protein interactions using yeast-based technologies has been given previously [29]. The split ubiquitin system, for example, is based on the functional reconstitution of the
Protein arrays for the determination of PPIs
Protein array technology [32] is becoming increasingly important for the identification of PPIs. Developments circumventing the difficulties in mammalian protein purification turned protein arrays into useful tools for interaction screening. In situ synthesis of proteins in reticulocyte cell extracts on antibody-coated chips, for example, has eliminated the need to purify proteins [33]. These chips carry sufficient protein to perform interaction studies, as was shown for a 29 × 29 interaction
Literature collection and computational prediction of human PPIs
Recently, a somewhat provocative study came to the conclusion that new interaction knowledge tends to attach to islands of highly interconnected known facts [36]. To bridge these gaps between the islands, a few initiatives are trying to simply collect protein interaction knowledge from the literature, indeed creating large, extremely valuable protein interaction datasets. The most ‘puristic’ approach is to read through the scientific literature and collect reported PPIs one by one. This is what
Combining PPI interaction data
There is a broad spectrum of methods to be used for the generation of protein interaction data. The methods differ to such an extent that they result in complementary rather than overlapping data. Therefore, combination of different techniques for the identification of PPIs is a good way to draw a more comprehensive picture (Figure 2b). In a recent study, more than 600 interactions for the yeast chaperone Hsp90 were identified with a combination of four different methods [44••]. Y2H and tandem
Making PPI networks dynamic
The results of comprehensive PPI studies are large, highly connected, static PPI maps. Much has been learned from the analysis of topology, motifs and statistical parameters treating the networks as single entireties [45]. Nevertheless, this view does not account for spatial and temporal aspects, and obviously does not reflect the actual situation in a cell.
The identification of interaction clusters, believed to represent functional cellular modules or protein complexes, remains a key issue in
Conclusion
Current approaches to determine human PPIs have room for improvement, both in terms of coverage and accuracy. However, experimental and computational interactome research will ultimately yield a reliable global map of protein relationships. Using large-scale PPI maps as frameworks, subnetworks of several distinct states of dynamic cellular processes will be describable through integration of other types of information such as protein expression, localization and gene regulatory data. Perturbed
References and recommended reading
Papers of particular interest, published within the annual period of review, have been highlighted as:
• of special interest
•• of outstanding interest
Acknowledgements
We like to thank S. Schnögl and M. Lalowski for critical reading and suggestions and to acknowledge the support of the German BMBF (NGFN2, KB-P04T03, 01GR0471).
References (49)
- et al.
Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction
Curr Opin Microbiol
(2004) - et al.
Towards a proteome-scale map of the human protein-protein interaction network
Nature
(2005) - et al.
A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration
Cell
(2006) - et al.
A protein interaction framework for human mRNA degradation
Genome Res
(2004) - et al.
Analysis of a high-throughput yeast two-hybrid system and its use to predict the function of intracellular proteins encoded within the human MHC class III region
Genomics
(2004) - et al.
A strategy for constructing large protein interaction maps using the yeast two-hybrid system: regulated expression arrays and two-phase mating
Genome Res
(2003) - et al.
Analysis of membrane protein interactions using yeast-based technologies
Trends Biochem Sci
(2002) - et al.
Emergent behavior of growing knowledge about molecular interactions
Nat Biotechnol
(2005) - et al.
A network-based analysis of systemic inflammation in humans
Nature
(2005) - et al.
Maximizing the potential of functional genomics
Nat Rev Genet
(2004)
MIPS: analysis and annotation of proteins from whole genomes in 2005
Nucleic Acids Res
Global analysis of protein localization in budding yeast
Nature
Subcellular localization of the yeast proteome
Genes Dev
Global analysis of protein expression in yeast
Nature
Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise
Nature
Dynamic complex formation during the yeast cell cycle
Science
Structural systems biology: modelling protein interactions
Nat Rev Mol Cell Biol
The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)
Genome Res
Molecular characterization and comparison of the components and multiprotein complexes in the postsynaptic proteome
J Neurochem
A human protein atlas for normal and cancer tissues based on antibody proteomics
Mol Cell Proteomics
A human protein-protein interaction network: a resource for annotating the proteome
Cell
A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease
Mol Cell
Functional proteomics mapping of a human signaling pathway
Genome Res
The mammalian protein-protein interaction database and its viewing system that is linked to the main FANTOM2 viewer
Genome Res
Cited by (91)
Proximity Labeling and Proteomics: Get to Know Neighbors
2023, Methods in EnzymologyCitation Excerpt :Sometimes they do not function alone and instead contribute to various biological functions by interacting with each other (protein–protein interactions) or other components, such as lipids and nucleic acids. Among them, protein–protein interactions are an indispensable event in biological reactions (Frieden, 1971; Stelzl & Wanker, 2006). Studies have found that several molecules involved in many signaling pathways specifically interact in various organelles (Virkamäki, Ueki, & Kahn, 1999; Wodak, Vlasblom, Turinsky, & Pu, 2013).
Prediction of biomarker signatures and therapeutic agents from blood sample against Pancreatic Ductal Adenocarcinoma (PDAC): A network-based study
2020, Informatics in Medicine UnlockedDiscovery of functional module alignment
2016, NeurocomputingCitation Excerpt :Protein interaction networks (PINs) are fundamental to understand the biology process [1].
DULIP: A Dual Luminescence-Based Co-Immunoprecipitation Assay for Interactome Mapping in Mammalian Cells
2015, Journal of Molecular Biology