Journal of Molecular Biology
Protein–Protein Interactions: Hot Spots and Structurally Conserved Residues often Locate in Complemented Pockets that Pre-organized in the Unbound States: Implications for Docking
Introduction
Protein–protein interactions are critical for practically all biological functions, including signal transduction, metabolism, vesicle transport and mitogenic processes. To comprehend the mechanism of biological processes, the function of the proteins must be considered within the context of other interacting proteins. Considerable efforts have centered on studies of the principles governing protein–protein interactions, including interface residue contacts, morphology, hydrophobic patches, conservation, residue propensities and secondary structures.1, 2, 3, 4, 5 Identification of binding sites is critical for establishing protein–protein interaction networks, cellular pathways and regulation. In addition, it can assist in drug design and quaternary structure prediction.
Residues at protein–protein binding sites are more conserved than on the rest of the protein surface.4, 6, 7, 8, 9, 10, 11, 12 Aided by a conservation analysis of interfacial residues, predictions of protein binding sites have recently achieved significant success.6, 12, 13 On the basis of the work of Tsai et al. in 1996,14 Keskin et al. have constructed a substantially enlarged dataset of non-redundant protein–protein interfaces.15 Unlike other sequence-non-redundant datasets, this dataset is derived through structural comparisons of interfaces, independent of both the sequence order at the interface and the folds of the two parent protein chains.9 Following a selection of 3799 non-redundant interface clusters,15 we have obtained 67 clusters with 343 members and over 3600 structurally conserved residues, allowing a statistical analysis. This database is useful, since it may be explored for important features in protein–protein interactions that may be independent of sequence order or backbone folds.
In this study, we examine the geometrical features of the structurally conserved residues in the form of pockets and voids, known to abundantly populate protein interfaces.16, 17 First, we study unfilled pockets (Figure 1). These are pockets present after protein–protein association. They represent geometrical features of packing defects at the protein interface. Second, we study complemented pockets (Figure 1). These pockets are present when the two proteins are separated, but disappear following association. They represent binding regions on interfaces that have non-trivial geometric shape and have tight fitting. Details on pocket identification are described in Materials and Methods.
Alanine scanning of interfaces has shown that a few key residues can contribute dominantly to the binding free energy of protein–protein complexes.18, 19, 20 A residue is defined as a hot spot if there is a significant binding free energy change (ΔΔG≥2 kcal/mol) when mutated to alanine (1 cal=4.184 J). The hot spots collected by Bogan & Thorn21, 22 have been shown to overlap remarkably well with structurally conserved residues.9, 10 Thus, we study the relationship of interfacial pockets and experimental energy hot spots. In silico, efficient energy functions have been developed for the prediction of the experimentally measured free energy change by alanine substitution.23, 24 However, there are no general principles to address the question of what makes an interfacial residue a hot spot. With a remarkable foresight, Bogan & Thorn have proposed that some hot spots were largely surrounded by hydrophobic O-rings.21 Nevertheless, prediction of hot spots remained a difficult task.18, 25 Hydrophobicity, shape, charge and interfacial residue type have been shown to inadequately explain or predict the energy hot spots.5, 23 To address this question, we investigate the geometrical features of the hot spots, particularly focusing on whether they locate in tightly fit interface indentations.
Our analysis indicates that whereas most interfaces have packing defects in the form of unfilled pockets, they also have tight fitting regions characterized by complemented pockets. Importantly, these regions are enriched in structurally conserved residues. For the cases where both proteins in the complex were alanine-scanned,22 the complemented pockets and protruding residues identify 62% of all known hot spots, and 60% of the residues in complemented pockets are hot spots. We further examine the red-hot residues (ΔΔG≥4 kcal/mol). We find that 93% (13/14) of the red-hot residues are found as protruding or complemented pocket residue. Our results point toward the crucial role of local tight packing in non-trivial geometrical shapes in protein–protein interactions. Evolution has optimized tight fitting through a complemented pocket hot region organization. These concave indentations may provide sites for drug discovery.
We further analyze a set of 31 protein–protein complexes compiled by Chen et al.,26 which have been crystallized in the unbound state: 18 of these contain complemented pockets. In 16 of these 18, these pockets pre-exist in the unbound state, with a low root-mean-square deviation (RMSD) between the atoms lining the corresponding pockets. This observation supports the suggestion that pre-formed pockets may be a highly populated native state feature. Combined, our results point toward the mechanistic role of the hot spots in protein–protein binding and suggest a possible scheme for identification of a hot spot in an unbound protein structure. As such, they may have applications in protein docking experiments.
Section snippets
Imperfect fit: unfilled pockets on protein–protein interfaces
Shape complementarity between interacting partners is a fundamental aspect of protein–protein interactions. Shape complementarity alone was used by Connolly in 198627 to address the docking problem. Achieving a good match of shapes is the focus of many protein docking methods.28, 29, 30, 31 To understand how well protein chains pack together and how shape matching can aid in docking, we analyze the distribution of unfilled pockets. Our result is consistent with the report by Hubbard & Argos,16
Discussion and Conclusions
Protein–protein interfaces are “porous” and are rarely packed perfectly.16 Unfilled and complemented pockets are often scattered on the interfaces. Most large protein–protein complexes have a significant amount of unfilled pockets. The size of unfilled pockets in the interfaces is correlated with the interface size. Experimentally, it is unclear whether unfilled pockets contain water molecules or how the dynamics of water molecules entering and escaping these pockets may affect binding
Dataset and definition of protein–protein interfaces
The dataset of structurally conserved residues are taken from Keskin et al.15 These entries consist of 21,686 two-chain interfaces. An iterative pairwise structural comparison14, 48, 49, 50 and a heuristic clustering procedure were employed to cluster these interfaces. Following six cycles, 3799 clusters are obtained. To further eliminate redundancy, if one of two sequences in a cluster shares a sequence similarity greater than 50%, it is deleted from the cluster. This yields a dataset with
Acknowledgements
We thank Dr L. Admin, J. F. Zhang, A. Binkowski, R. Jackups, P. Freeman, and J. Tseng for helpful discussions. We thank Drs C.-J. Tsai, Y. Pan, K. Gunasekaran, D. Zanuy, H.-H (G). Tsai and members of the Nussinov-Wolfson group, in particular Maxim Shatsky for help with MultiProt, and Inbal Halperin for the hot spot collection. We thank Dr Jacob V. Maizel for encouragement. We thank Dr A. Gursoy and S. Aytuna for their helpful discussions. This work is supported by grants from National Science
References (57)
- et al.
Morphology of protein–protein interfaces
Structure
(1998) - et al.
The atomic structure of protein–protein recognition sites
J. Mol. Biol.
(1999) - et al.
Evolutionary predictions of binding surfaces and interactions
Curr. Opin. Struct. Biol.
(2002) - et al.
Protein functional epitopes: hot spots, dynamics and combinatorial libraries
Curr. Opin. Struct. Biol.
(2001) - et al.
An evolutionary trace method defines binding surfaces common to protein families
J. Mol. Biol.
(1996) - et al.
A dataset of protein–protein interfaces generated with a sequence-order-independent comparison technique
J. Mol. Biol.
(1996) - et al.
Shape complementarity at protein/protein interfaces
J. Mol. Biol.
(1993) Unraveling hot spots in binding interfaces: progress and challenges
Curr. Opin. Struct. Biol.
(2002)- et al.
Anatomy of hot spots in protein interfaces
J. Mol. Biol.
(1998) - et al.
Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations
J. Mol. Biol.
(2002)