Protein–Protein Interactions: Hot Spots and Structurally Conserved Residues often Locate in Complemented Pockets that Pre-organized in the Unbound States: Implications for Docking

doi:10.1016/j.jmb.2004.09.051

Journal of Molecular Biology

Volume 344, Issue 3, 26 November 2004, Pages 781-795

https://doi.org/10.1016/j.jmb.2004.09.051 Get rights and content

Energetic hot spots account for a significant portion of the total binding free energy and correlate with structurally conserved interface residues. Here, we map experimentally determined hot spots and structurally conserved residues to investigate their geometrical organization. Unfilled pockets are pockets that remain unfilled after protein–protein complexation, while complemented pockets are pockets that disappear upon binding, representing tightly fit regions. We find that structurally conserved residues and energetic hot spots are strongly favored to be located in complemented pockets, and are disfavored in unfilled pockets. For the three available protein–protein complexes with complemented pockets where both members of the complex were alanine-scanned, 62% of all hot spots (ΔΔG>2 kcal/mol) are within these pockets, and 60% of the residues in the complemented pockets are hot spots. 93% of all red-hot residues (ΔΔG≥4 kcal/mol) either protrude into or are located in complemented pockets. The occurrence of hot spots and conserved residues in complemented pockets highlights the role of local tight packing in protein associations, and rationalizes their energetic contribution and conservation. Complemented pockets and their corresponding protruding residues emerge among the most important geometric features in protein–protein interactions. By screening the solvent, this organization shields backbone hydrogen bonds and charge–charge interactions. Complemented pockets often pre-exist binding. For 18 protein–protein complexes with complemented pockets whose unbound structures are available, in 16 the pockets are identified to pre-exist in the unbound structures. The root-mean-squared deviations of the atoms lining the pockets between the bound and unbound states is as small as 0.9 Å, suggesting that such pockets constitute features of the populated native state that may be used in docking.

Introduction

Protein–protein interactions are critical for practically all biological functions, including signal transduction, metabolism, vesicle transport and mitogenic processes. To comprehend the mechanism of biological processes, the function of the proteins must be considered within the context of other interacting proteins. Considerable efforts have centered on studies of the principles governing protein–protein interactions, including interface residue contacts, morphology, hydrophobic patches, conservation, residue propensities and secondary structures.1, 2, 3, 4, 5 Identification of binding sites is critical for establishing protein–protein interaction networks, cellular pathways and regulation. In addition, it can assist in drug design and quaternary structure prediction.

Residues at protein–protein binding sites are more conserved than on the rest of the protein surface.4, 6, 7, 8, 9, 10, 11, 12 Aided by a conservation analysis of interfacial residues, predictions of protein binding sites have recently achieved significant success.6, 12, 13 On the basis of the work of Tsai et al. in 1996,¹⁴ Keskin et al. have constructed a substantially enlarged dataset of non-redundant protein–protein interfaces.¹⁵ Unlike other sequence-non-redundant datasets, this dataset is derived through structural comparisons of interfaces, independent of both the sequence order at the interface and the folds of the two parent protein chains.⁹ Following a selection of 3799 non-redundant interface clusters,¹⁵ we have obtained 67 clusters with 343 members and over 3600 structurally conserved residues, allowing a statistical analysis. This database is useful, since it may be explored for important features in protein–protein interactions that may be independent of sequence order or backbone folds.

In this study, we examine the geometrical features of the structurally conserved residues in the form of pockets and voids, known to abundantly populate protein interfaces.16, 17 First, we study unfilled pockets (Figure 1). These are pockets present after protein–protein association. They represent geometrical features of packing defects at the protein interface. Second, we study complemented pockets (Figure 1). These pockets are present when the two proteins are separated, but disappear following association. They represent binding regions on interfaces that have non-trivial geometric shape and have tight fitting. Details on pocket identification are described in Materials and Methods.

Alanine scanning of interfaces has shown that a few key residues can contribute dominantly to the binding free energy of protein–protein complexes.18, 19, 20 A residue is defined as a hot spot if there is a significant binding free energy change (ΔΔG≥2 kcal/mol) when mutated to alanine (1 cal=4.184 J). The hot spots collected by Bogan & Thorn21, 22 have been shown to overlap remarkably well with structurally conserved residues.9, 10 Thus, we study the relationship of interfacial pockets and experimental energy hot spots. In silico, efficient energy functions have been developed for the prediction of the experimentally measured free energy change by alanine substitution.23, 24 However, there are no general principles to address the question of what makes an interfacial residue a hot spot. With a remarkable foresight, Bogan & Thorn have proposed that some hot spots were largely surrounded by hydrophobic O-rings.²¹ Nevertheless, prediction of hot spots remained a difficult task.18, 25 Hydrophobicity, shape, charge and interfacial residue type have been shown to inadequately explain or predict the energy hot spots.5, 23 To address this question, we investigate the geometrical features of the hot spots, particularly focusing on whether they locate in tightly fit interface indentations.

Our analysis indicates that whereas most interfaces have packing defects in the form of unfilled pockets, they also have tight fitting regions characterized by complemented pockets. Importantly, these regions are enriched in structurally conserved residues. For the cases where both proteins in the complex were alanine-scanned,²² the complemented pockets and protruding residues identify 62% of all known hot spots, and 60% of the residues in complemented pockets are hot spots. We further examine the red-hot residues (ΔΔG≥4 kcal/mol). We find that 93% (13/14) of the red-hot residues are found as protruding or complemented pocket residue. Our results point toward the crucial role of local tight packing in non-trivial geometrical shapes in protein–protein interactions. Evolution has optimized tight fitting through a complemented pocket hot region organization. These concave indentations may provide sites for drug discovery.

We further analyze a set of 31 protein–protein complexes compiled by Chen et al.,²⁶ which have been crystallized in the unbound state: 18 of these contain complemented pockets. In 16 of these 18, these pockets pre-exist in the unbound state, with a low root-mean-square deviation (RMSD) between the atoms lining the corresponding pockets. This observation supports the suggestion that pre-formed pockets may be a highly populated native state feature. Combined, our results point toward the mechanistic role of the hot spots in protein–protein binding and suggest a possible scheme for identification of a hot spot in an unbound protein structure. As such, they may have applications in protein docking experiments.

Section snippets

Imperfect fit: unfilled pockets on protein–protein interfaces

Shape complementarity between interacting partners is a fundamental aspect of protein–protein interactions. Shape complementarity alone was used by Connolly in 1986²⁷ to address the docking problem. Achieving a good match of shapes is the focus of many protein docking methods.28, 29, 30, 31 To understand how well protein chains pack together and how shape matching can aid in docking, we analyze the distribution of unfilled pockets. Our result is consistent with the report by Hubbard & Argos,¹⁶

Discussion and Conclusions

Protein–protein interfaces are “porous” and are rarely packed perfectly.¹⁶ Unfilled and complemented pockets are often scattered on the interfaces. Most large protein–protein complexes have a significant amount of unfilled pockets. The size of unfilled pockets in the interfaces is correlated with the interface size. Experimentally, it is unclear whether unfilled pockets contain water molecules or how the dynamics of water molecules entering and escaping these pockets may affect binding

Dataset and definition of protein–protein interfaces

The dataset of structurally conserved residues are taken from Keskin et al.¹⁵ These entries consist of 21,686 two-chain interfaces. An iterative pairwise structural comparison14, 48, 49, 50 and a heuristic clustering procedure were employed to cluster these interfaces. Following six cycles, 3799 clusters are obtained. To further eliminate redundancy, if one of two sequences in a cluster shares a sequence similarity greater than 50%, it is deleted from the cluster. This yields a dataset with

Acknowledgements

We thank Dr L. Admin, J. F. Zhang, A. Binkowski, R. Jackups, P. Freeman, and J. Tseng for helpful discussions. We thank Drs C.-J. Tsai, Y. Pan, K. Gunasekaran, D. Zanuy, H.-H (G). Tsai and members of the Nussinov-Wolfson group, in particular Maxim Shatsky for help with MultiProt, and Inbal Halperin for the hot spot collection. We thank Dr Jacob V. Maizel for encouragement. We thank Dr A. Gursoy and S. Aytuna for their helpful discussions. This work is supported by grants from National Science