Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Principles for designing ideal protein structures

Abstract

Unlike random heteropolymers, natural proteins fold into unique ordered structures. Understanding how these are encoded in amino-acid sequences is complicated by energetically unfavourable non-ideal features—for example kinked α-helices, bulged β-strands, strained loops and buried polar groups—that arise in proteins from evolutionary selection for biological function or from neutral drift. Here we describe an approach to designing ideal protein structures stabilized by completely consistent local and non-local interactions. The approach is based on a set of rules relating secondary structure patterns to protein tertiary motifs, which make possible the design of funnel-shaped protein folding energy landscapes leading into the target folded state. Guided by these rules, we designed sequences predicted to fold into ideal protein structures consisting of α-helices, β-strands and minimal loops. Designs for five different topologies were found to be monomeric and very stable and to adopt structures in solution nearly identical to the computational models. These results illuminate how the folding funnels of natural proteins arise and provide the foundation for engineering a new generation of functional proteins free from natural evolution.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Fundamental rules.
Figure 2: Derivation of secondary structure lengths from the rules for five protein topologies.
Figure 3: Characterization of design for each of the five folds.
Figure 4: Comparison of computational models with experimentally determined structures.

Similar content being viewed by others

Accession codes

Primary accessions

Protein Data Bank

Data deposits

TheNMR structures of the five designs have been deposited in the RCSB Protein Data Bank under the accession numbers 2KL8 (Di-I_5), 2LV8 (Di-II_10), 2LN3 (Di-III_14), 2LVB (Di-IV_5) and 2LTA (Di-V_7). NMR data have been deposited in the Biological Magnetic Resonance Data Bank under the accession numbers 16387 (Di-I_5), 18558 (Di-II_10), 18145 (Di-III_14), 18561 (Di-IV_5) and 18465 (Di-V_7).

References

  1. Leopold, P. E., Montal, M. & Onuchic, J. N. Protein folding funnels: a kinetic approach to the sequence-structure relationship. Proc. Natl Acad. Sci. USA 89, 8721–8725 (1992)

    Article  ADS  CAS  Google Scholar 

  2. Onuchic, J. N., Wolynes, P. G., Luthey-Schulten, Z. & Socci, N. D. Toward an outline of the topography of a realistic protein-folding funnel. Proc. Natl Acad. Sci. USA 92, 3626–3630 (1995)

    Article  ADS  CAS  Google Scholar 

  3. Dill, K. A. & Chan, H. S. From Levinthal to pathways to funnels. Nature Struct. Biol. 4, 10–19 (1997)

    Article  CAS  Google Scholar 

  4. Hill, R. B., Raleigh, D. P., Lombardi, A. & DeGrado, W. F. De novo design of helical bundles as models for understanding protein folding and function. Acc. Chem. Res. 33, 745–754 (2000)

    Article  CAS  Google Scholar 

  5. Butterfoss, G. L. & Kuhlman, B. Computer-based design of novel protein structures. Annu. Rev. Biophys. Biomol. Struct. 35, 49–65 (2006)

    Article  CAS  Google Scholar 

  6. Samish, I., MacDermaid, C. M., Perez-Aguilar, J. M. & Saven, J. G. Theoretical and computational protein design. Annu. Rev. Phys. Chem. 62, 129–149 (2011)

    Article  ADS  CAS  Google Scholar 

  7. Dahiyat, B. I. & Mayo, S. L. De novo protein design: fully automated sequence selection. Science 278, 82–87 (1997)

    Article  CAS  Google Scholar 

  8. Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003)

    Article  ADS  CAS  Google Scholar 

  9. Dantas, G., Kuhlman, B., Callender, D., Wong, M. & Baker, D. A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. J. Mol. Biol. 332, 449–460 (2003)

    Article  CAS  Google Scholar 

  10. Calhoun, J. R. et al. Computational design and characterization of a monomeric helical dinuclear metalloprotein. J. Mol. Biol. 334, 1101–1115 (2003)

    Article  CAS  Google Scholar 

  11. Isogai, Y., Ito, Y., Ikeya, T., Shiro, Y. & Ota, M. Design of lambda Cro fold: solution structure of a monomeric variant of the de novo protein. J. Mol. Biol. 354, 801–814 (2005)

    Article  CAS  Google Scholar 

  12. Shah, P. S. et al. Full-sequence computational design and solution structure of a thermostable protein variant. J. Mol. Biol. 372, 1–6 (2007)

    Article  CAS  Google Scholar 

  13. Hu, X., Wang, H., Ke, H. & Kuhlman, B. Computer-based redesign of a beta sandwich protein suggests that extensive negative design is not required for de novo beta sheet design. Structure 16, 1799–1805 (2008)

    Article  CAS  Google Scholar 

  14. Hecht, M. H., Richardson, J. S., Richardson, D. C. & Ogden, R. C. De novo design, expression, and characterization of Felix: a four-helix bundle protein of native-like sequence. Science 249, 884–891 (1990)

    Article  ADS  CAS  Google Scholar 

  15. Richardson, J. S. & Richardson, D. C. Natural beta-sheet proteins use negative design to avoid edge-to-edge aggregation. Proc. Natl Acad. Sci. USA 99, 2754–2759 (2002)

    Article  ADS  CAS  Google Scholar 

  16. Jin, W., Kambara, O., Sasakawa, H., Tamura, A. & Takada, S. De novo design of foldable proteins with smooth folding funnel: automated negative design and experimental verification. Structure 11, 581–590 (2003)

    Article  CAS  Google Scholar 

  17. Harbury, P. B., Plecs, J. J., Tidor, B., Alber, T. & Kim, P. S. High-resolution protein design with backbone freedom. Science 282, 1462–1467 (1998)

    Article  CAS  Google Scholar 

  18. Summa, C. M., Rosenblatt, M. M., Hong, J. K., Lear, J. D. & DeGrado, W. F. Computational de novo design, and characterization of an A(2)B(2) diiron protein. J. Mol. Biol. 321, 923–938 (2002)

    Article  CAS  Google Scholar 

  19. Havranek, J. J. & Harbury, P. B. Automated design of specificity in molecular recognition. Nature Struct. Biol. 10, 45–52 (2003)

    Article  CAS  Google Scholar 

  20. Kortemme, T. et al. Computational redesign of protein-protein interaction specificity. Nature Struct. Mol. Biol. 11, 371–379 (2004)

    Article  CAS  Google Scholar 

  21. Go, N. Theoretical studies of protein folding. Annu. Rev. Biophys. Bioeng. 12, 183–210 (1983)

    Article  CAS  Google Scholar 

  22. Rohl, C. A., Strauss, C. E., Misura, K. M. & Baker, D. Protein structure prediction using Rosetta. Methods Enzymol. 383, 66–93 (2004)

    Article  CAS  Google Scholar 

  23. Street, T. O., Fitzkee, N. C., Perskie, L. L. & Rose, G. D. Physical-chemical determinants of turn conformations in globular proteins. Protein Sci. 16, 1720–1727 (2007)

    Article  CAS  Google Scholar 

  24. Bystroff, C. & Baker, D. Prediction of local structure in proteins using a library of sequence-structure motifs. J. Mol. Biol. 281, 565–577 (1998)

    Article  CAS  Google Scholar 

  25. Hunter, C. G. & Subramaniam, S. Protein local structure prediction from sequence. Proteins 50, 572–579 (2003)

    Article  CAS  Google Scholar 

  26. Etchebest, C., Benros, C., Hazout, S. & de Brevern, A. G. A structural alphabet for local protein structures: improved prediction methods. Proteins 59, 810–827 (2005)

    Article  CAS  Google Scholar 

  27. Voelz, V. A., Shell, M. S. & Dill, K. A. Predicting peptide structures in native proteins from physical simulations of fragments. PLoS Comput. Biol. 5, e1000281 (2009)

    Article  ADS  Google Scholar 

  28. Tyka, M. D. et al. Alternate states of proteins revealed by detailed energy landscape mapping. J. Mol. Biol. 405, 607–618 (2011)

    Article  CAS  Google Scholar 

  29. Dill, K. A. Dominant forces in protein folding. Biochemistry 29, 7133–7155 (1990)

    Article  CAS  Google Scholar 

  30. Sheffler, W. & Baker, D. RosettaHoles: rapid assessment of protein core packing for structure prediction, refinement, design, and validation. Protein Sci. 18, 229–239 (2009)

    CAS  PubMed  Google Scholar 

  31. Fleming, P. J., Gong, H. & Rose, G. D. Secondary structure determines protein topology. Protein Sci. 15, 1829–1834 (2006)

    Article  CAS  Google Scholar 

  32. Chikenji, G., Fujitsuka, Y. & Takada, S. Shaping up the protein folding funnel by local interaction: lesson from a structure prediction study. Proc. Natl Acad. Sci. USA 103, 3141–3146 (2006)

    Article  ADS  CAS  Google Scholar 

  33. Kaplan, J. & DeGrado, W. F. De novo design of catalytic proteins. Proc. Natl Acad. Sci. USA 101, 11566–11570 (2004)

    Article  ADS  CAS  Google Scholar 

  34. Correia, B. E. et al. Computational design of epitope-scaffolds allows induction of antibodies specific for a poorly immunogenic HIV vaccine epitope. Structure 18, 1116–1126 (2010)

    Article  CAS  Google Scholar 

  35. Bolon, D. N. & Mayo, S. L. Enzyme-like proteins by computational design. Proc. Natl Acad. Sci. USA 98, 14274–14279 (2001)

    Article  ADS  CAS  Google Scholar 

  36. Jiang, L. et al. De novo computational design of retro-aldol enzymes. Science 319, 1387–1391 (2008)

    Article  ADS  CAS  Google Scholar 

  37. Röthlisberger, D. et al. Kemp elimination catalysts by computational enzyme design. Nature 453, 190–195 (2008)

    Article  ADS  Google Scholar 

  38. Siegel, J. B. et al. Computational design of an enzyme catalyst for a stereoselective bimolecular Diels-Alder reaction. Science 329, 309–313 (2010)

    Article  ADS  CAS  Google Scholar 

  39. Fleishman, S. J. et al. Computational design of proteins targeting the conserved stem region of influenza hemagglutinin. Science 332, 816–821 (2011)

    Article  ADS  CAS  Google Scholar 

  40. Azoitei, M. L. et al. Computation-guided backbone grafting of a discontinuous motif onto a protein scaffold. Science 334, 373–376 (2011)

    Article  ADS  CAS  Google Scholar 

  41. Khare, S. D. et al. Computational redesign of a mononuclear zinc metalloenzyme for organophosphate hydrolysis. Nature Chem. Biol. 8, 294–300 (2012)

    Article  CAS  Google Scholar 

  42. King, N. P. et al. Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science 336, 1171–1174 (2012)

    Article  ADS  CAS  Google Scholar 

  43. Eisenbeis, S. et al. Potential of fragment recombination for rational design of proteins. J. Am. Chem. Soc. 134, 4019–4022 (2012)

    Article  CAS  Google Scholar 

  44. Bonneau, R., Ruczinski, I., Tsai, J. & Baker, D. Contact order and ab initio protein structure prediction. Protein Sci. 11, 1937–1944 (2002)

    Article  CAS  Google Scholar 

  45. Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997)

    Article  CAS  Google Scholar 

  46. Huang, P. S. et al. RosettaRemodel: a generalized framework for flexible backbone protein design. PLoS ONE 6, e24109 (2011)

    Article  ADS  CAS  Google Scholar 

  47. Cooper, S. et al. Predicting protein structures with a multiplayer online game. Nature 466, 756–760 (2010)

    Article  ADS  CAS  Google Scholar 

  48. Studier, F. W. Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005)

    Article  CAS  Google Scholar 

  49. Jansson, M. et al. High-level production of uniformly N-15- and C-13-enriched fusion proteins in Escherichia coli. J. Biomol. NMR 7, 131–141 (1996)

    Article  CAS  Google Scholar 

  50. Pace, C. N., Vajdos, F., Fee, L., Grimsley, G. & Gray, T. How to measure and predict the molar absorption coefficient of a protein. Protein Sci. 4, 2411–2423 (1995)

    Article  CAS  Google Scholar 

  51. Santoro, M. M. & Bolen, D. W. Unfolding free energy changes determined by the linear extrapolation method. 1. Unfolding of phenylmethanesulfonyl alpha-chymotrypsin using different denaturants. Biochemistry 27, 8063–8068 (1988)

    Article  CAS  Google Scholar 

  52. Acton, T. B. et al. Preparation of protein samples for NMR structure, function, and small-molecule screening studies. Methods Enzymol. 493, 21–60 (2011)

    Article  CAS  Google Scholar 

  53. Neri, D., Szyperski, T., Otting, G., Senn, H. & Wuthrich, K. Stereospecific nuclear magnetic resonance assignments of the methyl groups of valine and leucine in the DNA-binding domain of the 434 repressor by biosynthetically directed fractional 13C labeling. Biochemistry 28, 7510–7516 (1989)

    Article  CAS  Google Scholar 

  54. Tjandra, N., Grzesiek, S. & Bax, A. Magnetic field dependence of nitrogen-proton J splittings in N-15-enriched human ubiquitin resulting from relaxation interference and residual dipolar coupling. J. Am. Chem. Soc. 118, 6264–6272 (1996)

    Article  CAS  Google Scholar 

  55. Shen, Y., Atreya, H. S., Liu, G. H. & Szyperski, T. G-matrix Fourier transform NOESY-based protocol for high-quality protein structure determination. J. Am. Chem. Soc. 127, 9085–9099 (2005)

    Article  CAS  Google Scholar 

  56. Delaglio, F. et al. Nmrpipe - a multidimensional spectral processing system based on unix pipes. J. Biomol. NMR 6, 277–293 (1995)

    Article  CAS  Google Scholar 

  57. Bartels, C., Xia, T. H., Billeter, M., Guntert, P. & Wuthrich, K. The program Xeasy for computer-supported NMR spectral-analysis of biological macromolecules. J. Biomol. NMR 6, 1–10 (1995)

    Article  CAS  Google Scholar 

  58. Liu, G. H. et al. NMR data collection and analysis protocol for high-throughput protein structure determination. Proc. Natl Acad. Sci. USA 102, 10487–10492 (2005)

    Article  ADS  CAS  Google Scholar 

  59. Shen, Y., Delaglio, F., Cornilescu, G. & Bax, A. TALOS+: a hybrid method for predicting protein backbone torsion angles from NMR chemical shifts. J. Biomol. NMR 44, 213–223 (2009)

    Article  CAS  Google Scholar 

  60. Güntert, P., Mumenthaler, C. & Wuthrich, K. Torsion angle dynamics for NMR structure calculation with the new program DYANA. J. Mol. Biol. 273, 283–298 (1997)

    Article  Google Scholar 

  61. Herrmann, T., Guntert, P. & Wuthrich, K. Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. J. Mol. Biol. 319, 209–227 (2002)

    Article  CAS  Google Scholar 

  62. Linge, J. P., Williams, M. A., Spronk, C. A., Bonvin, A. M. & Nilges, M. Refinement of protein structures in explicit solvent. Proteins 50, 496–506 (2003)

    Article  CAS  Google Scholar 

  63. Brünger, A. T. et al. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr. D 54, 905–921 (1998)

    Article  Google Scholar 

  64. Huang, Y. J., Tejero, R., Powers, R. & Montelione, G. T. A topology-constrained distance network algorithm for protein structure determination from NOESY data. Proteins 62, 587–603 (2006)

    Article  CAS  Google Scholar 

  65. Huang, Y. J. et al. An integrated platform for automated analysis of protein NMR structures. Methods Enzymol. 394, 111–141 (2005)

    Article  CAS  Google Scholar 

  66. Lüthy, R., Bowie, J. U. & Eisenberg, D. Assessment of protein models with three-dimensional profiles. Nature 356, 83–85 (1992)

    Article  ADS  Google Scholar 

  67. Sippl, M. J. Recognition of errors in three-dimensional structures of proteins. Proteins 17, 355–362 (1993)

    Article  CAS  Google Scholar 

  68. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. Procheck - a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283–291 (1993)

    Article  CAS  Google Scholar 

  69. Word, J. M., Bateman, R. C., Presley, B. K., Lovell, S. C. & Richardson, D. C. Exploring steric constraints on protein mutations using MAGE/PROBE. Protein Sci. 9, 2251–2259 (2000)

    Article  CAS  Google Scholar 

  70. Bhattacharya, A., Tejero, R. & Montelione, G. T. Evaluating protein structures determined by structural genomics consortia. Proteins 66, 778–795 (2007)

    Article  CAS  Google Scholar 

  71. Huang, Y. J., Powers, R. & Montelione, G. T. Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. J. Am. Chem. Soc. 127, 1665–1674 (2005)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank N. Grishin for suggesting target folds for design, P. Rajagopal for one-dimensional NMR measurements of Folds-I and -II, and J. Siegel for measurements by mass spectrometer. We also thank P.-S. Huang and Y.-E. A. Ban for computational tools; J. L. Gallaher for experimental assistance; J. Castellanos for the help with designing Fold-IV; H.-W. Lee, K. Pederson and J. Prestegard for measurements of residual dipolar couplings; and S. Khare, F. DiMaio, I. Andre, S. Fleishman, J. Mills, S. Takada, S. Fuchigami and G. Chikenji for comments on the manuscript. This work was supported by HHMI, DOE, DARPA, DTRA and the National Institutes of General Medical Science Protein Structure Initiative (PSI:Biology) programme, grant U54 GM094597. N.K. was also supported by Japan Society for the Promotion of Science (JSPS) Postdoctoral Fellowships for Research Abroad.

Author information

Authors and Affiliations

Authors

Contributions

N.K., R.T.-K., G.L., G.T.M. and D.B. designed the research. N.K. performed folding simulations and analysed natural proteins. N.K. wrote program code. N.K. and R.T.-K. performed computational design work: Di-I_5 and Di-IV_5 were designed by N.K., and Di-II_10, Di-III_14 and Di-V_7 were designed by R.T.-K. R.T.-K. expressed, purified and characterized the designed proteins by biochemical assay. R.X. and T.B.A. prepared isotope-enriched protein samples for NMR structure determination. G.L. collected NMR data and determined the solution NMR structures. N.K., R.T.-K., G.L., G.T.M. and D.B. wrote the manuscript.

Corresponding authors

Correspondence to Gaetano T. Montelione or David Baker.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Information

This file contains Supplementary Figures 1-14, Supplementary Tables 1-8, Supplementary Discussions 1-2, Supplementary Methods 1-5 and Supplementary references (see contents for further details). (PDF 4484 kb)

Supplementary Data

This file contains Supplementary Data for Rosetta command lines to perform the design protocol. (ZIP 6 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koga, N., Tatsumi-Koga, R., Liu, G. et al. Principles for designing ideal protein structures. Nature 491, 222–227 (2012). https://doi.org/10.1038/nature11600

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature11600

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing