Function and structure of inherently disordered proteins

https://doi.org/10.1016/j.sbi.2008.10.002Get rights and content

The application of bioinformatics methodologies to proteins inherently lacking 3D structure has brought increased attention to these macromolecules. Here topics concerning these proteins are discussed, including their prediction from amino acid sequence, their enrichment in eukaryotes compared to prokaryotes, their more rapid evolution compared to structured proteins, their organization into specific groups, their structural preferences, their half-lives in cells, their contributions to signaling diversity (via high contents of multiple-partner binding sites, post-translational modifications, and alternative splicing), their distinct functional repertoire compared to that of structured proteins, and their involvement in diseases.

Introduction

Many entire proteins and localized protein regions fail to fold into a 3D structure, yet carry out function. Rather than a linear sequence-to-structure-to-function paradigm, such proteins have been described by a trinity in which function arises from different forms (structured globules, collapsed disordered ensembles, and extended disordered ensembles) and from transitions between different forms such as a disorder-to-structure transition upon binding [1]. The collapsed disordered ensembles were originally thought to be exclusively native molten globules (MGs), with collapse driven by hydrophobic interactions. Recent studies, however, agree with earlier work suggesting that water is a poor solvent for the peptide backbone; thus, polar but uncharged model sequences form compact random coils [2••, 3••], while extended random coils result when polypeptide chains contain significant net charge (Rohit Pappu, unpublished). Water being a poor solvent for polypeptides was also invoked to explain the occurrence of a fourth protein form, the pre-MG, that occurs as an intermediate between the MG and the random coil during protein unfolding [4]. Much more work is needed to understand and relate the various nonstructured protein ensembles and to determine whether the relationship between structure and function should be assembled into a trinity, a quartet, or an even more complicated arrangement.

Here we provide an overview of these proteins, including their structures, functions, and regulations. In all these aspects, the set of non-folding proteins and regions is found to differ greatly from the set of proteins that fold into globular 3D structures.

Section snippets

Prediction of non-folding proteins and regions

Since the amino acid sequence contains the information for protein folding, it was reasoned that, for proteins that do not fold into 3D structures, the amino acid sequence should also specify protein non-folding. To test this hypothesis, predictors were developed to identify sequences that fail to fold [5, 6]. The fact that predictor accuracy was significantly better than expected by chance suggested that the information for failure to fold into a 3D structure is, indeed, likely to be inherent

Frequency of disordered regions

Disorder predictions have been carried out for many whole proteomes. They indicate that the fraction of proteins with substantial amounts of disorder goes as eukaryotes  archaea  eubacteria, with multicellular eukaryotes having much more predicted disorder than mono-cellular eukaryotes [11]. These results were confirmed and substantially extended to include functional classification using an improved predictor of disorder [12]. Integrating the results from these and other sources gives some rules

Protein evolution

Non-folding proteins and regions might be expected to change more rapidly during evolution than structured proteins because buried amino acids are highly constrained while disordered regions are not constrained by structure. For example, plots of sequence variability (measured by sequence entropy over alignments) were found to exhibit nearly linear dependence on the inverse of the packing density, until a low packing density was reached at which point sequence variability remained roughly

Partitioning unstructured proteins and regions into groups

Grouping proteins according to structure and function has proven very useful for studying structured proteins. Associating a new protein with an existing structure–function group (by sequence and/or structure alignment) provides important basic information and quickly identifies critical experiments for further characterization. Given the broad array of disordered protein types, their lack of 3D structure, and their sequence variability, it has so far proven difficult to cluster various

Do inherently unstructured proteins retain any preference for certain structures or are they totally unstructured?

One of the key open questions regarding inherently unstructured proteins is whether, in solution, they retain some preferred structure(s), or are just a plethora of many different conformations, rather like ‘cooked spaghetti’. A recent careful study of residual structure in disordered peptides and unfolded proteins was carried out via multivariate analysis and ab initio simulation of Raman optical activity [21]. This study showed striking differences between the structural characteristics of

Do non-folding proteins have a shorter half-life than other proteins?

Targeted turnover of proteins is a key element in the regulation of many cellular processes. The underlying physicochemical and/or sequential signals are not, however, fully understood. This is particularly pertinent in light of recent recognition that intrinsically unstructured/disordered proteins, common in eukaryotic cells, are extremely susceptible to proteolytic degradation in vitro. An in vivo high-throughput study of the half-lives of all yeast gene products [26] indicated that, in

Functionality of inherently disordered proteins and regions

Non-folding proteins and regions carry out pivotal biological functions, participating in various signaling and regulatory pathways, via specific protein–protein, protein–nucleic acid, and protein–ligand interactions [29, 30, 31, 32]. Enzymatically controlled sites of post-translational modification (PTM) such as acetylation, hydroxylation, ubiquitination, methylation, and phosphorylation, as well as sites of proteolytic attack, are frequently associated with regions of intrinsic disorder [29].

Involvement of inherently disordered proteins in diseases

The fact that many proteins are either wholly intrinsically disordered, or contain large stretches of intrinsically disordered sequences, has been followed by a growing realization that nonstructured proteins are associated with a broad range of human diseases, which led to the introduction of the D2 (disorder in disorders) concept [52]. Diseases involving protein disorder come in a variety of flavors, but we here restrict ourselves to discussing recent work concerning the amyloid diseases, in

Conclusions

The sequence-to-structure-to-function paradigm for proteins was developed from the study of enzymes. Bioinformatics studies indicate that this paradigm applies to enzymes, as well as to transport proteins.

In contrast, proteins and regions of proteins involved in signaling, control, and regulation often use inherently unstructured sequences as the basis for function. There are many structured signaling domains, but these often bind to unstructured protein partners. Moreover, there are numerous

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

This work was supported in part by the grants R01 LM007688-01A1 (to AKD and VNU) and GM071714-01A2 (to AKD and VNU) from the National Institutes of Health and from the Program of the Russian Academy of Sciences ‘Molecular and Cellular Biology’ (to VNU), by the Divadol Foundation, the Benoziyo Center for Neuroscience, the Kimmelman Center, Autism Speaks, the Israel Science Foundation, the Nalvyco Foundation, the Neuman Foundation, a research grant from Mr. Erwin Pearl, the European Commission

References (71)

  • H.T. Tran et al.

    Role of backbone-solvent interactions in determining conformational equilibria of intrinsically disordered proteins

    J Am Chem Soc

    (2008)
  • V.N. Uversky

    Protein folding revisited. A polypeptide chain at the folding–misfolding–nonfolding cross-roads: which way to go?

    Cell Mol Life Sci

    (2003)
  • P. Romero et al.

    Identifying disordered regions in proteins from amino acid sequence

  • V.N. Uversky et al.

    Why are “natively unfolded” proteins unstructured under physiologic conditions?

    Proteins

    (2000)
  • P. Radivojac et al.

    Intrinsic disorder and functional proteomics

    Biophys J

    (2007)
  • T. Ishida et al.

    Prediction of disordered regions in proteins based on the meta approach

    Bioinformatics

    (2008)
  • L. Bordoli et al.

    Assessment of disorder predictions in CASP7

    Proteins

    (2007)
  • A.K. Dunker et al.

    Intrinsic protein disorder in complete genomes

    Genome Inform Ser Workshop Genome Inform

    (2000)
  • J.J. Ward et al.

    Prediction and functional analysis of native disorder in proteins from the three kingdoms of life

    J Mol Biol

    (2004)
  • R.L. Jernigan et al.

    Packing regularities in biological structures relate to their dynamics

    Methods Mol Biol

    (2007)
  • Y.S. Lin et al.

    Proportion of solvent-exposed amino acids in a protein and rate of protein evolution

    Mol Biol Evol

    (2007)
  • G.W. Daughdrill et al.

    Dynamic behavior of an intrinsically unstructured linker domain is conserved in the face of negligible amino acid sequence conservation

    J Mol Evol

    (2007)
  • G. Fiorin et al.

    Unwinding the helical linker of calcium-loaded calmodulin: a molecular dynamics study

    Proteins

    (2005)
  • A. Nagy et al.

    Hierarchical extensibility in the PEVK domain of skeletal-muscle titin

    Biophys J

    (2005)
  • H.J. Dyson et al.

    Intrinsically unstructured proteins and their functions

    Nat Rev Mol Cell Biol

    (2005)
  • S. Vucetic et al.

    Flavors of protein disorder

    Proteins

    (2003)
  • F. Zhu et al.

    Residual structure in disordered peptides and unfolded proteins from multivariate analysis and ab initio simulation of Raman optical activity data

    Proteins

    (2008)
  • A. Paz et al.

    Biophysical characterization of the unstructured cytoplasmic domain of the human neuronal adhesion protein Neuroligin 3

    Biophys J

    (2008)
  • J. Prilusky et al.

    FoldIndex©: a simple tool to predict whether a given protein sequence is intrinsically unfolded

    Bioinformatics

    (2005)
  • S.J. Whittington et al.

    Urea promotes polyproline II helix formation: implications for protein denatured states

    Biochemistry

    (2005)
  • A. Belle et al.

    Quantification of protein half-lives in the budding yeast proteome

    Proc Natl Acad Sci U S A

    (2006)
  • P. Tompa et al.

    Structural disorder serves as a weak signal for intracellular protein degradation

    Proteins

    (2008)
  • H. Xie et al.

    Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions

    J Proteome Res

    (2007)
  • S. Vucetic et al.

    Functional anthology of intrinsic disorder. 2. Cellular components, domains, technical terms, developmental processes, and coding sequence diversities correlated with long disordered regions

    J Proteome Res

    (2007)
  • H. Xie et al.

    Functional anthology of intrinsic disorder. 3. Ligands, post-translational modifications, and diseases associated with intrinsically disordered proteins

    J Proteome Res

    (2007)
  • Cited by (0)

    View full text