Constructing ensembles for intrinsically disordered proteins
Highlights
► We critically discuss methods for modeling intrinsically disordered proteins. ► Both the advantages and limitations of existing methods are analyzed. ► We further outline the major challenges to modeling these proteins and discuss new methods for the validation of these approaches.
Introduction
Thermal fluctuations cause proteins to sample a variety of conformations during their biological lifetime, where the probability of each conformation is determined by the topography of the underlying energy landscape. Folded proteins exhibit energy landscapes that have a well-defined global energy minimum (Figure 1a). By contrast, intrinsically disordered proteins (IDPs) correspond to a class of polypeptides with relatively flat energy landscapes (Figure 1b) and consequently, these proteins sample a relatively large and diverse set of conformations at room temperature [1, 2]. A great deal of interest in understanding the structure of IDPs has emerged because of their proposed role in neurodegenerative disorders such as Parkinson's and Alzheimer's diseases [3, 4, 5, 6, 7, 8, 9, 10, 11]. Therefore, a detailed characterization of these systems could pave the way to the development of new therapeutics through structure based drug design [12, 13].
The earliest attempts at modeling disordered protein states were aimed at describing folded proteins under denaturing conditions [14, 15, 16, 17]. Denatured proteins and IDPs share the characteristic that experimental observables correspond to averages over a diverse ensemble of conformations. Therefore, the typical approach to constructing an ensemble for both folded proteins under denaturing conditions and IDPs is to generate a set of conformations that have ensemble averages that agree with experimental values. When formulated in this way, the approach is straightforward; that is, generate a diverse set of conformations and then find a subset of structures and their relative stabilities (or weights) that agree with experiment. In other cases, the ensemble is constructed using purely theoretical methods and the predicted data are compared to experiment [18, 19]. While important insights have been obtained using this latter approach, using experimental data to guide the construction of the ensemble helps to limit the space of possible solutions.
In practice, constructing an ensemble from experimental data is quite a challenging task because the amount of data that are typically available pales in comparison to the number of parameters needed to uniquely define the ensemble. In other words, there are typically many different ensembles that agree with any given set of experimental data. Hence the optimization problem described above leads to degenerate solutions. In light of this, how does one reliably infer a set of conformations and weights that capture the essential features of the energy landscape, from the available experimental data? In this article, we review recent advances in this area and provide discussion regarding the advantages and limitations of various techniques.
Section snippets
Sources of experimental data
To date, most of the experimental measurements that have been used to guide the construction of unfolded ensembles correspond to observables obtained via NMR spectroscopy. Examples of such measurements include chemical shifts, which provide information about local conformational preferences [20, 21, 22••], scalar couplings, which report on backbone dihedral angles [23•], residual dipolar couplings (RDCs), which report on the angle of a bond relative to an external frame of reference [8, 22••,
Validation of ensemble building methods
Before discussing specific algorithms used for constructing ensembles it is useful to introduce a technique, which we will refer to as the reference ensemble method, which has become a standard tool for evaluating the performance of these methods [22••, 45•, 46, 47]. The reference ensemble method is illustrated in Figure 2. A reference ensemble is a predefined ‘truth,’ that is, a prespecified set of conformations and their statistical weights that can be used to calculate synthetic experimental
Ensemble-restrained MD simulations
Restrained MD simulations introduce a term into the potential function that biases the simulation towards regions of conformational space that agree with experimental observations. For an IDP, the restraints should be applied to an entire ensemble rather than an individual structure [45•]. This is accomplished by simulating multiple replicas of the protein in parallel and calculating the biasing potential based on averages taken over all of the replicas [45•, 48]. Ganguly and Chen [29••] used
Ensemble construction using a predefined conformational library
Another method for constructing ensembles for IDPs is to first generate a library of conformations and then to select a subset of conformations from this library such that averages calculated from this subset agree with the experimental data. The initial conformational library may be generated with MD, perhaps using techniques to enhance conformational sampling (see [50] for a review) like replica exchange [51], accelerated MD [46] or quenched MD [21], by piecing together small peptide
Degeneracy and model construction
Degeneracy of the ensembles with respect to the experimental measurements is one problem that plagues the construction of IDP ensembles. At its core, the problem of degeneracy arises because in practice the number of experimental constraints is small relative to the number of degrees of freedom that are needed to uniquely specify the ensemble. Fisher et al. [22••] used the reference ensemble method to show that one can often find many sets of statistical weights (for a prespecified set of
Conclusions and future directions
Any comprehensive description of an IDP necessitates the construction of an ensemble — a finite collection of conformations and weights — that capture the essence of the conformational distribution of the protein. A variety of different approaches have been developed for constructing ensembles for IDPs, each of which has its own advantages and limitations. In the past few years, a number of advances have been made in our ability to model the conformational ensembles of IDPs. Many of these advances,
References and recommended reading
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest
References (61)
- et al.
Mutations of tau protein in frontotemporal dementia promote aggregation of paired helical filaments by enhancing local beta-structure
J Biol Chem
(2001) Intrinsically disordered proteins are potential drug targets
Curr Opin Chem Biol
(2010)- et al.
Characterization of long-range structure in the denatured state of staphylococcal nuclease. II. Distance restraints from paramagnetic relaxation and calculation of an ensemble of structures
J Mol Biol
(1997) - et al.
Characterization of long-range structure in the denatured state of staphylococcal nuclease. I. Paramagnetic relaxation enhancement by nitroxide spin labels
J Mol Biol
(1997) - et al.
Calculation of ensembles of structures representing the unfolded state of an SH3 domain
J Mol Biol
(2001) - et al.
Quantitative description of backbone conformational sampling of unfolded proteins at amino acid resolution from NMR residual dipolar couplings
J Am Chem Soc
(2009) - et al.
Defining long-range order and local disorder in native alpha-synuclein using residual dipolar couplings
J Am Chem Soc
(2005) - et al.
Fluctuations in protein structure from X-ray diffraction
Annu Rev Biophys Bioeng
(1984) - et al.
Multiple conformations of full-length p53 detected with single-molecule fluorescence resonance energy transfer
Proc Natl Acad Sci U S A
(2009) - et al.
Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology
J Biomol NMR
(2007)
Fast and accurate predictions of protein NMR chemical shifts from interatomic distances
J Am Chem Soc
NMR characterization of long-range order in intrinsically disordered proteins
J Am Chem Soc
Constructing atomic-resolution RNA structural ensembles using MD and motionally decoupled NMR RDCs
Methods
Constructing RNA dynamical ensembles by combining MD and motionally decoupled NMR RDCs: new insights into RNA dynamics and adaptive ligand recognition
Nucleic Acids Res
Residual structure within the disordered C-terminal segment of p21(Waf1/Cip1/Sdi1) and its implications for molecular recognition
Protein Sci
Inferential structure determination
Science
Similarity measures for protein ensembles
PLoS ONE
BioMagResBank database with sets of experimental NMR constraints corresponding to the structures of over 1400 biomolecules deposited in the Protein Data Bank
J Biomol NMR
DisProt: the database of disordered proteins
Nucleic Acids Res
Finding order within disorder: elucidating the structure of proteins associated with neurodegenerative disease
Future Med Chem
The unfoldomics decade: an update on intrinsically disordered proteins
BMC Genom
Assembly of Tau protein into Alzheimer paired helical filaments depends on local sequence motif (306 VQIVYK 311) forming beta-structure
Proc Natl Acad Sci U S A
Structure, microtubule interactions, and paired helical filament aggregation by tau mutants of frontotemporal dementias
Biochemistry
Conformational changes specific for pseudophosphorylation at serine 262 selectively impare binding of tau to microtubules
Biochemistry
Global hairpin folding of tau in solution
Biochemistry
Structural polymorphism of 441-residue tau at single residue resolution
PLoS Biol
Highly populated turn conformations in natively unfolded tau protein identified from residual dipolar couplings and molecular simulation
J Am Chem Soc
Domain conformation of tau protein studied by solution small-angle X-ray scattering
Biochemistry
Aggregation analysis of the microtubule binding domain in tau protein by spectroscopic methods
J Biochem
Ensemble docking of multiple protein structures: considering protein structural variations in molecular docking
Proteins
Cited by (223)
Recent advances in de novo computational design and redesign of intrinsically disordered proteins and intrinsically disordered protein regions
2024, Archives of Biochemistry and BiophysicsFlanking regions, amyloid cores, and polymorphism: the potential interplay underlying structural diversity
2023, Journal of Biological ChemistryPerspectives on evolutionary and functional importance of intrinsically disordered proteins
2023, International Journal of Biological MacromoleculesTargeting disorders in unstructured and structured proteins in various diseases
2022, Biophysical Chemistry