The traditional view: Linker histones and chromatin fiber condensation

Nucleosomes and nucleosomal arrays

Chromosomal DNA is compacted by, and made accessible through, hierarchical levels of ordered chromatin condensation and decondensation. Chromatin is a dynamic nucleoprotein structure formed from histone proteins, DNA, and numerous chromatin-associated proteins. Nucleosomes, the fundamental building blocks of chromatin, are made up of 150 base pairs (bp) of DNA and an octamer of core histone proteins 1. The histone octamer consists of two molecules each of histones H2A, H2B, H3, and H4 2, and is stabile only when wrapped by nucleosomal DNA, or under high-salt solution conditions in vitro 3, 4. Approximately 1.65 superhelical turns of nucleosomal DNA are wrapped around the histone octamer to form the nucleosome, resulting in the first level of DNA condensation. The nucleosome is stabilized by extensive charge-dipole interactions between the main chains of the histones and DNA phosphates, and by hydrogen bonding between the many histone arginine residues inserted into the minor grooves of the DNA 5. The canonical alpha-helical histone-fold motifs of the core histones (Figure 1A) bind nucleosomal DNA and make up the structured core of the nucleosome, while the N-terminal “tail” domains (NTDs) pass outside the gyres of the DNA and extend beyond the nucleosome core structure 6. The NTDs are between 14 and 38 amino acids in length, highly basic (enriched in lysine and arginine residues), and are largely devoid of regular secondary structure 7. The NTDs and the histone-fold domains 8 are sites of numerous, combinatorial post-translational modifications that influence the accessibility of the nucleosomes to chromatin-associated proteins, transcription factors, and other regulatory proteins, and regulate chromatin condensation (for reviews, see 9, 10, 11). A polymer of nucleosomes assembled on a single DNA molecule is known as a nucleosomal array. The nucleosomes in a nucleosomal array are connected by core histone-free, extra-nucleosomal DNA termed linker DNA. Linker DNA is distinguished from nucleosomal DNA in that it is not constrained in the superhelix by the core histones, and it is much more susceptible to nuclease digestion 12. As will be discussed below, linker DNA possesses unique morphology in folded nucleosomal arrays and linker histone-bound chromatin fibers.

Figure 1
figure 1

The linker histone, particularly the long CTD, has the potential to mediate multiple, simultaneous interactions when bound to the nucleosome. Ribbon diagrams of the 3D structures of Xenopus laevis core histone H2B (A) and Gallus gallus linker histone H1 (B). Models for histones H2B and H1 were derived from PDB entries 1AOI 1 and the first model from entry 1GHC 121, respectively. The histone tails were constructed in Coot 122 by making virtual concatamers of the H2B NTD residues visible in the X-ray structure. Images were made with the program VMD 123. Every effort was made to ensure the C- and N-terminal extensions were created to scale, approximately 3.5 Å per residue for an extended peptide devoid of secondary structure.

Structural dynamics of nucleosomal arrays

Nucleosomal arrays, either extracted from living cells or reconstituted from purified components, undergo a hierarchical series of salt-dependent condensation transitions 13, 14, 15, 16. The primary chromatin structure corresponds to an extended 'beads on a string' conformation, and is observed under very low salt concentrations (and in the absence of linker histone) 17. When imaged under these conditions, the individual nucleosomes do not come in close contact, and the linker DNA is nearly maximally extended, with inter-nucleosome distances of 150-200 Å (this length correlates to 30-50 bp of exposed linker DNA) 18, 19. The addition of salts (e.g., 1-2 mM MgCl2 or 100-200 mM NaCl 20) causes short-range, intra-array nucleosome-nucleosome interactions, resulting in folding of the array into secondary chromatin structures. The endpoint of salt-dependent folding of arrays is widely thought to be the canonical 30-nm fiber described in early physical studies of fragmented chromatin and also observed more recently using model systems 12, 14, 15, 21, 22, 23 (for reviews, see 24, 25, 26, 27). The nature of the structure of the 30-nm fiber itself has been studied and debated at length, yielding two basic models (for reviews, see 21, 25, 26). A two-start helix, consisting of a zig-zag arrangement of stacked nucleosomes, with 5-6 nucleosomes per 11-nm helical rise, was proposed based on biochemical 23 and crystallographic data obtained with a tetranucleosome array having a very short nucleosome repeat length (167 bp) in the absence of linker histone 28. This was reinforced by EM of cross-linked arrays and computer modeling 29. In this study, both simulated and formaldehyde cross-linked EM structures were used to determine internucleosome connectivities. The dominant connectivities in both simulated and cross-linked arrays were found to be between nucleosomes N and N ± 2, though, importantly, the addition of magnesium ions raised the proportion of nearest neighbor (N ± 1) and N ± 3 contacts, indicative of (linker) DNA bending to accommodate greater compaction. These data were thus found to be consistent with the two-start, zig-zag, twisted ribbon model.

Alternatively, EM studies of long polynucleosome arrays of varying repeat length, in the presence and absence of linker histone and salt, have provided convincing evidence for the one-start, interdigitated solenoid structure, with 11 nucleosomes per 11-nm rise 30, 31. This structure requires 6 consecutive nucleosomes containing linker histone in order to complete one helical turn, and thus, the structure is stabilized through N ± 6 connections. This work supports early work using endogenous chromatin fragments purified from chicken erythrocyte nuclei, where neutron scattering supported the notion that linker histone was essential for forming the 30-nm fiber 32. A more recent study using magnetic tweezers to pull on long 197-bp repeat length nucleosome arrays, and comparing arrays with and without linker histone, suggests that based upon the slopes of the stretching curves the arrays respond in a manner consistent with Hooke's law; thus, the structure must be solenoidal, not zig-zag 33. Shorter repeat length arrays (167 bp, more similar to the studies from the Richmond lab 23, 28) are shorter and stiffer when folded, consistent with the two-start helix. Thus, nucleosome repeat length is a predictor of the specific, perhaps local, higher-order structure of the 30-nm fiber.

At a slightly higher concentration of divalent salt than those that induce folding, nucleosomal arrays reversibly self-associate into tertiary chromatin structures. This structural transition was observed in the earliest studies of chromatin, as dating back over 50 years ago it was reported that isolated rat and chicken nuclei underwent a reversible and cooperative transition from a homogenous to granular state when the buffer included 5 mM MgCl2 34, 35, 36, 37. Nucleosomal array model systems and homogenous, recombinant core histone octamers have been used to extensively characterize the self-association transition 20, 38, 39. The use of octamers in which the NTDs had been removed through recombinant DNA techniques 40 showed that the NTDs of H2A, H2B, H3 and H4 contribute additively and independently to nucleosome array oligomerization. Collectively, nucleosomal array folding and self-association are correlated with short-range folding and long-range fiber-fiber interactions, respectively, present in eukaryotic interphase chromosome fibers 14, 16.

Linker histones

The folded secondary structures formed by nucleosomal arrays at physiological ionic strength are intrinsically unstable and require other proteins to “lock” the arrays into stable higher-order secondary chromatin structures 14. These proteins have been termed chromatin architectural proteins 41. The most abundant chromatin architectural proteins in higher eukaryotes are the linker histones, a family of proteins structurally distinct from the core histones. Unlike the highly evolutionarily conserved core histones, which most likely evolved from an archebacterial ancestor 42, the sequences of the linker histone variants are more highly variable across species (for a review, see 43) and are thought to have evolved from eubacteria 44. Linker histones are present at an average of nearly one molecule per nucleosome in living cells 45, 46, less in species with shorter nucleosome repeat length. The stoichiometry is closer to one molecule per nucleosome in highly condensed heterochromatin, and somewhat lower in decondensed euchromatin 46. A nucleosome bound by linker histones is called a chromatosome, while a nucleosome array assembled with linker histones is called a chromatin fiber. For the purpose of simplicity, within this review we will focus on linker histone H1 and its sequence variants (for a review, see 47).

Linker histones have a tripartite structure unlike that of the core histones (Figure 1B), with unstructured N- (13-40 amino acids in length) and C-terminal (100 amino acids) domains flanking a well-folded 'globular domain' (GD) of 80 amino acids 48. No specific function has been observed for the NTD; thus, its nature remains largely enigmatic. The structure of the central, globular domain 49, 50, 51 contains at least two separate DNA-binding sites: the first involves a classical winged-helix motif and the second a cluster of conserved basic residues on the opposite face of that domain 51. These two DNA-binding domains allow the linker histone globular domain to bridge different DNA molecules and form tram-track structures 52, and explain the preferential binding of linker histone to DNA crossovers 53 and four-way junctions 54. Linker histones asymmetrically bind to the nucleosomes of chromatin fibers at the nucleosomal DNA entry and exit sites 55, 56, 57, 58 and increase the micrococcal nuclease protection of nucleosomes from 146 to 168 bp 59.

Linker histones and the structural dynamics of chromatin fibers

Early studies of native chromatin fibers revealed distinct differences in the fiber morphology when stripped of linker histone to form nucleosomal arrays 17, 60. Specifically, H1-containing chromatin fibers had a more regular helical appearance. On the other hand, the addition of salt caused “clumping”, not formation of the defined fibers of regular diameter seen in H1-containing fibers. Thus, it was proposed that H1 must be bound near the entry-exit sites to alter the DNA paths and “induce” folding of chromatin fiber into regular 30-nm diameter structures. As discussed above, many studies have established that nucleosomal arrays in salt equilibrate between extended and highly folded conformations. Three decades after the original research, we now know that the effects of linker histones are to (1) convert a heterogeneous population of folded nucleosomal arrays into stable 30-nm diameter structure(s), and (2) cause chromatin fibers to self-associate at much lower salt concentrations than nucleosomal arrays. These and related results have been interpreted to mean that linker histones stabilize the intrinsic condensed structures formed by nucleosomal arrays 13, 20, 61. Similar conclusions have been derived from recent single-molecule magnetic tweezer studies 33. Of note, H1-dependent formation of stable 30-nm structures is cooperative 30 and dependent on the linker histone being bound to 5-7 contiguous nucleosomes 32. In addition to the stabilizing function, there seems to be little doubt that linker histones promote increased compaction and regularity of the 30-nm structures 30. Thus, linker histones both stabilize the folded fiber and induce specific structural features such as the stem-loop motif described below.

The mechanism through which the linker histones influence chromatin folding has been investigated using both conventional and cryo-electron microscopy to image linker histone-containing trinucleosome fragments from chicken erythrocytes 62. These studies resulted in particularly high-resolution images of a 3D zig-zag conformation showing that linker histones decrease the entry-exit angle of DNA from the nucleosome. In addition, the entering and exiting DNA strands were seen to emerge tangentially from a single origin close to the nucleosome when the linker histone was present. This close approach of the entry-exit DNA is known as a stem or apposed stem motif, and has been studied in great detail by the Woodcock lab and others 62, 63, 64. The three possible orientations of the entry-exit DNA are such that the DNA linking number changes to -2 (the strands cross), -1 (the strands come to parallel, then diverge), or 0 (the strands cross one another and cross the nucleosome). In order to bring the entry-exit strands this close, sufficient screening of the DNA negative charge is likely provided by the large number of positively charged amino acids in the linker histone CTD (and possibly the H3 NTD 65).

The H1 CTD has been directly implicated in chromatin folding: fibers bound to a proteolytically truncated H1 lacking the CTD were not capable of stabilizing 30-nm diameter secondary chromatin structures 66, 67. The amino acid composition (40% Lys and Arg, < 10% hydrophobic residues) of the CTD is ideal for allowing ionic interactions of a largely extended domain with the linker DNA. In addition, alternating Lys and Ala residues interspersed with Pro create a uniform charge distribution 68 predisposed to form proline-kinked α-helical elements 69. Two recent papers investigating the role of the H1 CTD in chromatin condensation have provided surprising insight into how this domain functions. In the first paper, the folding and self-association dynamics of chromatin fibers bound to a series of mouse H1-0 CTD truncation mutants were determined. The results indicated that there are two discontinuous “sub-domains” within the CTD that mediate H1 function in stabilizing the condensed chromatin 70. These sub-domains are defined by residues 97-121 (which directly abut the globular domain) and 145–169. Residues 122-144 and 170-196 could be deleted without affecting the function. In the second paper, the role of the unique CTD amino-acid composition 71 was determined by studying the H1 isoforms and performing sub-domain swapping and directed mutagenesis experiments 72. Results showed that the H1 isoforms, which vary significantly in their primary sequence but not in their amino-acid composition, all functioned identically in chromatin condensation assays in vitro. Randomization of amino acids 97-121, while maintaining the unique H1 CTD amino-acid composition, had no effect on condensation transitions, ruling out primary sequence as a determinant of function. Surprisingly, sub-domain “swap” mutants, in which residues 122-144 and 170-196 replaced residues 97-121, functioned as well as wild-type protein in mediating chromatin condensation pathways. These results indicate that the molecular determinants of H1 CTD function in chromatin condensation are its unique amino-acid composition and the specific location of sub-domains of the CTD relative to the globular domain, not its primary sequence.

The determinants of H1 CTD function have been linked to intrinsic protein disorder 72. Intrinsically disordered protein domains lack native structure but often undergo a disorder-to-order transition concomitant with binding to nucleic acids or other proteins 71, 73, 74. There is substantial evidence that the H1 CTD is intrinsically disordered. The CTD and peptides derived from the CTD are random coils in aqueous solution 75, 76. Trifluoroethanol increases the α-helical content of the CTD, as does sodium perchlorate, indicating that the CTD has an inherent propensity to form α-helical structures. Importantly, a peptide corresponding to residues 99-121 of the H1 CTD becomes α-helical upon binding to DNA 77, 78. This peptide closely corresponds to the functional domain that abuts the H1 globular domain and was shown to be important for chromatin condensation (see above, and 70). Most recently, the full-length H1 CTD was studied in a variety of solution conditions in the presence and absence of DNA, and was shown to adopt a full complement of secondary stuctures 79. Taken together, these data suggest that the CTD mediates stabilization of condensed chromatin fibers by assuming α-helical and/or β-strand structure(s) upon binding to linker DNA. As we will see below, intrinsic disorder also appears to play a role in some CTD-mediated protein-protein interactions.

The emerging view: Linker histones and protein-protein interactions

Linker histones bind many different nuclear and cytosolic proteins

The previous sections describe the traditional, well-established functions of H1 as a chromatin architectural protein. However, increasing evidence has accumulated indicating that linker histones also act by interacting with many different non-histone nuclear and cytosolic proteins. Specifically, at least 16 examples of linker histone-protein interactions can be found scattered throughout the literature (summarized in Table 1). Taken together, the data in Table 1 clearly indicate that H1 functions in part by participating in protein-protein interactions. It is therefore appropriate to ask: how many more H1-interacting proteins remain to be identified? Two groups have recently utilized distinct approaches to examine H1-protein interactions in vivo. Fang-Lin Sun and co-workers 80 used antibodies against the N- and C-terminal regions of H1 to co-IP nuclear proteins from Drosophila melanogaster Kc cells. What they identified as H1-binding proteins include H2B and H3, 40S and 60S ribosome components, and two proteins (hnRNP48 and hnRNP36) involved in pre-mRNA processing and cytoplasmic export 81, 82. However, they identified only “the most prominent bands present between 15 and 50 kDa in the gels”, and thus many other interesting H1-binding proteins may have been present and not accounted.

Table 1 Specific H1-protein interactions

The work from the lab of Woojin An 83 provides more direct insight into the breadth of interactions in which H1 is involved. Using a cell line stably expressing a tandem-tagged H1.2 construct (Flag-HA-H1.2), a series of separation steps (P-11 phosphocellulose and M2 agarose chromatography, glycerol gradient sedimentation) and mass spectrometry were used to identify the binding partners. A stained SDS-PAGE gel demonstrates the presence of nearly 20 prominent bands and many other less abundant stained bands, strongly suggesting that the number of H1-specific binding partners is much larger than indicated in Table 1. In terms of specific proteins, a protein kinase (DNA-PK) and phosphatase (PP1), Poly-ADP-Ribose Polymerase 1 (PARP1) 84, cell-survival transcription factor YB1 85, and the DNA/RNA binding protein PURα 86 were identified.

Mechanisms of linker histone-protein interactions

As with chromatin condensation, it is of interest to know which linker histone domain(s) mediate protein-protein interactions and the mechanism(s) through which they act. The most thoroughly described interaction at the molecular level involves the apoptotic nuclease DNA Fragmentation Factor, DFF40 (for reviews, see 87, 88). DFF40 exists in the nucleus in complex with its chaperone and inhibitor DFF45 89 [caspase-activated DNase (CAD) and inhibitor of CAD in mice]. DFF45 is cleaved at two caspase-3 sites late in the apoptotic pathway, releasing DFF40 from the DFF40/DFF45 dimer and allowing DFF40 to form enzymatically active homo-oligomers 90, 91, 92. Free DFF40 is highly specific and active in cleaving the linker DNA between nucleosomes, in the process releasing small chromatin fragments. Because of its preference for linker DNA, it was soon discovered that DFF40-dependent DNA cleavage was greatly stimulated by linker histones 89, 91, 93.

A detailed biochemical analysis subsequently investigated the mechanism of DFF40-linker histone interactions 94. The first unexpected observation was that the H1 CTD mediated the protein-protein interaction. Using the same H1-0 mutants as Lu and Hansen 70, it was shown that progressive deletion of the 72 most C-terminal residues of the H1 CTD led to progressive loss of DFF40-dependent DNA cleavage. Of note, no further effect was observed when the 24-residue region immediately abutting the globular domain was deleted. Thus, the regions of the CTD needed to mediate chromatin condensation and DFF40 interactions appear to be distinct. The CTD also functioned independently of the rest of the protein; a number of different 48-residue peptides derived from the H1 CTD led to activation levels similar to those of full-length H1. Interestingly, the amino-acid composition of all the activating peptides was very similar. Consistent with this observation, six of the mouse somatic H1 isoforms, which differ significantly in their primary sequence but not amino-acid composition, activated DFF40 equally well. Thus, amino-acid composition appears to be an important determinant of H1-DFF40 interactions in addition to H1-chromatin interactions. These studies further showed that DFF40-H1 interactions enhanced DNA binding by DFF40, and that incubation of DFF40 with the isolated H1 CTD or any of the 48 amino-acid peptides significantly enhanced DNA binding, similar to the effect on DNA cleavage. The binding of H1 to the barrier to autointegration factor (BAF) protein is also dependent on the H1 CTD 95, although mechanistic studies of the BAF-H1 interaction were not performed. Taken together, these results support the conclusion that the H1 CTD is capable of mediating interactions with both DNA and specific proteins, thus implicating the CTD as an important determinant of linker histone multifunctionality.

Unfortunately, the majority of studies of H1-dependent protein-protein interactions described have not distinguished whether the GD, NTD, or CTD is responsible for the interaction. The H1 CTD can form an amphipathic (basic residues on one face, hydrophobic residues on the opposing face) α-helix 78, and thus contains a hydrophobic surface, possibly utilized for protein binding. Further, it has been shown that the H1 NTD (residues 11-23, which abut the GD) can form a non-amphipathic α-helix 96. As no chromatin-condensing function has been ascribed to the NTD, perhaps this is also a site of H1-protein interactions, as suggested for Msx-1 97 and HP1 98, 99.

Prothymosin (ProTα) co-IPs with H1, is able to extract a fraction of H1 from reconstituted chromatin, and gel shifts both the full-length H1 and the globular domain 100. ProTα is small (12 kDa), highly acidic, and predicted to be largely unstructured. The interaction with the GD is intriguing, as is the observation that ProTα interferes with H1-chromatin binding, since H1 binding to nucleosomes occurs in part through the GD. However, none of the assays looked for contacts with the linker histone CTD or NTD, and in light of the biochemical nature of ProTα, it seems possible that the basic, unstructured H1 CTD may be able to interact with the highly acidic ProTα. Similar hypotheses can be made for parathymosin, which biochemically is a nearly identical protein to ProTα.

While we can now say with some certainty that H1 functions in part through protein-protein interactions, much more work will be necessary to define the extent to which the three linker histone domains function as protein-protein interfaces. However, in view of what little is known, it seems likely that all three domains are able to serve in this role depending on the specific protein-protein interaction in question.

Linker histone post-translational modifications and protein-protein interactions

The H1 isoforms are phosphoproteins, as numerous serine and threonine residues in the N- and C-terminal domains are phosphorylated in response to various cellular stimuli 101, 102, 103, 104, 105. More recent proteomic approaches have uncovered additional and specific combinations of modifications besides phosphorylation, such as ubiquitylation 106, lysine acetylation 107, and lysine methylation (for excellent reviews, see 101, 108). One intriguing example involving protein-protein interactions comes from the Schneider lab, who determined that the chromodomain of the protein HP1 specifically recognizes methylated lysine 26 (K26Me) of H1 isoform 1.4 99. Of note, phosphorylation of the adjacent serine 27 (S27Phos) blocked binding of HP1, creating an on-off switch for HP1 binding. Interestingly, a more recent study from the Reinberg lab showed that the mono- or di-methylation of H1 lysine 26 (K26Me/Me) leads to binding of a different chromatin-condensing protein (L3MBTL1) at Rb-regulated genes 27, again establishing a link between post-translational modifications of the H1 NTD and H1-mediated protein-protein interactions. Such specific, post-translational modification-controlled recruitment of H1 by a heterochromatin-specific protein such as HP1 acting through H1 N-terminal residues indicates a function other than nucleosomal DNA binding for the linker histone NTD.

Concluding remarks

It is clear from both the large number of specific H1-dependent protein-protein interactions, and the cellular processes with which these H1 interacting proteins are involved, that the linker histones are much more than just a nucleosome-binding protein that stabilizes higher-order chromatin structures. Rather, it appears that H1 also is a multifunctional recruitment hub for a number of overlapping (and often opposing) processes centered on the genomic DNA. It is interesting that many of interactions are with proteins thought to be primarily localized to the cytosol. In the future, more effort needs to be focused on determining the full breadth of H1-mediated protein-protein interactions, as well as on mapping the domains responsible and identifying the mechanisms through which they act. This will lead to clarification of the many gaps in our current understanding of the molecular basis for the multifunctional nature of the linker histone family.