Chapter 3 - Natural History of the Eukaryotic Chromatin Protein Methylation System

https://doi.org/10.1016/B978-0-12-387685-0.00004-4Get rights and content

In eukaryotes, methylation of nucleosomal histones and other nuclear proteins is a central aspect of chromatin structure and dynamics. The past 15 years have seen an enormous advance in our understanding of the biochemistry of these modifications, and of their role in establishing the epigenetic code. We provide a synthetic overview, from an evolutionary perspective, of the main players in the eukaryotic chromatin protein methylation system, with an emphasis on catalytic domains. Several components of the eukaryotic protein methylation system had their origins in bacteria. In particular, the Rossmann fold protein methylases (PRMTs and DOT1), and the LSD1 and jumonji-related demethylases and oxidases, appear to have emerged in the context of bacterial peptide methylation and hydroxylation systems. These systems were originally involved in synthesis of peptide secondary metabolites, such as antibiotics, toxins, and siderophores. The peptidylarginine deiminases appear to have been acquired by animals from bacterial enzymes that modify cell-surface proteins. SET domain methylases, which display the β-clip fold, apparently first emerged in prokaryotes from the SAF superfamily of carbohydrate-binding domains. However, even in bacteria, a subset of the SET domains might have evolved a chromatin-related role in conjunction with a BAF60a/b-like SWIB domain protein and topoisomerases. By the time of the last eukaryotic common ancestor, multiple SET and PRMT methylases were already in place and are likely to have mediated methylation at the H3K4, H3K9, H3K36, and H4K20 positions, and carried out both asymmetric and symmetric arginine dimethylation. Inference of H3K27 methylation in the ancestral eukaryote appears uncertain, though it was certainly in place a little later in eukaryotic evolution. Current data suggest that unlike SET methylases, which are universally present in eukaryotes, demethylases are not. They appear to be absent in the earliest-branching eukaryotic lineages, and emerged later along with several other chromatin proteins, such as the Dot1-methylase, prior to divergence of the kinetoplastid-heterolobosean lineage from the remaining eukaryotes. This period also corresponds to the point of origin of DNA cytosine methylation by DNMT1. Origin of major lineages of SET domains such as the Trithorax, Su(var)3-9, Ash1, SMYD, and TTLL12 and E(Z) might have played the initial role in the establishment of multiple distinct heterochromatic and euchromatic states that are likely to have been present, in some form, through much of eukaryotic evolution. Elaboration of these chromatin states might have gone hand-in-hand with acquisition of multiple jumonji-related and LSD1-like demethylases, and functional linkages with the DNA methylation and RNAi systems. Throughout eukaryotic evolution, there were several lineage-specific expansions of SET domain proteins, which might be related to a special transcription regulation process in trypanosomes, acquisition of new meiotic recombination hotspots in animals, and methylation and associated modifications of the diatom silaffin proteins involved in silica biomineralization. The use of specific domains to “read” the methylation marks appears to have been present in the ancestral eukaryote itself. Of these the chromo-like domains appear to have been acquired from bacterial secreted proteins that might have a role in binding cell-surface peptides or peptidoglycan. Domain architectures of the primary enzymes involved in the eukaryotic protein methylation system indicate key features relating to interactions with each other and other modifications in chromatin, such as acetylation. They also emphasize the profound functional distinction between the role of demethylation and deacetylation in regulation of chromatin dynamics.

Introduction

The three superkingdoms of life utilize very distinct strategies for packaging their genomic DNA. Most bacteria utilize members of the IHF/HU family as their primary DNA-packaging protein.1 In addition, certain bacteria, such as chlamydiae, have specialized DNA-packaging proteins of the HC1/HC2 family that function in establishing the condensed chromatin that is typical of certain stages of their life cycle.2, 3 Archaea show a surprising diversity of DNA-packaging proteins that include members of the Alba, MC1, Sac7/Cren7/Sso7, and histone fold families.4, 5 The histone fold proteins, which they share with eukaryotes, are primarily observed only in two of the great divisions of archaea, namely the euryarchaea and the thaumarchaea. The archaeal histones represent a packaging strategy that appears to have been the precursor of the eukaryotic system. Currently characterized archaeal nucleosomes comprise a single or a pair of distinct histone subunits, assembling into a tetramer that wraps ~ 80 base-pairs (bp) of DNA around it (comparable to the eukaryotic histone H3–H4 tetrasomes).4 The origin of the eukaryotes was accompanied by a dramatic development of this ancestral histone template. First, there was proliferation followed by divergence resulting in four distinct histones (H2A, H2B, H3, and H4) that are conserved throughout eukarya.4 Second, these histones assembled into an octamer, as opposed to the archaeal tetrasome, and wrapped nearly twice as much DNA (~ 146 bp).4 Third, the eukaryotic histones acquired extensions to the N-terminus and/or C-terminus of the globular DNA-binding histone fold, that are enriched in positively charged residues.6 These extensions are known as the histone tails and provide additional surfaces that neutralize the negative charges of the DNA backbone.

Emergence of the histone octamer-based packaging in eukaryotes was also accompanied by several other major structural innovations pertaining to chromosomal organization.5 Right in the common ancestor of all extant eukaryotes a transition was made from the predominantly circular chromosomes of prokaryotes to multiple linear ones whose ends are capped by telomeres. Further, the chromosomes were separated from the rest of the cell by a membrane bi-layer, resulting in the quintessential feature of the eukaryotes, the nucleus.7, 8, 9 Emergence of the nucleus decoupled cytoplasmic translation from nuclear transcription and marked a major departure from the prokaryotic situation. This appears to have relaxed the constraints on the eukaryotic genes allowing them to be colonized by introns, as mRNA was no longer translated during transcription.6 However, emergence of introns favored the origin of a new set of large protein complexes: the spliceosomal complexes that associated with transcribed genes, and acted on the intron-containing primary transcripts.10 Emergence of the nucleus also appears to have favored the emergence of a distinct subnuclear organelle, the nucleolus, where the ribosomal proteins could be combined with the freshly synthesized rRNAs to generate functional ribosomal subunits.11 Thus, the landscape of eukaryotic chromatin diverged considerably from that of the prokaryotes, with spliceosomal, rRNA processomal, and telomerase ribonucleoprotein complexes adding to the protein and nucleic acid mass of the chromosomes beyond just the genome and the histone octamers.

In terms of protein structure, the origin of the eukaryotes was characterized by an expansion of low-complexity sequences in proteins.6, 12 These form nonglobular segments of proteins that typically exist as disordered or unstructured random coils, and tend to be enriched in a single or few amino acids. In addition to histone tails, such low-complexity regions are also abundant in eukaryotic nuclear proteins such as transcription factors (TFs) and spliceosomal proteins (e.g., RGG and SR repeats), and might play roles in protein–protein interactions and low-specificity nucleic acid interactions.12 These low-complexity regions offered a niche for the diversification of a veritable ecosystem of enzymes in eukaryotes that catalyze addition of covalent modifications to the amino acid side chains or the N- and C-termini of polypeptides.13, 14, 15, 16, 17 There also arose a corresponding array of enzymes that catalyzed the removal of such covalent modifications, to restore the given peptide to its unmodified state. In addition to histone tails, the other targets of this flux of modifications were peptides from proteins that are more transient or long-term residents of chromosomes. These modifications span a dramatic range in terms of molecular weight and biochemical diversity.13, 14, 15, 16, 17 The simplest of these are low-molecular weight adducts (methyl, phosphate, and acetyl groups). Somewhat higher molecular weight modifications include mono-ADP-ribosylation, biotinylation, and spermidinylation. The largest modifications involve addition of whole biopolymers such as branched or linear polyADP-ribose (up to 200 ADP-ribose units), peptides such as polyglutamate or polyglycine (up to 20 amino acids) and polypeptides of the ubiquitin family such as ubiquitin (Ub) and Sumo. In addition to these adducts, there are covalent modifications that directly modify the amino acid side chain. These include citrullination that results from the deimidation of the guanidino group of arginine (releasing the ammonium ion) and ornithination that results from hydrolysis of the guanidino group (releasing urea).18, 19 Other direct modifications are hydroxylations of the side chains of proline, lysine, and asparagine which generate the corresponding hydroxy amino acids.16, 20, 21

Among chromatin proteins, the ɛ-amino group of lysine is the most prominent target for modification, and receives adducts such as methyl, acetyl, biotinyl, and ubiquitin-like polypeptides.16 The target amino can accept up to 3 methyl groups, resulting in distinct mono-, di-, and tri- methyl forms of lysine. In contrast, the guanidino group of the other basic amino acid, arginine, is primarily the target for a single adduct, methylation. In this case, methylation can result in three distinct modifications namely monomethylarginine and either asymmetric dimethylarginine where both methyl groups are linked to a single nitrogen atom of the guanidine group or symmetric dimethylarginine with one methyl group on each of the two available nitrogens.19 The alcoholic amino acids serine and threonine are the primary targets of phosphorylation, but tyrosine is also similarly modified, predominantly in the animal lineage.15 Serine and threonine can also be glycanated by N-acetylglucosamine, the significance of which is only recently beginning to be understood.22 The acidic side chain of glutamate is a target for several modifications such as mono- and poly-ADP-ribosylation, polyglutamylation, polyglycination, and potentially also methylation.13, 14 The amino termini of chromatin proteins are also often subject to processing followed by acetylation. These adducts, along with direct modifications of side chains (hydroxylation and citrullination), have a profound consequence on the biochemical properties of histones and other chromatin proteins. The most prevalent modifications of histones are acetylation and methylation.15, 17 The former has been observed on at least 13 lysine side chains distributed across the four standard octameric histones. Methylation target sites are also distributed across the four core histones, with six of those being arginine and the remaining seven being lysine.15, 17, 23 These are followed by phosphorylation with at least six sites, ubiquitin-system modifications with at least five target sites and poly-ADP-ribosylation on a single site across the classic core histones or their variants like centromeric H3.15, 24, 25 Other than the core histones, the linker histone H1 is also subject to various modifications, such as methylation (e.g., at H1.4K26).26

In a direct sense, all of these modifications can affect both the surface electrostatics and the net size of the modified polypeptide, and sterically affect its interactions with nucleic acids and proteins. For example, the acetylation of lysines can reduce the net positive charge, phosphorylation and polyglutamylation can increase the net negative charge, and ubiquitination and poly-ADP-ribosylation can drastically alter the size of the polypeptide.13, 14, 15, 17 Additionally, many of these modifications carry epigenetic information, commonly termed “the histone code.” The introduction of these modifications by specific enzymes can be seen as a coding step, in which extragenetic information is “written” into the histones and transmitted through subsequent cell divisions.15, 17 Discrimination between modified and unmodified peptides by specific peptide-binding domains, which might then recruit other chromatin remodeling or modifying activities to chromatin, can be conceptualized as the “interpretation” of the epigenetic code.17, 27 Finally, the removal of these marks by other enzymes can be conceived as “resetting” of the epigenetic information and usually accompanies major differentiation events or transitions such as postzygotic development. These protein-based marks also functionally interact with both DNA modifications and the RNAi system to comprise the complete complex of epigenetic coding in eukaryotes.17 Over the past two decades, biochemical and biological studies have unleashed an avalanche of information regarding the structural, mechanistic, and organismal dimensions of these systems of epigenetic information. In particular, a combination of computational analysis of proteins sequences and structures and experimental investigations have identified most of the major enzyme classes involved in the generation and erasure of epigenetic marks on proteins as well as the domains that discriminate among them.

A key realization from the studies on chromatin protein modifications has been that, though most eukaryotes possess sizeable complements of proteins catalyzing the major modifications, they can all be unified into a relatively small set of protein superfamilies. Likewise, a relatively small set of structural scaffolds has been used repeatedly among the binding domains that discriminate modified from unmodified peptides in chromatin. Protein acetyltransferases can be unified as members of the GCN5-like acetyltransferase (GNAT) fold.28, 29 The deacetylases belong to two major folds, namely the HDAC-arginase-like fold that contains the prototypical histone deacetylase Rpd3, and the classical Rossmann fold which includes the deacetylases of the Sir2 superfamily.30, 31, 32 Among kinases, most belong to the eukaryote-type protein kinase fold, though the recently characterized WSTF (that phosphorylates H2A.X on tyrosine 142) appears to define a novel structural scaffold for protein kinases.33 Ubiquitin- and SUMO-conjugating systems follow a three enzyme cascade (E1, E2, E3), of which all histone-modifying E3s contain a treble-clef domain of the RING finger superfamily as their catalytic element.34, 35, 36 The deubiquitinating isopeptidases acting on histones contain a catalytic domain with either the papain-like fold, or the metal-dependent JAB domain of the deaminase-like fold. Likewise, the catalytic domains of methylating and demethylating enzymes, which are the focus of this chapter, belong to a small set of ancient structural scaffolds.

The chromatin protein methylation system can be defined as comprising lysine and arginine methylases, the corresponding demethylases, and the arginine deiminases that regulate arginine methylation by its conversion to citrulline. Domains that discriminate methylated peptides from their unmethylated counterparts (i.e., readers of the epigenetic code established by the above enzymes) may also be considered as an immediate extension of this system. In contrast to the several surveys that discuss chromatin protein methylation from a functional angle with a focus on human or yeast models, we adopt an evolutionary perspective and exploit the genomic information that has become available across the eukaryotic tree. We present a structural overview of the main types of protein methylases, demethylases and deiminases followed by an evolutionary consideration of each of the catalytic domains. We then briefly survey the structural diversity of the peptide-binding domains involved in discrimination of methyl marks and their potential role in recruiting other activities to chromatin. Thereafter, we consider the major trends in the domain architectures of enzymes belonging to the methylation system and discuss the emergent syntactical features in the context of the functions of these proteins. Finally, we try to place the evolutionary history of protein methylation in the context of the other major mediators of epigenetic information, namely the DNA methylation and the RNAi systems.

Section snippets

The Categories of Protein Methylases and Their Role in Chromatin Protein Methylation

Protein methylases have evolved among two structurally unrelated folds. The first group of protein methylases belongs to the classical methyltransferase superfamily along with numerous other methylases, and possesses the Rossmann fold.37, 38, 39 The second group of currently known protein methylases, the SET domain superfamily, contains the β-clip fold.40 Among the classical Rossmann fold-type methylases are several distinct protein methylase families, and two of these methylate histones and

Enzymatic Mechanisms That Preempt or Reverse the Action of Protein Methylases in Chromatin

Protein methylation in chromatin proteins, both by Rossmann fold and SET domain methylases, is modulated by catalytic activities that either preempt or reverse the methyl marks.15 The primary preemptive mechanism that has been characterized is citrullination of histone arginines. Demethylation affects both methylated arginines and lysines, and represents an important regulatory mechanism.

Domains Involved in Discrimination of Methylated Peptides

The discrimination of the methylation states of modified peptides in chromatin proteins is mediated by a number of structurally diverse domains. A comprehensive discussion of the proteins containing these domains is beyond the scope of this work, as it would involve most major groups of chromatin proteins. Hence, in this chapter, we briefly discuss the major structural scaffolds involved in binding modified peptides and discriminating their methylation status. Currently, methylated peptide

Associations with DNA-Binding and Modified-Peptide-Recognition Domains

Analysis of domain architectures, and the network representation of the total set of architectures that are found among enzymes involved in the methylation system, reveal certain interesting patterns with considerable functional significance (Fig. 6). First, there is a striking difference in the architectures of the Rossmann fold methylases and deiminases on one side, and the SET domain methylases and JOR/JmjC and LSD1 demethylases on the other side (Fig. 6). The former show practically no

Evolutionary Considerations

Complementary evidences from comparative genomics, sequence analysis, and structural biology have uncovered several key aspects of the provenance and the history of the eukaryotic protein methylation system and its integration with other regulatory mechanisms such as DNA methylation and RNAi. The evidence from comparative genomics indicates that many key players in each of these mechanisms have emerged in the bacterial world, as a part of different systems that were under selection for

General Conclusions

The past 15 years have witnessed an extraordinary expansion of studies pertaining to chromatin protein methylation and its functional significance.17 In face of the enormous literature that has accumulated in this field, it is difficult to discern key new directions that might help in filling major lacunae. However, one important direction of study would be to create comprehensive regulatory networks that link all methylated proteins to their corresponding methylating or demethylating enzymes

Acknowledgments

Work by the authors is supported by the intramural funds of the National Library of Medicine, National Institutes of Health, USA. We would like to acknowledge the numerous contributions of various researchers in the protein methylation and chromatin field, which we were regrettably unable to cite due to sheer enormity of the literature under review.

References (267)

  • A.F. Neuwald et al.

    GCN5-related histone N-acetyltransferases belong to a diverse superfamily that includes the yeast SPT10 protein

    Trends Biochem Sci

    (1997)
  • A.M. Burroughs et al.

    Evolutionary genomics of the HAD superfamily: understanding the structural adaptations and catalytic diversity in a superfamily of phosphoesterases and allied enzymes

    J Mol Biol

    (2006)
  • A.M. Burroughs et al.

    Anatomy of the E2 ligase fold: implications for enzymology and evolution of ubiquitin/Ub-like protein conjugation

    J Struct Biol

    (2008)
  • H.L. Schubert et al.

    Many paths to methyltransfer: a chronicle of convergence

    Trends Biochem Sci

    (2003)
  • R.S. Lipson et al.

    Two novel methyltransferases acting upon eukaryotic elongation factor 1A in Saccharomyces cerevisiae

    Arch Biochem Biophys

    (2010)
  • T.R. Porras-Yakushi et al.

    Yeast ribosomal/cytochrome c SET domain methyltransferase subfamily: identification of Rpl23ab methylation sites and recognition motifs

    J Biol Chem

    (2007)
  • T.R. Porras-Yakushi et al.

    A novel SET domain methyltransferase modifies ribosomal protein Rpl23ab in yeast

    J Biol Chem

    (2005)
  • M. Sadaie et al.

    A conserved SET domain methyltransferase, Set11, modifies ribosomal protein Rpl12 in fission yeast

    J Biol Chem

    (2008)
  • H. Demirci et al.

    Multiple-site trimethylation of ribosomal protein L11 by the PrmA methyltransferase

    Structure

    (2008)
  • W. An et al.

    Ordered cooperative functions of PRMT1, p300, and CARM1 in transcriptional activation by p53

    Cell

    (2004)
  • S.A. Blythe et al.

    beta-Catenin primes organizer gene expression by recruiting a histone H3 arginine 8 methyltransferase, Prmt2

    Dev Cell

    (2010)
  • S.L. Chen et al.

    The coactivator-associated arginine methyltransferase is necessary for muscle differentiation: CARM1 coactivates myocyte enhancer factor-2

    J Biol Chem

    (2002)
  • B.D. Strahl et al.

    Methylation of histone H4 at arginine 3 occurs in vivo and is mediated by the nuclear receptor coactivator PRMT1

    Curr Biol

    (2001)
  • T.B. Miranda et al.

    Protein arginine methyltransferase 6 specifically methylates the nonhistone chromatin protein HMGA1a

    Biochem Biophys Res Commun

    (2005)
  • J.Y. Chan et al.

    Physical and functional interactions between hnRNP K and PRMT family proteins

    FEBS Lett

    (2009)
  • C. Kim et al.

    Regulation of post-translational protein arginine methylation during HeLa cell cycle

    Biochim Biophys Acta

    (2010)
  • J.C. Fisk et al.

    A type III protein arginine methyltransferase from the protozoan parasite Trypanosoma brucei

    J Biol Chem

    (2009)
  • J.H. Lee et al.

    PRMT7, a new protein arginine methyltransferase that synthesizes symmetric dimethylarginine

    J Biol Chem

    (2005)
  • A. Frankel et al.

    PRMT3 is a distinct member of the protein arginine N-methyltransferase family

    Conferral of substrate specificity by a zinc-finger domain. J Biol Chem

    (2000)
  • J. Mellor

    Linking the cell cycle to histone modifications: DOT1, G1/S, and cycling K79me2

    Mol Cell

    (2009)
  • Y. Okada et al.

    hDOT1L links histone methylation to leukemogenesis

    Cell

    (2005)
  • Q. Feng et al.

    Methylation of H3-lysine 79 is mediated by a new family of HMTases without a SET domain

    Curr Biol

    (2002)
  • M. Dlakic

    Chromatin silencing protein and pachytene checkpoint regulator DOT1p has a methyltransferase fold

    Trends Biochem Sci

    (2001)
  • J.M. Schulze et al.

    Linking cell cycle to histone modifications: SBF and H2B monoubiquitination machinery and cell-cycle regulation of H3K79 dimethylation

    Mol Cell

    (2009)
  • S. Oh et al.

    A lysine-rich region in DOT1p is crucial for direct interaction with H2B ubiquitylation and high level methylation of H3K79

    Biochem Biophys Res Commun

    (2010)
  • N. Levesque et al.

    Loss of H3 K79 trimethylation leads to suppression of Rtt107-dependent DNA damage sensitivity through the translesion synthesis pathway

    J Biol Chem

    (2010)
  • F. Boissier et al.

    Further insight into S-adenosylmethionine-dependent methyltransferases: structural characterization of Hma, an enzyme essential for the biosynthesis of oxygenated mycolic acids in Mycobacterium tuberculosis

    J Biol Chem

    (2006)
  • B.M. Harvey et al.

    Insights into polyether biosynthesis from analysis of the nigericin biosynthetic gene cluster in Streptomyces sp. DSM4137

    Chem Biol

    (2007)
  • E.J. Richards et al.

    Epigenetic codes for heterochromatin formation and silencing: rounding up the usual suspects

    Cell

    (2002)
  • E.V. Koonin et al.

    The impact of comparative genomics on our understanding of evolution

    Cell

    (2000)
  • T. Nakamura et al.

    ALL-1 is a histone methyltransferase that assembles a supercomplex of proteins involved in transcriptional regulation

    Mol Cell

    (2002)
  • H. Wang et al.

    Purification and functional characterization of a histone H3-lysine 4-specific methyltransferase

    Mol Cell

    (2001)
  • N. Sirinupong et al.

    Crystal structure of cardiac-specific histone methyltransferase SmyD1 reveals unusual active site architecture

    J Biol Chem

    (2010)
  • C. Derunes et al.

    Characterization of the PR domain of RIZ1 histone methyltransferase

    Biochem Biophys Res Commun

    (2005)
  • R. Margueron et al.

    Ezh1 and Ezh2 maintain repressive chromatin through different mechanisms

    Mol Cell

    (2008)
  • L.B. Pedersen et al.

    Purification of recombinant Chlamydia trachomatis histone H1-like protein Hc2, and comparative functional analysis of Hc2 and Hc1

    Mol Microbiol

    (1996)
  • E. Perara et al.

    A developmentally regulated chlamydial gene with apparent homology to eukaryotic histone H1

    Proc Natl Acad Sci USA

    (1992)
  • J.N. Reeve et al.

    Archaeal histones: structures, stability and DNA binding

    Biochem Soc Trans

    (2004)
  • L. Aravind et al.

    The two faces of Alba: the evolutionary connection between proteins participating in chromatin structure and RNA metabolism

    Genome Biol

    (2003)
  • V. Anantharaman et al.

    Comparative genomics of protists: new insights into the evolution of eukaryotic signal transduction and gene regulation

    Annu Rev Microbiol

    (2007)
  • Cited by (0)

    View full text