Introduction

Enzymatic modifications of cytosine bases and histone proteins in the nucleosome core provide heritable epigenetic information in vertebrates that is not encoded in the nucleotide sequence of the cell. Chromatin replication during S phase of the cell cycle offers a window of opportunity for these enzymes and accessory factors to load onto the newly synthesized DNA and robustly propagate all the modification marks, as found in the parental cells. Failure to maintain correct epigenetic information leads to catastrophic consequences for the cell, including incorrect gene expression and apoptosis [13]. Importantly, cytosine methylation in mammalian cells is faithfully preserved between successive cell divisions. The preservation of DNA methylation during cell division is catalyzed by DNA (cytosine-5) methyltransferases (DNMTs). The major maintenance DNMT enzyme is DNA (cytosine-5) methyltransferase 1 (DNMT1) and has strong preference for hemimethylated DNA in vitro [4]. Previous studies demonstrate that DNMT1 is stably associated with newly replicated origin in the mammalian cells [57]. Therefore, it is plausible that DNMT1 can methylate the newly synthesized daughter strands soon after replication by reading the methylation pattern of the parental strand [8]. Similarly, there is strong evidence supporting the heritability of histone modifications in multicellular organisms. The strongest evidence links histone H3K27 and H3K4 modification catalyzed by the Polycomb group (PcG) and trithorax group (trxG) to mitotic inheritance of lineage-specific gene expression patterns [9]. Although some components of trxG and PcG possess histone methyltransferase activities for H3, other components of trxG and PcG interpret those histone marks. These two protein complexes are shown to be critical regulators of numerous developmental genes. They silence or activate a gene via binding to specific regions of a gene and post-translationally modifying the histones. Some recent works have also demonstrated that PcG-mediated gene silencing may involve non-coding RNA and RNAi machinery [9]. Thus, the model of histone modification inheritance is more complex than DNA methylation due to replication-independent histone deposition on DNA [10]. Nevertheless, there is strong speculation that a large number of histone modifications may follow epigenetic inheritance mechanisms. Recent advances in sequencing technology, higher computational capacity and highly specific antibodies against histone modifications have resulted in greater understanding of epigenetic marks in the context of mammalian genome. The flurry of research activity has resulted in several very excellent publications [1113]. In this review, we have attempted to give readers a general overview of the epigenetic mechanisms in mammals by discussing histone and DNA modifications along with the involvement of RNA in both developmental and biological context.

DNA methylation and its implications in epigenetic regulation

In the mammalian genome DNA methylation occurs by covalent modification of the fifth carbon (C5) in the cytosine base and the majority of these modifications are present at CpG dinucleotides within the genome. However, in mouse embryonic stem cells, the genomic DNA contains methylated CpA, CpT and CpG sequences [14] instead of exclusive CpG methylation, which is predominately found in somatic cells. Nevertheless, 5-methyl cytosine (Me5C) accounts for about 1% of total DNA bases and therefore is estimated to represent 70–80% of all CpG dinucleotides in the genome [15]. The CpG dinucleotides are distributed unevenly across the human genome, but are concentrated in dense pockets called CpG islands (CGIs). The methylation pattern in any given cell is the outcome of independent but dynamic processes of methylation and demethylation. In the mammalian genome, methylation patterns in differentiated somatic cells are generally stable and inheritable. However, reprogramming (demethylation/remethylation) of methylation pattern takes place during two developmental stages, in germ cells and in preimplantation embryos. In contrast to genome-wide demethylation occurrence in the primordial germ cells, genomes of mature sperms and eggs in mammals are highly methylated as compared to somatic cells [8, 16].

Although the majority of CpGs are methylated in the genome, CpG dinucleotides within CGI promoters are typically unmethylated during development and in normal (non-neoplastic/non-senescent) tissue types. The CGIs are genomic DNA regions with high frequency of CpG dinucleotides. Typically, a CGI is a region with at least 200 bp with a greater than 50% GC and an observed/expected CG ratio greater than 60% [17]. Comprehensive analysis of CGIs in human chromosomes 21 and 22 by Takai and Jones [18] revealed that regions of DNA of greater than 500 bp with a G+C equal to or greater than 55%, and observed CG/expected CG of 0.65 were more likely to be associated with the 5′ regions of genes. With this definition most of the Alu-repetitive elements were excluded. These islands overlap with promoter regions of 50–60% of human genes [19]. However, a subset of promoter CGIs are methylated in a tissue-specific manner during development, showing an exception to the general rule that CGI methylation in normal tissue is limited to X-inactivation and imprinted genes [8, 20]. This observation was supported further by recent findings of genome-wide profiling of DNA methylation demonstrating that non-X-linked promoter CGIs are methylated in normal tissues and escape methylation in germ line cells [21]. Another study has estimated 6–8% of CGIs to be methylated in the genomic DNA of human brain, blood, muscle and spleen [22]. Interestingly, in the same study, CGIs displayed tissue-specific methylation of genes essential for development, suggesting a programmed mechanism of DNA methylation. Another means of DNA methylation propagation is via methylation spreading that begins with genome-wide demethylation that starts shortly after fertilization. Remethylation of most of the genome occurs after the blastocyst stage [23] and continues at a slower pace during the rest of the developmental period. Even though the phenomenon of spreading has not been fully understood, it was proposed as a self-perpetuating interaction between chromatin-modifying proteins and DNA methylation [24]. Indeed, many of the chromatin modification enzymes responsible for gene silencing are found associated with each other in mammalian cells. Some of the examples of DNMT1-associated proteins are HDAC1 [25], histone methyltransferase G9a [26], ATP-dependent chromatin modeling enzyme SNF2H [27], and Polycomb protein EZH2 [28]. Therefore, the above hypothesis that initial DNA or histone methylation will attract repressive complexes, and create a transcriptionally unfavorable chromatin conformation is very plausible. This alteration in chromatin structure, in turn, influences the nearby chromatin and makes it more prone to methylation spreading. This phenomenon is well documented in Arabidopsis, where tandem repeats upstream of endogene SDC element recruit non-CG DNA methylation directed by histone methylation and siRNA, and display spreading of siRNAs and methylation beyond the repetitive DNA [29]. Existing pieces of evidence in mammalian cells show that there are certain cis-acting elements that are dispersed throughout the genome and they can either act as a methylation signal element or methylation boundary during methylation spreading. For example, in the mouse Aprt gene, two upstream B1 repetitive DNA elements were identified to provide de novo methylation signal for spreading [30]. These elements reside in the large stretches of DNA dubbed as methylation centers. Other retrotransposon elements such as B2, Alu (human equivalent of B1), and LINE- 1 (long interspersed nuclear element-1) are also considered to possess de novo signaling activity for methylation spreading [24]. In contrast, the Sp1 binding sites within the Aprt promoter provide the counteracting force against spreading. Indeed, site-directed mutation of one or more Sp1 sites eliminates the binding of transcription factors and allows methylation to spread at the Aprt promoter [31]. However, (ATAA)n repeat sequences in the human GSTP1 gene, Sp1 and CTCF elements in the BRCA1 gene, act as boundary elements for prevention of methylation spreading onto CGIs [32, 33]. Recent experimental work on genome-wide DNA methylation analysis discovered an overrepresentation of putative zinc finger binding sites at the boundaries of methylation-resistant CGIs. This observation suggested that these sites may reinforce transcription factor binding and thereby block methylation spreading and promote transcription [34]. Dynamic equilibrium between methylation spreading and its suspension is likely to be responsible for establishing and maintaining stable DNA methylation patterns in human somatic cells. Furthermore, a combined study of bioinformatic approaches and methylation data from chromosome 21 has demonstrated that DNA sequence, repeat frequencies, and predicted DNA structures correlated with methylation status of CGIs [35].

Aberrant gene expression is one of the key features associated with complex diseases such as cancer, type II diabetes, schizophrenia and autoimmune disease. These diseases are known to be heritable, although they do not follow clear Mendelian inheritance patterns. There are several lines of evidence suggesting that epigenetic abnormalities, together with genetic alterations, are responsible for the deregulation of key regulator genes resulting in these diseases. The epigenetic mechanism provides an alternative explanation for some of the features in complex diseases, including late onset, gender effects, parent-of-origin effects, and fluctuation of symptoms [36]. For example, in cancer cells, normally unmethylated CGIs are often hypermethylated to silence flanking tumor suppressor genes during neoplasia [37, 38]. On the other hand, demethylation (hypomethylation) of normally methylated CGIs can lead to unscheduled activation of genes, as was first shown at MAGE-1 locus, which is normally expressed only in germ line cells but is activated in human tumors [39]. Indeed, the pattern of cancer-associated methylation of CGIs also depends on other factors, such as cell lineage and environmental stimuli. Apart from cancer, a rare genetic disease ICF (immunodeficiency, centromeric instability and facial anomalies) syndrome was correlated with methyltransferase machinery. These ICF patients have mutations in the DNMT3B gene, which leads to hypomethylation of satellite DNA and specific chromosomal decondensation [40]. Thus, DNA methylation and enzymatic apparatus play a significant role during normal embryonic development and diseases.

Histone modification and chromatin function

In eukaryotes, DNA is packaged as chromatin in the nucleus. Chromatin is further organized into two different levels of general structure, silent heterochromatin and active euchromatin. Heterochromatin regions correspond to the bulk of nuclear material and constitute both telomeres and pericentric regions, and these areas tend to be rich in repetitive sequences and low in gene content. The rest of the genome is considered to be euchromatin that contains most of the genes and is considered transcriptionally active. The nucleosomes are the basic unit of chromatin consisting of 147 bp of DNA wrapped around a histone octamer. Two copies of each of the following core histones are present in amononucleosome: H2A, H2B, H3 and H4. All of them have a globular Cterminal domain and an unstructured N-terminal tail [41]. Interestingly, a variety of modifications are associated with these tails [42]. Histone modifications include, methylation of arginine (R), methylation, acetylation, ubiquitination and sumoylation of lysines (K), and phosphorylation of serine (S) and threonine (T) (refer to the next section and Table 1 in this review for detailed description of histone modification enzymes and their cognate substrates) [42, 43]. Even though the significance of most of these modifications is not fully understood, recent advances in the field have determined that lysine acetylation and methylation are key modulator marks for transcriptional activation or repression [42]. Acetylation/deacetylation of lysines correlates with chromatin accessibility and transcription, whereas lysine methylation exerts a different effect depending on the number of the methyl groups and position of lysine residues. For example, modifications that are localized to inactive genes, such as trimethylation of H3K9 and H3K27, are often referred to as heterochromatin modifications. Trimethylation of histone H3 lysine 4 (H3K4) and acetylation of H3 and H4 are associated with active transcription and termed as euchromatin modifications [44, 45]. Plasticity of euchromatin keeps DNA open for biological activity, thus genes can be transcriptionally turned on or off. A combination of high levels of acetylation and trimethylation of H3K4, H3K36, and H3K79 can be detected in transcriptionally active genes, while low levels of these modifications are associated with inactive state. However, it should be noted that acetylation is exclusively associated with active chromatin. Most of the acetylated residues reside in N-terminal tails of histones except for H3K56, which resides in the core domain [42]. Similar to lysine methylation, arginine methylation can be either an active or repressive mark for transcription. Indeed, coactivator-associated arginine methyltransferase CARM1/PRMT4 is physically associated with histone acetyltransferases and acts in a cooperative manner mediating the function of nuclear factor κB (NF-κB) [46]. Another arginine methyltransferase, PRMT5 has been identified to be present in a promoter complex, and it is proposed that PRMT5 may function as a transcription repressor [47]. Similarly, histone lysine methyltransferase G9a also exhibits dual functional specificity acting as both transcriptional suppressor [48] and activator [49]. Apart from gene expression or repression, chromatin modification appears to play a vital role in other cellular processes such as the response to DNA damage. In response to DNA damage, histone modifications may assist in marking the site of damage and may provide a platform for repair to take place. For example, phosphorylation of H2A.X variant in response to DNA double strand breaks (DSBs), provides signals for nonhomologous end joining repair pathway in mammals [50]. Furthermore, methylation of H4K20 and cell cycle checkpoint protein Crb2 are associated with ionizing radiation-induced DNA damage in fission yeast that results in cell cycle arrest at G2/M. Similar DNA damage response is also observed in human cells. For example, mono- and dimethyl H4K20 are involved in recognition and recruitment of p53BP1, a homologue of Crb2, to damaged DNA sites [51].

Table 1 Histone-modifying enzymes.

Recently, a central role of acetyltransferase HBO1 (histone acetyltransferase binding to origin recognition complex) was demonstrated during DNA replication [52]. In the above study, HBO1 was found in a complex with ING family of tumor suppressors that are associated with cell proliferation. Low levels of HBO1 and ING5 correlate with reduction in DNA synthesis and affect progression into S phase [52]. These results suggest that the role of H4 acetylation by HBO1 is an important event for DNA synthesis. Furthermore, HBO1, ING4, ING5, and p53 have been shown to associate in shared protein complexes in cells. The tumor suppressor p53 physically interacts with HBO1 and negatively regulates its HAT activity in vitro and in cells, and thus connects p53-responsive stress signaling and HBO1-dependent chromatin modification pathways [53]. Another histone modification, phosphorylation may play a role in condensation/decondensation of chromatin during replication in mammalian cells. For example, phosphorylation of H3S10 may function to displace the HP1 complex from H3K9 methylated chromatin to facilitate cellular events for decondensation [54].

In summary, DNA methylation and histone modifications are important for the coordinated transcription, replication and repair process. In all those complex cellular events, cross-talk between DNA methylation and histone modifications may help to maintain correct and ordered recruitment of protein factors onto chromatin for coordinated function. Therefore, deregulation of cross-talk(s) can lead to aberrant outcomes of important biological processes in living cells.

Enzymes that participate in chromatin modifications

As described before, chromatin modifications in mammals occur at two distinct levels, DNA methylation and histone modifications. Several mammalian DNMTs have been identified, and grouped into two major classes depending on their substrate preference and the resulting function (reviewed in [55, 56]). DNMT3A and DNMT3B are de novo methyltransferases that are responsible for establishing cytosine methylation patterns at unmethylated DNA. Global de novo methylation occurs during early embryogenesis when DNA methylation marks are re-established after genome-wide demethylation for epigenetic reprogramming. Once established, DNA methylation patterns should be stably maintained over cell divisions. This function is fulfilled by the maintenance methyltransferase DNMT1 through its preference for hemimethylated DNA [4, 57], and copying of preexisting methylation patterns onto the newly synthesized DNA strands during DNA replication. Furthermore, different isoforms of DNMT1 (Dnmt1 s and Dnmt1o) participate in maintenance of methylation imprints in preimplantation mouse embryos [58]. Thus, cooperative function between DNMTs provides a way of passing and maintaining epigenetic information between successive cell generations. The targeted deletion of these de novo and maintenance methyltransferases results in various developmental defects [36]. Unlike DNMT1 and DNMT3A/B that contain both regulatory and catalytic domains, DNA methyltransferase DNMT2 has only the catalytic domain exhibiting only weak methyltransferase activity in vitro, and its absence causes no discernable effects in global CpG methylation and developmental phenotype [36]. DNMT3L is a DNMT3-related protein that is expressed only in germ cells and at the stage where de novo methylation occurs [59]. It lacks enzymatic activity but modulates the catalytic activity of DNMT3A and DNMT3B by physically associating with them. Crystal structures for some of these mammalian DNMTs (mouse Dnmts) have been solved, providing additional biochemical and structural insights into the function of the enzymes [60]. To date, the available structures include the PWWP domain of Dnmt3b [61], full-length Dnmt3L with a bound histone H3 N-terminal tail peptide [62], and a complex between the C-terminal domains of Dnmt3a and Dnmt3L [63].

While correct establishment and maintenance of DNA methylation patterns are critical for normal development, DNA demethylation is also equally important for precise execution of developmental programs as evidenced by epigenetic reprogramming events in early embryos and primordial germ cells (PGCs). It is unknown whether DNA demethylation requires demethylase activities or can occur passively through DNA replication in the absence of DNMT1. Although no DNA demethylase activity has been convincingly identified, several mechanisms have been proposed to account for the loss of DNA methylation [64]. For example, DNA deaminases of the Aid/Apobec family have been shown to catalyze deamination of 5-methylcytosine resulting in T:G mismatch, which may lead to DNA demethylation if the mismatch is repaired [65]. Interestingly, a recent study has proposed thatDNMTs themselves have dual roles in CpG methylation and active demethylation of 5-methyl CpGs through deamination during periodic methylation/demethylation cycles of the pS2 gene promoter upon activation by estrogens [66]. Although precise roles of DNMT3A/B in this process are unclear, the study has demonstrated that DNMT3A/B can deaminate both cytosine and 5-methylcytosine in vitro, and that concordant recruitment of DNMT3A/B, DNA glycosylase, and other base excision repair proteins occurs during pS2 promoter demethylation.

Covalent modifications of histones add multiple layers of complexity to chromatin, ranging from small chemical changes such as acetylation and methylation to large peptide addition such as ubiquitylation and sumoylation. Over the past 10 years, several families of histone-modifying enzymes have been identified, as summarized in Table 1. Recent reviews have extensively covered topics of histone modifications, their mechanism of action, and the biological functions derived from individual or combined modifications [42, 67]. Of particular interest, new nomenclature for some families of chromatinmodifying enzymes has recently been proposed since the current nomenclature of the enzymes is rather inconsistent and often creates additional complexity [68]. Most histone modifications are dynamically regulated as evidenced by identification of many enzymes that can remove the modification. One wellstudied example is histone demethylation that is carried out by two classes of enzymes, amine oxidases such as LSD1 and hydroxylases of JmjC family [69]. In contrast, arginine demethylation activity has not been identified yet, although the deimination process converting an arginine to citrulline has been proposed as an alternative mechanism to antagonize arginine methylation [70].

As supported by the number and type of histone-modifying enzymes (Table 1), lysine has emerged as a crucial amino acid residue for histone modifications over the past decade. Interestingly, lysine modifications of non-histone proteins are also mediated by some of the known histone-modifying enzymes, and can be reversed by antagonizing activities just as observed for histone modification. For example, lysine methylation has recently been identified as a novel modification of the p53 tumor suppressor in addition to previously known modifications such as acetylation and ubiquitylation [71]. Histone-modifying enzymes involved in methylation/demethylation of p53 include SYMD2, SET9, and LSD1.

In summary, dynamic modifications of DNA/histones and non-histone proteins by chromatin-modifying enzymes reflect their functional diversity and regulatory complexity.

Other nuclear proteins crucial for epigenetic modifications

Chromatin modifications can directly change chromatin structure by altering the physical properties of individual nucleosomes, primarily by neutralization or addition of charge to target residues. This affects histone-DNA interactions and creates either a more open chromatin architecture or higher-order structures through differential modulation of internucleosomal contacts [67]. In most cases, however, the epigenetic roles of chromatin modifications are augmented by many specialized sets of nuclear proteins that do not participate in chromatin modifications per se but are critical for epigenetic gene regulation. Among many proteins that fall into this category, three types of proteins/complexes are briefly reviewed in this section: chromatin remodeling complexes, effector proteins with various binding modules for different modifications, and insulator proteins.

Chromatin remodeling complexes are energy-driven, multi-protein machinery that allows access to specific DNA regions or histones by altering nucleosomal positions, histone-DNA interactions, and histone octamer positions (Fig. 1A). These chromatin remodellers have a catalytic ATPase to induce changes in local chromatin structure covering one or two nucleosomes. The ATPases in chromatin remodeling complexes are grouped into three subfamilies: the SWI/SNF ATPases, the imitation switch (ISWI) ATPases, and the chromodomain and helicase-like domain (CHD) ATPases. Several recent reviews have summarized the current understanding of diversity and specialization of chromatin remodeling complexes and modulation of remodeller activity by nucleosome modifications [72, 73].

Figure 1
figure 1

Other nuclear proteins crucial for epigenetic modifications and gene regulation. (A) A simplified example for the role of chromatin remodeling complexes recruited by transcription factors or specific modifications on chromatin [72]. ATP hydrolysis-driven repositioning of nucleosomes exposes an occluded DNA region to allow access of transcriptionalmachinery. (B) Modified histones serve as recognition sites for effector proteins. The illustration shows that trimethylated H3K9 (hexagons) is recognized by HP1 that recruits SUV39H1 and DNMTsto facilitate further H3K9 methylation, HP1 binding, and DNA methylation on adjacent nucleosomes, resulting in repressive chromatin spreading [75, 76]. (C) A model illustrating two major functions of insulators. Insulators (I) placed between an enhancer (E) and promoter blocks enhancer-promoter communication, thereby preventing inappropriate activation of promoters by distant enhancers (left panel). Insulators can also function as a chromatin barrier that limits heterochromatin spreading and prevents repression of neighboring genes (right panel). Condensed heterochromatin is decorated with repressive marks such as H3K9 methylation (hexagons) and DNA methylation (circles).

In many cases, chromatin modifications serve as recognition sites for the recruitment of effector modules that read and implement modification-encoded biological messages [42, 74]. Several distinct binding modules have been identified in various nuclear proteins, coupling a particular histone modification with cognate effector proteins (Table 2). Thus, the composition of modifications on a given histone can either recruit or occlude a set of proteins. Effector proteins may alter chromatin structure by binding two or more nucleosomes as found with HP1 and Polycomb group proteins [74]. Effector proteins can also act as an adaptor to attract additional chromatin-modifying enzymes or remodeling complexes to augment the chromatin alteration initiated by the modification. Such an example can be found in HP1 binding to trimethylated H3K9 [75] and DNMT1 [76]. These initial interactions can recruit SUV39H1 and/or DNMT1 and promote further H3K9 methylation, HP1 binding, and DNA methylation, which may in turn result in further transcriptional gene silencing or chromatin repression [77] (Fig. 1B). Similarly, the PHD domain of BPTF, a component of the NURF chromatin remodeling complex, recognizes trimethylated H3K4 and brings the remodeller with it [78]. Some other effector proteins have enzymatic activities themselves, as exemplified by CHD1 remodeling ATPase, which binds to trimethylated H3K4 and introduces active structure remodeling [42]. Similar effector proteins have been identified for DNA methylation. A series of methyl CpG-binding proteins, such as MBDs and MeCP2, has demonstrated the ability to interpret DNA methylation marks in different biological contexts (reviewed in [56]). Specifically, it has been demonstrated that interpretation of DNA methylation marks by MBDs and MeCP2 has additional assurance via recruitment of HDACs for gene silencing [25, 79, 80]. Recently, another methyl CpG-binding protein UHRF1 has been shown to recruit DNMT1 itself onto chromatin to facilitate the faithful inheritance of genomic methylation patterns [81, 82].

Table 2 Effector proteins containing specific binding modules for histone modifications.

Finally, insulators are DNA elements that can protect a gene from neighboring transcriptional influences to prevent inappropriate activation or repression of the gene. Insulators have two well-known functions that are represented by enhancer blocking and formation of chromatin barrier, so that they can either prevent distal enhancers from activating a promoter or block heterochromatin spreading that may lead to silencing of neighboring genes (Fig. 1C). Achromatin insulator protein CTCF (CCCTC-binding factor) plays important roles in many aspects of epigenetic regulation including genomic imprinting, X-chromosome inactivation, transcription of non-coding RNAs at repetitive elements, and long-range chromatin interactions (reviewed in [83, 84]). Thus, CTCF-binding sites establish epigenetic boundaries by which correct gene expression is ensured during development, and also contribute to higher-order genome organization within the nucleus.

Chromatin modification and its role in development

Epigenetic mechanisms can affect long-term gene expression, which constitutes the basis for the accurate execution of developmental programs and the maintenance of the cell types over subsequent cell divisions. PcG and trxG genes were first discovered in Drosophila melanogaster as master regulators of homeotic (Hox) gene expression. Polycomb complexes function as repressors of target genes, whose action is balanced by an antagonistic effect of trithorax complexes working on the identical DNA regulatory elements. These elements, PcG or trxG response elements (PREs/TREs), recruit PcG or trxG proteins to form multimeric complexes on PREs/TREs and mediate epigenetic inheritance of silent or active chromatin states through cell divisions, respectively. These PcG and trxG complexes are not required for the initial establishment of homeotic gene expression pattern, but are essential for maintenance of the established state throughout the rest of development (reviewed in [9]). Although PREs/TREs have only been identified and characterized in Drosophila, PcG and trxG genes are also conserved in mammals and play an important role in many developmental processes such as cell lineage specification and stem cell maintenance (reviewed in [85]). For example, recent genome-wide PcG profiling in mouse and human embryonic stem (ES) cells has revealed that most PcG targets in ES cells are regulators of differentiation pathways, suggesting that the PcG proteins keep stem cells in a pluripotent state by suppressing cell fate-specific genes [86, 87]. These PcG target genes can be activated upon differentiation to result in specific cell types with a concomitant loss of PcG proteins, suggesting a possibility that trxG proteins may be involved in this activation by replacing PcG proteins on the target genes.

PcG proteins form two distinct multi-protein complexes, Polycomb repressive complex 1 (PRC1) and PRC2, although mammals have two additional PRC2-related complexes. The components of each complex in different organisms and their recruitment/mechanism of action have been reviewed comprehensively [88, 89]. The catalytically active component of PRC2, EZH2, catalyzes trimethylation of histone H3K27, and this enzymatic activity is required for PRC2-mediated silencing. The H3K27 methylation mark deposited by PRC2 recruits PRC1 via its chromodomain-containing components, which is believed to facilitate condensation of chromatin structure. Other properties of PRC1 also contribute to transcriptional silencing. PRC1-mediated ubiquitylation of histone H2A is critical for Hox gene silencing by an unknown mechanism [90]. In mammalian cells this robust PcGmediated repression appears to be stabilized by DNA methylation since EZH2 can directly recruit DNA methyltransferases to target genes [28]. Furthermore, H3K27 methylation by PcG predisposes the marked genes to de novo methylation leading to aberrant silencing in cancer cells [91]. Although it remains unknown in mammalian cells, there may be additional mechanisms other than histone/DNA modifications in PcG-mediated repression, since studies in Drosophila have implicated other silencing mechanisms such as direct interactions with the transcriptional machinery and transcription of non-coding RNA (ncRNA) (reviewed in [9]). In fact, PcG complexes have been shown to participate in gene silencing during X-chromosome inactivation and genomic imprinting where ncRNAs play a critical role in silencing mechanisms, which is reviewed later in this contribution. As the mechanistic opposite of PcG, trxG proteins also form several multimeric complexes. The trxG-associated MLL1 has been shown to catalyze histone H3K4 trimethylation that is recognized by BPTF, a subunit of NURF chromatin remodeling complex. This targeting of the remodeling complex to histones methylated by trxG is thought to facilitate active chromatin formation by repositioning nucleosomes on the promoter [78].

In addition to activation of genomic programs leading to specific cell types, another equally important epigenetic event during development is that a cell must silence alternative gene expression programs specific to other cell types to secure its fate. The best example of this lineage restriction is found in neurogenesis, during which neural cell fates are acquired in the developing nervous system, and neuron-specific genes are repressed in non-neuronal cells outside the nervous system. This suggests that neuronal chromatin is epigenetically programmed in different cellular contexts. REST (repressor element 1-silencing transcription factor), a repressor of neuronal genes containing a conserved RE1 provides a link between epigenetic mechanisms and neurogenesis by establishing silent chromatin states in cooperation with other corepressors and chromatin modifiers (reviewed in [92]). The corepressor CoREST confers more specialized repression mechanisms, such that the RESTCoREST complex recruits various chromatin modifiers for long-term silencing of neuronal genes in terminally differentiated non-neuronal cells. Chromatin-modifying enzymes and other epigenetic silencing factors involved in this process have been extensively reviewed [92].

In contrast to stable and inheritable silencing of neuronal chromatin in terminally differentiated nonneuronal cells, the situation in ES cells and neuronal progenitors impose another aspect of epigenetic concern on gene expression since these cells should be able to relieve the silent chromatin state upon differentiation to allow a lineage-specific gene expression. A comparative analysis of neuronal gene chromatin in terminally differentiated fibroblasts and pluripotent ES cells has revealed that stem cells and progenitors possess a poised chromatin status for subsequent neuronal differentiation with distinct differences in epigenetic marks and transcriptional features [93]. This study suggests that the core REST complex establishes characteristic chromatin states by recruiting different chromatin modifiers in non-neuronal and ES cells, emphasizing the role of REST and its corepressors in building plasticity of neuronal chromatin.

Taken together, epigenetic mechanisms set a fundamental basis for maintenance of ES cell identity and long-term cellular memory that are crucial for normal development.

Dosage compensation in mammals

In mammals, females have two X chromosomes (XX), while males have only one (XY). This chromosomal difference between the sexes creates a need for dosage compensation systems to adjust the gene dose of X-linked genes. Mammalian dosage compensation is accomplished by silencing of one of the two X chromosomes in females, a process referred to as X chromosome in activation (XCI) (reviewed in [94]). In mouse and human embryos, XCI is initiated in early development. The XCI is regulated by a cis-acting master switch locus, the X-inactivation center (Xic), which includes ncRNA gene Xist (X inactive specific transcript) and its antisense transcription unit Tsix/Xite (Xist spelled backward due to its antisense orientation to Xist). The Xic senses the number of X chromosomes and produces the noncoding Xist RNA from one of the two chromosomes to trigger silencing in cis. Therefore, the initiation of this random inactivation presents important questions on how cells count the number of X chromosomes and choose which one to be inactivated. Recent progress in understanding the mechanisms driving the XCI counting and choice has indicated that multiple regulatory systems may be involved, thus giving rise to multiple models for the initiation of random XCI (reviewed in [95]). Among these interesting findings, trans-interaction of X chromosomes via a novel X-pairing region of Xic has been observed, suggesting that the homologous pairing may enable a cell to detect the number of X chromosomes and coordinate Xist/Tsix expression to determine the future active and inactive X chromosomes (Xa and Xi, respectively) [96]. Another recent study supports an alternative mechanism, a stochastic model where each X chromosome has an independent probability to initiate the XCI within a certain time span. These studies suggest the presence of a novel X-encoded trans-acting XCI activator involved in initiation of XCI, based on observations in tetraploid ES cells [97]. In contrast to random inactivation, in some mammals the parental origin determines which X chromosome is to be inactivated (reviewed in [94]). All tissues of marsupials and the extra-embryonic tissues of mice display imprinted inactivation of the paternal X chromosome. The molecular basis underlying the preferential paternal X inactivation and the nature of imprints are not currently well understood.

Once the future Xi is chosen, XCI starts with the accumulation of Xist RNA along the Xi. The Xist expression is regulated by the Tsix gene that acts primarily in the nucleus and is transcribed in the antisense direction over the Xist gene [98]. The Xist RNA coating-induced silencing accompanies multiple layers of epigenetic modifications on the Xi, which lock in and stably maintain the inactive state through cell divisions (reviewed in [94, 95]). Chromosomewide studies revealed various X-linked histone modifications, including hypoacetylation of histone H4 [99], trimethylation of H3K9 and H3K27 [100, 101], H4K20 monomethylation [102], H2AK119 monoubiquitylation [103], as well as substitution of core histone H2A with the histone variant macroH2A [104]. In addition to the histone modification profile, the Xa and Xi allele-specific DNA methylation patterns have also been established [105]. Analysis of Dnmt1 −/− embryos has shown that methylation is required for stable maintenance of gene silencing on the Xi [106]. As discussed above, a wide range of chromatin modifiers are known to be involved in XCI, including PcG complexes, histone deacetylases, and DNMT (reviewed in [94]). Although the exact combination of histone modifications on the Xi may vary during development and in different lineages and cell types, the order of chromatin modifications leading to X inactivation was postulated based on the observations during female mouse ES cell differentiation (reviewed in [94, 95]). First, Xist RNA transcription and accumulation on the Xi in cis trigger silencing through as yet unknown mechanisms. Then, recruitment of PRC1 and PRC2 mediates H2AK119 monoubiquitylation and H3K27 trimethylation, respectively. At this early stage of XCI, the inactivation process is reversible and dependent on the presence of Xist RNA. As cell differentiation proceeds, the Xi undergoes deposition of histone macroH2A and histone H4 hypoacetylation, followed by promoter-specific DNA methylation on the Xi. At this phase, the XCI is irreversible and Xist RNA is not required for maintenance of the inactive state. Apart from chromatin modifications on Xi, the Xi also shows the shift to late replication during random inactivation [107] and Xist RNA defines a repressive nuclear compartment early on in the XCI process [108]. Thus, the epigenetic marks and temporal/spatial segregation mechanisms contribute to the initiation and maintenance of XCI. Despite significant progress in understanding of molecular mechanisms of XCI, there are still many unanswered questions. For example, the counting and choice process of random inactivation awaits further elucidation of its molecular basis. Similarly, the mechanisms by which Xist RNA triggers recruitment of chromatin-modifying complexes remain unknown. Furthermore, it is still elusive how cis-acting elements and trans-acting factors coordinate and spread silencing across the chromosome.

Genomic imprinting

Diploid organisms such as mammals carry two copies of autosomal genes, one from each parent. In most cases, both parental alleles have equal potential to be expressed in cells. However, a subset of autosomal genes are subject to genomic imprinting by which the expression is limited to one of the two parental alleles depending on the parent-of-origin of the gene. Genomic imprinting is an epigenetic mechanism conserved in placental mammals, and failure to establish correct imprinting has been shown to cause defects in embryonic and neonatal growth and can result in neurological disorders such as Prader-Willi syndrome [109]. To date about ∼80 imprinted genes have been identified in mice, the majority of which are clustered in the genome while there are some solo imprinted genes [110]. Each imprinted gene cluster often encompasses several protein-coding genes over 100–3000 kb DNA, and at least one non-coding RNA (ncRNA) gene [111]. For example, the maternally imprinted Igf2r cluster spans ∼500 kb, and contains three maternally expressed protein-coding genes and the 108 kb Air ncRNA gene that is essential for imprinted gene expression [112, 113]. Expression of imprinted genes in each cluster is generally controlled by a single major cis-acting element, the imprinting control region (ICR) [114]. ICRs are CpG-rich DNA sequences that are methylated in only one of the two parental gametes, and thus carry the parental information. This DNA methylation imprint is acquired during gametogenesis. Prior to sex determination, the parental imprints are erased in germ cells formed in the embryonic gonad. As the embryo develops into a male or female, gametic imprints are placed on paternally imprinted genes during sperm production and on maternally imprinted genes during egg formation, respectively. After fertilization, this methylation imprint is maintained on the same parental chromosome through cell divisions. Establishment and maintenance of the imprints require a series of epigenetic machinery. Gametic imprints are established in germ cells by the de novo methyltransferase Dnmt3a [115]. Another member of Dnmt3 family, Dnmt3L, has been shown to be essential for maternal imprinting in female germ cells, whereas its disruption in male germ cells results in meiotic catastrophe caused by retrotransposon reactivation [116, 117]. These imprinted marks are stably propagated through successive cell divisions by maintenance methyltransferase Dnmt1 and its oocyte-specific isoform Dnmt1o [118, 119]. Furthermore, these gametic imprints can be erased in germ lines during genome-wide reprogramming by an unknown demethylation mechanism(s). Although DNA methylation is the most important mechanism for imprinting, it does not appear to be the only mechanism. Histone modification by a mouse PcG protein Eed has been demonstrated to affect a few paternally repressed genes; however, it has a relatively minor effect compared to that of DNA methylation and may only contribute to maintenance of imprints [120]. Similarly, the absence of histone methyltransferase G9a has been shown to exert pronounced effects on paternal repression of placenta-specific imprinted genes [121].

As mentioned above, each imprinted gene cluster contains at least one ncRNA gene that plays a crucial role in silencing of the multiple protein-coding genes in the cluster by cis-acting mechanisms. Current understanding of imprinted gene expression/silencing has been gained from the studies on six well-characterized clusters, of which four (Igf2r/Air, IC2/Kcnq1, Gnas, and PWS/AS) are maternally imprinted, and two (Igf2/H19 and Dlk1) are paternally imprinted by DNA methylation acquired in the male gamete (as reviewed in [111]). Despite the differences in gene organization and ICR functions in different clusters, a few common features of imprinted gene expression/silencing can be derived. The unmethylated ICRs are implicated in all six clusters as positive regulators of ncRNA expression. In maternally imprinted clusters (Igf2r/Air and IC2/Kcnq1), the unmethylated ICR works as a promoter for a paternally expressed ncRNA that is an antisense orientation to at least one of the genes in the cluster. While deletion of the methylated maternal ICR has no effect on maternally inherited alleles, deletion of the unmethylated paternal ICR reverses the parental-specific expression pattern such that ncRNA expression is lost and biallelic gene expression is obtained by abrogation of paternal silencing. Truncation of ncRNAs at these loci also has similar effects on paternal gene expression, relieving silencing of the paternally inherited alleles (reviewed in [122]). ICRs in paternally imprinted clusters appear to utilize different mechanisms to control the imprinted gene expression. For example, the H19 ncRNA at the Igf2/H19 locus is expressed from the unmethylated maternal chromosome but the ICR does not act as a promoter. Rather, it serves as a boundary element for CTCF (CCCTCbinding factor) that is a chromatin insulator protein [83]. The CTCF protein binds the unmethylated maternal ICR blocking the interaction of downstream enhancers with Igf2 and Ins promoters , while it does not affect the interaction between the enhancers and H19 ncRNA promoter. On the paternal chromosome, the DNA methylation imprint prevents CTCF binding, thus allowing the enhancers to drive the expression of Igf2 and Ins genes [123]. Interestingly, the CTCF protein has been shown to have an additional function at the Igf2/H19 locus, protecting the maternal allele from methylation post-fertilization [124].

Although there is an obvious involvement of ncRNAs in imprinted gene silencing, it is unclear how they can repress even non-overlapped genes that are several hundred kilobase pairs apart from either side of the imprinted ncRNA gene. The major question on this issue is to determine whether imprinted ncRNAs silence genes through the transcript itself or through the action of transcription. Several models have recently been reviewed to address this question [122]. Given the similarities in silencing mechanisms between genomic imprinting and X-chromosome inactivation many useful insights into imprinting mechanisms may be obtained by examining whether what is known about X-chromosome inactivation can be applied to genomic imprinting. Another important question that remains unanswered is how the gametic methylation machinery distinguishes parental-specific alleles and establishes DNA methylation marks at different regions at different loci.

Inheritance of silent loci and genome defense

Completion of the human and mouse genome sequence revealed that transposable elements (TEs) play a major role in shaping the mammalian genome, in particular, in its evolution [125, 126]. These elements account for 45% and 37% of human and mouse genome, respectively. Families of repetitive elements include long terminal repeats (LTR)-retrotransposon, long interspersed nuclear elements (LINE), short interspersed nuclear elements (SINE), and DNA transposons.

Retrotransposons transpose with the help of reverse transcriptase and they can be divided into two subfamilies depending on the presence or absence of direct repeats at the end of the element called LTR. LINE elements do not contain LTRs and account for 17% of the total human genome. A small percentage of these autonomous non-LTR retrotransposons in the human genome remain active [127]. Intracisternal Aparticles (IAPs), MaLR and Etn elements are active LTR retrotransposons present in the mouse genome [128]. In contrast, SINE elements are non-autonomous, non-LTR retrotransposons. The Alu repeats are most common SINE families in human genome and account for 10% of the whole genome mass [129]. B1 and B2 are major SINE elements in mouse genome [130]. DNA transposons do not require reverse transcriptase for integration event into the genome. Instead, a self-encoded protein called transposase can recognize terminal inverted repeats (TIR) of the DNA transposons for genome integration. To date, no evidence has been available for the presence of active DNA transposon, although many copies of inactive fossil DNA transposons are present [125].

There are many ways that transposable elements can interfere with the structure and regulation of gene expression in the genome. They include insertion, deletion or an inversion of genomic sequences. Recombination between non-allelic repeats can lead to rearrangements/translocations, and strong constitutive promoters of retrotransposons can express chimeric mRNA [128, 131]. Transposable elements also can serve as promoters, enhancers, silencers, and alternative splicing site and thereby modulate the expression of related genes [132]. In contrast to the huge number and different modes of gene disruption associated with these transposons, the damage that transposons cause to their host is generally minor. For instance, only 1 in 600 germ line mutations in human can be attributed to transposon insertions [133]. In fact, the damage caused by transposons is largely limited by active repression of these endogenous parasitic elements. Most transposon copies reside in heterochromatin, which by definition contains regions of silent DNA so that they possess little harm to the host genome.

Mammalian (and other vertebrates) genome structure is protected against these parasitic transposable elements. DNA cytosine methylation and modification of histone tails (methylation at H3K9 and deacetylation) are associated with the host-defense system [134, 135]. Drosophila suffers from abundant transposon-mediated mutations and lacks DNA methylation, which adds supportive evidence to the above scenario [136]. In mouse, the transcription of IAP is normally repressed but is greatly induced in embryos lacking DNMT1, demonstrating that methylation is responsible for the repressed state of these elements [137]. Human endogenous retroviruses (HERVs) resemble simple retroviruses in structure. The demethylation of HERVs has been examined in a limited number of cancers (germ cell tumors and cancers of the ovary, testicles and bladder). In these cases, HERV hypomethylation increases with malignancy [138]. In vitro transcription assays using site-specific mutagenesis and methylation demonstrate that methylation of critical CpG dinucleotides within the LINE promoter is enough to ensure repression of transcription. In a number of cancers, hypomethylation of LINE elements is evident, compared to their normal counterparts or unaffected adjacent tissues [139, 140]. LINE hypomethylation can occur early in cancer initiation, notably in colon and prostate cancers. In most other cancers studied (leukemias, urothelial, ovarian and breast cancers), LINE demethylation increases with the degree of malignancy. Therefore, depending on the cancer type, LINE hypomethylation may be useful as an early detector of cancer or a prognostic indicator [141].

Modification of histones also plays a role in suppressing TE transcription. Chromatins associated with TEs are enriched for methylation of histone H3K9, which is a signal for transcriptional suppression. Mutation in Suv39, a H3K9 methyltransferase, leads to reactivation of TE transcription in mouse ES cells [135]. In A. thaliana, DDM1 is required for TE silencing. Mutation of ddm1 results in a loss of DNA and H3K9 methylation, also leading to active TE [142]. Lsh1, mouse homolog of ddm1 is required for TE suppression, and elevated TE transcripts were observed in mutant Lsh1 −/− embryos [143].

Another layer of regulation of TE comes with RNA interference (RNAi) that is mediated through micro-RNAs (miRNAs). These ∼22-nucleotide-long small RNA molecules can negatively control their target gene expression. To date there are more than 460 miRNAs documented [144]. Although RNAi-mediated DNA methylation and TE silencing are well understood for A. thaliana [145], the mechanism by which RNAi mediates chromatin modification is not established in mammals. It is known that the DNA methyltransferase DNMT3A binds to artificially introduced siRNA and directs DNA methylation, which is consistent with a requirement of this enzyme in the downstream event in RNAi [146]. However, complete understanding of RNAi-regulated epigenetic mechanism in mammals still awaits further investigations.

Future prospects

Research in the last two decades demonstrated an emerging pattern of cross-talk between different epigenetic pathways. Some of these pathways were similar and conserved between both yeast and mammalian cells. For example, a cross-talk between RNAi pathways and histone modification reading protein Chp1 of yeast is similar to the Xist RNA of the mammalian cells that plays a role in deposition of DNA and histone methylation marks for X chromosome inactivation, although yeast cells are devoid of DNA methylation. One of the nagging but difficult questions in epigenetic mechanisms is the timing of the events. It is plausible to imagine that chromatin replication during S phase of the cell cycle may offer a greater flexibility for such information to pass from one generation to next. This hypothesis is supported by the presence of several complexes of epigenetic factors such as DNMT1-G9a-PCNA [26], CAF1-MBD1-SETDB1 [147], DNMT1-HDACs [25, 148] and the Polycomb protein EZH2-DNMT1 complex that directs H3K27 methylation [28] during mammalian chromatin replication. However, these observations do not answer all the questions. Indeed, mislocalization of DNMT1 from the replication fork only had a small effect on the overall genomic methylation by reducing the methylation efficiency [149]. Perhaps there are post-replicative chromatin modifications that occur after the initial wave of replicative chromatin modification during cell division. Currently, it is not known what roles modified histones play after the semi-conservative chromatin replication. With the recent discovery of several histone demethylases that can erase epigenetic marks, epigenetic modifications appear to be much more reversible rather than fixed. This brings us to another challenging area of how epigenetic marks are erased or rewritten during development and diseases. These phenomena are also not understood during ES cell development, especially how a multi-potent stem cell can give rise to several different cell type, each being genetically identical but with unique epigenetic signatures and different cellular phenotypes. Such distinctive epigenetic phenotypes are hallmarks of adult monozygotic human twins [150]. Finally, we need a better understanding of the molecular phenomenon of epigenetics in mammalian development and diseases. With modern technological innovations, such as high-throughput DNA sequencing, whole genome bisulfite sequencing and chromatin immunoprecipitation-sequencing, we can explore chromatin modifications in a more efficient manner. What we know today is just a small percentage of the exciting field of epigenetics.

Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the orginal author(s) and source are credited.