Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Discovery and characterization of chromatin states for systematic annotation of the human genome

Abstract

A plethora of epigenetic modifications have been described in the human genome and shown to play diverse roles in gene regulation, cellular differentiation and the onset of disease. Although individual modifications have been linked to the activity levels of various genetic functional elements, their combinatorial patterns are still unresolved and their potential for systematic de novo genome annotation remains untapped. Here, we use a multivariate Hidden Markov Model to reveal 'chromatin states' in human T cells, based on recurrent and spatially coherent combinations of chromatin marks. We define 51 distinct chromatin states, including promoter-associated, transcription-associated, active intergenic, large-scale repressed and repeat-associated states. Each chromatin state shows specific enrichments in functional annotations, sequence motifs and specific experimentally observed characteristics, suggesting distinct biological roles. This approach provides a complementary functional annotation of the human genome that reveals the genome-wide locations of diverse classes of epigenetic function.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Example of chromatin state annotation.
Figure 2: Chromatin state definition and functional interpretation.
Figure 3: Promoter and transcribed chromatin states show distinct functional and positional enrichments.
Figure 4: SNP and GWAS enrichments for chromatin states.
Figure 5: Discovery power of chromatin states for genome annotation.
Figure 6: Recovery of chromatin states with subsets of marks.

Similar content being viewed by others

References

  1. Bernstein, B.E., Meissner, A. & Lander, E.S. The mammalian epigenome. Cell 128, 669–681 (2007).

    Article  CAS  PubMed  Google Scholar 

  2. Kouzarides, T. Chromatin modifications and their function. Cell 128, 693–705 (2007).

    Article  CAS  PubMed  Google Scholar 

  3. Strahl, B.D. & Allis, C.D. The language of covalent histone modifications. Nature 403, 41–45 (2000).

    Article  CAS  PubMed  Google Scholar 

  4. Schreiber, S.L. & Bernstein, B.E. Signaling network model of chromatin. Cell 111, 771–778 (2002).

    Article  CAS  PubMed  Google Scholar 

  5. Barski, A. et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

    Article  CAS  PubMed  Google Scholar 

  6. Wang, Z. et al. Combinatorial patterns of histone acetylations and methylations in the human genome. Nat. Genet. 40, 897–903 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Heintzman, N.D. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318 (2007).

    Article  CAS  PubMed  Google Scholar 

  8. Heintzman, N.D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Guttman, M. et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223–227 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Hon, G., Wang, W. & Ren, B. Discovery and annotation of functional chromatin signatures in the human genome. PLoS Comput. Biol. 5, e1000566 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  11. Wang, X., Xuan, Z., Zhao, X., Li, Y. & Zhang, M.Q. High-resolution human core-promoter prediction with CoreBoost_HM. Genome Res. 19, 266–275 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Won, K.J., Chepelev, I., Ren, B. & Wang, W. Prediction of regulatory elements in mammalian genomes using chromatin signatures. BMC Bioinformatics 9, 547 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Hon, G., Ren, B. & Wang, W. ChromaSig: a probabilistic approach to finding common chromatin signatures in the human genome. PLOS Comput. Biol. 4, e1000201 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  14. Day, N., Hemmaplardh, A., Thurman, R.E., Stamatoyannopoulos, J.A. & Noble, W.S. Unsupervised segmentation of continuous genomic data. Bioinformatics 23, 1424–1426 (2007).

    Article  CAS  PubMed  Google Scholar 

  15. Jia, L. et al. Functional enhancers at the gene-poor 8q24 cancer-linked locus. PLoS Genet. 5, e1000597 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  16. Thurman, R.E., Day, N., Noble, W.S. & Stamatoyannopoulos, J.A. Identification of higher-order functional domains in the human ENCODE regions. Genome Res. 17, 917 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Schuettengruber, B. et al. Functional anatomy of polycomb and trithorax chromatin landscapes in Drosophila embryos. PLoS Biol. 7, e13 (2009).

    Article  PubMed  CAS  Google Scholar 

  18. Jaschek, R. & Tanay, A. Spatial clustering of multivariate genomic and epigenomic information. in Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology (ed. Batzoglou, S.) 170–183 (Springer, 2009).

  19. Schwartz, S., Meshorer, E. & Ast, G. Chromatin organization marks exon-intron structure. Nat. Struct. Mol. Biol. 16, 990–995 (2009).

    Article  CAS  PubMed  Google Scholar 

  20. Kolasinska-Zwierz, P. et al. Differential chromatin marking of introns and expressed exons by H3K36me3. Nat. Genet. 41, 376–381 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Andersson, R., Enroth, S., Rada-Iglesias, A., Wadelius, C. & Komorowski, J. Nucleosomes are well positioned in exons and carry characteristic histone modifications. Genome Res. 19, 1732–1741 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Schones, D.E. et al. Dynamic regulation of nucleosome positioning in the human genome. Cell. 132, 878–898 (2008).

    Article  CAS  Google Scholar 

  23. Sripathy, S.P., Stevens, J. & Schultz, D.C. The KAP1 corepressor functions to coordinate the assembly of de novo HP1-demarcated microenvironments of heterochromatin required for KRAB zinc finger protein-mediated transcriptional repression. Mol. Cell. Biol. 26, 8623–8638 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. O'Geen, H. et al. Genome-wide analysis of KAP1 binding suggests autoregulation of KRAB-ZNFs. PLoS Genet. 3, e89 (2007).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  25. Hindorff, L.A., Junkins, H.A., Mehta, J.P. & Manolio, T.A. A catalog of published genome-wide association studies. <http://www.genome.gov/gwastudies> accessed July 22, 2009.

  26. Gudbjartsson, D.F. et al. Sequence variants affecting eosinophil numbers associate with asthma and myocardial infarction. Nat. Genet. 41, 342–347 (2009).

    Article  CAS  PubMed  Google Scholar 

  27. Guelen, L. et al. Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature 453, 948–951 (2008).

    Article  CAS  PubMed  Google Scholar 

  28. Furey, T.S. & Haussler, D. Integration of the cytogenetic map with the draft human genome sequence. Hum. Mol. Genet. 12, 1037–1044 (2003).

    Article  CAS  PubMed  Google Scholar 

  29. Wang, Z. et al. Genome-wide mapping of HATs and HDACs reveals distinct functions in active and inactive genes. Cell 138, 1019–1031 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Johnson, D.S. et al. Systematic evaluation of variability in ChIP-chip experiments using predefined DNA targets. Genome Res. 18, 393–403 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25, 1952–1958 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Zhang, Y., Shin, H., Song, J.S., Lei, Y. & Liu, X.S. Identifying positioned nucleosomes with epigenetic marks in human from ChIP-Seq. BMC Genomics 9, 537 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Cui, K. et al. Chromatin signatures in multipotent human hematopoietic stem cells indicate the fate of bivalent genes during differentiation. Cell Stem Cell 4, 80–93 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  35. Celniker, S.E. et al. Unlocking the secrets of the genome. Nature 459, 927–930 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nat. Genet. 38, 626–635 (2006).

    Article  CAS  PubMed  Google Scholar 

  37. Karolchik, D. et al. The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 36, D773–D779 (2008).

    Article  CAS  PubMed  Google Scholar 

  38. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. & Wheeler, D.L. GenBank: update. Nucleic Acids Res. 32, D23–D26 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Durbin, R., Eddy, S., Krogh, A. & Mitchison, G. Biological Sequence Analysis (Cambridge Univ. Press, 1998).

  40. Neal, R.M. & Hinton, G.E. A view of the EM algorithm that justifies incremental, sparse, and other variants. Learn. Graph. Models 89, 355–368 (1998).

    Article  Google Scholar 

  41. Pruitt, K.D., Tatusova, T. & Maglott, D.R. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, D61–D65 (2007).

    Article  CAS  PubMed  Google Scholar 

  42. Smit, A., Hubley, R. & Green, P. RepeatMasker Open-3.0 1996-2010 <http://www.repeatmasker.org>.

  43. Miller, W. et al. 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res. 17, 1797–1808 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Boyle, A.P. et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Su, A.I. et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101, 6062–6067 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Kheradpour, P., Stark, A., Roy, S. & Kellis, M. Reliable prediction of regulator targets using 12 Drosophila genomes. Genome Res. 17, 1919–1931 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Ernst, J. & Bar-Joseph, Z. STEM: a tool for the analysis of short time series gene expression data. BMC Bioinformatics 7, 191 (2006).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).

Download references

Acknowledgements

We thank P. Kheradpour for regulatory motif instances and M.F. Lin for predicted new exons. We thank M. Garber, A. Siepel, K. Lindblad-Toh, and E. Lander for use of comparative information on 29 mammals. We thank B. Bernstein, N. Shoresh, C. Epstein and T. Mikkelsen for helpful discussions. We thank L. Goff, C. Bristow, R. Sealfon and all members of the MIT CompBio Group for comments, feedback and support. This material is based upon work supported by the National Science Foundation under award no. 0905968 and funding from the US National Human Genome Research Institute (NHGRI) under awards U54-HG004570 and RC1-HG005334.

Author information

Authors and Affiliations

Authors

Contributions

J.E. and M.K. developed the method, analyzed results and wrote the paper.

Corresponding author

Correspondence to Manolis Kellis.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 1 and 2, Supplementary Notes and Supplementary Figs. 1–41 (PDF 5184 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ernst, J., Kellis, M. Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat Biotechnol 28, 817–825 (2010). https://doi.org/10.1038/nbt.1662

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.1662

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing