Skip to main content

Analyzing ChIP-seq Data: Preprocessing, Normalization, Differential Identification, and Binding Pattern Characterization

  • Protocol
  • First Online:
Book cover Next Generation Microarray Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 802))

Abstract

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a high-throughput antibody-based method to study genome-wide protein–DNA binding interactions. ChIP-seq technology allows scientist to obtain more accurate data providing genome-wide coverage with less starting material and in shorter time compared to older ChIP-chip experiments. Herein we describe a step-by-step guideline in analyzing ChIP-seq data including data preprocessing, nonlinear normalization to enable comparison between different samples and experiments, statistical-based method to identify differential binding sites using mixture modeling and local false discovery rates (fdrs), and binding pattern characterization. In addition, we provide a sample analysis of ChIP-seq data using the steps provided in the guideline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Johnson DS, Mortazavi A, Myers R et al (2007) Genome-Wide Mapping of in Vivo Protein-DNA Interactions. Science 316: 1441–1442

    Article  Google Scholar 

  2. Liu E, Pott S, Huss M (2010) Q&A: ChIP-seq technologies and the study of gene regulation. BMC Biology 8: 56

    Article  PubMed  CAS  Google Scholar 

  3. Cleveland WS (1988) Locally-Weighted Regression: An Approach to Regression Analysis by Local Fitting. J. Am. Stat. Assoc. 85: 596–610

    Article  Google Scholar 

  4. Taslim C, Wu J, Yan P et al (2009) Comparative study on ChIP-seq data: normalization and binding pattern characterization. Bioinformatics 25: 2334–2340

    Article  PubMed  CAS  Google Scholar 

  5. Khalili A, Huang T, Lin S (2009) A robust unified approach to analyzing methylation and gene expression data. Computational Statistics and Data Analysis 53: 1701–1710

    Article  PubMed  Google Scholar 

  6. Akaike H (1973) Information Theory and an Extension of the Maximum Likelihood Principle: 267–281

    Google Scholar 

  7. Efron B (2004) Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis. Journal of the American Statistical Association 99: 96–104

    Article  Google Scholar 

  8. Oetken G, Parks T, Schussler H (1975) New results in the design of digital interpolators. IEEE Transactions on Acoustics, Speech and Signal Processing [see also IEEE Transactions on Signal Processing] 23: 301–309

    Google Scholar 

  9. Pruitt KD, Tatusova T, Maglott DR (2007) NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins, Nucleic Acids Research 35: D61–65

    Article  PubMed  CAS  Google Scholar 

  10. Lin CY, Strom A, Vega V et al (2004) Discovery of estrogen receptor alpha target genes and response elements in breast tumor cells. Genome Biology 5, R66

    Article  PubMed  Google Scholar 

  11. Feng W, Liu Y, Wu J et al (2008) A Poisson mixture model to identify changes in RNA polymerase II binding quantity using high-throughput sequencing technology. BMC Genomics 9: S23

    Article  PubMed  Google Scholar 

  12. Rozowsky J, Euskirchen G, Auerbach RK et al (2009) PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotech 27: 66–75

    Article  CAS  Google Scholar 

  13. Kharchenko PV, Tolstorukov MY, Park PJ (2008) Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nature biotechnology 26: 1351–1359

    Article  PubMed  CAS  Google Scholar 

  14. Jothi R, Cuddapah S, Barski A et al (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucl. Acids Res. 36: 5221–5231

    Article  PubMed  CAS  Google Scholar 

  15. McLachlan G, Peel D (2000) Finite Mixture Models. Wiley-Interscience, New York

    Book  Google Scholar 

  16. Mortazavi A, Williams BA, McCue K et al (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Meth 5:621–628

    Article  CAS  Google Scholar 

  17. The networks and functional analyses were generated through the use of Ingenuity Pathways Analysis (Ingenuity® Systems), see http://www.ingenuity.com

  18. KEGG pathway analysis, see http://www.genome.jp/kegg/

  19. Gene Ontology website, see http://www.geneontology.org/

  20. WEB-based GEne SeT AnaLysis Toolkit, see http://bioinfo.vanderbilt.edu/webgestalt/

  21. Software and datasets used can be downloaded, see http://www.stat.osu.edu/~statgen/SOFTWARE/GNG/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shili Lin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Taslim, C., Huang, K., Huang, T., Lin, S. (2012). Analyzing ChIP-seq Data: Preprocessing, Normalization, Differential Identification, and Binding Pattern Characterization. In: Wang, J., Tan, A., Tian, T. (eds) Next Generation Microarray Bioinformatics. Methods in Molecular Biology, vol 802. Humana Press. https://doi.org/10.1007/978-1-61779-400-1_18

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-400-1_18

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-399-8

  • Online ISBN: 978-1-61779-400-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics