High-resolution human core-promoter prediction with CoreBoost_HM

  1. Xiaowo Wang1,2,3,
  2. Zhenyu Xuan2,3,
  3. Xiaoyue Zhao2,
  4. Yanda Li1 and
  5. Michael Q. Zhang2,4
  1. 1 MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China;
  2. 2 Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
    1. 3 These authors contributed equally to this work.

    Abstract

    Correctly locating the gene transcription start site and the core-promoter is important for understanding transcriptional regulation mechanism. Here we have integrated specific genome-wide histone modification and DNA sequence features together to predict RNA polymerase II core-promoters in the human genome. Our new predictor CoreBoost_HM outperforms existing promoter prediction algorithms by providing significantly higher sensitivity and specificity at high resolution. We demonstrated that even though the histone modification data used in this study are from a specific cell type (CD4+ T-cell), our method can be used to identify both active and repressed promoters. We have applied it to search the upstream regions of microRNA genes, and show that CoreBoost_HM can accurately identify the known promoters of the intergenic microRNAs. We also identified a few intronic microRNAs that may have their own promoters. This result suggests that our new method can help to identify and characterize the core-promoters of both coding and noncoding genes.

    Footnotes

    • 4 Corresponding author.

      E-mail mzhang{at}cshl.org; fax (516) 367-8461.

    • [Supplemental material is available online at www.genome.org.]

    • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.081638.108.

      • Received June 3, 2008.
      • Accepted October 27, 2008.
    | Table of Contents

    Preprint Server