The RNA polymerase II core promoter — the gateway to transcription

https://doi.org/10.1016/j.ceb.2008.03.003Get rights and content

The RNA polymerase II core promoter is generally defined to be the sequence that directs the initiation of transcription. This simple definition belies a diverse and complex transcriptional module. There are two major types of core promoters — focused and dispersed. Focused promoters contain either a single transcription start site or a distinct cluster of start sites over several nucleotides, whereas dispersed promoters contain several start sites over 50–100 nucleotides and are typically found in CpG islands in vertebrates. Focused promoters are more ancient and widespread throughout nature than dispersed promoters; however, in vertebrates, dispersed promoters are more common than focused promoters. In addition, core promoters may contain many different sequence motifs, such as the TATA box, BRE, Inr, MTE, DPE, DCE, and XCPE1, that specify different mechanisms of transcription and responses to enhancers. Thus, the core promoter is a sophisticated gateway to transcription that determines which signals will lead to transcription initiation.

Introduction

The RNA polymerase II core promoter comprises the sequences that direct the initiation of transcription (for reviews, see [1, 2•, 3•, 4•, 5•]). Thus, in principle, the core promoter could be as simple as a single motif that serves as a universal transcription start site, or as complex as a unique set of sequence instructions for each promoter. Historically, the former model has often been presumed to be true, but emerging data indicate that there is considerable diversity in core promoter structure and function.

The objective of this review is to provide an overview of current topics that relate to the core promoter, with a particular emphasis on sequence motifs in core promoters. In addition, we have annotated core promoter-related data in papers that were published in the past two years. It should further be noted that the properties of core promoters and their cognate factors are not likely to be strictly absolute; hence, the principles and ideas described in this essay should be taken only as current working models.

Section snippets

Focused versus dispersed core promoters

The vast majority of research on core promoters has been devoted to the study of focused core promoters (Figure 1). In focused core promoters (also referred to as single-peak, or SP, promoters), there is either a single transcription start site or a distinct cluster of start sites in a short region of several nucleotides. Most eukaryotic core promoters appear to be focused core promoters. In vertebrates, however, only about one-third or less of core promoters are focused core promoters;

The initiator (Inr)

The initiator (Inr) motif encompasses the transcription start site [1, 12]. Based on functional assays, the Inr consensus was determined to be YYANWYY in humans and TCAKTY in Drosophila (degenerate nucleotides are indicated according to the IUPAC nucleotide code). The A nucleotide in the middle of the Inr consensus is often the +1 start site in focused core promoters. Inr-like sequences have also been described in Saccharomyces cerevisiae (e.g., see [13] and references therein). The Inr is

The TATA box and BRE

The TATA box, which is the most ancient and most widely used core promoter motif throughout nature, was aptly the first eukaryotic core promoter element to be identified (Michael L Goldberg, PhD thesis, Stanford University, 1979). The TATA box has a consensus of TATAWAAR, where the upstream T nucleotide is most commonly at −31 or −30 relative to the A + 1 (or G + 1) in the Inr (see, for instance [10•, 19•]). The TATA box is recognized and bound by TBP, which is a subunit of the TFIID complex in

The DPE and MTE

The DPE (downstream core promoter element) was identified as a downstream TFIID recognition sequence that is important for basal transcription activity [25]. The DPE is conserved from Drosophila to humans, and is located from +28 to +33 relative to the A + 1 in the Inr. The DPE consensus is RGWYVT in Drosophila [26]. The DPE consensus in humans has yet to be determined; however, mammalian core promoters containing sequences that conform to the Drosophila consensus have been found to possess DPE

The DCE and XCPE1 motifs

The DCE (downstream core element) was originally found in the human beta-globin promoter [32], and has also been characterized in the adenovirus major late promoter [33]. The DCE occurs frequently with the TATA box, and appears to be distinct from the DPE. The DCE consists of three subelements: SI, CTTC from +6 to +11; SII, CTGT from +16 to +21; and SIII, AGC from +30 to +34. Photocrosslinking studies revealed that the DCE is in proximity to TAF1.

The XCPE1 (X core promoter element 1) motif is

Diversity in core promoter function

The existence of different core promoter elements results in diversity in core promoter function (reviewed in [38]). For instance, enhancers are functionally linked to core promoters (see, e.g. [39]), and some transcriptional enhancers have been found to exhibit specificity for TATA versus DPE core promoter motifs [40]. In addition, different factors mediate the basic transcription process from different types of core promoters. For example, a set of purified transcription factors (TFIIA,

TBP-related factors

Diversity in the function of the transcription machinery can be seen with TBP and TBP-related factors (TRFs) (for recent reviews, see [45•, 46•, 47•]). There are three TRFs: TRF1, TRF2 (also known as TLF, TLP, TRF, and TRP), and TRF3 (also known as TBP2). TRF1 is absent in vertebrates but is present in Drosophila, in which it binds to a TC-rich sequence and mediates transcription by RNA polymerases II and III [48, 49•]. TRF2 does not bind to TATA box sequences, and is involved in RNA polymerase

Conclusions and future prospects

The core promoter is diverse and complex. We still need to gain a better understanding of the DNA sequences that dictate core promoter function as well as the protein factors that function at the different types of core promoters. It will be particularly important to devote more effort to the study of the mechanisms of transcription at dispersed core promoters, because current evidence (see, e.g. [11, 34•, 59]) suggests that there may be fundamental differences in the strategies and mechanisms

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgments

We are grateful to Uwe Ohler, Debra Urwin, Barbara Rattner, Timur Yusufzai, and Alexandra Lusser for critical reading of this manuscript. This work was supported by a grant from the National Institutes of Health (GM041249) to JTK.

References (70)

  • C. Molina et al.

    Genome wide analysis of Arabidopsis core promoters

    BMC Genomics

    (2005)
  • Y.Y. Yamamoto et al.

    Differentiation of core promoter architecture between plants and mammals revealed by LDSS analysis

    Nucleic Acids Res

    (2007)
  • A.P. Bird

    CpG-rich islands and the function of DNA methylation

    Nature

    (1986)
  • P. Carninci et al.

    Genome-wide analysis of mammalian promoter architecture and evolution

    Nat Genet

    (2006)
  • M.P. Lee et al.

    ATG deserts define a novel core promoter subclass

    Genome Res

    (2005)
  • S.T. Smale et al.

    The “initiator” as a transcription control element

    Cell

    (1989)
  • C. Yang et al.

    Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters

    Gene

    (2007)
  • Ohler U, Liao GC, Niemann H, Rubin GM. Computational analysis of core promoters in the Drosophila genome. Genome Biol...
  • P.C. FitzGerald et al.

    Comparative genomics of Drosophila and human core promoters

    Genome Biol

    (2006)
  • N.I. Gershenzon et al.

    The features of Drosophila core promoters revealed by statistical analysis

    BMC Genomics

    (2006)
  • B.A. Purnell et al.

    TFIID sequence recognition of the initiator and sequences farther downstream in Drosophila class II genes

    Genes Dev

    (1994)
  • T.H. Kim et al.

    A high-resolution map of active promoters in the human genome

    Nature

    (2005)
  • S.J. Cooper et al.

    Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome

    Genome Res

    (2006)
  • T. Lagrange et al.

    New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB

    Genes Dev

    (1998)
  • W. Deng et al.

    A core promoter element downstream of the TATA box that is recognized by TFIIB

    Genes Dev

    (2005)
  • W. Deng et al.

    TFIIB and the regulation of transcription by RNA polymerase II

    Chromosoma

    (2007)
  • T.W. Burke et al.

    Drosophila TFIID binds to a conserved downstream basal promoter element that is present in many TATA-box-deficient promoters

    Genes Dev

    (1996)
  • A.K. Kutach et al.

    The downstream promoter element DPE appears to be as widely used as the TATA box in Drosophila core promoters

    Mol Cell Biol

    (2000)
  • T.W. Burke et al.

    The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila

    Genes Dev

    (1997)
  • H. Shao et al.

    Core promoter binding by histone-like TAF complexes

    Mol Cell Biol

    (2005)
  • C.Y. Lim et al.

    The MTE, a new core promoter element for transcription by RNA polymerase II

    Genes Dev

    (2004)
  • T. Juven-Gershon et al.

    Rational design of a super core promoter that enhances gene expression

    Nat Methods

    (2006)
  • H. Xi et al.

    Analysis of overrepresented motifs in human core promoters reveals dual regulatory roles of YY1

    Genome Res

    (2007)
  • B.A. Lewis et al.

    A downstream element in the human beta-globin promoter: evidence of extended sequence-specific transcription factor IID contacts

    Proc Natl Acad Sci U S A

    (2000)
  • D.H. Lee et al.

    Functional characterization of core promoter elements: the downstream core element is recognized by TAF1

    Mol Cell Biol

    (2005)
  • Cited by (279)

    • Early epigenetic markers for precision medicine

      2023, Progress in Molecular Biology and Translational Science
    View all citing articles on Scopus
    View full text