Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE

  1. Eivind Valen1,
  2. Giovanni Pascarella2,
  3. Alistair Chalk3,
  4. Norihiro Maeda4,
  5. Miki Kojima4,
  6. Chika Kawazu4,
  7. Mitsuyoshi Murata4,
  8. Hiromi Nishiyori4,
  9. Dejan Lazarevic2,8,
  10. Dario Motti2,
  11. Troels Torben Marstrand1,
  12. Man-Hung Eric Tang1,
  13. Xiaobei Zhao1,
  14. Anders Krogh1,
  15. Ole Winther1,
  16. Takahiro Arakawa4,
  17. Jun Kawai4,
  18. Christine Wells3,
  19. Carsten Daub5,
  20. Matthias Harbers7,
  21. Yoshihide Hayashizaki4,
  22. Stefano Gustincich2,
  23. Albin Sandelin1,9 and
  24. Piero Carninci4,6,9
  1. 1 The Bioinformatics Centre, Department of Biology and Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes vej 5, DK-2200, Denmark;
  2. 2 The Giovanni Armenise-Harvard Foundation Laboratory, Sector of Neurobiology, International School for Advanced Studies (SISSA), Basovizza, 34012 Trieste, Italy;
  3. 3 The National Centre for Adult Stem Cell Research, The Eskitis Institute for Cell and Molecular Therapies, Griffith University, Nathan QLD 4111, Australia;
  4. 4 LSA Technology Development Group, Omics Science Center, RIKEN Yokohama Institute, Yokohama, Kanagawa 230-0045 Japan
  5. 5 LSA Bioinformatics Team, Omics Science Center, RIKEN Yokohama Institute, Yokohama, Kanagawa 230-0045 Japan;
  6. 6 Functional Genomics Technology Team, Omics Science Center, RIKEN Yokohama Institute, Yokohama, Kanagawa 230-0045 Japan;
  7. 7 DNAFORM Inc., Yokohama, Kanagawa 230-0046, Japan;
  8. 8 CBM Scrl–Consorzio per il Centro di Biomedicina Molecolare, Basovizza, 34012 Trieste, Italy

    Abstract

    Finding and characterizing mRNAs, their transcription start sites (TSS), and their associated promoters is a major focus in post-genome biology. Mammalian cells have at least 5–10 magnitudes more TSS than previously believed, and deeper sequencing is necessary to detect all active promoters in a given tissue. Here, we present a new method for high-throughput sequencing of 5′ cDNA tags—DeepCAGE: merging the Cap Analysis of Gene Expression method with ultra-high-throughput sequence technology. We apply DeepCAGE to characterize 1.4 million sequenced TSS from mouse hippocampus and reveal a wealth of novel core promoters that are preferentially used in hippocampus: This is the most comprehensive promoter data set for any tissue to date. Using these data, we present evidence indicating a key role for the Arnt2 transcription factor in hippocampus gene regulation. DeepCAGE can also detect promoters used only in a small subset of cells within the complex tissue.

    Footnotes

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server