Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Defining transcribed regions using RNA-seq

Abstract

Next-generation sequencing technologies are revolutionizing genomics research. It is now possible to generate gigabase pairs of DNA sequence within a week without time-consuming cloning or massive infrastructure. This technology has recently been applied to the development of 'RNA-seq' techniques for sequencing cDNA from various organisms, with the goal of characterizing entire transcriptomes. These methods provide unprecedented resolution and depth of data, enabling simultaneous quantification of gene expression, discovery of novel transcripts and exons, and measurement of splicing efficiency. We present here a validated protocol for nonstrand-specific transcriptome sequencing via RNA-seq, describing the library preparation process and outlining the bioinformatic analysis procedure. While sample preparation and sequencing take a fairly short period of time (1–2 weeks), the downstream analysis is by far the most challenging and time-consuming aspect and can take weeks to months, depending on the experimental objectives.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Flowchart of experimental procedure.
Figure 2: Diagrammatic view of analysis steps in a RNA-seq experiment.
Figure 3: Example of acceptable Bioanalyzer trace for total RNA.
Figure 4: Example of annotated output from data analysis steps of the protocol.

Similar content being viewed by others

References

  1. Kapranov, P., Willingham, A.T. & Gingeras, T.R. Genome-wide transcription and the implications for genomic organization. Nat. Rev. Genet. 8, 413–423 (2007).

    Article  CAS  PubMed  Google Scholar 

  2. Mercer, T.R., Dinger, M.E. & Mattick, J.S. Long non-coding RNAs: insights into functions. Nat. Rev. Genet. 10, 155–159 (2009).

    Article  CAS  PubMed  Google Scholar 

  3. Carthew, R.W. & Sontheimer, E.J. Origins and mechanisms of miRNAs and siRNAs. Cell 136, 642–655 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Marguerat, S. & Bähler, J. RNA-seq: from technology to biology. Cell Mol. Life Sci. published online, doi:10.1007/s00018-009-0180-6 (27 October 2009).

  5. Wilhelm, B.T. & Landry, J. RNA-seq—quantitative measurement of expression through massively parallel RNA-sequencing. Methods 48, 249–257 (2009).

    Article  CAS  PubMed  Google Scholar 

  6. Wilhelm, B.T. et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453, 1239–1243 (2008).

    Article  CAS  PubMed  Google Scholar 

  7. Mardis, E.R. Next-generation DNA sequencing methods. Annu. Rev. Genomics Hum. Genet. 9, 387–402 (2008).

    Article  CAS  PubMed  Google Scholar 

  8. Lyne, R. et al. Whole-genome microarrays of fission yeast: characteristics, accuracy, reproducibility, and processing of array data. BMC Genomics 4, 27 (2003).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat. Methods 5, 613–619 (2008).

    Article  CAS  PubMed  Google Scholar 

  10. Quail, M.A. et al. A large genome center's improvements to the Illumina sequencing system. Nat. Methods 5, 1005–1010 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Korbel, J.O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Lister, R. et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133, 523–536 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Ingolia, N.T., Ghaemmaghami, S., Newman, J.R.S. & Weissman, J.S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Li, H. et al. Determination of tag density required for digital transcriptome analysis: application to an androgen-sensitive prostate cancer model. Proc. Natl. Acad. Sci. USA 105, 20179–20184 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Parkhomchuk, D. et al. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 37, 123 (2009).

    Article  Google Scholar 

  16. Croucher, N.J. et al. A simple method for directional transcriptome sequencing using Illumina technology. Nucleic Acids Res. published online, doi:10.1093/nar/gkp811 (8 October 2009).

  17. Furuno, M. et al. Clusters of internally primed transcripts reveal novel long noncoding RNAs. PLoS Genet. 2, e37 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Quinlan, A.R., Stewart, D.A., Strömberg, M.P. & Marth, G.T. Pyrobayes: an improved base caller for SNP discovery in pyrosequences. Nat. Methods 5, 179–181 (2008).

    Article  CAS  PubMed  Google Scholar 

  19. Rougemont, J. et al. Probabilistic base calling of Solexa sequencing data. BMC Bioinformatics 9, 431 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Whiteford, N. et al. Swift: primary data analysis for the Illumina Solexa sequencing platform. Bioinformatics 25, 2194–2199 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Ewing, B. & Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).

    Article  CAS  PubMed  Google Scholar 

  22. Denoeud, F. et al. Annotating genomes with massive-scale RNA sequencing. Genome Biol. 9, R175 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Hahn, D.A., Ragland, G.J., Shoemaker, D.D. & Denlinger, D.L. Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis. BMC Genomics 10, 234 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Yassour, M. et al. Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proc. Natl. Acad. Sci. USA 106, 3264–3269 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Toth, A.L. et al. Wasp gene expression supports an evolutionary link between maternal behavior and eusociality. Science 318, 441–444 (2007).

    Article  CAS  PubMed  Google Scholar 

  26. Trapnell, C. & Salzberg, S.L. How to map billions of short reads onto genomes. Nat. Biotechnol. 27, 455–457 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Kent, W.J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Rumble, S.M. et al. SHRiMP: accurate mapping of short color-space reads. PLoS Comput. Biol. 5, e1000386 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Li, R., Li, Y., Kristiansen, K. & Wang, J. SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank Dr. J.-R. Landry for critical reading of the manuscript. Research in the Bähler laboratory is funded by Cancer Research UK and by PhenOxiGEn, an EU FP7 research project.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed extensively to the work presented in this paper.

Corresponding author

Correspondence to Jürg Bähler.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wilhelm, B., Marguerat, S., Goodhead, I. et al. Defining transcribed regions using RNA-seq. Nat Protoc 5, 255–266 (2010). https://doi.org/10.1038/nprot.2009.229

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2009.229

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research