Elsevier

Methods

Volume 43, Issue 2, October 2007, Pages 110-117
Methods

Construction of small RNA cDNA libraries for deep sequencing

https://doi.org/10.1016/j.ymeth.2007.05.002Get rights and content

Abstract

Small RNAs (21–24 nucleotides) including microRNAs (miRNAs) and small interfering RNAs (siRNAs) are potent regulators of gene expression in both plants and animals. Several hundred genes encoding miRNAs and thousands of siRNAs have been experimentally identified by cloning approaches. New sequencing technologies facilitate the identification of these molecules and provide global quantitative expression data in a given biological sample. Here, we describe the methods used in our laboratory to construct small RNA cDNA libraries for high-throughput sequencing using technologies such as MPSS, 454 or SBS.

Introduction

Nearly all eukaryotes produce small RNAs (21–24 nucleotides) that function to silence genes by multiple mechanisms. miRNAs (generally 21–22 nt) are the most abundant type of small RNAs in most organisms. miRNAs originate from “hairpin” primary transcripts from one strand of distinct genomic loci by two rounds of endoribonuclease cleavage by RNase III-like enzymes. Another type of small RNAs, known as siRNAs (generally 22–24 nt), is similar in structure and function to miRNAs. siRNAs are processed from longer double-stranded RNA molecules and represent both strands of the RNA. In many organisms, such as plants, siRNAs are believed to originate from longer transcripts derived from transposons, repetitive sequences and transgenes [1], [2], [3].

The first and still most common approach to the discovery of small RNAs has been to clone and sequence individual small RNAs using traditional molecular methods. The majority of currently known miRNAs were identified by this approach. It was first used to identify miRNAs and siRNAs in mammals, Caenorhabditis elegans, Drosophila and Arabidopsis [4], [5], [6], [7], [8]. Small RNAs that are generated by RNaseIII have 5′ phosphate and 3′ hydroxyl termini in contrast to most RNA turnover products that have a 5′ hydroxyl terminus [9]. Different cloning protocols have been developed independently. Most of them require the presence of 5′ phosphate and free 3′ hydroxyl group on the small RNAs for adapter ligation. After reverse transcription, the cDNA is PCR-amplified using primers corresponding to the adapter sequences. The PCR products are cloned and sequenced. Based on published data, about 30–50% of the clones represent RNA turnover products of the abundant rRNAs, tRNAs, snRNAs [7], [10]. The cloning frequency of an individual small RNA generally reflects its relative abundance in the sample, providing a quantitative expression measurement.

Despite the early success of this approach, it is unlikely that these efforts are saturating for rare or tissue-specific small RNAs. The identification and quantification of small RNAs using high-throughput sequencing methods was first accomplished in Arabidopsis by our lab [11]. More than 2 million small RNAs were sequenced by Massively Parallel Signature Sequencing (MPSS) [12] from Arabidopsis flowers and seedlings, yielding more than 70,000 genome-matching distinct sequences. This represented a significant advance over more traditional methods for small RNA identification. One limitation of MPSS is that it is only capable of sequencing the 5′ 17 nucleotides of small RNAs. We also pioneered use of an alternative approach for small RNA sequencing based on the “454” method of sequencing [13], a technology which produces longer sequence reads [12]. Recently, we reported the use of both MPSS and 454 to sequence small RNAs from different Arabidopsis mutant backgrounds [14], [15]. Combined with genetic approaches, deep sequencing provides a powerful tool for the dissection and characterization of diverse small RNA populations and identification of low abundance miRNAs.

This article describes the method used in our laboratory to make size-fractionated cDNA libraries that are used for high-throughput sequencing with parallel approaches. This method was originally developed for use with plants and MPSS. Substantial progress has been reported for other next-generation sequencing technologies. Solexa, Inc. has developed a four-color DNA sequencing-by-synthesis (SBS) approach as a replacement for MPSS based on a novel, reversible, dye-termination chemistry (http://www.solexa.com). This approach can potentially generate >10 million 25–30 nt sequence tags with high accuracy. A different sequencing approach named Supported Oligo Ligation Detection (SOLiD) is being developed by Agencourt Personal Genomics (now a part of Applied Biosystems, Inc.). This method uses an array of microbeads each coated with a single DNA or cDNA fragment; a pool of fluorescent oligos is used to “read” the sequences by complementary binding using a repeated process of ligation, detection, and cleavage. This determines up to 50 nucleotides of sequence per bead, for >10 million beads. These novel, highly parallel methods have the potential to dramatically reduce the cost of sequencing and offer a much richer source of sequence information. The method described here should be applicable to all of these forthcoming technologies.

Section snippets

Method

An overview of small RNA cloning and sequencing methods is schematically depicted in Fig. 1. First, low molecular weight (LMW) RNA is isolated from the tissue of interest. Next, small RNAs (20–30 nt) are purified from the LMW RNA fraction by polyacrylamide gel-based size fractionation and are ligated to a 5′ RNA adapter. To prevent self-ligation of small RNAs and self-ligation of the adapter, the 5′ terminus of the adapter has a hydroxyl group and an excess of adapter over small RNAs is used.

Material and reagents

  • 1.

    RNA isolation: Trizol reagent (Invitrogen 15596), chloroform, isopropanol, 75% ethanol, DEPC-treated water.

  • 2.

    LMW and high molecular weight (HMW) RNA separation: 5 M NaCl, 50% PEG8000, 5 mg/ml glycogen (Ambion 9510).

  • 3.

    RNA purification: 10× TBE, 2× formamide loading buffer (90% formamide, 1× TBE, xylene cyanol, and bromophenol blue), 10 bp DNA ladder (1 μg/μl) (Invitrogen 10821-015), 10% ammonium persulfate, TEMED, 40% acrylamide stock (Ambion 9022) , 0.3 M NaCl, ethanol (EtOH), ethidium bromide, Spin-X

Low molecular weight (LMW) RNA isolation

Harvest samples and immediately freeze in liquid nitrogen.

  • (a)

    Grind to a fine powder. As an example, the use of 3 g of seedling tissue ground using a mortar and pestle under liquid nitrogen will yield about 500 μg of total RNA.

  • (b)

    Isolate total RNA using Trizol reagent as indicated in the manufacturer’s protocol. For the example tissue in (a), we would use 40 ml Trizol. For some recalcitrant tissues, we add an extra chloroform extraction.

  • (c)

    Dissolve total RNA in DEPC-treated water to a concentration of about

Troubleshooting

The integrity of RNA and DNA oligos has significant impact on the outcome of the experiment. HPLC or PAGE-purified RNA oligos should be used. Regardless of the source of oligos, if there is any question about the cleanliness of the oligos, the oligos should be further PAGE-purified. The oligos can be assessed for intactness by running an aliquot on a polyacrylamide gel. The following discussion assumes that only very pure, high quality RNA or DNA oligos were used in the protocol.

Positive

Conclusions

A major limitation of traditional sequencing for the discovery of small RNAs by cloning is that it is extremely challenging to identify small RNAs that are expressed at a low level, in restricted cell-types, or at very specific stages. In principle, this is no longer a limiting factor due to our ability to deeply sequence small RNA libraries from a broad range of samples. Using the method described here, we first analyzed the small RNA component of the transcriptome of Arabidopsis tissues [11].

Acknowledgments

We thank S. Luo and C.D. Haudenschild for technical advice and assistance; M. German and M. Accerbi for comments on the manuscript. This work was supported primarily by NSF Grants 0439186 and 0548569 (P.J.G. and B.C.M.), with additional support provided by DOE DE-FG02-04ER15541 (P.J.G.).

References (16)

  • X. Chen

    FEBS Lett.

    (2005)
  • B.C. Meyers et al.

    Curr. Opin. Biotechnol.

    (2006)
  • W. Park et al.

    Curr. Biol.

    (2002)
  • P.D. Zamore et al.

    Cell

    (2000)
  • H. Vaucheret

    Genes Dev.

    (2006)
  • M. Lagos-Quintana et al.

    Science

    (2001)
  • N.C. Lau et al.

    Science

    (2001)
  • R.C. Lee et al.

    Science

    (2001)
There are more references available in the full text version of this article.

Cited by (199)

View all citing articles on Scopus
View full text