Function annotation of the rice transcriptome at single-nucleotide resolution by RNA-seq
- Tingting Lu1,4,
- Guojun Lu1,4,
- Danlin Fan1,
- Chuanrang Zhu1,
- Wei Li1,
- Qiang Zhao1,2,
- Qi Feng1,
- Yan Zhao1,
- Yunli Guo1,
- Wenjun Li1,
- Xuehui Huang1 and
- Bin Han1,3,5
- 1 National Center for Gene Research & Institute of Plant Physiology and Ecology, Shanghai Institutes of Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, China;
- 2 College of Life Science & Biotechnology, Shanghai Jiaotong University, Shanghai 200240, China;
- 3 Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
-
↵4 These authors contributed equally to this work.
Abstract
The functional complexity of the rice transcriptome is not yet fully elucidated, despite many studies having reported the use of DNA microarrays. Next-generation DNA sequencing technologies provide a powerful approach for mapping and quantifying the transcriptome, termed RNA sequencing (RNA-seq). In this study, we applied RNA-seq to globally sample transcripts of the cultivated rice Oryza sativa indica and japonica subspecies for resolving the whole-genome transcription profiles. We identified 15,708 novel transcriptional active regions (nTARs), of which 51.7% have no homolog to public protein data and >63% are putative single-exon transcripts, which are highly different from protein-coding genes (<20%). We found that ∼48% of rice genes show alternative splicing patterns, a percentage considerably higher than previous estimations. On the basis of the available rice gene models, 83.1% (46,472 genes) of the current rice gene models were validated by RNA-seq, and 6228 genes were identified to be extended at the 5′ and/or 3′ ends by at least 50 bp. Comparative transcriptome analysis demonstrated that 3464 genes exhibited differential expression patterns. The ratio of SNPs with nonsynonymous/synonymous mutations was nearly 1:1.06. In total, we interrogated and compared transcriptomes of the two rice subspecies to reveal the overall transcriptional landscape at maximal resolution.
Footnotes
-
↵5 Corresponding author.
E-mail bhan{at}ncgr.ac.cn; fax 86-21-64825775.
-
[Supplemental material is available online at http://www.genome.org. The RNA-seq data from this study have been deposited in the EMBL Sequence Read Archive (SRA) under accession no. ERA000212 (http://www.ebi.ac.uk/ena/data/view/ERA000212) and are available in a genome browser at http://www.ncgr.ac.cn/rrs. The sequence data set of continuous transcribed fragments, the detailed list of identified splicing junctions, all identified SNP lists, the SPSS binary code, and Perl scripts are freely available at http://www.ncgr.ac.cn/english/edatabase.htm.]
-
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.106120.110.
- Received February 4, 2010.
- Accepted July 12, 2010.
- Copyright © 2010 by Cold Spring Harbor Laboratory Press
Freely available online through the Genome Research Open Access option.