Review
International Rice Genome Sequencing Project: the effort to completely sequence the rice genome

https://doi.org/10.1016/S1369-5266(99)00047-3Get rights and content

Abstract

The International Rice Genome Sequencing Project (IRGSP) involves researchers from ten countries who are working to completely and accurately sequence the rice genome within a short period. Sequencing uses a map-based clone-by-clone shotgun strategy; shared bacterial artificial chromosome/ P1-derived artificial chromosome libraries have been constructed from Oryza sativa ssp. japonica variety ‘Nipponbare’. End-sequencing, fingerprinting and marker-aided PCR screening are being used to make sequence-ready contigs. Annotated sequences are immediately released for public use and are made available with supplemental information at each IRGSP member’s website. The IRGSP works to promote the development of rice and cereal genomics in addition to producing genome sequence data.

Introduction

Rice is a wonderful plant. It feeds about one half of the world’s population, mainly in Asia, Africa, and South America. Cooking of rice is simple and does not require fermentation by yeast. It contains all of the amino acids essential for humans except lysine. It has a long cultivation history and, like religion or tradition, its use is deeply ingrained in the daily lives of Asian people.

A huge number of rice varieties adapted to local climates, soils and cooking preferences have been produced throughout the history of its cultivation. Over the past 30 years, world rice production has doubled as the result of the introduction of new varieties and improved technology. Nevertheless, increases in annual rice production have slowed to the point where production is no longer keeping pace with the growth in the number of consumers. Rice production in the next fifty years faces even greater challenges. A larger and more affluent population means, on the one hand, demand for greater production and better quality rice, and on the other hand, the availability of less land, water and labor to produce the crop. In short, there will be great demands on biotechnology to improve rice production.

The sequencing of all of the rice genes alone provides insufficient information on which to base crop improvements such as greater yield. Map-based sequence information is required to exploit the full potential of the rice sequence. In recent years, plant breeding has been enhanced by molecular-marker technology that permits researchers to screen larger populations and necessitates less progeny testing. Knowledge of the location of all of genes in a genome extends the usefulness of molecular-marker technology because it allows the identification of candidate genes that control specific traits. The genes themselves then become markers and the process becomes more accurate and efficient. For example, knowing the location and sequence of candidate genes makes it possible to design allele-specific markers; these markers readily lend themselves to processes in which the extraction of DNAs from plant leaves and the successive PCR reaction are automated.

Rice is a model species for the cereals and a good candidate for DNA sequencing. It has a genome size of 400–430 million base pairs (Mb), the smallest of the major cereals but three times that of Arabidopsis thaliana [1]. Rice also has a well-mapped genome: the rice molecular map, which has over 6000 markers, has already been useful in helping to align physical chromosome maps. Over 40,000 expressed sequence tags (ESTs) have been reported and many are mapped. A yeast artificial chromosome (YAC) library that has been fingerprinted and ordered with mapped markers currently covers 60% of the rice genome. Several bacterial artificial chromosome (BAC) libraries have also been described. Since the introduction of new methods for Agrobacterium tumefaciens transformation, rice has become the easiest of all cereal plants to transform genetically. This tool permits geneticists to complement mutations, or to confer dominant phenotypes to verify gene function.

Following-on from the past decade’s progress in understanding the molecular genetics of rice, an effort to sequence the whole rice genome has become a reality, beginning in Japan in 1997. Other countries with an interest in rice genomics decided to cooperate in this laborious but meaningful task, which became the International Rice Genome Sequencing Project (IRGSP). Here, we briefly review the strategy that has been adopted for the sequencing of the rice genome.

Section snippets

Mapping: the link between genomics and genetics

The genetic map has maintained its central importance as the basic tool that links information in the nucleotide sequence to phenotypic traits throughout the rice genome-sequencing project. The first step in understanding rice at the DNA level is to make a linkage map based on polymorphisms within DNA sequences, such as restriction fragment length polymorphisms (RFLPs), simple sequence repeats (SSRs) and cleaved amplified polymorphic sequences (CAPSs). More than ten rice genetic maps have been

IRGSP sequencing strategy

Wide-ranging discussion within the IRGSP has encompassed many points including the optimal method of sequencing, the rice cultivar to be sequenced, the accuracy of sequences and the sequence release policy. A single variety of rice was chosen to be the source of DNA for sequencing because the cultivated varieties have diverse genetic backgrounds and so, if several varieties were used, allelic polymorphisms would probably impede the accurate compilation or integration of sequences. The japonica

Annotation and database

The sequences generated by the Rice Genome Research Program are annotated by searching the non-redundant protein database using BLASTX software [11], searching the rice EST database using BLASTN software [11], and scanning the sequence with GenScan [12] (trained for maize) to predict open reading frames and with Splice Predictor [13] to project exon/intron splice sites. These results are combined to make a final annotation of genes and elements, and their coordinates in a genome sequence. Other

Conclusions

The IRGSP grew out of a workshop held in 1997 at the 4th International Plant Molecular Biology Conference in Singapore. At this workshop, Japan, the USA, the European Union, Korea, and China agreed to collaborate on rice genome sequencing. Specifically, they agreed to share materials and results and to sequence Nipponbare as the sole germplasm. Since then, meetings have been held twice a year to report on progress and to discuss strategies and technical issues. The ten countries now taking part

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

References (24)

  • C. Burge et al.

    Prediction of complete gene structures in human genomic DNA

    J Mol Biol

    (1997)
  • K. Arumuganathan et al.

    Nuclear DNA content of some important plant species

    Plant Mol Biol Reporter

    (1991)
  • Y. Harushima et al.

    A high-density rice genetic linkage map with 2275 markers using a single F2 population

    Genetics

    (1998)
  • N. Kurata et al.

    Physical mapping of the rice genome with YAC clones

    Plant Mol Biol

    (1997)
  • Budiman MA, Tomkins JP, Wing RA. Construction and characterization of rice Nipponbare BAC library. URL...
  • T. Baba et al.

    Construction and characterization of rice genomic libraries, PAC library of japonica variety Nipponbare, and BAC library of indica variety Kasalath

    Misc Publ Natl Inst Agrobiol Resour

    (2000)
  • Clemson University Genomics Institute: Rice BAC end sequencing pro-ject. URL...
  • Wu J, Shimokawa T, Maehara T, Yazaki J, Harada C, Yamamoto S, Takazaki Y, Fujii F, Ono N, Koike K et al.: Current...
  • K. Yamamoto et al.

    Large-scale EST sequencing in rice

    Plant Mol Biol

    (1997)
  • Rice Genome Research Program: Genome Sequencing. URL http://www.staff.or.jp/genomicdata/GenomeFinished.html On this web...
  • The Wellcome Trust: Summary of the Report of the Second International Strategy Meeting on Human Genome Sequencing,...
  • S.F. Altschul et al.

    Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

    Nucleic Acids Res

    (1997)
  • Cited by (353)

    View all citing articles on Scopus
    View full text