Skip to main content
Log in

Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat

  • Published:
Plant Molecular Biology Aims and scope Submit manuscript

Abstract

Plant genomics projects involving model species and many agriculturally important crops are resulting in a rapidly increasing database of genomic and expressed DNA sequences. The publicly available collection of expressed sequence tags (ESTs) from several grass species can be used in the analysis of both structural and functional relationships in these genomes. We analyzed over 260 000 EST sequences from five different cereals for their potential use in developing simple sequence repeat (SSR) markers. The frequency of SSR-containing ESTs (SSR-ESTs) in this collection varied from 1.5% for maize to 4.7% for rice. In addition, we identified several ESTs that are related to the SSR-ESTs by BLAST analysis. The SSR-ESTs and the related sequences were clustered within each species in order to reduce the redundancy and to produce a longer consensus sequence. The consensus and singleton sequences from each species were pooled and clustered to identify cross-species matches. Overall a reduction in the redundancy by 85% was observed when the resulting consensus and singleton sequences (3569) were compared to the total number of SSR-EST and related sequences analyzed (24 606). This information can be useful for the development of SSR markers that can amplify across the grass genera for comparative mapping and genetics. Functional analysis may reveal their role in plant metabolism and gene evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. 1997. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucl. Acids Res. 25: 3389-3402.

    Google Scholar 

  • Ayers, N.M., McClung, A.M., Larkin, P.D., Bligh, H.F.J., Jones, C.A. and Park, W.D. 1997. Microsatellites and a single nucleotide polymorphism differentiate apparent amylose classes in an extended pedigree of US rice germplasm. Theor. Appl. Genet. 94: 773-781.

    Google Scholar 

  • Band M.R., J.H. Larson, M. Rebeiz, C.A. Green, D.W., Heyen, J., Donovan, R., Windish, C., Steining, P., Mahyuddin, J.E., Womack and H.A. Lewin. 2000. An ordered comparative map of the cattle and human genomes. Genome Res. 10: 1359-1368.

    Google Scholar 

  • Becker, J. and Heun, M. 1995. Barley microsatellites: allele variation and mapping. Plant Mol. Biol. 27: 835-845.

    Google Scholar 

  • Bryan, G.J., Collins, A.J., Stephenson, P., Orry, A., Smith, J.B. and Gale, M.D. 1997. Isolation and characterisation of microsatellites from hexaploid bread wheat. Theor. Appl. Genet. 94: 557-563.

    Google Scholar 

  • Chakraborty, R., Kimmel, M., Strivers, D.N., Davison, L.J. and Deka, R. 1997. Relative mutation rates at di-, tri-and tetranucleotide microsatellite loci. Proc. Natl. Acad. Sci. USA 94: 1041-1046.

    Google Scholar 

  • Chin, E.C.L., Senior, M.L., Shu, H. and Smith, J.S.C. 1996. Maize simple repetitive DNA sequences: abundance and allele variation. Genome 39: 866-873.

    Google Scholar 

  • Cho, Y.G., Ishii, T., Temnykh, S., Chen, X., Lipovich, L., McCouch, S.R., Park, W.D., Ayer, N. and Cartinhour, S. 2000. Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice (Oryza sativa). Theor. Appl. Genet. 100: 713-722.

    Google Scholar 

  • Eujayl, I., Sorrells, M.E., Baum, M., Wolters, P. and Powell, W. 2000. Assessment of genotypic variation among cultivated durum wheat based on EST-SSRs and genomic SSRs. Theor. Appl. Genet.

  • Green, P. 1999. SWAT/Crossmatch/PHRAP package, University of Washington. URL: http://www.phrap.org.

  • Gupta, P.K., Balyan, H.S., Sharma, P.C. and Ramesh, B. 1996. Microsatellites in plants: a new class of molecular markers. Curr. Sci. 70: 45-54.

    Google Scholar 

  • Herron, B.J., Silva, G.H. and Flaherty, L. 1998. Putative assignment of ESTs to the genetic map by use of the SSLP database. Mammal. Genome 9: 1072-1074.

    Google Scholar 

  • Kantety, R.V., Zeng, X., Bennetzen, J.L. and Zehr, BE. 1995. Assessment of genetic diversity in dent and popcorn (Zea mays L.) inbred lines using inter-simple sequence repeat (ISSR) amplification. Mol. Breed. 1: 365-373.

    Google Scholar 

  • Korzun, V., Röder, M.S., Wendekake, K., Pasqualone, A., Lotti, C., Ganal, M.W. and Blanco, A. 1999. Integration of dinucleotide microsatellites from hexaploid bread wheat into a genetic linkage map of durum wheat. Theor. Appl. Genet 98: 1202-1207.

    Google Scholar 

  • La Rota, C.M. 2000. EST clustering for database simplification and candidate gene discovery in rice. M.S. Thesis, Cornell University, New York.

    Google Scholar 

  • Laurent, P., Elduque, C., Hayes, H., Saunier, K., Eggen, A. and Levéziel, H. 2000. Assignment of 60 human ESTs in cattle. Mammal. Genome 11: 748-754.

    Google Scholar 

  • Lewin, B. 1994. Genes V. Oxford University Press, New York.

    Google Scholar 

  • Liu, Z.W., Biyashev, R.M. and Maroof, M.A.S. 1996. Development of simple sequence repeat DNA markers and their integration into a barley linkage map. Theor. Appl. Genet. 93: 869-876.

    Google Scholar 

  • McCouch, S.R., Chen, X., Panaud, O., Temnykh, S., Xu, Y., Cho, Y.G., Huang, N., Ishii, T. and Blair, M. 1997. Microsatellite marker development, mapping and applications in rice genetics and breeding. Plant Mol. Biol. 35: 89-99.

    Google Scholar 

  • Miller, R.T., Christoffels, A.G. et al. 1999. A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. Genome Res. 9: 1143-55.

    Google Scholar 

  • Nachit, M., Elouafi, I., Pagnotta, M.A., El Saleh, A., Iacono, E., Labhilili, M., Asbati, A., Azrak, M., Hazzam, H., Benscher, D., Khairallah, M., Ribaut, J., Tanzarella, O.A., Porceddu, E. and Sorrells, M.E. 2001. Molecular linkage map for an intraspecific recombinant inbred population of durum wheat (Triticum turgidum L. var. durum). Theor. Appl. Genet. 102: 177-186.

    Google Scholar 

  • Neri, C., Albanese, V., Lebre, A-S, Holbert, S., Saada, C., Bougueleret, L., Meier-Ewert, S., Le Gall, I., Millasseau, P., Bui, H., Giudicelli, C., Massart, C., Guillou, S., Gervy, P., Poullier, E., Rigault, P., Weissenbach, J., Lennon, G., Chumakov, I., Dausset, J., Lehrach, H., Cohen, D. and Cann, H.M. 1996. Survey of CAG/CTG repeats in human cDNAs representing new genes: candidates for inherited neurological disorders. Human Mol. Genet. 5: 1001-1009.

    Google Scholar 

  • Plaschke, J., Ganal, M.W. and Röder, M.S. 1995. Detection of genetic diversity in closely related bread wheat using microsatellite markers. Theor. Appl. Genet. 92: 1078-1084.

    Google Scholar 

  • Pujana, M.A., Gratacos, M., Corral, J., Banchs, I., Sanchez, A., Genis, D., Cervera, C., Volpini, V. and Estivill, X. 1997. Polymorphisms at 13 expressed human sequences containing CAG/CTG repeats and analysis in autosomal dominant cerebellar ataxia (ADCA) patients. Human Genet. 101: 18-21.

    Google Scholar 

  • Pulst, S.-M., Nechiporuk, A., Nechiporuk, T., Gispert, S. Chen, X.-N., Lopes-Cendes, I., Pearlman, S., Starkman, S., Orozco-Diaz, G., Lunkes, A., DeJong, P., Rouleau, G.A., Aurburger, G., Korenberg, J.R., Figueroa, C. and Sahba, S. 1996. Moderate expansion of a normally biallelic trinucleotide repeat in spinocerebellar ataxia type 2. Nature Genet. 13: 269-276.

    Google Scholar 

  • Quackenbush, J., Liang F. et al. 2000. The TIGR Gene Indices: reconstruction and representation of expressed gene sequences. Nucl. Acids Res 28: 141-145.

    Google Scholar 

  • Ramsay, L., Macaulay, M., degli Ivannissevich, S., MacLean, K., Cardle, L., Fuller, J., Edwards, K.J., Tuvesson, S., Morgante, M., Massari, A., Maestri, E., Marmiroli, N., Sjakste, T., Ganal, M., Powell, W. and Waugh, R. Genetics 156: 1997-2005.

  • Rebeiz, M. and Lewin, H.A. 2000. COMPASS of 47 787 cattle ESTs. Animal Biotechnol. 11: 175-241.

    Google Scholar 

  • Röder, M.S., Korzun, V., Wandehake, K., Planschke, J., Tixier, M.H., Leroy, P. and Ganal, M.W. 1998. A microsatellite map of wheat. Genetics 149: 2007-2023.

    Google Scholar 

  • Russell, J., Fuller, J., Young, G., Tomas, B., Taramino, G., Macaulay, M., Waugh, R. and Powell, W. 1997. Discriminating between barley genotypes using microsatellite markers. Genome 40: 442-450.

    Google Scholar 

  • Salimath, S.S., de Oliveira, A.C., Godwin, I.D. and Bennetzen, JL. 1995. Assessment of genomic origins and genetic diversity in the genus Eleusine with DNA markers. Genome 38: 757-763.

    Google Scholar 

  • Sanpei, K., Takano, H., Igarashi, S., Sato, T., Oyake, M., Sasaki, H., Wakisaka, A., Tashiro, T., Ishida, Y., Ikeuchi, T., Koide, R., Saito, M., Sato, A., Tanaka, T., Hanyu, S., Takiyama, Y., Nishizawa, M., Shimizu, N., Nomura, Y., Sagawa, N., Iwabuchi, K., Eguchi, T., Tanaka, H., Takanashi, H. and Tsuji, S. 1996. Identification of the spinocerebellar ataxia type 2 gene using a direct identification of repeat expansion and cloning technique, DIRECT. Nature Genet. 13: 277-284.

    Google Scholar 

  • Sasaki, T., Billet, E., Petronis, A., Ying, D., Parsons, T., Macciardi, F.M., Meltzer, H.Y., Lieberman, J., Joffe, R.T., Ross, C.A., McInnis, M.G., Li, S.H. and Kennedy, J.L. 1996. Psychosis and genes with trinucleotide repeat polymorphism. Human Genet. 97: 244-246.

    Google Scholar 

  • Schug, M.D., Hutter, C.M., Wetterstrand, K.A., Gaudette, M.S., Mackay, T.P.C. and Aquadro, C.F. 1988. The mutation rates of di-, tri-, and tetranucleotide repeats in Drosophila melanogaster. Mol. Biol. Evol. 5: 1751-1760.

    Google Scholar 

  • Schuler, G.D., Boguski, M.S. et al. 1996. A gene map of the human genome. Science 274 (5287): 540-546.

    Google Scholar 

  • Scott, K.D., Eggler, P., Seaton, G., Rossetto, M., Ablett, E.M., Lee, L.S. and Henry, R.J. 2000. Analysis of SSRs derived from grape ESTs. Theor. Appl. Genet. 100: 723-726.

    Google Scholar 

  • Senior, M.L., Chin, E.C.L., Lee, M. and Smith, J.S.C. 1996. Simple sequence repeat markers developed from maize found in the GenBank database: map construction. Crop Sci. 36: 1676-1683.

    Google Scholar 

  • Smit, A. 1999. RepeatMasker. University of Washington, Seattle, WA. URL: http://www.phrap.org.

    Google Scholar 

  • Sorrells, M.E. 2000a. The evolution of comparative plant genetics. In: J.P. Gustafson (Ed.) Genomes. Proceedings 22nd Stadler Symposium (6-8 June 1998, Columbia, MO), Kluwer Academic Publishers, Boston, MA.

    Google Scholar 

  • Sorrells, M.E. 2000b. Comparative genomics for tef improvement. In: H. Tefera (Ed.) Proceedings of the International Workshop for tef Improvement (13-16 October 2000, Addis Ababa, Ethiopia).

  • Tautz, D. and Renz, M. 1984. Simple sequence repeats are ubiquitous repetitive components of eukaryotic genomes. Nucl. Acids Res 12: 4127-4138.

    Google Scholar 

  • Temnykh, S., Park, W.D., Ayers, N., Cartinhour, S., Hauck, N., Lipovich, L., Cho, Y.G., Ishii, T. and McCouch, S.R. 1999. Mapping and genome organization of microsatellites in rice (Oryza sativa Theor. Appl. Genet. 100: 698-712.

    Google Scholar 

  • The Huntington's Disease Collaborative Research Group. 1993. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington's disease chromosomes. Cell 72: 971-983.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kantety, R.V., La Rota, M., Matthews, D.E. et al. Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48, 501–510 (2002). https://doi.org/10.1023/A:1014875206165

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1014875206165

Navigation