Abstract
Plant genomics projects involving model species and many agriculturally important crops are resulting in a rapidly increasing database of genomic and expressed DNA sequences. The publicly available collection of expressed sequence tags (ESTs) from several grass species can be used in the analysis of both structural and functional relationships in these genomes. We analyzed over 260 000 EST sequences from five different cereals for their potential use in developing simple sequence repeat (SSR) markers. The frequency of SSR-containing ESTs (SSR-ESTs) in this collection varied from 1.5% for maize to 4.7% for rice. In addition, we identified several ESTs that are related to the SSR-ESTs by BLAST analysis. The SSR-ESTs and the related sequences were clustered within each species in order to reduce the redundancy and to produce a longer consensus sequence. The consensus and singleton sequences from each species were pooled and clustered to identify cross-species matches. Overall a reduction in the redundancy by 85% was observed when the resulting consensus and singleton sequences (3569) were compared to the total number of SSR-EST and related sequences analyzed (24 606). This information can be useful for the development of SSR markers that can amplify across the grass genera for comparative mapping and genetics. Functional analysis may reveal their role in plant metabolism and gene evolution.
Similar content being viewed by others
References
Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. 1997. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucl. Acids Res. 25: 3389-3402.
Ayers, N.M., McClung, A.M., Larkin, P.D., Bligh, H.F.J., Jones, C.A. and Park, W.D. 1997. Microsatellites and a single nucleotide polymorphism differentiate apparent amylose classes in an extended pedigree of US rice germplasm. Theor. Appl. Genet. 94: 773-781.
Band M.R., J.H. Larson, M. Rebeiz, C.A. Green, D.W., Heyen, J., Donovan, R., Windish, C., Steining, P., Mahyuddin, J.E., Womack and H.A. Lewin. 2000. An ordered comparative map of the cattle and human genomes. Genome Res. 10: 1359-1368.
Becker, J. and Heun, M. 1995. Barley microsatellites: allele variation and mapping. Plant Mol. Biol. 27: 835-845.
Bryan, G.J., Collins, A.J., Stephenson, P., Orry, A., Smith, J.B. and Gale, M.D. 1997. Isolation and characterisation of microsatellites from hexaploid bread wheat. Theor. Appl. Genet. 94: 557-563.
Chakraborty, R., Kimmel, M., Strivers, D.N., Davison, L.J. and Deka, R. 1997. Relative mutation rates at di-, tri-and tetranucleotide microsatellite loci. Proc. Natl. Acad. Sci. USA 94: 1041-1046.
Chin, E.C.L., Senior, M.L., Shu, H. and Smith, J.S.C. 1996. Maize simple repetitive DNA sequences: abundance and allele variation. Genome 39: 866-873.
Cho, Y.G., Ishii, T., Temnykh, S., Chen, X., Lipovich, L., McCouch, S.R., Park, W.D., Ayer, N. and Cartinhour, S. 2000. Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice (Oryza sativa). Theor. Appl. Genet. 100: 713-722.
Eujayl, I., Sorrells, M.E., Baum, M., Wolters, P. and Powell, W. 2000. Assessment of genotypic variation among cultivated durum wheat based on EST-SSRs and genomic SSRs. Theor. Appl. Genet.
Green, P. 1999. SWAT/Crossmatch/PHRAP package, University of Washington. URL: http://www.phrap.org.
Gupta, P.K., Balyan, H.S., Sharma, P.C. and Ramesh, B. 1996. Microsatellites in plants: a new class of molecular markers. Curr. Sci. 70: 45-54.
Herron, B.J., Silva, G.H. and Flaherty, L. 1998. Putative assignment of ESTs to the genetic map by use of the SSLP database. Mammal. Genome 9: 1072-1074.
Kantety, R.V., Zeng, X., Bennetzen, J.L. and Zehr, BE. 1995. Assessment of genetic diversity in dent and popcorn (Zea mays L.) inbred lines using inter-simple sequence repeat (ISSR) amplification. Mol. Breed. 1: 365-373.
Korzun, V., Röder, M.S., Wendekake, K., Pasqualone, A., Lotti, C., Ganal, M.W. and Blanco, A. 1999. Integration of dinucleotide microsatellites from hexaploid bread wheat into a genetic linkage map of durum wheat. Theor. Appl. Genet 98: 1202-1207.
La Rota, C.M. 2000. EST clustering for database simplification and candidate gene discovery in rice. M.S. Thesis, Cornell University, New York.
Laurent, P., Elduque, C., Hayes, H., Saunier, K., Eggen, A. and Levéziel, H. 2000. Assignment of 60 human ESTs in cattle. Mammal. Genome 11: 748-754.
Lewin, B. 1994. Genes V. Oxford University Press, New York.
Liu, Z.W., Biyashev, R.M. and Maroof, M.A.S. 1996. Development of simple sequence repeat DNA markers and their integration into a barley linkage map. Theor. Appl. Genet. 93: 869-876.
McCouch, S.R., Chen, X., Panaud, O., Temnykh, S., Xu, Y., Cho, Y.G., Huang, N., Ishii, T. and Blair, M. 1997. Microsatellite marker development, mapping and applications in rice genetics and breeding. Plant Mol. Biol. 35: 89-99.
Miller, R.T., Christoffels, A.G. et al. 1999. A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. Genome Res. 9: 1143-55.
Nachit, M., Elouafi, I., Pagnotta, M.A., El Saleh, A., Iacono, E., Labhilili, M., Asbati, A., Azrak, M., Hazzam, H., Benscher, D., Khairallah, M., Ribaut, J., Tanzarella, O.A., Porceddu, E. and Sorrells, M.E. 2001. Molecular linkage map for an intraspecific recombinant inbred population of durum wheat (Triticum turgidum L. var. durum). Theor. Appl. Genet. 102: 177-186.
Neri, C., Albanese, V., Lebre, A-S, Holbert, S., Saada, C., Bougueleret, L., Meier-Ewert, S., Le Gall, I., Millasseau, P., Bui, H., Giudicelli, C., Massart, C., Guillou, S., Gervy, P., Poullier, E., Rigault, P., Weissenbach, J., Lennon, G., Chumakov, I., Dausset, J., Lehrach, H., Cohen, D. and Cann, H.M. 1996. Survey of CAG/CTG repeats in human cDNAs representing new genes: candidates for inherited neurological disorders. Human Mol. Genet. 5: 1001-1009.
Plaschke, J., Ganal, M.W. and Röder, M.S. 1995. Detection of genetic diversity in closely related bread wheat using microsatellite markers. Theor. Appl. Genet. 92: 1078-1084.
Pujana, M.A., Gratacos, M., Corral, J., Banchs, I., Sanchez, A., Genis, D., Cervera, C., Volpini, V. and Estivill, X. 1997. Polymorphisms at 13 expressed human sequences containing CAG/CTG repeats and analysis in autosomal dominant cerebellar ataxia (ADCA) patients. Human Genet. 101: 18-21.
Pulst, S.-M., Nechiporuk, A., Nechiporuk, T., Gispert, S. Chen, X.-N., Lopes-Cendes, I., Pearlman, S., Starkman, S., Orozco-Diaz, G., Lunkes, A., DeJong, P., Rouleau, G.A., Aurburger, G., Korenberg, J.R., Figueroa, C. and Sahba, S. 1996. Moderate expansion of a normally biallelic trinucleotide repeat in spinocerebellar ataxia type 2. Nature Genet. 13: 269-276.
Quackenbush, J., Liang F. et al. 2000. The TIGR Gene Indices: reconstruction and representation of expressed gene sequences. Nucl. Acids Res 28: 141-145.
Ramsay, L., Macaulay, M., degli Ivannissevich, S., MacLean, K., Cardle, L., Fuller, J., Edwards, K.J., Tuvesson, S., Morgante, M., Massari, A., Maestri, E., Marmiroli, N., Sjakste, T., Ganal, M., Powell, W. and Waugh, R. Genetics 156: 1997-2005.
Rebeiz, M. and Lewin, H.A. 2000. COMPASS of 47 787 cattle ESTs. Animal Biotechnol. 11: 175-241.
Röder, M.S., Korzun, V., Wandehake, K., Planschke, J., Tixier, M.H., Leroy, P. and Ganal, M.W. 1998. A microsatellite map of wheat. Genetics 149: 2007-2023.
Russell, J., Fuller, J., Young, G., Tomas, B., Taramino, G., Macaulay, M., Waugh, R. and Powell, W. 1997. Discriminating between barley genotypes using microsatellite markers. Genome 40: 442-450.
Salimath, S.S., de Oliveira, A.C., Godwin, I.D. and Bennetzen, JL. 1995. Assessment of genomic origins and genetic diversity in the genus Eleusine with DNA markers. Genome 38: 757-763.
Sanpei, K., Takano, H., Igarashi, S., Sato, T., Oyake, M., Sasaki, H., Wakisaka, A., Tashiro, T., Ishida, Y., Ikeuchi, T., Koide, R., Saito, M., Sato, A., Tanaka, T., Hanyu, S., Takiyama, Y., Nishizawa, M., Shimizu, N., Nomura, Y., Sagawa, N., Iwabuchi, K., Eguchi, T., Tanaka, H., Takanashi, H. and Tsuji, S. 1996. Identification of the spinocerebellar ataxia type 2 gene using a direct identification of repeat expansion and cloning technique, DIRECT. Nature Genet. 13: 277-284.
Sasaki, T., Billet, E., Petronis, A., Ying, D., Parsons, T., Macciardi, F.M., Meltzer, H.Y., Lieberman, J., Joffe, R.T., Ross, C.A., McInnis, M.G., Li, S.H. and Kennedy, J.L. 1996. Psychosis and genes with trinucleotide repeat polymorphism. Human Genet. 97: 244-246.
Schug, M.D., Hutter, C.M., Wetterstrand, K.A., Gaudette, M.S., Mackay, T.P.C. and Aquadro, C.F. 1988. The mutation rates of di-, tri-, and tetranucleotide repeats in Drosophila melanogaster. Mol. Biol. Evol. 5: 1751-1760.
Schuler, G.D., Boguski, M.S. et al. 1996. A gene map of the human genome. Science 274 (5287): 540-546.
Scott, K.D., Eggler, P., Seaton, G., Rossetto, M., Ablett, E.M., Lee, L.S. and Henry, R.J. 2000. Analysis of SSRs derived from grape ESTs. Theor. Appl. Genet. 100: 723-726.
Senior, M.L., Chin, E.C.L., Lee, M. and Smith, J.S.C. 1996. Simple sequence repeat markers developed from maize found in the GenBank database: map construction. Crop Sci. 36: 1676-1683.
Smit, A. 1999. RepeatMasker. University of Washington, Seattle, WA. URL: http://www.phrap.org.
Sorrells, M.E. 2000a. The evolution of comparative plant genetics. In: J.P. Gustafson (Ed.) Genomes. Proceedings 22nd Stadler Symposium (6-8 June 1998, Columbia, MO), Kluwer Academic Publishers, Boston, MA.
Sorrells, M.E. 2000b. Comparative genomics for tef improvement. In: H. Tefera (Ed.) Proceedings of the International Workshop for tef Improvement (13-16 October 2000, Addis Ababa, Ethiopia).
Tautz, D. and Renz, M. 1984. Simple sequence repeats are ubiquitous repetitive components of eukaryotic genomes. Nucl. Acids Res 12: 4127-4138.
Temnykh, S., Park, W.D., Ayers, N., Cartinhour, S., Hauck, N., Lipovich, L., Cho, Y.G., Ishii, T. and McCouch, S.R. 1999. Mapping and genome organization of microsatellites in rice (Oryza sativa Theor. Appl. Genet. 100: 698-712.
The Huntington's Disease Collaborative Research Group. 1993. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington's disease chromosomes. Cell 72: 971-983.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kantety, R.V., La Rota, M., Matthews, D.E. et al. Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48, 501–510 (2002). https://doi.org/10.1023/A:1014875206165
Issue Date:
DOI: https://doi.org/10.1023/A:1014875206165