SNP identification in crop plants

https://doi.org/10.1016/j.pbi.2008.12.009Get rights and content

In many plants, single nucleotide polymorphism (SNP) markers are increasingly becoming the marker system of choice. However, for many crop plants there are surprisingly low numbers of validated SNP markers available although they are needed in large numbers for studies regarding genetic variation, linkage mapping, population structure analysis, association genetics, map-based gene isolation, and plant breeding. This review summarizes the current status of SNP marker development technologies for major crop plants. It will also provide an outlook into the future regarding possible SNP identification approaches in crop plants on the basis of current development in model systems such as Arabidopsis which will become available with the full sequencing of more plant genomes, genome resequencing, and in conjunction with the next-generation sequencing technologies.

Introduction

Molecular markers are widely employed in plant research and plant breeding. During plant breeding, markers are being used for the acceleration of plant selection gains through marker-assisted selection (MAS) on the basis of individual genes or on a genome level through the selection of chromosomal segments [1]. With molecular markers, genes of scientific and agronomic importance can be isolated solely on the basis of their position on the genetic map [2] and to dissect traits that are controlled by many different factors (quantitative traits) into their individual components (called QTL  quantitative trait loci) which can subsequently be molecularly identified [3]. In plant genetic research, molecular markers are also being used for the analysis of population structure, the study of evolutionary relationships, and in sequenced model systems such as Arabidopsis for studies of the genetic structure of individuals at the whole-genome level [4].

In recent years, SNP markers have gained much interest in the scientific and breeding community [5]. They occur in virtually unlimited numbers as differences of individual nucleotides between individuals and every SNP in single copy DNA is a potentially useful marker. The potential of SNP markers is clearly demonstrated in human genome analysis. On the basis of massive research efforts and the full sequence of the human genome, several million SNP markers [6••] have been identified and technologies to analyze large numbers of SNP markers simultaneously (currently up to 1 million) have been developed. With such large marker numbers it has become possible to scan the entire genome at extremely high marker densities for associations of individual markers with specific quantitatively inherited traits which is called whole-genome scanning (WGS), genome-wide association studies (GWAS), or association genetics [7].

Although thousands of SNP markers are widely used in animal and human genome analysis, their use in plants is still in its infancy. At present, essentially no studies have been published in major crop plants that involve the parallel analysis of more than 2500 SNPs in large numbers of individuals although there is a clear need for thousands of markers assayed in hundreds or thousands of lines for the use of association genetics approaches in plants [8].

Section snippets

SNP identification technologies

There are several SNP identification techniques that are used for the identification of large numbers of SNPs in a given plant. In the following the specific status of these approaches in model plants and crop plants will be discussed with respect to their applicability, requirements, and limitations on the basis of currently published literature (Table 1).

A special challenge  SNP identification in allopolyploid plant species

From a genetic point of view, many important crop species are no simple diploid genetic systems. Polyploidy is prevalent in many crop plants including, for example, oilseed rape (Brassica napus), cotton (Gossypium hirsutum), and tobacco (Nicotiana tabacum) which are allotetraploid species or wheat (Triticum aestivum) which is an allohexaploid species. Other plants such as sugarcane and potato are highly heterozygous autopolyploids with four or more genome copies. The previously described

Additional issues

In contrast to the situation in humans or Arabidopsis thaliana where basically all SNPs are of interest, the situation is different in many crop plants where the level of genetic variation within the germplasm used for plant breeding is very narrow and represents only a small part of the genetic variation of the entire species [59]. For example, only a small percentage of the genetic variation which is present in the crossing range of tomato is found within the breeding material [31]. This

The future of SNP identification in plants  prospects and limitations

Currently, large-scale identification of SNPs in crop plants is still a challenging endeavor independent whether the entire genome or only the coding regions of genes are surveyed for SNPs. In the future fully sequenced genomes will only become gradually available since it will still take time until many of the main crop plant species will be completed in that way. For example, although a rough draft sequence of the maize genome is now available (URL: http://www.maizesequence.org), it can be

Conclusions

Although a variety of approaches for large-scale SNP identification are available and the speed with which SNPs can be identified in major crop plants will increase with the involvement of the next-generation sequencing techniques, a number of challenges will remain that need to be addressed before large-scale SNP genotyping will be used routinely in major crop plants for purposes such as association genetics and plant breeding. Without the full genomic sequence of major crop plants available

References and recommended reading

Papers of particular interest, published within the period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

References (63)

  • N. Pavy et al.

    Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPs

    BMC Genomics

    (2006)
  • J. Batley et al.

    Mining for single nucleotide polymorphisms in insertions/deletions in maize expressed sequence tag data

    Plant Phys

    (2003)
  • R. Kota et al.

    Snipping polymorphisms from large EST collections in barley (Hordeum vulgare L.)

    Mol Genet Genomics

    (2003)
  • N. Yamamoto et al.

    Expressed sequence tags from the laboratory-grown miniature tomato (Lycopersicon esculentum) cultivar Micro-Tom and mining for single nucleotide polymorphisms and insertions/deletions in tomato cultivars

    Gene

    (2005)
  • J.O. Borevitz et al.

    Large-scale identification of single-feature polymorphisms in complex genomes

    Genome Res

    (2003)
  • J.O. Borevitz et al.

    Genome-wide patterns of single-feature polymorphism in Arabidopsis thaliana

    Proc Natl Acad Sci U S A

    (2007)
  • T. Singer et al.

    A high-resolution map of Arabidopsis recombinant inbred lines by whole-genome exon array hybridization

    PLoS Genet

    (2006)
  • R. Kumar et al.

    Single feature polymorphism discovery in rice

    PLoS ONE

    (2007)
  • X.P. Cui et al.

    Detecting single-feature polymorphisms using oligonucleotide arrays and robustified projection pursuit

    Bioinfomatics

    (2005)
  • N. Rostoks et al.

    Single-feature polymorphism discovery in the barley transcriptome

    Genome Biol

    (2005)
  • M. Krist et al.

    Genetic diversity contribution to errors in short oligonucleotide microarray analysis

    Plant Biotechnol J

    (2006)
  • S. Das et al.

    Detection and validation of single feature polymorphisms in cowpea (Vigna unguiculata L. Walp) using a soybean genome array

    BMC Genomics

    (2008)
  • M. Gore et al.

    Evaluation of target preparation methods for single-feature polymorphism detection in large complex plant genomes

    Crop Sci

    (2007)
  • S.I. Wright et al.

    The effects of artificial selection on the maize genome

    Science

    (2005)
  • M. Yamasaki et al.

    A large-scale screen for artificial selection in maize identifies candidate agronomic loci for domestication and crop improvement

    Plant Cell

    (2005)
  • A. Beló et al.

    Whole genome scan detects an allelic variant of fad2 associated with increased oleic acid levels in maize

    Mol Genet Genomics

    (2008)
  • I.Y. Choi et al.

    A soybean transcript map: gene distribution, haplotype and single-nucleotide polymorphism analysis

    Genetics

    (2007)
  • M. Nordborg et al.

    The pattern of polymorphism in Arabidopsis thaliana

    PLoS Biol

    (2005)
  • K. Schmid et al.

    A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism

    Genetics

    (2005)
  • S. Nasu et al.

    Search for and analysis of single nucleotide polymorphisms (SNPs) in rice (Oryza sativa, Oryza rufipogon) and establishment of SNP markers

    DNA Res

    (2002)
  • A. Van Deynze et al.

    Diversity in conserved genes in tomato

    BMC Genomics

    (2007)
  • Cited by (0)

    View full text