Organization and variability of the maize genome

https://doi.org/10.1016/j.pbi.2006.01.009Get rights and content

With a size approximating that of the human genome, the maize genome is about to become the largest plant genome yet sequenced. Contributing to that size are a whole-genome duplication event and a retrotransposition explosion that produced a large amount of repetitive DNA. This DNA is greatly under-represented in cDNA collections, so analysis of the maize transcriptome has been an expedient way of assessing the gene content of maize. Over 2 million maize cDNA sequences are now available, making maize the third most widely studied organism, behind mouse and man. To date, the sequencing of large-sized DNA clones has been largely driven by the genetic interests of different investigators. The recent construction of a physical map that is anchored to the genetic map will aid immensely in the maize genome-sequencing effort. However, studies showing that the repetitive DNA component is highly polymorphic among maize inbred lines point to the need to sample vertically a few specific regions of the genome to evaluate the extent and importance of this variability.

Introduction

Maize is unusual among model genetic organisms. In addition to being, until the past decade, the most extensively studied plant species, it has a very high commercial value and serves as the main staple of the diet for millions of people in Africa and the Americas. It should, therefore, have ranked close to the top among species considered for early large-scale genome sequencing by the public sector. However, the high cost of sequencing its moderately large genome (>2.5 Mb) relegated maize to a position behind other plants with much smaller genomes, such as Arabidopsis, rice, poplar, and Medicago. That situation finally changed last year, when the same three US agencies that sponsored the Arabidopsis genome sequencing project (National Science Foundation [NSF], Department of Energy [DOE], and US Department of Agriculture [USDA]) announced a US$32 million program for sequencing the maize genome. Here, we review our knowledge of the organization and variability of the maize genome at the outset of this highly anticipated program.

Section snippets

The maize transcriptome

Given the maize genome's large size [1] and abundance of transposable element [2], a quick way to assess its gene content is through the isolation and sequencing of cDNAs from different tissues. With a total of 482 892 sequences, maize currently ranks tenth overall in number of cDNA entries in GenBank and second, only behind wheat, among plants. In addition, an industry consortium has recently made a collection of 1 845 987 expressed sequence tags (ESTs) available via a users agreement (//www.maizeseq.org/

The early stages of genome sequencing: absence of a physical map

The initial sequence analysis of maize genomic regions, from different maize lines and scattered in the linkage map, was driven by the genetic interests of individual investigators (see Supplementary material). The first contiguous genomic regions to be sequenced were the z1C1 [15, 16] and adh1 loci [17] on chromosomes 4 and 1, respectively. The z1C1 locus of inbred BSSS53 was assembled first from overlapping cosmids and subsequently from overlapping bacterial artificial chromosome (BAC)

The present: sequencing genomic regions anchored to the physical map

As illustrated by the Arabidopsis and rice projects, the sequencing of large regions of the genome and the isolation of genes by map-based cloning benefit greatly from the availability of a physical map, which is generally constructed in two steps. First, the genome is broken down in small overlapping fragments of about 150 kb and common restriction patterns are used to establish contiguous regions, called finger printing contigs (FPCs) [29]. Second, FPCs are aligned to the genetic map through

Overall composition of the maize genome

A global view of the organization of repetitive DNA in the maize genome has been generated by fluorescence-based in situ hybridization, which facilitated the localization of centromeric, knob, and microsatellite repeat sequences for each of the ten chromosomes [37]. Recently, it has even been possible to position genetically mapped EST markers along the 10 maize pachytene chromosomes on the basis of the distribution of recombination nodules along synaptonemal complexes [38]. Most EST markers

Intraspecific and interspecific comparisons

The availability of BAC-sized contiguous sequences has made it possible to examine the relationship of components of the maize genome to each other and to those of other species.

Some gene families in maize, such as storage protein and disease resistance genes, are found in multiple genomic regions. The α-zein storage protein genes consist of 42 copies in six different chromosomal locations [11, 42]. The Rp1 genes are located in two regions of 250 and 300 kb on chromosome 10; and the MP3 genes

Conclusions

A large-scale sequencing project of the B73 genome is about to begin. The sequence information derived from this project will be extremely useful to geneticists and breeders because it will be largely anchored to a genetic map. However, the repetitive DNA component is highly polymorphic and there might even be exceptions to the conservation of gene order among inbreds. Therefore, the picture of the distribution of retrotransposons and genes captured by the new genome sequencing project will be

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • • of special interest

  • •• of outstanding interest

Acknowledgements

We thank Galina Fuks for her help in assembling the tables. Part of the work presented here and the preparation of this article was supported by National Science Foundation (NSF) grants DBI 99-75618, MCB 99-04646, DBI 02-11851, and DBI 03-20683.

References (58)

  • P. SanMiguel et al.

    Evidence that a recent increase in maize genome size was caused by the massive amplification of intergene retrotransposons

    Ann Bot

    (1998)
  • L.K. Anderson et al.

    Uneven distribution of expressed sequence tag loci on maize pachytene chromosomes

    Genome Res

    (2006)
  • K. Arumuganathan et al.

    Nuclear DNA content of some important plant species

    Plant Mol Biol Reporter

    (1991)
  • J. Fernandes et al.

    Comparison of RNA expression profiles based on maize expressed sequence tag frequency analysis and micro-array hybridization

    Plant Physiol

    (2002)
  • B. Veit et al.

    Regulation of leaf initiation by the terminal ear 1 gene of maize

    Nature

    (1998)
  • J. Lai et al.

    Characterization of the maize endosperm transcriptome and its comparison to the rice genome

    Genome Res

    (2004)
  • International Rice Genome Sequencing Project

    The map-based sequence of the rice genome

    Nature

    (2005)
  • T. Dresselhaus et al.

    Representative cDNA libraries from few plant cells

    Plant J

    (1994)
  • M.L. Engel et al.

    Sperm cells of Zea mays have a complex complement of mRNAs

    Plant J

    (2003)
  • J. Messing et al.

    Sequence composition and genome organization of maize

    Proc Natl Acad Sci USA

    (2004)
  • H. Fu et al.

    Intraspecific violation of genetic colinearity and its implications in maize

    Proc Natl Acad Sci USA

    (2002)
  • R. Song et al.

    Gene expression of a gene family in maize based on noncollinear haplotypes

    Proc Natl Acad Sci USA

    (2003)
  • S. Brunner et al.

    Evolution of DNA sequence nonhomologies among maize inbreds

    Plant Cell

    (2005)
  • G. Haberer et al.

    Structure and architecture of the maize genome

    Plant Physiol

    (2005)
  • Y. Fu et al.

    Quality assessment of maize assembled genomic islands (MAGIs) and large-scale experimental verification of predicted genes

    Proc Natl Acad Sci USA

    (2005)
  • V. Llaca et al.

    Amplicons of maize zein genes are conserved within genic but expanded and constricted in intergenic regions

    Plant J

    (1998)
  • R. Song et al.

    Sequence, regulation, and evolution of the maize 22-kD alpha zein gene family

    Genome Res

    (2001)
  • A.P. Tikhonov et al.

    Colinearity and its exceptions in orthologous adh regions of maize and sorghum

    Proc Natl Acad Sci USA

    (1999)
  • P. SanMiguel et al.

    Nested retrotransposons in the intergenic regions of the maize genome

    Science

    (1996)
  • P. SanMiguel et al.

    The paleontology of intergene retrotransposons of maize

    Nat Genet

    (1998)
  • J. Ma et al.

    Rapid recent growth and divergence of rice nuclear genomes

    Proc Natl Acad Sci USA

    (2004)
  • Z. Swigonova et al.

    Structure and evolution of the r/b chromosomal regions in rice, maize and sorghum

    Genetics

    (2005)
  • H. Fu et al.

    The highly recombinogenic bz locus lies in an unusually gene-rich region of the maize genome

    Proc Natl Acad Sci USA

    (2001)
  • H. Fu et al.

    Recombination rates between adjacent genic and retrotransposon regions in maize vary by 2 orders of magnitude

    Proc Natl Acad Sci USA

    (2002)
  • K. Palaisa et al.

    Long-range patterns of diversity and linkage disequilibrium surrounding the maize Y1 gene are indicative of an asymmetric selective sweep

    Proc Natl Acad Sci USA

    (2004)
  • F. Zhang et al.

    Comparisons of maize pericarp color 1 alleles reveal paralogous gene recombination and an organ-specific enhancer region

    Plant Cell

    (2005)
  • D.C. Inada et al.

    Conserved noncoding sequences in the grasses

    Genome Res

    (2003)
  • C.B. Della-Vedova et al.

    The dominant inhibitory chalcone synthase allele C2-Idf (Inhibitor diffuse) from Zea mays (L.) acts via an endogenous RNA silencing mechanism

    Genetics

    (2005)
  • M. Stam et al.

    Differential chromatin structure within a tandem array 100 kb upstream of the maize b1 locus is associated with paramutation

    Genes Dev

    (2002)
  • Cited by (0)

    View full text