The Search for Meaning in Noncoding DNA

  1. Andrew G. Clark
  1. Institute of Molecular Evolutionary Genetics, Pennsylvania State University, University Park, Pennsylvania 16802, USA

This extract was created in the absence of an abstract.

Only a small portion of the genome of higher organisms encodes information for amino acid sequences of proteins, and of the noncoding sequence an unknown fraction plays a vital role in regulating gene expression. It is widely appreciated that comparisons among genome sequences will provide a key opportunity to identify functional regions of noncoding DNA by virtue of the conservation of their primary sequences. Underlying this assumption (that sequence conservation implies functional constraint) is an old idea in the theory of molecular evolution: that substitution rates vary among sites depending on constraint. In this issue, Bergman and Kreitman (2001)study the properties of sequence divergence in noncoding regions by comparing 100 kb of sequence in promoter regions and introns of 40 genes of Drosophila melanogaster and Drosophila virilis. Using a heuristic filtered dotplot, they identify blocks >8 bp in length with >70% sequence conservation between the two species. Altogether, ∼22%–26% of the noncoding sequence fell into such conserved blocks. On average, there were 10.7 blocks per kilobase pair of D. melanogaster DNA, and the blocks varied widely in length with an average of 19 bp. Distributions of block lengths, distributions of lengths of insertion/deletion events, and patterns of nucleotide substitutions were all statistically indistinguishable in contrasts of intergenic and intronic sequences. This is a surprising result, as one would expect that transcriptional, …

| Table of Contents

Preprint Server