Mutation rate variation in the mammalian genome

https://doi.org/10.1016/j.gde.2003.10.008Get rights and content

Abstract

Recent advances in the large-scale sequencing of mammalian genomes have provided a means to study divergence in not only genic sequences but also in the non-coding bulk of DNA. There is evidence of significant variation in the levels of divergence between presumably neutral regions, pointing at an underlying variation in the rate of mutation across the genome. Apparently, such variation occurs on different scales, including sequence context effects (the influence of neighboring nucleotides on the rate of mutation at individual sites), variation within chromosomes (on the scales of kilobases as well as megabases), and between chromosomes (among autosomes as well as between autosomes and sex chromosomes). An important aspect for further research in this area is to study whether there is an ultimate evolutionary explanation for mutation rate variation within mammalian genomes.

Introduction

Mutation is a fundamental process without which evolution would not occur. Knowledge about mutation rates is therefore key to evolutionary and population genetics, but also to several other areas. For instance, proper evolutionary dating founded on molecular clocks requires knowledge of the mutation rate. Moreover, as the double-edged sword effect of mutation is to cause genetic disease, understanding the rate of mutation is important in medical genetics. Furthermore, if we are to infer selection from patterns of divergence, an important aspect of comparative and functional genomics, then we need realistic null models of neutral variation (i.e. knowledge of mutation patterns). Finally, knowledge about mutation rates can shed light on issues relating to the mechanistic basis of germline mutation – there is, for instance, an ongoing debate concerning the relative importance of replication errors as a source of mutation.

There is an increasing body of evidence pointing at within-genome variation in the substitution rate at presumably neutral sites, a variation most easily explained by an underlying variation in the rate of mutation. The first such hints were offered by the observation that synonymous (silent) substitution rates vary between mammalian genes [1]. However, inferring patterns of mutation from patterns of substitution in silent sites can be problematic (Table 1). Fortunately, the recent burst of large-scale genomic sequence data has permitted the study of mutation rate variation at a new and much larger scale. Importantly, we are now starting to learn about mutation processes in the non-coding bulk of DNA, both in repetitive and unique sequences (Table 1). Here we review mammalian mutation rate variation from a genomics perspective, paying particular attention to recent data obtained in large-scale sequence comparisons within primates. We shall focus on the process of point mutation as mutations involving insertions and deletions — including short indels, transpositions and length mutation in tandem repetitive DNA — are generally thought of as having a mechanistic basis different from that of point mutation.

Section snippets

Methodological aspects

Given that spontaneous germline mutation rates for point substitutions in mammals are only ∼10−8 per bp per generation [2], we cannot hope to observe enough mutations directly to reliably infer mutation rates. As a consequence, mutation patterns are usually studied indirectly by comparing orthologous sequences from different species. Polymorphism data based on sequence variation within species also represent a useful and important source of information concerning mutation, although here we

Sequence context effects

The single nucleotide is the smallest scale at which mutation rate variation can occur. For example, at this level we see a G/C↔A/T bias in the rate of transversion, which means that G/C nucleotides are more mutable [8], and there is an extensive literature on modelling among site variation in substitution rate [9] (Figure 1). Such rate variation is also affected by the sequence context, which refers to the influence of neighbouring nucleotides on mutation rates. The study of sequence context

Mutation rate variation and isochores

A striking feature of mammalian genomes is the existence of a large variation in GC content, characterized by long (>300 kb) regions of relatively homogenous GC content termed ‘isochores’. The origin and evolution of isochores have been the focus of much debate [29]. One possibility is that they reflect regional variation in mutation patterns. Natural selection [30] or biased gene conversion [31], however — which both alter the probability of mutations becoming fixed in a population rather than

The role of recombination

A link between recombination and mutation has recently been suggested from observations of a positive correlation between divergence and recombination rate 33.•, 39.. A direct effect could be damages incurred during crossing over; recombination-associated double-stranded breaks in yeast are believed to be mutagenic but it is unknown whether they exist in humans [33]. It is also possible that recombination-associated mechanisms indirectly affect mutation rates. Recombination-associated mismatch

Conclusions

An important conclusion from recent work is that there is evidence for within-genome mutation rate variation in mammalian non-coding DNA at various scales — from sequence context effects, via regional and within-chromosome effects, to between-chromosome variation. There is some support for the notion that mutation patterns are associated with isochore structure and recombination but it is highly unlikely that there is a single factor that can explain mutation rate variation at all the scales

References and recommended reading

Papers of particular interest, published within the annual period of review, have been highlighted as:

  • of special interest

  • ••

    of outstanding interest

Acknowledgements

Financial support from the Swedish Research Council is acknowledged. H Ellegren is a Royal Swedish Academy of Sciences Research Fellow supported by a grant from Knut and Alice Wallenberg Foundation.

References (46)

  • R.H. Waterston et al.

    Initial sequencing and comparative analysis of the mouse genome

    Nature

    (2002)
  • E.T. Dermitzakis et al.

    Numerous potentially functional but non-genic conserved sequences on human chromosome 21

    Nature

    (2002)
  • A. Ureta-Vidal et al.

    Comparative genomics: genome-wide analysis in metazoan eukaryotes

    Nat Rev Genet

    (2003)
  • N.G.C. Smith et al.

    Deterministic mutation rate variation in the human genome

    Genome Res

    (2002)
  • Z. Yang

    The among-site rate variation and its impact on phylogenetic analyses

    Trends Ecol Evol

    (1996)
  • M. Zavolan et al.

    Statistical inference of sequence-dependent mutation rates

    Curr Opin Genet Dev

    (2001)
  • J. Majewski et al.

    Distribution and characterization of regulatory elements in the human genome

    Genome Res

    (2002)
  • I. Hellmann et al.

    Selection on human genes as revealed by comparisons to chimpanzee cDNA

    Genome Res

    (2003)
  • S. Subramanian et al.

    Neutral substitutions occur at a faster rate in exons than in noncoding DNA in primate genomes

    Genome Res

    (2003)
  • P.F. Arndt et al.

    DNA sequence evolution with neighbor-dependent mutation

    J Comput Biol

    (2003)
  • G. Matassi et al.

    Chromosomal location effects on gene sequence evolution in mammals

    Curr Biol

    (1999)
  • M.J. Lercher et al.

    Local similarity in evolutionary rates extends over whole chromosomes in human-rodent and mouse-rat comparisons: implications for understanding the mechanistic basis of the male mutation bias

    Mol Biol Evol

    (2001)
  • E.J.B. Williams et al.

    The proteins of linked genes evolve at similar rates

    Nature

    (2000)
  • Cited by (121)

    • DNA mismatch repair preferentially safeguards actively transcribed genes

      2018, DNA Repair
      Citation Excerpt :

      We found that H3K36me3 is more widely distributed than MutSα, but all MutSα-enriched genes are also abundant in H3K36me3, further indicating that MutSα is recruited to chromatin via H3K36me3. However, like spontaneous mutations [28–30], H3K36me3 and MutSα are not evenly distributed in the genome. Instead, they are more enriched in euchromatin, exons, and 3′ gene bodies than in heterochromatin, introns, and 5′ gene bodies, respectively.

    View all citing articles on Scopus
    View full text