Elsevier

Gene

Volume 443, Issues 1–2, 15 August 2009, Pages 64-69
Gene

Genetic robustness at the codon level as a measure of selection

https://doi.org/10.1016/j.gene.2009.05.009Get rights and content

Abstract

Selection at the DNA level is usually detected by analysing substitution rates from multiple-species comparisons. It has been suggested that measures of genetic robustness at the codon level, which can be measured by analysing a single coding sequence, can be used to estimate selection, but the validity of these measures has been questioned. Here I test the efficiency of different measures of genetic robustness at the codon level to estimate the level of selection acting on a gene. I find that volatility and other measures of robustness are correlated with dN/dS, and that this is not simply the effect of a preference for translationally optimal codons. I discuss the possible implications and the possible problems of these methods based on single-sequence codon usage analysis.

Introduction

Natural selection at the DNA sequence level is usually assessed by comparing the number of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN) from an alignment of at least two homologous protein-coding DNA sequences (Li, 1997, Nei and Kumar, 2000).

Two methods have been proposed, independently (“volatility” by Plotkin and Dushoff, 2003, Plotkin et al., 2004 and “degree of error minimization” by Archetti, 2004a, Archetti, 2004b), that measure genetic robustness of a coding sequence, and could be used to detect selection by analysing a single sequence, with no need of comparative analysis. These methods rely on the fact that synonymous codons, though neutral at the protein level, have different mutant codons with different fitness, and therefore different rates of back mutations (Archetti, 2004b, Plotkin et al., 2004) or because they affect differently the fitness of competing individuals (Archetti, 2006). The main difference between the two methods is that volatility (Plotkin et al., 2004) is a relative measure (relative to the other genes of the genome), whereas the degree of error minimization (Archetti, 2004a, Archetti, 2004b) is an absolute measure (depending only on the genetic code used).

Both methods rely on codon usage analysis of single coding sequences, with no need of comparative analysis. Since homologous sequences are not always available, a method to detect selection based on direct single-sequence analysis would be a useful tool in the study of molecular evolution. Both Archetti, 2004b, Plotkin et al., 2004 found a correlation between genetic robustness and substitution rates in a limited set of genes. Therefore the importance of these methods as a complement to dN/dS is worth investigating.

Volatility (Plotkin et al., 2004) has been criticised on different grounds, and virtually all the subsequent analyses (Hahn et al., 2005, Nielsen and Hubisz, 2005, Sharp, 2005; Chen et al., 2005, Dagan and Graur, 2005, Friedman and Hughes, 2005, Stoletzki et al., 2005, Zhang, 2005, Pillai et al., 2005) seem to suggest the rejection of volatility as a method to detect selection, whereas the degree of error minimization (Archetti, 2004b) has received less attention, primarily in studies concerning the evolution of the genetic code (Goodarzi et al., 2005, Marquez et al., 2005).

The critiques to volatility belong to three broad categories: (i) theoretical or conceptual uncertainties on the method, (ii) existence of confounding factors, and (iii) doubts on the validity of the results (correlation with dN/dS). Clearly, as Stoletzki et al. (2005) point out, critiques of the first and second kind are rather unimportant if the method does provide an alternative valid measure to dN/dS.

The critiques put forward by Hahn et al., 2005, Nielsen and Hubisz, 2005, Sharp, 2005, Chen et al., 2005, Zhang, 2005, Pillai et al., 2005 and also Dagan and Graur (2005), belong to the first and second category. Stoletzki et al., 2005, Friedman and Hughes, 2005, and in part also Dagan and Graur (2005), belong to the third. Stoletzki et al. (2005) show that a correlation between dN/dS and volatility actually exists but suggest that it is a byproduct of a correlation with preference for codons that optimize translation efficiency, while Dagan and Graur, 2005, Friedman and Hughes, 2005 question the very existence of a correlation.

I develop new measures of genetic robustness and I compare these and previous measures with dN/dS values derived from classical comparative analysis. I also re-analyse the data for which volatility has been questioned, to test whether the other measures described here can provide a valid alternative to dN/dS or whether they have the same problems as volatility. I then discuss whether the theoretical critiques that have been put forward about volatility can be applied to other measures as well.

Section snippets

Volatility

Volatility has been described and used by Plotkin and Dushoff (2003) and by Plotkin et al. (2004). The volatility v(i) of codon i is defined asv(i)=1nnD[A(i),A(i*)]where the sum is calculated over the n non-stop codons i⁎ that can mutate into i by a single point mutation and D is the distance (dissimilarity) between the amino acid A(i) coded by codon i and its mutant A(i⁎) coded by codon i⁎. Plotkin et al. (2004) use the simplest possible measure for D, the Hamming metric, which equals zero if

Genetic robustness is correlated with dN/dS

I analysed the correlation between dN/dS and measures of genetic robustness for species that had been analysed previously and for which no evidence of a correlation between dN/dS and volatility had been found. The results are in Table 1, Table 2. In S. cerevisiae I found a correlation between dN/dS and all the measures of genetic robustness except PCA and except the modified versions of volatility. Plotkin et al. (2004), mention that the correlation between dN/dS and volatility is higher (r =  

Discussion

I have shown that alternative measures based on the analysis of codon usage from a single sequence are correlated with dN/dS values derived from comparative analysis, at least in the species analysed here. Because the validity of one of these measures (volatility) has received many critiques, I think it is worth pointing out the differences between the idea on which volatility is based and the logic of other measures of genetic robustness.

The first main difference is that volatility is a

Acknowledgments

Thanks to Nina Stoletzki and to Robert Friedman and Austin Hughes for providing the data used in their analyses, to Joshua Plotkin for help with the calculation of volatility and to Alan Grafen for discussion and a clarification on the statistics. I am supported by a long-term postdoctoral fellowship of the Human Frontier Science Program Organization.

References (26)

  • GoodarziH. et al.

    On the coevolution of genes and genetic code

    Gene

    (2005)
  • PillaiS.K. et al.

    Codon volatility does not reflect selective pressure on the HIV-1 genome

    Virology

    (2005)
  • WrightF.

    The ‘effective number of codons’ used in a gene

    Gene

    (1990)
  • ArchettiM.

    Codon usage bias and mutation constraints reduce the level of error minimization of the genetic code

    J. Mol. Evol.

    (2004)
  • ArchettiM.

    Selection on codon usage for error minimization at the protein level

    J. Mol. Evol.

    (2004)
  • ArchettiM.

    Genetic robustness and selection at the protein level for synonymous codons

    J. Evol. Biol.

    (2006)
  • ArchettiM.

    Survival of the steepest: hypersensitivity to mutations as an adaptation to soft selection

    J. Evol. Biol.

    (2009)
  • ChenY. et al.

    Codon volatility does not detect selection

    Nature

    (2005)
  • DaganT. et al.

    The comparative method rules! Codon volatility cannot detect positive Darwinian selection using a single genome sequence

    Mol. Biol. Evol.

    (2005)
  • FriedmanR. et al.

    Codon volatility as an indicator of positive selection, data from eukaryotic genome comparisons

    Mol. Biol. Evol.

    (2005)
  • HahnM.W. et al.

    Codon bias and selection on single genomes

    Nature

    (2005)
  • KimuraM.

    The Neutral Theory of Natural Selection

    (1983)
  • LiW.H.

    Rates and Patterns of Nucleotide Substitution

  • Cited by (6)

    • Codon usage divergence of important functional genes in Mycobacterium tuberculosis

      2022, International Journal of Biological Macromolecules
      Citation Excerpt :

      Meanwhile, other drugs targeting researches for genes of high divergence have also received some attention [59,60]; however, there were very few practical applications. Codon usage preference and divergence not only play an important regulatory role map at the level of gene expression, but also help to improve the accuracy and efficiency of translation [61]. The codon usage divergences of important genes in Mycobacterium tuberculosis are highlighted in the present study.

    View full text