Abstract
Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed volatility, to estimate selection pressures on proteins on the basis of their synonymous codon usage (Plotkin and Dushoff 2003; Plotkin et al. 2004). Here we provide a theoretical foundation for this approach. Under the Fisher-Wright model, we derive the expected frequencies of synonymous codons as a function of the strength of selection on amino acids, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. We find that synonymous codon usage can reliably distinguish between negative selection and neutrality only for organisms, such as some microbes, that experience large effective population sizes or periods of elevated mutation rates. The power of volatility to detect positive selection is also modest—requiring approximately 100 selected sites—but it depends less strongly on population size. We show that phenomena such as transient hyper-mutators can improve the power of volatility to detect selection, even when the neutral site heterozygosity is low. We also discuss several confounding factors, neglected by the Fisher-Wright model, that may limit the applicability of volatility in practice.
Similar content being viewed by others
References
Akashi H (2001) Gene expression and molecular evolution. Curr Opin Genet Dev 11:660–666
Akashi H, Schaeffer SW (1997) Natural selection and the frequency distribution of “silent” DNA polymorphism in Drosophila. Genetics 146:295–307
Anisimova M, Bielawski JP, Yang Z (2001) The accuracy and power of likelihood ratio tests to detect positive selection at amino acid sites. Mol Biol Evol 18:1585–1592
Berg O (1996) Selection intensity for codon bias and the effective population size of Escherichia coli. Genetics 142:1379–1382
Bjedov I, Tenaillon O, Gerard B, et al. (2003) Stress-induced mutagenesis in bacteria. Science 300:1404–1409
Bulmer M (1991) The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897–907
Bustamante CD, Wakely J, Sawyer S, Hartl DL (2001) Directional selection and the site-frequency spectrum. Genetics 159:1779–1788
Chamary JV, Parmley JL, Hurst LD (2006) Hearing silence: non-neutral evolution at synonymous sites in mammals. Nature Rev Genet 7:98–108
Charlesworth B, Morgan MT, Charlesworth D (1993) The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289–1303
Clark A, Glanowski S, Nielsen R, Thomas P, Kejariwal A, MA MT, Tanenbaum D, Civello D, Lu F, B BM, Ferriera S, Wang G, Zheng X, White T, Sninsky J, Adams M, Cargill M (2003) Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302:1960–1963
Coghlan A, Wolfe KH (2000) Relationship of codon bias to mRNA concentration and protein length in saccharomyces cerevisiae. Yeast 16:1131–1145
Crow JF, Kimura M (1970) An introduction to population genetics theory.Burgess, Minneapolis
Dagan T, Graur D (2004) The comparative method rules! codon volatility cannot detect positive Darwinian selection using a single genome sequence. Mol Biol Evol 22:496–500
Debry R, Marzluff WF (1994) Selection on silent sites in the rodent H3 historic gene family. Genetics 138:191–202
Denamur E, Lecointre G, Darlu P, OTenaillon CA, Sayada C, Sunjevaric I (2000) Evolutionary implications of the frequent horizontal transfer of mismatch repair genes. Cell 103:711–721
Drummond DA, Bloom JD, Adami C, Wilke CO, Arnold FH (2005) Why highly expressed proteins evolve slowly. Proc Natl Acad Sci USA 102:14338–14343
Drummond DA, Raval A, Wilke CO (2006) A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 23:327–337
Ewens W (2004) Mathematical populations genetics I. Springer-Verlag, New York
Fleischmann RD, Alland D, Eisen JA, Carpenter L, White O, Peterson J, DeBoy R, Dodson R, Gwinn M, Haft D, Hickey E, Kolonay JF, Nelson WC, Umayam LA, Ermolaeva M, Salzberg SL, Delcher A, Utterback T, Weidman J, Khouri H, Gill J, Mikula A, Bishai W, Jacobs WR, Venter JC, Fraser CM (2002) Whole-genome comparison of Mycobacterium tuberculosis clinical and laboratory strains. J Bacteriol 184:5479–5490
Friedman R, Hughes AL (2005) Codon volatility as an indicator of positive selection: Data from eukaryotic genome comparisons. Mol Biol Evol 22:542–543
Gibbons RJ, Kapsimalis B (1967) Estimates of the overall rate of growth of the intenstinal microflora for hamsters, Guinea pigs, and mice J Bacteriol 93:510–512
Gillespie J (2001) Is the population size of a species relevant to its evolution? Evolution 55:2161–2169
Giraud A, Radman M, Matic I, Taddei F (2001) The rise and fall of mutator bacteria. Curr Opin Microbiol 4:582–585
Golding GB, Strobeck C (1982) Expected frequencies of codon use as a function of mutation rates and codon fitnesses. J Mol Evol 18:379–386
Goldman N, Yang Z (1994) Codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol 11:725–736
Hahn MW, Mezey JG, Begun DJ, Gillespie JH, Kern AD, Langley CH, Moyle LC (2005) Codon bias and selection on signle genomes. Nature 433:E1
Hartl DL, Sawyer SA (1994) Selection intensity for codon bias. Genetics 138:227–234
Higgs P (1994) Error thresholds and stationary mutant distributions in multi-locus diploid genetics models. Genet Res Cambr 63:63–78
Hirsh AE, Fraser HB, Wall DP (2005) Adjusting for selection on synonymous sites in estimates of evolutionary distance. Mol Biol Evol 22:174–177
Ikemura T (1981) Correlation between the abundance of Escherichia coli transfer-RNAs and the occurrence of the respective codons in its protein. J Mol Biol 146:1–21
Kellis M, Patterson N, Endrizzi M, Birren B, Lander E (2003) Sequencing and comparison of yeast species to identify genes arid regulatory elements. Nature 423:241–254
Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population. Genetics 49:725–738
King JL, Jukes TH (1969) Non-Darwinian evolution. Science 164:788
Konopka AJ (1985) Theory of degenerate coding and informational parameters of protein coding genes. Biochimie 67:455–468
Kreitman M (2000) Methods to detect selection in populations with applications to the human Annu Rev Genomics. Hum Genet 1:539–559
LeClerc -J, Li B, Payne WL, Cebula TA (1996) High mutation frequencies among Escherichia coli and salmonella pathogens. Science 274:1208–1211
Li WH (1993) Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J Mol Evol 36:96–99
Lynch M, Conery JS (2003) The origins of genome complexity. Science 302:1401–4404
Maynard Smith J, Haigh J (1974) The hitch-hiking effect of a favorable gene. Genet Res Cambr 23:23–25
McDonald JH, Kreitman M (1991) Adaptive protein evolution at the ADH locus in Drosophila. Nature 351:652–654
Miyata T, Miyazawa S, Yashunaga T (1979) Two types of amino acid substitutions in protein eolution. J. Mol Evol 12:219–236
Myers LA, Ancel FD, Lachmann M (2005) Evolution of genetic potential. PloS Comput Biol 1:236–243
Nagylaki T (1992) Introduction to theoretical population genetics. Springer, Berlin
Nielsen R, Hubisz M (2005) Detecting selection needs comparative data. Nature 433:E6
Notley-McRobb L, Seeto S, Ferenci T (2001) Enrichment and elimination of mutY mutators in Escherichia coli populations. Genetics 162:1955–1062
Ochman H, Elwyn S, Moran NA (1999) Calibrating bacterial evolution. Proc Natl Acad Sci USA 96:12638–12643
Oliver A, Canton R, Campo P, Baquero F, arid Blazquez J (2000) High frequency of hypermutable Pseudornonas aeruginosa in cystic fibrosis lung infection. Science 288:1251–1254
Pal C, Papp B, Hurst LD (2001) Highly expressed genes in yeast evolve slowly. Genetics 158:927–931
Plotkin JB, Dushoff J (2003) Codon bias and frequency-dependent selection on the hemagglutinin epitopes of Influenza A virus. Proc Natl Acad Sci USA 100:7152–7157
Plotkin JB, Dushoff J, Fraser HB (2004) Detecting selection using a single genome sequence of M. tuberculosis and P falciparum. Nature 248:942–946
Plotkin JB, Dushoff J, Fraser HB (2005) Codon bias and selection on single genomes: reply. Nature 433:E7–E8
Plotkin JB, Fraser HB, Dushoff J (2006) Natural selection on the genome of Saccharomyces cerevisiae (in preparation)
Sanjuan R, Moya A, Elena S (2004) The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus. Proc Natl Acad Sci USA 101:8396–8401
Sawyer SA, Hartl DL (1992) Population genetics of polymorphism and divergence. Genetics 132:1161–1176
Sharp PM (2005) Gene “volatility” is most unlikely to reveal adaptation. Mol Biol Evol 22:807–809
Sharp PM, Li WH (1987) The codon adaptation index: A measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1295
Simonsen KL, Churchill GA, Aquadro CF (1995) Poperties of statistical tests of neutrality for DNA polymorphism data. Genetics 141:413–429
Sorensen M, Kurland C, Pedersen S (1989) Codon usage determines translation rate in Escherichia coli. J Mol Biol 207:365–377
Stoletzki N, Welch J, Hermisson J, Eyre-Walker A (2005) A dissection of volatility in yeast. Mol Biol Evol 22:2022–2026
Tajima F (1996) The amount of DNA polymorphism maintained in a finite population when the neutral mutation rate varies among sites. Genetics 143:1457–1465
Tang H, Wyckoff GJ, Lu J, Wu C (2004) Universal evolutionary index for amino acid changes. Mol Biol Evol 21:1548–1556
Tenaillon O, Denamur E, Matic I (2004) Evolutionary significance of stressinduced mutagenesis in bacteria. Trends Microbiol 12:264–270
Thompson CJ, McBride JL (1973) On Eigen’s theory of the self-organization of matter and the evolution of biological macromolecules. Math Biosci 21:127–142
van Nimwegen E, Crutchfield J, Huynen M (1999) Neutral evolution of mutational robustness. Proc Natl Acad Sci USA 96:9716–9820
Wertman K, Drubin D, Botstein D (1992) Systematic mutational analysis of the yeast ACT1 gene. Genetics 132:337–350
Wilke C (2001) Adaptive evolution on neutral networks. Bull Math Biol 63:715–130
Winter G, Kawai S, Haeggstrom M, Kaneko O, vonEuler A, Kawazu S, Palm D, Fernandez V, Walgren M (2005) SURFIN is a polymorphic antigen expression on plasmodium falciparum merozoites and infected erythrocytes. J Exp Med 20l:1853–1863
Wlocha DM, Szafranieca K, Bortsb RH, Korona R (2001) Direct estimate of the mutation rate and the distribution of fitness effects in the yeast Saccharomyces cerevisiae. Genetics 159:441–452
Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159
Yampolsky LY, Stoltzfus A (2004) The exchangability of amino acids in proteins. Genetics 170:1459–1472
Yang Z (2000) Maximum likelihood estimation on large phylogenies and analysis of adaptive evolution in human influenza virus A. J Mol Evol 51:423–432
Yang Z, Nielsen R, Goldman N, Pedersen AMK (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449
Zhang J (2004) On the evolution of codon volatility. Genetics 16S:495–501
Zuckerkandl E, Pauling L (1965) Molecules as documents of evolutionary history. J Theor Biol 8:357–366
Zyel C, DeVisser J (2001) Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics 157:53–61
Acknowledgments
We thank Daniel Fisher, Andrew Murray, and Michael Turelli for their input during the preparation of the manuscript. We also thank an anonymous referee for substantial conceptual input. J.B.P. acknowledges support from the Harvard Society of Fellows, the Milton Fund, and the Burroughs Wellcome Fund. M.M.D. acknowledges support from a Merck Award for Genome-Related Research.
Author information
Authors and Affiliations
Corresponding author
Additional information
[Reviewing Editor: Dr. Lauren Meyers]
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Plotkin, J.B., Dushoff, J., Desai, M.M. et al. Codon Usage and Selection on Proteins. J Mol Evol 63, 635–653 (2006). https://doi.org/10.1007/s00239-005-0233-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00239-005-0233-x