Skip to main content

Advertisement

Log in

Codon Usage and Selection on Proteins

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Selection pressures on proteins are usually measured by comparing homologous nucleotide sequences (Zuckerkandl and Pauling 1965). Recently we introduced a novel method, termed volatility, to estimate selection pressures on proteins on the basis of their synonymous codon usage (Plotkin and Dushoff 2003; Plotkin et al. 2004). Here we provide a theoretical foundation for this approach. Under the Fisher-Wright model, we derive the expected frequencies of synonymous codons as a function of the strength of selection on amino acids, the mutation rate, and the effective population size. We analyze the conditions under which we can expect to draw inferences from biased codon usage, and we estimate the time scales required to establish and maintain such a signal. We find that synonymous codon usage can reliably distinguish between negative selection and neutrality only for organisms, such as some microbes, that experience large effective population sizes or periods of elevated mutation rates. The power of volatility to detect positive selection is also modest—requiring approximately 100 selected sites—but it depends less strongly on population size. We show that phenomena such as transient hyper-mutators can improve the power of volatility to detect selection, even when the neutral site heterozygosity is low. We also discuss several confounding factors, neglected by the Fisher-Wright model, that may limit the applicability of volatility in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1.
Fig. 2.
Fig. 3.

Similar content being viewed by others

Sujatha Thankeswaran Parvathy, Varatharajalu Udayasuriyan & Vijaipal Bhadana

References

  • Akashi H (2001) Gene expression and molecular evolution. Curr Opin Genet Dev 11:660–666

    Article  PubMed  CAS  Google Scholar 

  • Akashi H, Schaeffer SW (1997) Natural selection and the frequency distribution of “silent” DNA polymorphism in Drosophila. Genetics 146:295–307

    PubMed  CAS  Google Scholar 

  • Anisimova M, Bielawski JP, Yang Z (2001) The accuracy and power of likelihood ratio tests to detect positive selection at amino acid sites. Mol Biol Evol 18:1585–1592

    PubMed  CAS  Google Scholar 

  • Berg O (1996) Selection intensity for codon bias and the effective population size of Escherichia coli. Genetics 142:1379–1382

    PubMed  CAS  Google Scholar 

  • Bjedov I, Tenaillon O, Gerard B, et al. (2003) Stress-induced mutagenesis in bacteria. Science 300:1404–1409

    Article  PubMed  CAS  Google Scholar 

  • Bulmer M (1991) The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897–907

    PubMed  CAS  Google Scholar 

  • Bustamante CD, Wakely J, Sawyer S, Hartl DL (2001) Directional selection and the site-frequency spectrum. Genetics 159:1779–1788

    PubMed  CAS  Google Scholar 

  • Chamary JV, Parmley JL, Hurst LD (2006) Hearing silence: non-neutral evolution at synonymous sites in mammals. Nature Rev Genet 7:98–108

    Article  CAS  PubMed  Google Scholar 

  • Charlesworth B, Morgan MT, Charlesworth D (1993) The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289–1303

    PubMed  CAS  Google Scholar 

  • Clark A, Glanowski S, Nielsen R, Thomas P, Kejariwal A, MA MT, Tanenbaum D, Civello D, Lu F, B BM, Ferriera S, Wang G, Zheng X, White T, Sninsky J, Adams M, Cargill M (2003) Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302:1960–1963

    Article  PubMed  CAS  Google Scholar 

  • Coghlan A, Wolfe KH (2000) Relationship of codon bias to mRNA concentration and protein length in saccharomyces cerevisiae. Yeast 16:1131–1145

    Article  PubMed  CAS  Google Scholar 

  • Crow JF, Kimura M (1970) An introduction to population genetics theory.Burgess, Minneapolis

    Google Scholar 

  • Dagan T, Graur D (2004) The comparative method rules! codon volatility cannot detect positive Darwinian selection using a single genome sequence. Mol Biol Evol 22:496–500

    Article  PubMed  CAS  Google Scholar 

  • Debry R, Marzluff WF (1994) Selection on silent sites in the rodent H3 historic gene family. Genetics 138:191–202

    PubMed  CAS  Google Scholar 

  • Denamur E, Lecointre G, Darlu P, OTenaillon CA, Sayada C, Sunjevaric I (2000) Evolutionary implications of the frequent horizontal transfer of mismatch repair genes. Cell 103:711–721

    Article  PubMed  CAS  Google Scholar 

  • Drummond DA, Bloom JD, Adami C, Wilke CO, Arnold FH (2005) Why highly expressed proteins evolve slowly. Proc Natl Acad Sci USA 102:14338–14343

    Article  PubMed  CAS  Google Scholar 

  • Drummond DA, Raval A, Wilke CO (2006) A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 23:327–337

    Article  PubMed  CAS  Google Scholar 

  • Ewens W (2004) Mathematical populations genetics I. Springer-Verlag, New York

    Google Scholar 

  • Fleischmann RD, Alland D, Eisen JA, Carpenter L, White O, Peterson J, DeBoy R, Dodson R, Gwinn M, Haft D, Hickey E, Kolonay JF, Nelson WC, Umayam LA, Ermolaeva M, Salzberg SL, Delcher A, Utterback T, Weidman J, Khouri H, Gill J, Mikula A, Bishai W, Jacobs WR, Venter JC, Fraser CM (2002) Whole-genome comparison of Mycobacterium tuberculosis clinical and laboratory strains. J Bacteriol 184:5479–5490

    Article  PubMed  CAS  Google Scholar 

  • Friedman R, Hughes AL (2005) Codon volatility as an indicator of positive selection: Data from eukaryotic genome comparisons. Mol Biol Evol 22:542–543

    Article  PubMed  CAS  Google Scholar 

  • Gibbons RJ, Kapsimalis B (1967) Estimates of the overall rate of growth of the intenstinal microflora for hamsters, Guinea pigs, and mice J Bacteriol 93:510–512

    PubMed  CAS  Google Scholar 

  • Gillespie J (2001) Is the population size of a species relevant to its evolution? Evolution 55:2161–2169

    Article  PubMed  CAS  Google Scholar 

  • Giraud A, Radman M, Matic I, Taddei F (2001) The rise and fall of mutator bacteria. Curr Opin Microbiol 4:582–585

    Article  PubMed  CAS  Google Scholar 

  • Golding GB, Strobeck C (1982) Expected frequencies of codon use as a function of mutation rates and codon fitnesses. J Mol Evol 18:379–386

    Article  PubMed  CAS  Google Scholar 

  • Goldman N, Yang Z (1994) Codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol 11:725–736

    PubMed  CAS  Google Scholar 

  • Hahn MW, Mezey JG, Begun DJ, Gillespie JH, Kern AD, Langley CH, Moyle LC (2005) Codon bias and selection on signle genomes. Nature 433:E1

    Article  CAS  Google Scholar 

  • Hartl DL, Sawyer SA (1994) Selection intensity for codon bias. Genetics 138:227–234

    PubMed  CAS  Google Scholar 

  • Higgs P (1994) Error thresholds and stationary mutant distributions in multi-locus diploid genetics models. Genet Res Cambr 63:63–78

    Google Scholar 

  • Hirsh AE, Fraser HB, Wall DP (2005) Adjusting for selection on synonymous sites in estimates of evolutionary distance. Mol Biol Evol 22:174–177

    Article  PubMed  CAS  Google Scholar 

  • Ikemura T (1981) Correlation between the abundance of Escherichia coli transfer-RNAs and the occurrence of the respective codons in its protein. J Mol Biol 146:1–21

    Article  PubMed  CAS  Google Scholar 

  • Kellis M, Patterson N, Endrizzi M, Birren B, Lander E (2003) Sequencing and comparison of yeast species to identify genes arid regulatory elements. Nature 423:241–254

    Article  PubMed  CAS  Google Scholar 

  • Kimura M, Crow JF (1964) The number of alleles that can be maintained in a finite population. Genetics 49:725–738

    PubMed  CAS  Google Scholar 

  • King JL, Jukes TH (1969) Non-Darwinian evolution. Science 164:788

    Article  PubMed  CAS  Google Scholar 

  • Konopka AJ (1985) Theory of degenerate coding and informational parameters of protein coding genes. Biochimie 67:455–468

    PubMed  CAS  Google Scholar 

  • Kreitman M (2000) Methods to detect selection in populations with applications to the human Annu Rev Genomics. Hum Genet 1:539–559

    CAS  Google Scholar 

  • LeClerc -J, Li B, Payne WL, Cebula TA (1996) High mutation frequencies among Escherichia coli and salmonella pathogens. Science 274:1208–1211

    Article  PubMed  CAS  Google Scholar 

  • Li WH (1993) Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J Mol Evol 36:96–99

    Article  PubMed  CAS  Google Scholar 

  • Lynch M, Conery JS (2003) The origins of genome complexity. Science 302:1401–4404

    Article  PubMed  CAS  Google Scholar 

  • Maynard Smith J, Haigh J (1974) The hitch-hiking effect of a favorable gene. Genet Res Cambr 23:23–25

    Article  Google Scholar 

  • McDonald JH, Kreitman M (1991) Adaptive protein evolution at the ADH locus in Drosophila. Nature 351:652–654

    Article  PubMed  CAS  Google Scholar 

  • Miyata T, Miyazawa S, Yashunaga T (1979) Two types of amino acid substitutions in protein eolution. J. Mol Evol 12:219–236

    Article  PubMed  CAS  Google Scholar 

  • Myers LA, Ancel FD, Lachmann M (2005) Evolution of genetic potential. PloS Comput Biol 1:236–243

    Google Scholar 

  • Nagylaki T (1992) Introduction to theoretical population genetics. Springer, Berlin

    Google Scholar 

  • Nielsen R, Hubisz M (2005) Detecting selection needs comparative data. Nature 433:E6

    Article  PubMed  CAS  Google Scholar 

  • Notley-McRobb L, Seeto S, Ferenci T (2001) Enrichment and elimination of mutY mutators in Escherichia coli populations. Genetics 162:1955–1062

    Google Scholar 

  • Ochman H, Elwyn S, Moran NA (1999) Calibrating bacterial evolution. Proc Natl Acad Sci USA 96:12638–12643

    Article  PubMed  CAS  Google Scholar 

  • Oliver A, Canton R, Campo P, Baquero F, arid Blazquez J (2000) High frequency of hypermutable Pseudornonas aeruginosa in cystic fibrosis lung infection. Science 288:1251–1254

    Article  PubMed  CAS  Google Scholar 

  • Pal C, Papp B, Hurst LD (2001) Highly expressed genes in yeast evolve slowly. Genetics 158:927–931

    PubMed  CAS  Google Scholar 

  • Plotkin JB, Dushoff J (2003) Codon bias and frequency-dependent selection on the hemagglutinin epitopes of Influenza A virus. Proc Natl Acad Sci USA 100:7152–7157

    Article  PubMed  CAS  Google Scholar 

  • Plotkin JB, Dushoff J, Fraser HB (2004) Detecting selection using a single genome sequence of M. tuberculosis and P falciparum. Nature 248:942–946

    Article  CAS  Google Scholar 

  • Plotkin JB, Dushoff J, Fraser HB (2005) Codon bias and selection on single genomes: reply. Nature 433:E7–E8

    Article  CAS  Google Scholar 

  • Plotkin JB, Fraser HB, Dushoff J (2006) Natural selection on the genome of Saccharomyces cerevisiae (in preparation)

  • Sanjuan R, Moya A, Elena S (2004) The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus. Proc Natl Acad Sci USA 101:8396–8401

    Article  PubMed  CAS  Google Scholar 

  • Sawyer SA, Hartl DL (1992) Population genetics of polymorphism and divergence. Genetics 132:1161–1176

    PubMed  CAS  Google Scholar 

  • Sharp PM (2005) Gene “volatility” is most unlikely to reveal adaptation. Mol Biol Evol 22:807–809

    Article  PubMed  CAS  Google Scholar 

  • Sharp PM, Li WH (1987) The codon adaptation index: A measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1295

    PubMed  CAS  Google Scholar 

  • Simonsen KL, Churchill GA, Aquadro CF (1995) Poperties of statistical tests of neutrality for DNA polymorphism data. Genetics 141:413–429

    PubMed  CAS  Google Scholar 

  • Sorensen M, Kurland C, Pedersen S (1989) Codon usage determines translation rate in Escherichia coli. J Mol Biol 207:365–377

    Article  PubMed  CAS  Google Scholar 

  • Stoletzki N, Welch J, Hermisson J, Eyre-Walker A (2005) A dissection of volatility in yeast. Mol Biol Evol 22:2022–2026

    Article  PubMed  CAS  Google Scholar 

  • Tajima F (1996) The amount of DNA polymorphism maintained in a finite population when the neutral mutation rate varies among sites. Genetics 143:1457–1465

    PubMed  CAS  Google Scholar 

  • Tang H, Wyckoff GJ, Lu J, Wu C (2004) Universal evolutionary index for amino acid changes. Mol Biol Evol 21:1548–1556

    Article  PubMed  CAS  Google Scholar 

  • Tenaillon O, Denamur E, Matic I (2004) Evolutionary significance of stressinduced mutagenesis in bacteria. Trends Microbiol 12:264–270

    Article  PubMed  CAS  Google Scholar 

  • Thompson CJ, McBride JL (1973) On Eigen’s theory of the self-organization of matter and the evolution of biological macromolecules. Math Biosci 21:127–142

    Google Scholar 

  • van Nimwegen E, Crutchfield J, Huynen M (1999) Neutral evolution of mutational robustness. Proc Natl Acad Sci USA 96:9716–9820

    Article  PubMed  CAS  Google Scholar 

  • Wertman K, Drubin D, Botstein D (1992) Systematic mutational analysis of the yeast ACT1 gene. Genetics 132:337–350

    PubMed  CAS  Google Scholar 

  • Wilke C (2001) Adaptive evolution on neutral networks. Bull Math Biol 63:715–130

    Article  PubMed  CAS  Google Scholar 

  • Winter G, Kawai S, Haeggstrom M, Kaneko O, vonEuler A, Kawazu S, Palm D, Fernandez V, Walgren M (2005) SURFIN is a polymorphic antigen expression on plasmodium falciparum merozoites and infected erythrocytes. J Exp Med 20l:1853–1863

    Article  CAS  Google Scholar 

  • Wlocha DM, Szafranieca K, Bortsb RH, Korona R (2001) Direct estimate of the mutation rate and the distribution of fitness effects in the yeast Saccharomyces cerevisiae. Genetics 159:441–452

    Google Scholar 

  • Wright S (1931) Evolution in Mendelian populations. Genetics 16:97–159

    CAS  PubMed  Google Scholar 

  • Yampolsky LY, Stoltzfus A (2004) The exchangability of amino acids in proteins. Genetics 170:1459–1472

    Article  CAS  Google Scholar 

  • Yang Z (2000) Maximum likelihood estimation on large phylogenies and analysis of adaptive evolution in human influenza virus A. J Mol Evol 51:423–432

    PubMed  CAS  Google Scholar 

  • Yang Z, Nielsen R, Goldman N, Pedersen AMK (2000) Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449

    PubMed  CAS  Google Scholar 

  • Zhang J (2004) On the evolution of codon volatility. Genetics 16S:495–501

    Article  CAS  Google Scholar 

  • Zuckerkandl E, Pauling L (1965) Molecules as documents of evolutionary history. J Theor Biol 8:357–366

    Article  PubMed  CAS  Google Scholar 

  • Zyel C, DeVisser J (2001) Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics 157:53–61

    Google Scholar 

Download references

Acknowledgments

We thank Daniel Fisher, Andrew Murray, and Michael Turelli for their input during the preparation of the manuscript. We also thank an anonymous referee for substantial conceptual input. J.B.P. acknowledges support from the Harvard Society of Fellows, the Milton Fund, and the Burroughs Wellcome Fund. M.M.D. acknowledges support from a Merck Award for Genome-Related Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joshua B. Plotkin.

Additional information

[Reviewing Editor: Dr. Lauren Meyers]

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Plotkin, J.B., Dushoff, J., Desai, M.M. et al. Codon Usage and Selection on Proteins. J Mol Evol 63, 635–653 (2006). https://doi.org/10.1007/s00239-005-0233-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-005-0233-x

Keywords

Navigation