Abstract
More introns exist between codons (phase 0) than between the first and the second bases (phase 1) or between the second and the third base (phase 2) within the codon. Many explanations have been suggested for this excess of phase 0. It has, for example, been argued to reflect an ancient utility for introns in separating exons that code for separate protein modules. There may, however, be a simple, alternative explanation. Introns typically require, for correct splicing, particular nucleotides immediately 5′ in exons (typically a G) and immediately 3′ in the following exon (also often a G). Introns therefore tend to be found between particular nucleotide pairs (e.g., G|G pairs) in the coding sequence. If, owing to bias in usage of different codons, these pairs are especially common at phase 0, then intron phase biases may have a trivial explanation. Here we take codon usage frequencies for a variety of eukaryotes and use these to generate random sequences. We then ask about the phase of putative intron insertion sites. Importantly, in all simulated data sets intron phase distribution is biased in favor of phase 0. In many cases the bias is of the magnitude observed in real data and can be attributed to codon usage bias. It is also known that exons may carry either the same phase (symmetric) or different phases (asymmetric) at the opposite ends. We simulated a distribution of different types of exons using frequencies of introns observed in real genes assuming random combination of intron phases at the opposite sides of exons. Surprisingly the simulated pattern was quite similar to that observed. In the simulants we typically observe a prevalence of symmetric exons carrying phase 0 at both ends, which is common for eukaryotic genes. However, at least in some species, the extent of the bias in favor of symmetric (0,0) exons is not as great in simulants as in real genes. These results emphasize the need to construct a biologically relevant null model of successful intron insertion.
Similar content being viewed by others
References
DG Arques CJ Michel (1990) ArticleTitlePeriodicities in coding and noncoding regions of the genes J Theor Biol 143 307–318
NJ Dibb AJ Newman (1989) ArticleTitleEvidence that introns arose at proto-splice sites EMBO J 8 2015–2021
M Eigen R Winkler-Oswatitsch (1981) ArticleTitleTransfer-RNA: The early adaptor Naturwissenschaften 68 217–228
T Endo A Fedorov SJ Souza Particlede W Gilbert (2002) ArticleTitleDo introns favor or avoid regions of amino acid conservation? Mol Biol Evol 19 521–525
ST Eskesen FN Eskesen A Ruvinsky (2004a) ArticleTitleNatural selection affects frequencies of AG and GT dinucleotides at the 5′ and 3′ ends of exons Genetics 167 543–550
ST Eskesen FN Eskesen BP Kinghorn A Ruvinsky (2004b) ArticleTitlePeriodicity of DNA in exons BMC Mol Biol 5 12
A Fedorov G Suboch M Bujakov L Fedorova (1992) ArticleTitleAnalysis of nonuniformity in intron phase distribution Nucleic Acids Res 20 2553–2557
A Fedorov X Cao S Saxonov SJ Souza Particlede SW Roy W Gilbert (2001) ArticleTitleIntron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns Proc Natl Acad Sci USA 98 13177–13182
W Gilbert SJ Souza Particlede M Long (1997) ArticleTitleOrigin of genes Proc. Natl Acad Sci USA 94 7698–7703
Guigó R (2000) DNA composition, codon usage and exon prediction. http://www.pdg.cnb.uam.es/cursos/FVi2001/GenomAna/GeneIdentification/SearchContent/main.html/
H Kaessmann S Zollner A Nekrutenko WH Li (2002) ArticleTitleSignatures of domain shuffling in the human genome Genome Res 12 1642–1650
M Long M Deutsch (1999) ArticleTitleAssociation of intron phases with conservation at splice site sequences and evolution of spliceosomal introns Mol Biol Evol 16 1528–1534
M Long C Rosenberg (2000) ArticleTitleTesting the “proto-splice sites” model of intron origin: Evidence from analysis of intron phase correlations Mol Biol Evol 17 1789–1796
M Long C Rosenberg W Gilbert (1995) ArticleTitleIntron phase correlations and the evolution of the intron/exon structure of genes Proc Natl Acad Sci USA 92 12495–12499
M Long SJ Souza Particlede C Rosenberg W Gilbert (1998) ArticleTitleRelationship between “proto-splice sites” and intron phases: Evidence from dicodon analysis Proc Natl Acad Sci USA 95 219–223
M Lynch (2002) ArticleTitleIntron evolution as a population-genetic process Proc Natl Acad Sci USA 99 6118–6123
VIu Makeev GK Frank VG Tumanian (1996) ArticleTitle[Statistics of periodic regularities in sequences of human introns] Biofizika 41 241–246
W-G Qui N Schisler A Stoltzfus (2004) ArticleTitleThe evolutionary gain of spliceosomal introns: Sequence and phase preferences Mol Biol Evol 21 1252–1263
SW Roy A Fedorov W Gilbert (2002) ArticleTitleThe signal of ancient introns is obscured by intron density and homolog number Proc Natl Acad Sci USA 99 15513–15517
S Saxonov I Daizadeh A Fedorov W Gilbert (2000) ArticleTitleEID: The Exon-Intron Database—An exhaustive database of protein-coding intron-containing genes Nucleic Acids Res 28 185–190
JC Shepherd (1981) ArticleTitlePeriodic correlations in DNA sequences and evidence suggesting their evolutionary origin in a comma-less genetic code J Mol Evol 17 94–102
AV Sverdlov IB Rogozin VN Babenko EV Koonin (2003) ArticleTitleEvidence of splice signal migration from exon to intron during intron evolution Curr Biol 13 2170–2174
S Tiwari S Ramachandran S Bhattacharya R Ramaswamy (1997) ArticleTitlePrediction pf probable genes by Fourier analysis of genomic sequences Comput Appl Biosci 13 263–270
EN Trifonov (1987) ArticleTitleTranslation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16 S rRNA nucleotide sequences J Mol Biol 194 643–652 Occurrence Handle1:CAS:528:DyaL2sXktF2mtbc%3D Occurrence Handle2443708
AA Tsonis JB Elsner PA Tsonis (1991) ArticleTitlePeriodicity in DNA coding sequences: Implications in gene evolution J Theor Biol 151 323–331
VB Zhurkin (1981) ArticleTitlePeriodicity in DNA primary structure is defined by secondary structure of the coded protein Nucleic Acids Res 9 1963–1971
Acknowledgments
We thank E. Koonin for discussion and the opportunity to read a relevant paper before publication. We are also grateful to both anonymous reviewers for critical comments and useful information.
Author information
Authors and Affiliations
Corresponding author
Additional information
Reviewing Editor: Dr. Manyuan Long
Rights and permissions
About this article
Cite this article
Ruvinsky, A., Eskesen, S., Eskesen, F. et al. Can Codon Usage Bias Explain Intron Phase Distributions and Exon Symmetry?. J Mol Evol 60, 99–104 (2005). https://doi.org/10.1007/s00239-004-0032-9
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s00239-004-0032-9