Skip to main content
Log in

Estimating the pattern of nucleotide substitution

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Knowledge of the pattern of nucleotide substitution is important both to our understanding of molecular sequence evolution and to reliable estimation of phylogenetic relationships. The method of parsimony analysis, which has been used to estimate substitution patterns in real sequences, has serious drawbacks and leads to results difficult to interpret. In this paper a model-based maximum likelihood approach is proposed for estimating substitution patterns in real sequences. Nucleotide substitution is assumed to follow a homogeneous Markov process, and the general reversible process model (REV) and the unrestricted model without the reversibility assumption are used. These models are also applied to examine the adequacy of the model of Hasegawa et al. (J. Mol. Evol. 1985;22:160–174) (HKY85). Two data sets are analyzed. For the Ψν-globin pseudogenes of six primate species, the REV model fits the data much better than HKY85, while, for a segment of mtDNA sequences from nine primates, REV cannot provide a significantly better fit than HKY85 when rate variation over sites is taken into account in the models. It is concluded that the use of the REV model in phylogenetic analysis can be recommended, especially for large data sets or for sequences with extreme substitution patterns, while HKY85 may be expected to provide a good approximation. The use of the unrestricted model does not appear to be worthwhile.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barry D, Hartigan JA (1987) Statistical analysis of hominoid molecular evolution. Star Sci 2:191–210

    Google Scholar 

  • Brown WM, Prager EM, Wang A, Wilson AC (1982) Mitochondrial DNA sequences of primates, tempo and mode of evolution. J Mol Evol 18:225–239

    Google Scholar 

  • Dayhoff MO (1978) Atlas of protein sequence and structure, vol 5, suppl 3. National Biomedical Research Foundation, Washington, DC, pp 347

    Google Scholar 

  • Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376

    CAS  PubMed  Google Scholar 

  • Gojobori T, Yokoyama S (1987) Molecular evolutionary rates of oncogenes. J Mol Evol 26:148–156

    Google Scholar 

  • Gojobori T, Ishii K, Nei M (1982a) Estimation of average number of nucleotide substitutions when the rate of substitution varies with nucleotides. J Mol Evol 18:414–423

    Google Scholar 

  • Gojobori T, Li WH, Graur D (1982b) Patterns of nucleotide substitution in pseudogenes and functional genes. J Mol Evol 18: 360–369

    Google Scholar 

  • Goldman N (1990) Maximum likelihood inference of phylogenetic trees, with special reference to Poisson process models of DNA substitution and to parsimony analysis. Syst Zool 39:345–361

    Google Scholar 

  • Goldman N (1993) Statistical tests of models of DNA substitution. J Mol Evol 36:182–198

    Google Scholar 

  • Hasegawa M, Kishino H, Yano T (1985) Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174

    CAS  PubMed  Google Scholar 

  • Imanishi T, Gojobori T (1992) Patterns of nucleotide substitutions inferred from the phylogenies of the class I major histocompatibility complex genes. J Mol Evol 35:196–204

    Google Scholar 

  • Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–123

    Google Scholar 

  • Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120

    CAS  PubMed  Google Scholar 

  • Kimura M (1981) Estimation of evolutionary distances between homologous nucleotide sequences. Proc Natl Acad Sci USA 78: 454–458

    Google Scholar 

  • Kishino H, Miyata T, Hasegawa M (1990) Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J Mol Evol 31:151–160

    CAS  Google Scholar 

  • Lanave C, Preparata G, Saccone C, Serio G (1984) A new method for calculating evolutionary substitution rates. J Mol Evol 20:86–93

    CAS  PubMed  Google Scholar 

  • Li W-H, Wu C-I, Luo C-C (1984) Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J Mol Evol 21:58–71

    Google Scholar 

  • Li W-H, Wu C-I, Luo C-C (1985) Evolution of DNA sequences. In: MacIntyre J (ed) Molecular evolutionary genetics. Plenum Press, New York, pp 1–94

    Google Scholar 

  • Miyamoto MM, Slighton JL, Goodman M (1987) Phylogenetic relations of humans and African apes from DNA sequences in the Ψν-globin region. Science 238:369–373

    Google Scholar 

  • Moriyama EN, Ina Y, Iheo K, Shimizu N, Gojobori T (1991) Mutation pattern of human immunodeficiency virus genes. J Mol Evol 32:360–363

    Google Scholar 

  • Navidi WC, Churchill GA, von Haeseler A (1991) Methods for inferring phylogenies from nucleotide acid sequence data by using maximum likelihood and linear invariants. Mol Biol Evol 8: 128–143

    Google Scholar 

  • Reeves JH (1992) Heterogeneity in the substitution process of amino acid sites of proteins coded for by mitochondrial DNA. J Mol Evol 35:17–31

    Google Scholar 

  • Ritland K, Clegg MT (1987) Evolutionary analysis of plant DNA sequences. Am Nat 130:S74-S100

    Google Scholar 

  • Rodriguez F, Oliver JF, Marin A, Medina JR (1990) The general stochastic model of nucleotide substitutions. J Theor Biol 142: 485–501

    PubMed  Google Scholar 

  • Tamura K (1992) Estimation of the number of nucleotide substitutions when there are strong transition-transversion and G + C-content biases. Mol Biol Evol 9:678–687

    Google Scholar 

  • Tamura K, Nei M (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10:512–526

    CAS  PubMed  Google Scholar 

  • Tavare S (1986) Some probabilistic and statistical problems on the analysis of DNA sequences. In: Lectures in mathematics in the life sciences, vol 17. pp 57–86

    Google Scholar 

  • Thompson E (1975) Human evolutionary trees. Cambridge University Press, Cambridge

    Google Scholar 

  • Wilbur WJ (1985) On the PAM matrix model of protein evolution. Mol Biol Evol 2:434–447

    CAS  Google Scholar 

  • Yang Z (1992) Variations of substitution rates and estimation of evolutionary distances of DNA sequences. PhD Thesis, Beijing Agricultural University, Beijing

    Google Scholar 

  • Yang Z (1993) Maximum likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol 10:1396–1401

    CAS  PubMed  Google Scholar 

  • Yang Z, Goldman N (in press) Evaluation and extension of Markov process models of nucleotide substitution. Acta Genetica Sinica

  • Yang Z, Goldman N, Friday AE (in press) Comparison of models for nucleotide substitution used in maximum likelihood phylogenetic estimation. Mol Biol Evol

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Z. Estimating the pattern of nucleotide substitution. J Mol Evol 39, 105–111 (1994). https://doi.org/10.1007/BF00178256

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00178256

Key words

Navigation