Detecting Site-Specific Biochemical Constraints Through Substitution Mapping

Dutheil, Julien

doi:10.1007/s00239-008-9139-8

Detecting Site-Specific Biochemical Constraints Through Substitution Mapping

Published: 12 August 2008

Volume 67, pages 257–265, (2008)
Cite this article

Journal of Molecular Evolution Aims and scope Submit manuscript

Julien Dutheil^1,2

208 Accesses
9 Citations
Explore all metrics

Abstract

The neutral theory of molecular evolution states that most mutations are deleterious or neutral. It results that the evolutionary rate of a given position in an alignment is a function of the level of constraint acting on this position. Inferring evolutionary rates from a set of aligned sequences is hence a powerful method to detect functionally and/or structurally important positions in a protein. Some positions, however, may be constrained while having a high substitution rate, providing these substitutions do not affect the biochemical property under constraint. Here, I introduce a new evolutionary rate measure accounting for the evolution of specific biochemical properties (e.g., volume, polarity, and charge). I then present a new statistical method based on the comparison of two rate measures: a site is said to be constrained for property X if it shows an unexpectedly high conservation of X knowing its total evolutionary rate. Compared to single-rate methods, the two-rate method offers several advantages: it (i) allows assessment of the significance of the constraint, (ii) provides information on the type of constraint acting on each position, and (iii) detects positions that are not proposed by previous methods. I apply this method to a 200-sequence data set of triosephosphate isomerase and report significant cases of positions constrained for polarity, volume, or charge. The three-dimensional localization of these positions shows that they are of potential interest to the molecular evolutionist and to the biochemist.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference

Site-Specific Amino Acid Distributions Follow a Universal Shape

Article 24 November 2020

Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

Article Open access 10 August 2017

References

Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate—a practical and powerful approach to multiple testing. J Roy Stat Soc B Met 57:89–300
Google Scholar
Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29(4):1165–1188
Article Google Scholar
Chessel D, Dufour A, Thioulouse J (2004) The ade4 package. I. One-table methods. R News:5–10
Dutheil J, Galtier N (2007) Detecting groups of co-evolving positions in a molecule: a clustering approach. BMC Evol Biol 7:242–242
Article PubMed Google Scholar
Dutheil J, Pupko T, Jean-Marie A, Galtier N (2005) A model-based approach for detecting coevolving positions in a molecule. Mol Biol Evol 22:1919–1928
Article PubMed CAS Google Scholar
Dutheil J, Gaillard S, Bazin E, Glémin S, Ranwez V, Galtier N, Belkhir K (2006) Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics. BMC Bioinformatics 7:188–188
Article PubMed Google Scholar
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376
Article PubMed CAS Google Scholar
Felsenstein J (2004) Inferring phylogenies. Sinauer Associates, Sunderland, MA
Google Scholar
Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, Ben-Tal N (2003) ConSurf: identification of functional regions in proteins by surfacemapping of phylogenetic information. Bioinformatics 19:163–164
Article PubMed CAS Google Scholar
Goldman N, Thorne JL, Jones DT (1998) Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149:445–458
PubMed CAS Google Scholar
Grantham R (1974) Amino acid difference formula to help explain protein evolution. Science 185:862–864
Article PubMed CAS Google Scholar
Guindon S, Gascuel O (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52:696–704
Article PubMed Google Scholar
Kawashima S, Kanehisa M (2000) AAindex: amino acid index database. Nucleic Acids Res 28:374–374
Article PubMed CAS Google Scholar
Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge
Google Scholar
Koshi JM, Goldstein RA (1997) Mutation matrices and physical-chemical properties: correlations and implications. Proteins 27:336–344
Article PubMed CAS Google Scholar
Kosiol C, Goldman N (2005) Different versions of the Dayhoff rate matrix. Mol Biol Evol 22:193–199
Article PubMed CAS Google Scholar
Kraulis PJ (1991) Molscript—a program to produce both detailed and schematic plots of protein structures. J Appl Crystal 24:946–950
Article Google Scholar
Lichtarge O, Bourne HR, Cohen FE (1996) An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 257:342–358
Article PubMed CAS Google Scholar
Lolis E, Alber T, Davenport RC, Rose D, Hartman FC, Petsko GA (1990) Structure of yeast triosephosphate isomerase at 19-A resolution. Biochemistry 29:6609–6618
Article PubMed CAS Google Scholar
Mayrose I, Graur D, Ben-Tal N, Pupko T (2004) Comparison of site specific rate-inference methods for protein sequences: empirical Bayesian methods are superior. Mol Biol Evol 21:1781–1791
Article PubMed CAS Google Scholar
Mayrose I, Mitchell A, Pupko T (2005) Site-specific evolutionary rate inference: taking phylogenetic uncertainty into account. J Mol Evol 60:345–353
Article PubMed CAS Google Scholar
Merritt EA, Bacon DJ (1997) Raster3d: photorealistic molecular graphics. Methods Enzymol 277:505–524
Article PubMed CAS Google Scholar
Nielsen R (2002) Mapping mutations on phylogenies. Syst Biol 51:729–739
Article PubMed Google Scholar
Nielsen R, Yang Z (1998) Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929–936
PubMed CAS Google Scholar
Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N (2002) Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics 18(Suppl 1):S71–S77
PubMed Google Scholar
R Development Core Team (2006) R: a language and environment for statistical computing
Sainudiin R, Wong WS, Yogeeswaran K, Nasrallah JB, Yang Z, Nielsen R (2005) Detecting site-specific physicochemical selective pressures: applications to the Class I HLA of the human major histocompatibility complex and the SRK of the plant sporophytic self-incompatibility system. J Mol Evol 60:315–326
Article PubMed CAS Google Scholar
Sokal RR, Rholf FJ (1995) Biometry, 3rd edn. W. H. Freeman, New York
Google Scholar
Verhoeven KJF, Simonsen K, McIntyre LM (2005) Implementing false discovery rate control: increasing your power. Oikos 108:643–647
Article Google Scholar
Wong WS, Sainudiin R, Nielsen R (2006) Identification of physicochemical selective pressure on protein encoding nucleotide sequences. BMC Bioinformatics 7:148–148
Article PubMed Google Scholar
Woolley S, Johnson J, Smith MJ, Crandall KA, Mcclellan DA (2003) TreeSAAP: selection on amino acid properties using phylogenetic trees. Bioinformatics 19:671–672
Article PubMed CAS Google Scholar
Xia X, Li WH (1998) What amino acid properties affect protein evolution? J Mol Evol 47:557–564
Article PubMed CAS Google Scholar
Yang Z (1994) Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol 39:306–314
Article PubMed CAS Google Scholar
Yang Z (2006) Computational molecular evolution. Oxford University Press, Oxford, UK
Google Scholar
Yang Z, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641–1650
PubMed CAS Google Scholar

Download references

Acknowledgments

This work was supported by Centre National de la Recherche Scientifique and Action Concertée Incitative “Informatique, Mathématiques et Physique pour la Biologie.” The author would like to thank Nicolas Galtier, Tal Pupko, Itay Mayrose, Adi Stern, Adi Doron, Eyal Privman, Nimrod Rubinstein, Ofir Cohen, Osnat Penn, David Burnstein, and Guillaume Achaz for helpful suggestions on this work, Nicolas Galtier for help with the writing of the manuscript, and Karine Jacquet for help with the ade4 package. This publication is contribution 2008-051 of the Institut des Sciences de l’Evolution de Montpellier (UMR 5554—CNRS).

Author information

Authors and Affiliations

Institut des Sciences de l’Évolution (UM2-CNRS), Université Montpellier 2, Place Eugène Bataillon, CC064, 34 095, Montpellier Cedex 5, France
Julien Dutheil
BiRC—Bioinformatics Research Center, University of Aarhus, Høegh-Guldbergs Gade 10, Building 1090, 8000, Aarhus C, Denmark
Julien Dutheil

Authors

Julien Dutheil
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julien Dutheil.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dutheil, J. Detecting Site-Specific Biochemical Constraints Through Substitution Mapping. J Mol Evol 67, 257–265 (2008). https://doi.org/10.1007/s00239-008-9139-8

Download citation

Received: 29 January 2008
Revised: 20 May 2008
Accepted: 09 June 2008
Published: 12 August 2008
Issue Date: September 2008
DOI: https://doi.org/10.1007/s00239-008-9139-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting Site-Specific Biochemical Constraints Through Substitution Mapping

Abstract

Access this article

Similar content being viewed by others

The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference

Site-Specific Amino Acid Distributions Follow a Universal Shape

Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detecting Site-Specific Biochemical Constraints Through Substitution Mapping

Abstract

Access this article

Similar content being viewed by others

The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference

Site-Specific Amino Acid Distributions Follow a Universal Shape

Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation