Abstract
The underlying relationship between functional variables and sequence evolutionary rates is often assessed by partial correlation analysis. However, this strategy is impeded by the difficulty of conducting meaningful statistical analysis using noisy biological data. A recent study suggested that the partial correlation analysis is misleading when data is noisy and that the principal component regression analysis is a better tool to analyze biological data. In this paper, we evaluate how these two statistical tools (partial correlation and principal component regression) perform when data are noisy. Contrary to the earlier conclusion, we found that these two tools perform comparably in most cases. Furthermore, when there is more than one ‘true’ independent variable, partial correlation analysis delivers a better representation of the data. Employing both tools may provide a more complete and complementary representation of the real data. In this light, and with new analyses, we suggest that protein length and gene dispensability play significant, independent roles in yeast protein evolution.
Similar content being viewed by others
References
Coghlan A, Wolfe KH (2000) Yeast 16:1131–1145
Drummond DA, Raval A, Wilke CO (2006) A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 23:327–337
Duret L, Mouchiroud D (1999) Expression pattern and, surprisingly, gene length shape codon usage in Caenorhapbditis, Drosoophila, and Arabidopsis. Proc Nat Acad Sci USA 96:4482–4487
Ghaemmaghaml S, Huh W-K, Bower K, Howson RW, Belle A, Dephoure N, O’Shea JS, Weissman (2003) Global analysis of protein expression in yeast. Nature 425:737–741
Hahn MW, Conant GC, Wagner A (2004) Molecular evolution in large genetic networks: does connectivity equal constraint? J Mol Evol 58:203–211
Han JD, Bertin N, Hao T, Goldberg DS, Berriz GF, Zhang LV, Dupuy D, Walhout AJ, Cusick ME, Roth FP, Vidal M (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430:88–93
Hirsh AE, Fraser HB, Wall DP (2005) Adjusting for selection on synonymous sites in estimates of evolutionary distance. Mol Biol Evol 22:174–177
Holstege FC, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ, Green MR, Golub TR, Lander ES, Young RA (1998) Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95:717–728
Jordan IK, Wolf YI, Koonin EV (2003) No simple dependence between protein evolution and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly. BMC Evol Biol 3:1
Kim S-H, Yi S (2006) Correlated asymmetry between sequence and functional divergence of duplicate proteins in Saccharomyces cerevisiae. Mol Biol Evol 23:1068–1075
Lemos B, Bettencourt BR, Meiklejohn CD, Hartl DL (2005) Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. Mol Biol Evol 22:1345–1354
R Development Core Team (2004) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3–900051–00-3, URL http://www.R-project.org
Rocha EP, Danchin A (2004) An analysis of determinants of amino acids substitution rates in bacteria. Mol Biol Evol 21:108–116
Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman M W (2005) Functional genomic analysis of the rates of protein evolution. Proc Natl Acad Sci USA 102:5483–5488
Weisberg S (1985) Applied linear regression. John Wiley and Sons, 336 pp
Whittaker J (1996) Graphical models in applied multivariate statistics. John Wiley and Sons, New York, 466 pp
Zhang JG, He X (2005) Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol Biol Evol 22:1147–1155
Acknowledgements
We thank D. Allan Drummond and Claus Wilke for helpful personal communications, Charles Warden for critical reading of the manuscript. SY is supported by funds from the Georgia Institute of Technology.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below are the electronic supplementary materials.
Rights and permissions
About this article
Cite this article
Kim, SH., Yi, S.V. Understanding relationship between sequence and functional evolution in yeast proteins. Genetica 131, 151–156 (2007). https://doi.org/10.1007/s10709-006-9125-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10709-006-9125-2