Skip to main content
Log in

Understanding relationship between sequence and functional evolution in yeast proteins

  • Original Paper
  • Published:
Genetica Aims and scope Submit manuscript

Abstract

The underlying relationship between functional variables and sequence evolutionary rates is often assessed by partial correlation analysis. However, this strategy is impeded by the difficulty of conducting meaningful statistical analysis using noisy biological data. A recent study suggested that the partial correlation analysis is misleading when data is noisy and that the principal component regression analysis is a better tool to analyze biological data. In this paper, we evaluate how these two statistical tools (partial correlation and principal component regression) perform when data are noisy. Contrary to the earlier conclusion, we found that these two tools perform comparably in most cases. Furthermore, when there is more than one ‘true’ independent variable, partial correlation analysis delivers a better representation of the data. Employing both tools may provide a more complete and complementary representation of the real data. In this light, and with new analyses, we suggest that protein length and gene dispensability play significant, independent roles in yeast protein evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Coghlan A, Wolfe KH (2000) Yeast 16:1131–1145

    Google Scholar 

  • Drummond DA, Raval A, Wilke CO (2006) A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 23:327–337

    Article  PubMed  CAS  Google Scholar 

  • Duret L, Mouchiroud D (1999) Expression pattern and, surprisingly, gene length shape codon usage in Caenorhapbditis, Drosoophila, and Arabidopsis. Proc Nat Acad Sci USA 96:4482–4487

    Article  PubMed  CAS  Google Scholar 

  • Ghaemmaghaml S, Huh W-K, Bower K, Howson RW, Belle A, Dephoure N, O’Shea JS, Weissman (2003) Global analysis of protein expression in yeast. Nature 425:737–741

    Article  Google Scholar 

  • Hahn MW, Conant GC, Wagner A (2004) Molecular evolution in large genetic networks: does connectivity equal constraint? J Mol Evol 58:203–211

    Article  PubMed  CAS  Google Scholar 

  • Han JD, Bertin N, Hao T, Goldberg DS, Berriz GF, Zhang LV, Dupuy D, Walhout AJ, Cusick ME, Roth FP, Vidal M (2004) Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430:88–93

    Article  PubMed  CAS  Google Scholar 

  • Hirsh AE, Fraser HB, Wall DP (2005) Adjusting for selection on synonymous sites in estimates of evolutionary distance. Mol Biol Evol 22:174–177

    Article  PubMed  CAS  Google Scholar 

  • Holstege FC, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ, Green MR, Golub TR, Lander ES, Young RA (1998) Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95:717–728

    Article  PubMed  CAS  Google Scholar 

  • Jordan IK, Wolf YI, Koonin EV (2003) No simple dependence between protein evolution and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly. BMC Evol Biol 3:1

    Article  PubMed  Google Scholar 

  • Kim S-H, Yi S (2006) Correlated asymmetry between sequence and functional divergence of duplicate proteins in Saccharomyces cerevisiae. Mol Biol Evol 23:1068–1075

    Article  PubMed  Google Scholar 

  • Lemos B, Bettencourt BR, Meiklejohn CD, Hartl DL (2005) Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. Mol Biol Evol 22:1345–1354

    Article  PubMed  CAS  Google Scholar 

  • R Development Core Team (2004) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3–900051–00-3, URL http://www.R-project.org

  • Rocha EP, Danchin A (2004) An analysis of determinants of amino acids substitution rates in bacteria. Mol Biol Evol 21:108–116

    Article  PubMed  CAS  Google Scholar 

  • Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman M W (2005) Functional genomic analysis of the rates of protein evolution. Proc Natl Acad Sci USA 102:5483–5488

    Article  PubMed  CAS  Google Scholar 

  • Weisberg S (1985) Applied linear regression. John Wiley and Sons, 336 pp

  • Whittaker J (1996) Graphical models in applied multivariate statistics. John Wiley and Sons, New York, 466 pp

    Google Scholar 

  • Zhang JG, He X (2005) Significant impact of protein dispensability on the instantaneous rate of protein evolution. Mol Biol Evol 22:1147–1155

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

We thank D. Allan Drummond and Claus Wilke for helpful personal communications, Charles Warden for critical reading of the manuscript. SY is supported by funds from the Georgia Institute of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soojin V. Yi.

Electronic supplementary material

Below are the electronic supplementary materials.

ESM (PDF 186 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, SH., Yi, S.V. Understanding relationship between sequence and functional evolution in yeast proteins. Genetica 131, 151–156 (2007). https://doi.org/10.1007/s10709-006-9125-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10709-006-9125-2

Keywords

Navigation