Abstract
Mass spectrometry (MS)-based shotgun proteomics allows protein identifications even in complex biological samples. Protein abundances can then be estimated from the counts of tandem MS (MS/MS) spectra attributable to each protein, provided one accounts for differential MS detectability of contributing peptides. We developed a method, APEX, which calculates Absolute Protein EXpression levels based upon learned correction factors, MS/MS spectral counts and each protein's probability of correct identification. This protocol describes APEX-based calculations in three parts. (i) Using training data, peptide sequences and their sequence properties, a model is built to estimate MS detectability (Oi) for any given protein. (ii) Absolute protein abundances are calculated from spectral counts, identification probabilities and the learned Oi-values. (iii) Simple statistics allow calculation of differential expression in two distinct biological samples, i.e., measuring relative protein abundances. APEX-based protein abundances span 3–4 orders of magnitude and are applicable to mixtures of 100s to 1,000s of proteins.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Nesvizhskii, A.I., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003).
Silva, J.C., Gorenstein, M.V., Li, G.Z., Vissers, J.P. & Geromanos, S.J. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell. Proteomics 5, 144–156 (2006).
Oda, Y., Huang, K., Cross, F.R., Cowburn, D. & Chait, B.T. Accurate quantitation of protein expression and site-specific phosphorylation. Proc. Natl. Acad. Sci. U.S.A. 96, 6591–6596 (1999).
Ong, S.E. & Mann, M. Mass spectrometry-based proteomics turns quantitative. Nat. Chem. Biol. 1, 252–262 (2005).
Ong, S.E. et al. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1, 376–386 (2002).
Gygi, S.P. et al. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999 (1999).
Gerber, S.A., Rush, J., Stemman, O., Kirschner, M.W. & Gygi, S.P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. U.S.A. 100, 6940–6945 (2003).
Ishihama, Y. et al. Quantitative mouse brain proteomics using culture-derived isotope tags as internal standards. Nat. Biotechnol. 23, 617–621 (2005).
Liu, H., Sadygov, R.G. & Yates, J.R. 3rd. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76, 4193–4201 (2004).
Gao, J., Opiteck, G.J., Friedrichs, M.S., Dongre, A.R. & Hefta, S.A. Changes in the protein expression of yeast as a function of carbon source. J. Proteome Res. 2, 643–649 (2003).
Florens, L. et al. A proteomic view of the Plasmodium falciparum life cycle. Nature 419, 520–526 (2002).
Gao, J., Friedrichs, M.S., Dongre, A.R. & Opiteck, G.J. Guidelines for the routine application of the Peptide hits technique. J. Am. Soc. Mass. Spectrom. 16, 1231–1238 (2005).
States, D.J. et al. Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat. Biotechnol. 24, 333–338 (2006).
Blondeau, F. et al. Tandem MS analysis of brain clathrin-coated vesicles reveals their critical involvement in synaptic vesicle recycling. Proc. Natl. Acad. Sci. U.S.A. 101, 3833–3838 (2004).
Kislinger, T. et al. Global survey of organ and organelle protein expression in mouse: combined proteomic and transcriptomic profiling. Cell 125, 173–186 (2006).
Kislinger, T. et al. Proteome dynamics during C2C12 myoblast differentiation. Mol. Cell Proteomics 4, 887–901 (2005).
Steen, H. & Pandey, A. Proteomics goes quantitative: measuring protein abundance. Trends Biotechnol. 20, 361–364 (2002).
Elias, J.E., Gibbons, F.D., King, O.D., Roth, F.P. & Gygi, S.P. Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nat. Biotechnol. 22, 214–219 (2004).
Gay, S., Binz, P.A., Hochstrasser, D.F. & Appel, R.D. Peptide mass fingerprinting peak intensity prediction: extracting knowledge from spectra. Proteomics 2, 1374–1391 (2002).
Craig, R., Cortens, J.P. & Beavis, R.C. The use of proteotypic peptide libraries for protein identification. Rapid Commun. Mass. Spectrom. 19, 1844–1850 (2005).
Kuster, B., Schirle, M., Mallick, P. & Aebersold, R. Scoring proteomes with proteotypic peptide probes. Nat. Rev. Mol. Cell Biol. 6, 577–583 (2005).
Le Bihan, T., Robinson, M.D., Stewart, I.I. & Figeys, D. Definition and characterization of a “trypsinosome” from specific peptide characteristics by nano-HPLC-MS/MS and in silico analysis of complex protein mixtures. J. Proteome Res. 3, 1138–1148 (2004).
Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 (2007).
Tang, H. et al. A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22, e481–e488 (2006).
Lu, P., Vogel, C., Wang, R., Yao, X. & Marcotte, E.M. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat. Biotechnol. 25, 117–124 (2007).
Ghaemmaghami, S. et al. Global analysis of protein expression in yeast. Nature 425, 737–741 (2003).
Newman, J.R. et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441, 840–846 (2006).
Futcher, B., Latter, G.I., Monardo, P., McLaughlin, C.S. & Garrels, J.I. A sampling of the yeast proteome. Mol. Cell. Biol. 19, 7357–7368 (1999).
Lopez-Campistrous, A. et al. Localization, annotation, and comparison of the Escherichia coli K-12 proteome under two states of growth. Mol. Cell Proteomics 4, 1205–1209 (2005).
Lu, P. et al. Global metabolic changes following loss of a feedback loop reveal dynamic steady states of the yeast metabolome. Metab. Eng. 9, 8–20 (2007).
Wang, R. & Marcotte, E.M. The proteomic response of Mycobacterium smegmatis to anti-tuberculosis drugs suggests targeted pathways. J. Proteome Res. 7, 855–865 (2008).
Baerenfaller, K. et al. Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science 320, 938–941 (2008).
Schmidt, M.W., Houseman, A., Ivanov, A.R. & Wolf, D.A. Comparative proteomic and transcriptomic profiling of the fission yeast Schizosaccharomyces pombe. Mol. Syst. Biol. 3, 79 (2007).
Keller, A., Nesvizhskii, A.I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).
Cagney, G., Amiri, S., Premawaradena, T., Lindo, M. & Emili, A. In silico proteome analysis to facilitate proteomics experiments using mass spectrometry. Proteome Sci. 1, 5 (2003).
Neidhardt, F.C. & Umbarger, H.E. In Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology 2nd edn. Vol. 1 (eds. Neidhardt, F.C. et al.) 13–16 (ASM Press, Washington, D.C., 1996).
Sundararaj, S. et al. The CyberCell Database (CCDB): a comprehensive, self-updating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res. 32, D293–D295 (2004).
Kal, A.J. et al. Dynamics of gene expression revealed by comparison of serial analysis of gene expression transcript profiles from yeast grown on two different carbon sources. Mol. Biol. Cell 10, 1859–1872 (1999).
Stollberg, J., Urschitz, U., Urban, Z. & Boyd, C.D. A quantitative evaluation of SAGE. Genome Res. 10, 1241–1248 (2000).
Velculescu, V.E., Zhang, L., Vogelstein, B. & Kinzler, K.W. Serial analysis of gene expression. Science 270, 484–487 (1995).
Acknowledgements
C.V. acknowledges support by the International Human Frontier Science Program. We thank John Braisted and Srilatha Kuntumalla from JCVI for many useful discussions regarding the APEX calculations. This work was supported by grants from the Welch (F-1515) and Packard Foundations, the National Science Foundation and National Institutes of Health.
Author information
Authors and Affiliations
Corresponding authors
Supplementary information
Rights and permissions
About this article
Cite this article
Vogel, C., Marcotte, E. Calculating absolute and relative protein abundance from mass spectrometry-based protein expression data. Nat Protoc 3, 1444–1451 (2008). https://doi.org/10.1038/nprot.2008.132
Published:
Issue Date:
DOI: https://doi.org/10.1038/nprot.2008.132
This article is cited by
-
Increased specificity of Fasciola hepatica excretory-secretory antigens combining negative selection on hydroxyapatite and salt precipitation
Scientific Reports (2024)
-
Peptide mass mapping in bioapatites isolated from animal bones
Journal of Materials Science: Materials in Medicine (2020)
-
Mitochondrial protein sulfenation during aging in the rat brain
Biophysics Reports (2018)
-
msBiodat analysis tool, big data analysis for high-throughput experiments
BioData Mining (2016)
-
Computational approaches to protein inference in shotgun proteomics
BMC Bioinformatics (2012)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.