Research
Experimental and Statistical Considerations to Avoid False Conclusions in Proteomics Studies Using Differential In-gel Electrophoresis*

https://doi.org/10.1074/mcp.M600274-MCP200Get rights and content
Under a Creative Commons license
open access

In quantitative proteomics, the false discovery rate (FDR) can be defined as the number of false positives within statistically significant changes in expression. False positives accumulate during the simultaneous testing of expression changes across hundreds or thousands of protein or peptide species when univariate tests such as the Student's t test are used. Currently most researchers rely solely on the estimation of p values and a significance threshold, but this approach may result in false positives because it does not account for the multiple testing effect. For each species, a measure of significance in terms of the FDR can be calculated, producing individual q values. The q value maintains power by allowing the investigator to achieve an acceptable level of true or false positives within the calls of significance. The q value approach relies on the use of the correct statistical test for the experimental design. In this situation, a uniform p value frequency distribution when there are no differences in expression between two samples should be obtained. Here we report a bias in p value distribution in the case of a three-dye DIGE experiment where no changes in expression are occurring. The bias was shown to arise from correlation in the data from the use of a common internal standard. With a two-dye schema, where each sample has its own internal standard, such bias was removed, enabling the application of the q value to two different proteomics studies. In the case of the first study, we demonstrate that 80% of calls of significance by the more traditional method are false positives. In the second, we show that calculating the q value gives the user control over the FDR. These studies demonstrate the power and ease of use of the q value in correcting for multiple testing. This work also highlights the need for robust experimental design that includes the appropriate application of statistical procedures.

Cited by (0)

Published, MCP Papers in Press, May 17, 2007, DOI 10.1074/mcp.M600274-MCP200

1

The abbreviations used are: 2D, two-dimensional; FDR, false discovery rate; SA, standardized abundance; ECA, E. carotovora; PCER, per comparison error rate; FWER, familywise error rate; Q, quantile.

2

N. A. Karp, P. S. McCormick, M. R. Russell, and K. S. Lilley, unpublished observation.

*

This work was supported in part by Biotechnology and Biological Sciences Research Council (BBSRC) Grant BB/C50694/1. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The on-line version of this article (available at http://www.mcponline.org) contains supplemental material.

§

A BBSRC research associate supported by BBSRC Grant BB/C50694/1.

Supported by Unilever.

**

Supported by a BBSRC Strategic Studentship BBS/Q/Q/2004/05630.