Elsevier

Controlled Clinical Trials

Volume 19, Issue 6, December 1998, Pages 589-601
Controlled Clinical Trials

Power and Sample Size Calculations for Studies Involving Linear Regression

https://doi.org/10.1016/S0197-2456(98)00037-3Get rights and content

Abstract

This article presents methods for sample size and power calculations for studies involving linear regression. These approaches are applicable to clinical trials designed to detect a regression slope of a given magnitude or to studies that test whether the slopes or intercepts of two independent regression lines differ by a given amount. The investigator may either specify the values of the independent (x) variable(s) of the regression line(s) or determine them observationally when the study is performed. In the latter case, the investigator must estimate the standard deviation(s) of the independent variable(s). This study gives examples using this method for both experimental and observational study designs. Cohen’s method of power calculations for multiple linear regression models is also discussed and contrasted with the methods of this study. We have posted a computer program to perform these and other sample size calculations on the Internet (see http://www.mc.vanderbilt.edu/prevmed/psintro.htm). This program can determine the sample size needed to detect a specified alternative hypothesis with the required power, the power with which a specific alternative hypothesis can be detected with a given sample size, or the specific alternative hypotheses that can be detected with a given power and sample size. Context-specific help messages available on request make the use of this software largely self-explanatory.

Introduction

Clinical investigators sometimes wish to evaluate a continuous response measure in a cohort of patients randomized to one of several groups defined by increasing levels of some treatment. In performing sample size and power calculations for such studies, one reasonable approach models patient response as a linear function of dose, and poses power calculations in terms of detecting dose-response slopes of a given magnitude. Alternately, we may wish to evaluate the dose-response curves of two different treatments and test whether slopes of these curves differ. This article provides an easily used, accurate method for power and sample size calculations for such studies. We have posted an interactive self-documented program to perform these calculations on the Internet.

Other investigators have reviewed general methods for sample size and power calculations 1, 2, 3. Hintze [4] provided a method for designing studies to detect correlation coefficients of specified magnitudes that uses a computational algorithm of Guenther [5]. This method provides results that are perhaps less easily understood than those based on regression slope parameters, because many investigators can more readily interpret slopes than correlation coefficients. Kraemer and Thiemann [3] provide tables that permit exact sample size calculations for studies designed to detect correlation coefficients of a given magnitude. They also give formulas that permit using these tables for designs involving linear regression. Although accurate, these methods are less convenient than those that we have incorporated into an interactive computer program. Cohen [2] provided more complex methods for designs involving multiple linear regression and correlation analysis. Later in this study we describe these methods, which require expressing the alternative hypothesis in terms of their effect on the multiple correlation coefficient [6]. Hintze [4] has written software for deriving these calculations, but clinical investigators may find his methods somewhat difficult to use and interpret. Goldstein [7] and Iwane et al. [8] have reviewed other power and sample size software packages.

We study the effect of one variable on another by estimating the slope of the regression line between these variables. For example, we might compare the effects of a treatment at several dose levels. Suppose that we treat n patients, that the jth patient has response yj after receiving dose level xj, and that the expected value of yj given xj is γ + λxj. To test the null hypothesis that λ = 0 against a two-sided alternative hypothesis with type I error probability α, we must be able to answer the following three questions:

  • 1.

    How many patients must we study to detect a specific alternative hypothesis λ = λa with power 1 − β?

  • 2.

    With what power can we detect a specific alternative hypothesis λ = λa given observations on n study subjects?

  • 3.

    What alternative values of λa can we detect with power 1 − β if we study n patients?

Either observational or experimental studies may use this design. In the former, both {xj} and {yj} are attributes of the study subjects, and we intend to determine whether these two variables are correlated. In these studies, the investigator must also estimate σx, the predicted standard deviation of xj in the patients under study. In experiments, the investigator determines the values of {xj}. Typically, xj denotes a drug dose given at one of K distinct values w1, … , wK, with a proportion ck of the study subjects being assigned dose level wk.

The degree of dispersion of the response values about the regression line affects power and sample size calculations. A parameter that quantifies this dispersion is σ, the standard deviation of the regression errors. The regression error for the jth observation is the difference between the observed and expected response value for the jth subject. In other words, the regression error is the vertical distance between the observed response yj and the true regression line (see Fig. 1); σ is the standard deviation of these vertical distances. The values of σ, σx, σy, λ, and the correlation coefficient ρ are all interrelated. It is well known [6] that: λ = ρσyσx and it is easily shown that: σ = σy1 − ρ2 = λσx1ρ2 − 1 = σ2yλ2σ2x

Thus, when ρ = 1, the observations {xj} and {yj} are perfectly correlated and lie on a straight line with slope σyx; the regression errors are all zero because the observed and expected responses are always equal) and hence σ = 0. When ρ = 0, xj and yj are uncorrelated, the expected regression line is flat (λ = 0), and the standard deviation of the regression errors equals the standard deviation of yj (i.e., σ = σy). Figure 2 illustrates the relationship between these parameters when 0 < ρ < 1. This figure shows simulated data for patients given treatments A and B under the assumption that the two treatments have identical means and standard deviations of the independent and response variables. They differ in that the correlation coefficient between response and independent variables is 0.9 for treatment A (black dots) and 0.6 for treatment B (open circles). Consequently, the response to treatment A are more closely clustered around their (black) regression line than the response to treatment B (gray). Thus, the average regression error is less for treatment A than for treatment B and, hence, σ, the standard deviation of these errors, is less for treatment A than for treatment B. Power or sample size calculations require estimates of σx, λ, and σ. It is often difficult to estimate σ directly; however, we can obtain indirect estimates of σ using equation (2) whenever we are able to estimate ρ or σy. We derive power and sample size formulas for simple linear regression in the Appendix.

Suppose that we want to compare the slopes and intercepts of two independent regression lines. For example, we might wish to compare the effects of two different treatments at several dose levels. Suppose that treatments 1 and 2 are given to n1 and n2 patients, respectively, and that the jth subject who receives treatment i (i = 1 or 2) has response yij to treatment at dose level xij, where the expected value of yij is γi + λixij. We want to determine whether the response to the treatments differ. Specifically, we intend to test the null hypotheses that γ1 = γ2 and λ1 = λ2. In this case, we must answer the three questions given earlier for alternative hypotheses concerning the magnitude of the differences in the y intercept and slope parameters for these two treatments. We derive power and sample size formulas for two treatment linear regression problems in the Appendix.

Section snippets

Computer software

We have written a computer program to implement these and other sample size and power calculations [1] and have posted it, together with program documentation, on the Internet. The program runs under either Windows 95 or Windows NT operating systems. To obtain free copies open the http://www.mc.vanderbilt.edu/prevmed/psintro.htm page on the World Wide Web and follow instructions. The program, named PS, has a graphical user interface with hypertext help messages that make the use of the program

Linear Regression in an Observational Study

A dieting program encourages patients to follow a specific diet and to exercise regularly. We want to determine whether the actual average time per day spent exercising is related to body mass index (BMI, in kilograms per square meter) after 6 months on this program. Previous experience suggests that the exercise time of participants has a standard deviation of σx = 7.5 minutes. Kuskowska-Wolk et al. [12] reported that the standard deviation of the BMI for their female study subjects was σy =

Linear regression using the pass program

One of the most popular commercially available power and sample size programs, PASS 6.0 4, 8, provides a general approach to power calculations for multiple linear regression using the method of Cohen [2]. Let: yj = γ + λ1x1j + λ2x2j + · · · + λkxkj + ϵj : j = 1, … , J denote a conventional multiple linear regression model in which the jth patient has a response variable yj and k covariates {x1j, x2j, · · · xkj}. We intend to test the null hypothesis that λ1 = λ2 = · · · = λp = 0 for some pk

Discussion

The chief advantage of Cohen’s method of power calculations for multiple linear regression is its flexibility. It may be used to perform power calculations for a very wide range of linear regression problems and null hypotheses. This method, however, has three disadvantages that restrict its use:

  • 1.

    The pilot data needed for Cohen’s method is often unavailable. Suppose that the literature provides an estimate of the slope of a linear regression of weight loss against hours of exercise per week for

Software accuracy

We have written Excel spreadsheets that evaluate Appendix , for the different cases considered in this study. These spreadsheets provide independent confirmation that the PS program has correctly implemented our formulas. The fact that the PS and PASS programs give very similar answers to the cadmium and body mass index examples using very different methods is evidence that both programs have been coded correctly.

Acknowledgements

This work was supported by NIH RO1 Grants CA50468, HL19153, and LM06226 and NCI Center Grant CA68485. We thank Drs. W.A. Ray, O.B. Crofford, G.W. Reed, M.D. Decker, G.R. Bernard, M.R. Griffin, and R.I. Shorr for their helpful suggestions.

References (13)

There are more references available in the full text version of this article.

Cited by (842)

View all citing articles on Scopus
View full text