SeriesSample size calculations in randomised trials: mandatory and mystical
Section snippets
Components of sample size calculations
Calculating sample sizes for trials with dichotomous outcomes (eg, sick vs well) requires four components: type I error (α), power, event rate in the control group, and a treatment effect of interest (or analogously an event rate in the treatment group). These basic components persist through calculations with other types of outcomes, except other assumptions can be necessary. For example, with quantitative outcomes and a typical statistical test, investigators might assume a difference between
Effect of selecting α error and power
The conventions of α=0·05 and power=0·80 usually suffice. However, other assumptions make sense based on the topic studied. For example, if a standard prophylactic antibiotic for hysterectomy is effective with few side-effects, in a trial of a new antibiotic we might set α error lower (eg, 0·01) to reduce the chances of a false-positive conclusion. We might even consider lowering the power below 0·80 because of our reduced concern about missing an effective treatment—an effective safe treatment
Estimation of population parameters
For some investigators, estimation of population parameters—eg, event rates in the treatment and control groups—has mystical overtones. Some researchers scoff at this notion, since estimating the parameters is the aim of the trial: needing to do it before the trial seems ludicrous. The key point, however, is that they are not estimating the population parameters per se but the treatment effect they deem worthy of detecting. That is a big difference.
Usually, investigators start by estimating the
Low power with limited available participants
What happens when sample size software—in view of an investigator's diligent estimates—yields a trial size that exceeds the number of available participants? Frequently, investigators then calculate backwards and estimate that they have low power (eg, 0·40) for their available participants. This practice may be more the rule than the exception.9
Some methodologists advise clinicians to abandon such a low-power study. Many ethics review boards deem a low power trial unethical.10, 11, 12 Chalmers'
Sample size samba
Investigators sometimes perform a “sample size samba” to achieve adequate power.27, 28 The dance involves retrofitting of the parameter estimates (in particular, the treatment effect worthy of detection) to the available participants. This practice seems fairly common in our experience and in that of others.27 Moreover, funding agencies, protocol committees, and even ethics review boards might encourage this backward process. It represents an operational solution to a real problem. In view of
Sample size modification
With additional available participants and resource flexibility, investigators could consider a sample size modification strategy, which would alleviate some of the difficulties with rough guesses used in the initial sample size calculations. Usually, modifications lead to increased sample sizes,29 so investigators should have access to the participants and the funding to accommodate the modifications.
Approaches to modification rely on revision of the event rate, the variance of the endpoint,
Futility of post hoc power calculations
A trial yields a treatment effect and confidence interval for the results. The power of the trial is expressed in that confidence interval. Hence, the power is no longer a meaningful concern.7, 27, 34 Nevertheless, after trial completion, some investigators do power calculations on statistically non-significant trials using the observed results for the parameter estimates. This exercise has specious appeal, but tautologically yields an answer of low power.7, 27 In other words, this ill-advised
What should readers look for in sample size calculations?
Readers should find the a-priori estimates of sample size. Indeed, in trial reports, confidence intervals appropriately indicate the power. However, sample size calculations still provide important information. First, they specify the primary endpoint, which safeguards against changing outcomes and claiming a large effect on an outcome not planned as the primary outcome.35 Second, knowing the planned size alerts readers to potential problems. Did the trial encounter recruitment difficulties?
Conclusions
Statistical power is an important notion, but it should be stripped of its ethical bellwether status. We question the branding of trials as unethical based solely on an inherently subjective, imprecise sample size calculation process. We endorse planning for adequate power, and we salute large multicentre trials of the ISIS-2 ilk;43 indeed, more such studies should be undertaken. However, if the scientific world insisted solely on large trials, many unanswered questions in medicine would
References (43)
- et al.
The CONSORT statement: revised recommendations for improving the quality of reports or parallel-group trials
Lancet
(2001) - et al.
Why “underpowered” trials are not necessarily unethical
Lancet
(1997) Failure of randomisation by “sealed” envelope
Lancet
(1999)Atrial fibrillation and antithrombotic prophylaxis: a prospective meta-analysis
Lancet
(1989)- et al.
Publication bias and clinical trials
Control Clin Trials
(1987) - et al.
Under-reporting of clinical trials is unethical
Lancet
(2003) - et al.
Sample size and statistical power in reproductive research
Obstet Gynecol
(1995) - et al.
Sample size slippages in randomised trials: exclusions and the lost and wayward
Lancet
(2002) - et al.
Blinding in randomised trials: hiding who got what
Lancet
(2002) - et al.
Allocation concealment in randomised trials: defending against deciphering
Lancet
(2002)
Generation of allocation sequences in randomised trials: chance, not choice
Lancet
Unequal group sizes in randomised trials: guarding against guessing
Lancet
The revised CONSORT statement for reporting randomized trials: explanation and elaboration
Ann Intern Med
The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial: survey of 71 “negative” trials
N Engl J Med
Can we learn anything from small trials?
Ann N Y Acad Sci
Clinical trials: a practical approach
Clinical trials: design, conduct, and analysis
Clinical trials: a methodologic perspective
Preventing IUCD-related pelvic infection: the efficacy of prophylactic doxycycline at insertion
Br J Obstet Gynaecol
Small clinical trials: are they all bad?
Stat Med
The continuing unethical conduct of underpowered clinical trials
JAMA
Cited by (418)
Optimizing individual benefits of pulmonary rehabilitation including a multifaceted dietary intervention – A single-arm feasibility study
2023, Clinical Nutrition Open ScienceSpinal shock in severe SCI dogs and early implementation of intensive neurorehabilitation programs
2023, Research in Veterinary ScienceRemodelling of the intestinal ecosystem during caloric restriction and fasting
2023, Trends in MicrobiologyYear-to-year variation in attack rates could result in underpowered respiratory syncytial virus vaccine efficacy trials
2022, Journal of Clinical Epidemiology