Principles: The need for better experimental design

https://doi.org/10.1016/S0165-6147(03)00159-7Get rights and content

Abstract

Many experiments could be improved with better experimental design and statistical analysis. Badly designed experiments can lead to incorrect conclusions and wasted time and scientific resources. Such experiments are unethical if they involve animals or humans. Good experimental design requires clearly defined objectives and control of the major sources of variation. In this article, a small mouse experiment involving the response of a liver enzyme to the administration of an antioxidant is used to illustrate some important design concepts such as the control and partitioning of sources of variation using factorial and randomized block designs and the estimation of appropriate sample sizes. Scientists clearly need better training in experimental design with better access to consultant statisticians for more complex situations.

Section snippets

Common errors in experimental design

Experiments often have the potential for bias because subjects are not allocated at random and/or the treated and control groups are kept separately, for example, on different shelves in an animal room. Measurements taken from the treatment groups are sometimes performed at different times or even by different people from those of the control group. Some experiments even seem to be done in an ad hoc manner, with additional treatment groups being added during the course of the experiment. After

Designing better experiments

Designing experiments requires clear objectives, careful planning and should ensure that comparisons between groups are unbiased [7]. Each experiment should be large enough to have sufficient power to detect clinically or scientifically important results but should not be so large that they waste scientific resources. The repeatability of the experiment under different conditions needs to be considered, and the experiment should be simple, so as to avoid mistakes. An objective measure of the

The control of variation

An understanding of types of variation and how they are handled is of crucial importance. Variation, as a result of the species, sex, strain, age, bedding and diet of experimental animals, or the cell type, culture medium and culture conditions for in vitro studies, can be controlled directly by the scientist. These sources of variation, known as ‘fixed effects’, are either set at one level or deliberately varied as part of the design. If, for example, the sex of the animal is considered to be

Use of additional information

Experiments are commonly set up to test one or a few hypotheses but they often produce large volumes of data. There is a danger of attempting to use such data to answer additional questions that were not considered at the time the experiment was planned. The P values obtained in this way will be unreliable. However, such data can be used in several ways and, in particular, can be used to generate new hypotheses that can be tested in subsequent experiments [10]. In cases where several parameters

Sample size

Scientists sometimes express concern at the apparently small sample sizes often found with factorial designs and wonder whether the results will be accepted by a good journal. In the example shown in Box 1, there were only two mice of each strain in each treatment group. However, the main comparison of butylated hydroxyanisole (BHA)-treated versus control animals involved eight animals in each group, albeit of four different genotypes. Had the experiment been performed with eight outbred

Statistical analysis and presentation of results

Because this article is focused on experimental design, the methods of statistical analysis will not be considered in detail. However, the method of statistical analysis should always be decided at the time that the experiment is designed, although some modification might be necessary after the data have been collected. The ready availability of statistical software takes the hard work out of most calculations but choosing methods and interpreting output still needs a good understanding of

Guidelines and textbooks

This article has only touched on two techniques, the use of factorial and randomized block designs, for improving experimental design. Many other designs and techniques are available. Statistical guidelines often give helpful hints and a gentle introduction to statistical methods for planning, designing, executing, analysing and presenting experimental results. These are available for contributors to medical journals [13], for animal experiments [14] and for in vitro experiments [15].

Concluding remarks

The key to designing good experiments is to have clear objectives and to understand and control the main sources of variation. Fixed effects such as the strain, species and sex of animals are either held at a single level or are varied deliberately using a factorial design. Random effects are controlled by choosing uniform material such as disease-free isogenic strains of laboratory rodents, minimizing measurement error and controlling for time and/or space variation using randomized block

References (19)

  • M.F.W. Festing

    Strain differences in haematological response to chloramphenicol succinate in mice: implications for toxicological research

    Food Chem. Toxicol.

    (2001)
  • D.G. Altman

    Statistics in medical journals

    Stat. Med.

    (1982)
  • M.F.W. Festing

    Reduction of animal use: experimental design and quality of experiments

    Lab. Anim.

    (1994)
  • I. McCance

    Assessment of statistical procedures used in papers in the Australian Veterinary Journal

    Aust. Vet. J.

    (1995)
  • M.F.W. Festing et al.

    Reducing the use of laboratory-animals in toxicological research and testing by better experimental-design

    J. Royal Statistical Society Series B-Methodological

    (1996)
  • M.F.W. Festing

    Reduction of animal use and experimetnal design

  • I. Roberts

    Does animal experimentation inform human healthcare? Observations from a systematic review of international animal experiments on fluid resuscitation

    Br. Med. J.

    (2002)
  • D.R. Cox et al.

    The Theory of the Design of Experiments

    (2000)
  • M.F.W. Festing

    Warning: the use of genetically heterogeneous mice may seriously damage your research

    Neurobiol. Aging

    (1999)
There are more references available in the full text version of this article.

Cited by (81)

  • Stratification of hippocampal electrophysiological activation evoked by selective electrical stimulation of different angular and linear acceleration sensors in the rat peripheral vestibular system

    2021, Hearing Research
    Citation Excerpt :

    Since this was an exploratory study, we had no a priori reason, based on previous studies (e.g., Cuthbert et al., 2000; Rancz et al., 2015), for planning comparisons between, for example, specific electrodes in the left and right hippocampi, and therefore we chose to avoid post-hoc testing that would lead to an excessive type I error rate, especially in a context in which there were so many outcomes (Berger, 2005; Cramer et al., 2016). Therefore, we chose to make full use of significant interactions in order to interpret the results of the GEE and leave specific planned comparisons to future studies (Festing, 2003; Nieuwenhius et al., 2011). Heatmaps were created in Matlab using the ‘mean aggregation method’ and missing data imputed using the ‘K nearest neighbour algorithm’ (Lloyd, 2010).

  • Flow cytometry for receptor analysis from ex-vivo brain tissue in adult rat

    2018, Journal of Neuroscience Methods
    Citation Excerpt :

    Our conclusions have to be limited by the appreciation that we had only n = 2 animals in the 30 day sham group due to missing data. However, as stated, we did employ a nested 3 factor design, which increased statistical power (see Festing (2003) for an analysis), as well as an LMM analysis, which is designed to accommodate missing data, especially in a repeated measures situation (Smith, 2012). Furthermore, we recognised that the cell counts in the different categories (those expressing the M2/4 antibody only, the β-tubulin III antibody only, the M2/4 and β-tubulin III antibodies, or neither) were not independent of one another (since the number for the 4th category is fixed by those for the first 3), and therefore these data were compositional in nature.

  • Alternatives to eye bank native tissue for corneal stromal replacement

    2017, Progress in Retinal and Eye Research
    Citation Excerpt :

    In all animal studies for corneal implants, three basic requirements are randomization and blinding, inclusion of control groups, and statistical analyses. Both complete randomization and randomized blocking can be used (Festing, 2003). Blinding of the investigator analyzing the data and evaluating the outcomes to the identity of the experimental groups versus controls eliminates pre-conceived bias from the researcher.

  • Behavioral studies on anxiety and depression in a drug discovery environment: Keys to a successful future

    2015, European Journal of Pharmacology
    Citation Excerpt :

    Otherwise, the study will be invalid and should be repeated in its entirety. It is obvious that using the right experimental design is critical for success (Festing, 2003). In many assays in the mood research field, key parameters are often deducted or relative measures.

  • Introduction to Toxicological Screening Methods and Good Laboratory Practice

    2022, Introduction to Toxicological Screening Methods and Good Laboratory Practice
View all citing articles on Scopus
View full text