- Split View
-
Views
-
Cite
Cite
R. Mcnamee, Optimal design and efficiency of two-phase case–control studies with error-prone and error-free exposure measures, Biostatistics, Volume 6, Issue 4, October 2005, Pages 590–603, https://doi.org/10.1093/biostatistics/kxi029
- Share Icon Share
Abstract
This paper addresses optimal design and efficiency of two-phase (2P) case–control studies in which the first phase uses an error-prone exposure measure, Z, while the second phase measures true, dichotomous exposure, X, in a subset of subjects. Optimal design of a separate second phase, to be added to a preexisting study, is also investigated. Differential misclassification is assumed throughout. Results are also applicable to 2P cohort studies with error-prone and error-free measures of disease status but error-free exposure measures. While software based on the mean score method of Reilly and Pepe (1995, Biometrika82, 299–314) can find optimal designs given pilot data, the lack of simple formulae makes it difficult to generalize about efficiency compared to one-phase (1P) studies based on X alone. Here, formulae for the optimal ratios of cases to controls and first- to second-phase sizes, and the optimal second-phase stratified sampling fractions, given a fixed budget, are given. The maximum efficiency of 2P designs compared to a 1P design is deduced and is shown to be bounded from above by a function of the sensitivities and specificities of Z. The efficiency of ‘balanced’ separate second-phase designs (Breslow and Cain, 1988, Biometrika75, 11–20)—in which equal numbers of subjects are chosen from each first-phase strata—compared to optimal design is deduced, enabling situations where balanced designs are nearly optimal to be identified.
1 INTRODUCTION
Inaccurate exposure measurement can lead to biassed measures of disease–exposure relationships but accuracy often entails a greater cost per subject than error-prone approaches. Use of an error-free but expensive measure, X, instead of a cheap, error-prone measurement, Z, will remove the bias but, if the study budget is fixed, will also reduce statistical power and precision. A third possibility is to use both Z and X in a two-phase (2P) study where, for example, Z and disease status Y are measured in phase 1 while, in the second phase, X is measured on a sample of subjects from the first phase. This paper is concerned with the optimal design of such studies for efficiency in estimating β, the log odds ratio linking Y and dichotomous X, given a fixed budget.
In general 2P studies, the set of variables measured in phase 1, W say, is incomplete; this deficiency is made up in phase 2 but only on a sample of first-phase subjects. W may be a subset of the complete set of interest, V or, as considered here, an error-prone version of V. Data from both phases are combined for analysis. The latter, ‘errors in variables’, design may also be referred to as a study with internal exposure validation data. Various designs may be employed for the first phase, e.g. subjects may be chosen initially on the basis of disease status, as in case–control designs, or exposure status, or completely at random (Zhao and Lipsitz, 1992). This paper concentrates on first-phase case–control designs with an error-prone measure of exposure Z, followed by error-free measurement, X, in the second phase, but we show how the results apply when the first-phase design is that of a cohort study or clinical trial with error-free X but error-prone and error-free measures of disease status measured in the two phases.
Several methods exist for analyzing 2P case–control studies—see Thurigen et al. (2000) for a comprehensive review. In designing a 2P study, we need to distinguish situations where the two phases are planned together, and those where the second phase is planned separately after a first phase based on Y and Z is complete. In the latter case, here referred to as separate phase 2 (SP2) design, the second phase may be easily justified because it transforms a previous study with a biassed estimate of β into one which is unbiassed. The additional cost, compared to including all first-phase subjects in the second phase, may be low yet precision can, in some cases, be ‘almost as good’ (Reilly, 1996). On the other hand, to justify a 2Ps design, here defined as a design where the two phases are planned together, one should consider whether, for a fixed budget, there is any gain in efficiency compared to a simpler, ‘one-phase’ (1P) study based on Y and X alone. In a 2Ps design, the efficiency per unit cost may be increased if the balance between first- and second-phase sample sizes takes account of the relative costs per subject of Z and X, and if the first- and second-phase sampling designs are chosen optimally. In a separate second-phase design with a fixed budget, only the second-phase sampling design is open to choice; legitimate designs include simple random sampling and random sampling stratified by Y, Z, or Y and Z (Zhao and Lipsitz, 1992).
Palmgren (1987) and Greenland (1988a) addressed optimal 2Ps designs, but with restrictions which may have limited efficiency: second-phase sampling could only be stratified by Y, and the sampling fraction for cases and controls had to be equal. Palmgren proved that a yet more restricted design (equal numbers of first-phase cases and controls), but with an optimally chosen sampling fraction, could be less efficient than a one-phase (1P) study of the same cost and based on X alone. Greenland suggested that the 2Ps design might be less efficient unless the sensitivity and specificity of Z were ‘uniformly high’ but no formal proof was given. Reilly and Pepe (1995) developed an approach to solve general 2P optimality problems based on their ‘mean score’ analysis but, since information matrix inversion is required, software is generally needed for implementation. The software designed for this purpose (Reilly and Salim, 2000) requires first-phase pilot data on Y, Z, and X ; thus, while it yields individual solutions, this empirical approach does not readily give insight into the general utility of the design.
The efficiency of a 2P case–control study depends, among other factors, on whether or not one assumes differential misclassification, i.e. that misclassification of exposure by Z varies by Y: the differential assumption increases the variance of β (Greenland, 1988b; Palmgren, 1987). Greenland (1988b) has argued that differential misclassification should be the presumption in epidemiological studies unless there is compelling evidence against it; case–control studies in particular may be more prone to differential misclassification (Thurigen et al., 2000; Dahm et al., 1995). The mean score method and related software make no assumptions about the relationship between X and Z and thus would appear to assume differential misclassification by default—although Thurigen et al. (2000) have stated the opposite. This software gives empirical solutions to many optimal 2Ps design problems and also to SP2 design problems, given pilot first-phase data. Alternative software (Holcroft and Spiegelman, 1999), applicable only to the latter class of designs, and based on maximum likelihood analysis, assumes nondifferential misclassification and requires parameter estimates for implementation.
Optimal solutions, however derived, are in practice only as good as the parameter estimates: estimation error could undo any theoretical advantage of a design. Breslow and Cain (1988), Cain and Breslow (1988), Breslow and Holubkov (1997), and Breslow and Chatterjee (1999) proposed instead ‘balanced’ phase 2 designs: for the problem here, balance is defined as equal numbers of subjects from the Y.Z strata. Using examples and simulations, these authors argued that, in general, balanced designs were ‘near optimal’, although only the last mentioned paper addressed in detail the ‘errors in variables’ problem. Holcroft and Spiegelman (1999) also found that the balanced designs were nearly optimal in the SP2 examples they considered.
This paper derives formulae for the efficiency of a fully optimized, 2Ps case–control design compared with a 1P design of the same cost, assuming dichotomous exposure and that error in Z is differential. The optimal first-phase control:case ratio, ratio of first- and second-phase sizes, and optimal sampling fractions for the strata of Y.Z are established. We prove that the maximum efficiency is greater than that implied by Palmgren's (1987) restricted design but has an upper bound which is a simple function of the sensitivities and specificities of Z. The advantage of stratification by Z, in addition to Y, is quantified theoretically. Formulae for the optimal design of a SP2 study are given and used to evaluate balanced designs, enabling us to identify situations where balance is near optimal or suboptimal. The basic theory is extended to studies for estimating the interaction of X with another covariate and to situations where the roles of Y and X, in terms of design and measurement error, are reversed.
To illustrate the formulae, we consider 2P studies of the relationship between cervical cancer (Y) and herpes simplex virus 2 (HSV-2), where HSV-2 can be measured by the cheaper western blot procedure (Z) or a more refined method (X). Data from such a study were analyzed by Carroll et al. (1993) assuming differential misclassification. Here, we derive the optimal 2Ps design and the optimal SP2 design of a new investigation and consider whether the 2Ps design is efficient compared to a 1P design of the same budget. A clinical trial of treatment for ovarian cancer with surrogate and true outcome measures is also considered.
2 NOTATION AND ESTIMATION
Consider the first phase of a 2P case–control study in which there are n subjects in total and the ratio of controls to cases is R0, giving n1=n/(R0+1) cases and n0=nR0/(R0+1) controls. Cases (Y=0) and controls (Y=1) are sampled independently of each other. All n subjects are classified using an error-prone surrogate for exposure, Z which is assumed here to have two categories, 0 and 1. Corresponding results are available for the case where Z has more than two categories in the supplementary material (www.biostatistics.oupjournals.org); this could occur, for example, if Z was a score derived from a questionnaire. For the second-phase, random sampling, possibly stratified by Y and Z, is used to choose subjects.
The sampling fraction in the stratum with Y=i, Z=j, is νij, and the number of subjects sampled is mij=niνij, i=0, 1, j=0, 1. The total second-phase size is m=∑i, jmij, where m<n. These subjects are classified as either X=0 or X=1 (exposed).
The subscripts i and j always refer to levels of Y and Z, respectively. All probabilities are conditional on Y. Let πi=Pr{X=1|Y=i}, i=0, 1. The objective of the study is to estimate β, the log odds ratio of the Y–X relationship which, in a case–control study, can be found from
Formulae (2.3) and (2.4) do not appear to have been published previously.
An alternative parameterization of the relationship between Z and X is sometimes preferable. Let 𝛉i=Pr{Z=1|Y=i, X=1} and λi=Pr{Z=1|Y=i, X=0}, i=0, 1. Then, 𝛉i and 1−λi are the sensitivity and specificity, respectively, of Z as a proxy for X in group i. Nondifferential misclassification implies 𝛉1=𝛉0 and λ1=λ0. We will assume that Z is such that 𝛉i−λi≥0, i=0, 1 ; this condition ensures that ρi≥0, i=0, 1.
3 OPTIMAL 2PS DESIGN
To maximize the precision, the values of all the quantities, R0, n, and {νij}, should be chosen to minimize (2.4) subject to the constraint (3.1). These optimal values, and the resulting minimum variance and efficiency compared with a 1P study of the same budget, are given in Section 3.2. We also give details of some constrained ‘optimal’ designs—where R0 is a fixed a priori or where only stratification by Y is possible. For comparison, the variances for 1P studies based on X alone are given in (3.1). All proofs are given in the supplementary material.
3.1 Optimal 1P design
Equation (3.3) is equal to 1 under the null hypothesis that π1=π0.
3.2 Fully optimal 2Ps design
The optimal value of n is found by substituting (3.5) and (3.6) into (3.1). When 𝛉i−λi≥0, both
This characterization of optimal design—rather than in terms of n and {νij}—is convenient and insightful. The optimal fraction, m/n, decreases both as c2/c1 increases and as the validity of Z increases. The optimal allocation fractions (3.10) are discussed in detail in Section 4.
3.3 Optimal 2Ps design but with R0 fixed
Equation (3.11) yields an expression identical to (3.10) for the optimal allocation fractions E(mij)/E(m).
Some empirical comparisons were made between (3.10)–(3.12) and output from the optbud procedure, implemented in STATA by Reilly and Salim (2000), which also addresses the 2Ps design with a fixed budget. The comparison requires input of ‘pilot data’ constructed to yield π̂i=πi, 𝛉̂i=𝛉i, and λ̂i=λi, where πi, 𝛉i, and λi were the values used in (3.10)–(3.12). Identical results were obtained in each case. A similar comparison is not possible when R0 is a design parameter since the software is built on the assumption that the first-phase distribution of Y is already fixed.
3.4 Optimal 2Ps design but with stratification by Y only
It can also be shown that (3.13) is equal to equation (4) from Palmgren (1987, p. 692) which was derived using a likelihood approach and also equation (10) from Dahm et al. (1995, p. 2593) for a design in which the roles of X and Z were reversed. Formulae given by Greenland (1988b) do not appear to be equivalent. Palmgren then restricted the design further so that n1=n0 and νi=mi./n=ν, i=0, 1, and found the optimal value of ν.
The advantage of stratification by Z will be most marked when c2/c1 is large, and Z is accurate—without the latter condition, both gi and
4 SP2 DESIGN—n, R0, ANDmFIXED IN ADVANCE
4.1 Optimal design
From (4.1), the optimal fraction of cases among phase 2 subjects is
π0 . | eβ . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | 1.5 . | 2 . | 3 . | 5 . | 7 . | 10 . | |||||
0.001 | 0.45 (1) | 0.41 (1) | 0.37 (4) | 0.31 (7) | 0.28 (10) | 0.24 (13) | |||||
0.01 | 0.45 (< 1) | 0.42 (1) | 0.37 (3) | 0.32 (6) | 0.29 (9) | 0.26 (11) | |||||
0.05 | 0.46 (< 1) | 0.43 (1) | 0.39 (2) | 0.35 (4) | 0.33 (6) | 0.31 (7) | |||||
0.1 | 0.46 (< 1) | 0.44 (1) | 0.41 (2) | 0.39 (3) | 0.38 (3) | 0.38 (3) | |||||
0.2 | 0.47 (< 1) | 0.46 (< 1) | 0.45 (1) | 0.45 (1) | 0.45 (< 1) | 0.47 (< 1) | |||||
0.3 | 0.48 (< 1) | 0.48 (< 1) | 0.48 (< 1) | 0.50 (< 1) | 0.51 (< 1) | 0.54 (< 1) |
π0 . | eβ . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | 1.5 . | 2 . | 3 . | 5 . | 7 . | 10 . | |||||
0.001 | 0.45 (1) | 0.41 (1) | 0.37 (4) | 0.31 (7) | 0.28 (10) | 0.24 (13) | |||||
0.01 | 0.45 (< 1) | 0.42 (1) | 0.37 (3) | 0.32 (6) | 0.29 (9) | 0.26 (11) | |||||
0.05 | 0.46 (< 1) | 0.43 (1) | 0.39 (2) | 0.35 (4) | 0.33 (6) | 0.31 (7) | |||||
0.1 | 0.46 (< 1) | 0.44 (1) | 0.41 (2) | 0.39 (3) | 0.38 (3) | 0.38 (3) | |||||
0.2 | 0.47 (< 1) | 0.46 (< 1) | 0.45 (1) | 0.45 (1) | 0.45 (< 1) | 0.47 (< 1) | |||||
0.3 | 0.48 (< 1) | 0.48 (< 1) | 0.48 (< 1) | 0.50 (< 1) | 0.51 (< 1) | 0.54 (< 1) |
π0 . | eβ . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | 1.5 . | 2 . | 3 . | 5 . | 7 . | 10 . | |||||
0.001 | 0.45 (1) | 0.41 (1) | 0.37 (4) | 0.31 (7) | 0.28 (10) | 0.24 (13) | |||||
0.01 | 0.45 (< 1) | 0.42 (1) | 0.37 (3) | 0.32 (6) | 0.29 (9) | 0.26 (11) | |||||
0.05 | 0.46 (< 1) | 0.43 (1) | 0.39 (2) | 0.35 (4) | 0.33 (6) | 0.31 (7) | |||||
0.1 | 0.46 (< 1) | 0.44 (1) | 0.41 (2) | 0.39 (3) | 0.38 (3) | 0.38 (3) | |||||
0.2 | 0.47 (< 1) | 0.46 (< 1) | 0.45 (1) | 0.45 (1) | 0.45 (< 1) | 0.47 (< 1) | |||||
0.3 | 0.48 (< 1) | 0.48 (< 1) | 0.48 (< 1) | 0.50 (< 1) | 0.51 (< 1) | 0.54 (< 1) |
π0 . | eβ . | . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | 1.5 . | 2 . | 3 . | 5 . | 7 . | 10 . | |||||
0.001 | 0.45 (1) | 0.41 (1) | 0.37 (4) | 0.31 (7) | 0.28 (10) | 0.24 (13) | |||||
0.01 | 0.45 (< 1) | 0.42 (1) | 0.37 (3) | 0.32 (6) | 0.29 (9) | 0.26 (11) | |||||
0.05 | 0.46 (< 1) | 0.43 (1) | 0.39 (2) | 0.35 (4) | 0.33 (6) | 0.31 (7) | |||||
0.1 | 0.46 (< 1) | 0.44 (1) | 0.41 (2) | 0.39 (3) | 0.38 (3) | 0.38 (3) | |||||
0.2 | 0.47 (< 1) | 0.46 (< 1) | 0.45 (1) | 0.45 (1) | 0.45 (< 1) | 0.47 (< 1) | |||||
0.3 | 0.48 (< 1) | 0.48 (< 1) | 0.48 (< 1) | 0.50 (< 1) | 0.51 (< 1) | 0.54 (< 1) |
Now consider the optimum fraction of subjects with Z=1 among second-phase cases (controls); from (4.1) this is
1−λ . | θ . | . | . | . | . | . | . | . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | 0.99 . | 0.95 . | 0.9 . | 0.8 . | 0.7 . | 0.6 . | 0.5 . | 0.4 . | |||||||
0.99 | 0.50 (0) | 0.30 (7) | 0.23 (13) | 0.17 (20) | 0.13 (24) | 0.11 (27) | 0.09 (29) | 0.08 (31) | |||||||
0.95 | 0.70 | 0.50 (0) | 0.41 (2) | 0.31 (7) | 0.26 (11) | 0.22 (15) | 0.19 (18) | 0.16 (21) | |||||||
0.9 | 0.77 | 0.59 | 0.50 (0) | 0.40 (2) | 0.34 (5) | 0.29 (8) | 0.25 (12) | 0.21 (15) | |||||||
0.8 | 0.83 | 0.69 | 0.60 | 0.50 (0) | 0.43 (1) | 0.38 (3) | 0.33 (5) | 0.29 (8) | |||||||
0.7 | 0.87 | 0.74 | 0.66 | 0.57 | 0.50 (0) | 0.44 (1) | 0.40 (2) | 0.35 (5) | |||||||
0.6 | 0.89 | 0.78 | 0.71 | 0.62 | 0.56 | 0.50 (0) | 0.45 (1) | — | |||||||
0.5 | 0.91 | 0.81 | 0.75 | 0.67 | 0.60 | 0.55 | — | — | |||||||
0.4 | 0.92 | 0.84 | 0.79 | 0.71 | 0.65 | — | — | — |
1−λ . | θ . | . | . | . | . | . | . | . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | 0.99 . | 0.95 . | 0.9 . | 0.8 . | 0.7 . | 0.6 . | 0.5 . | 0.4 . | |||||||
0.99 | 0.50 (0) | 0.30 (7) | 0.23 (13) | 0.17 (20) | 0.13 (24) | 0.11 (27) | 0.09 (29) | 0.08 (31) | |||||||
0.95 | 0.70 | 0.50 (0) | 0.41 (2) | 0.31 (7) | 0.26 (11) | 0.22 (15) | 0.19 (18) | 0.16 (21) | |||||||
0.9 | 0.77 | 0.59 | 0.50 (0) | 0.40 (2) | 0.34 (5) | 0.29 (8) | 0.25 (12) | 0.21 (15) | |||||||
0.8 | 0.83 | 0.69 | 0.60 | 0.50 (0) | 0.43 (1) | 0.38 (3) | 0.33 (5) | 0.29 (8) | |||||||
0.7 | 0.87 | 0.74 | 0.66 | 0.57 | 0.50 (0) | 0.44 (1) | 0.40 (2) | 0.35 (5) | |||||||
0.6 | 0.89 | 0.78 | 0.71 | 0.62 | 0.56 | 0.50 (0) | 0.45 (1) | — | |||||||
0.5 | 0.91 | 0.81 | 0.75 | 0.67 | 0.60 | 0.55 | — | — | |||||||
0.4 | 0.92 | 0.84 | 0.79 | 0.71 | 0.65 | — | — | — |
Assuming 𝛉−λ≥0.
SE results are symmetric about the diagonal.
1−λ . | θ . | . | . | . | . | . | . | . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | 0.99 . | 0.95 . | 0.9 . | 0.8 . | 0.7 . | 0.6 . | 0.5 . | 0.4 . | |||||||
0.99 | 0.50 (0) | 0.30 (7) | 0.23 (13) | 0.17 (20) | 0.13 (24) | 0.11 (27) | 0.09 (29) | 0.08 (31) | |||||||
0.95 | 0.70 | 0.50 (0) | 0.41 (2) | 0.31 (7) | 0.26 (11) | 0.22 (15) | 0.19 (18) | 0.16 (21) | |||||||
0.9 | 0.77 | 0.59 | 0.50 (0) | 0.40 (2) | 0.34 (5) | 0.29 (8) | 0.25 (12) | 0.21 (15) | |||||||
0.8 | 0.83 | 0.69 | 0.60 | 0.50 (0) | 0.43 (1) | 0.38 (3) | 0.33 (5) | 0.29 (8) | |||||||
0.7 | 0.87 | 0.74 | 0.66 | 0.57 | 0.50 (0) | 0.44 (1) | 0.40 (2) | 0.35 (5) | |||||||
0.6 | 0.89 | 0.78 | 0.71 | 0.62 | 0.56 | 0.50 (0) | 0.45 (1) | — | |||||||
0.5 | 0.91 | 0.81 | 0.75 | 0.67 | 0.60 | 0.55 | — | — | |||||||
0.4 | 0.92 | 0.84 | 0.79 | 0.71 | 0.65 | — | — | — |
1−λ . | θ . | . | . | . | . | . | . | . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | 0.99 . | 0.95 . | 0.9 . | 0.8 . | 0.7 . | 0.6 . | 0.5 . | 0.4 . | |||||||
0.99 | 0.50 (0) | 0.30 (7) | 0.23 (13) | 0.17 (20) | 0.13 (24) | 0.11 (27) | 0.09 (29) | 0.08 (31) | |||||||
0.95 | 0.70 | 0.50 (0) | 0.41 (2) | 0.31 (7) | 0.26 (11) | 0.22 (15) | 0.19 (18) | 0.16 (21) | |||||||
0.9 | 0.77 | 0.59 | 0.50 (0) | 0.40 (2) | 0.34 (5) | 0.29 (8) | 0.25 (12) | 0.21 (15) | |||||||
0.8 | 0.83 | 0.69 | 0.60 | 0.50 (0) | 0.43 (1) | 0.38 (3) | 0.33 (5) | 0.29 (8) | |||||||
0.7 | 0.87 | 0.74 | 0.66 | 0.57 | 0.50 (0) | 0.44 (1) | 0.40 (2) | 0.35 (5) | |||||||
0.6 | 0.89 | 0.78 | 0.71 | 0.62 | 0.56 | 0.50 (0) | 0.45 (1) | — | |||||||
0.5 | 0.91 | 0.81 | 0.75 | 0.67 | 0.60 | 0.55 | — | — | |||||||
0.4 | 0.92 | 0.84 | 0.79 | 0.71 | 0.65 | — | — | — |
Assuming 𝛉−λ≥0.
SE results are symmetric about the diagonal.
4.2 Balanced SP2 design
These considerations suggest that the balanced designs will be near optimal in a wide range of situations and also enable prediction of situations where there will be major inefficiency. For example, if π0=0.01, π1=0.1, 𝛉0=0.50, 1−λ0=0.99, 𝛉1=0.99, and 1−λ1=0.50, then (a00, a01, a10, a11)opt=(0.682, 0.069, 0.023, 0.226) and (4.4) leads to an SE ratio of 1.45. When variance terms of order n−1 are taken into account, the negative impact of a balanced design will be less than that (4.4) predicts.
5 EXTENSIONS TO OTHER PROBLEMS
5.1 Optimal designs for estimation of interactions with exposure
Suppose now that the goal is to estimate the interaction between X and U as represented by the term βP in the model, log odds=α+βXX+βUU+βPXU. Besides Z, an accurately measured categorical risk factor U is ascertained in the first phase, while X is to be measured as before in the second phase on a sample of size m. The design which minimizes SE (βP) is required.
Breslow and Chatterjee (1999) advocated a balanced design here also, where balance now means equal numbers in the cross-classification by Y, Z, and U. The software from Reilly and Salim (2000) provides an empirical solution to this problem when R0 is fixed and given pilot data.
5.2 Cohort designs with surrogate and error-free outcome measures
Consider a 2P cohort study where, in the first phase, n0 and n1 subjects are chosen from unexposed (X=0) and exposed (X=1) populations; the classification by X is error free. All are measured using an error-prone surrogate of disease, Z. In a second phase, true disease status Y is measured accurately on a sample of subjects. A clinical trial where subjects are randomly allocated to treatments X=0 or X=1 and which employs a surrogate measure of the eventual outcome Y would also have the same structure. To use the previous results without introducing a new notation, we simply reverse the roles of Y and X. Now πi=Pr(Y = 1|X = i), with β defined as before, and 𝛉i and 1−λi are the sensitivity and specificity of Z for Y in group X=i. Assuming differential misclassification by the surrogate measure in the two X groups, and that the odds ratio is the parameter of interest, all the results derived earlier apply. In general, the differential misclassification assumption may be less valid in this setting but in the next section an example of this type, quoted by Pepe et al. (1994), is considered.
6 EXAMPLES
6.1 Example 1: 2P case–control study with error-prone exposures in phase 1
In the study of cervical cancer and HSV analyzed by Carroll et al. (1993), the western blot method (Z) exhibited differential misclassification, 𝛉 and 1−λ being 0.784 and 0.811 for cases but 0.576 and 0.688 for controls, respectively. π1 and π0 are estimated as 0.591 and 0.440, respectively. Their maximum likelihood analysis based on 2044 first-phase subjects classified by Z and a simple random sample of 115 subjects classified by the more refined X gave β̂=0.609, with an SE of 0.350. Application of (2.2) and (2.3) to data in their Table 1 gives identical results.
In planning a new study to estimate β, we need estimates of fi, gi, g̃, and ρ̃. An expression for ρi in terms of 𝛉i, λi, and πi is given by McNamee (2003). From the original study, we can estimate f0=2.015, f1=2.034, g0=0.964, g1=0.803, ρ̃=0.883, ρ0=0.265, ρ1=0.587, and ρ̃=0.427. The maximum reduction in SE (β) from 2Ps design is 1−g̃ or 12%. For finite values of c2/c1, the reduction is less and depends on ρ̃. Consider two extremes: c2/c1=100 and c2/c1=5. From (3.8), there will be a reduction in SE of about 7% in the former case and an increase of 7% in the latter. In the latter case, we would conclude that a 1P study based on X is preferable. Here, we proceed to find the optimal 2Ps design when c1=1, c2=100, and B=13 600 monetary units.
Expressions (3.9) and (4.1) together with B=n[c1 + c2mn−1] are easiest for calculation. From (3.9), the optimal m/n is 0.207 and so n=627, m=130. From (3.5),
6.2 Example 2: a clinical trial with surrogate and accurate outcome measures
Pepe et al. (1994) considered design of a 2P trial to compare chemotherapy (X = 0) and chemotherapy + radiotherapy (X = 1) in the treatment of ovarian cancer, where the outcome of interest was tumor eradication. X-ray determination was the surrogate outcome measure, Z, for all subjects, while a more accurate determination, Y, was to be made on a sample. They assumed π1=Pr(Y = 1|X = 1)=0.40, π0=Pr(Y = 1|X = 0)=0.20, where Y=1 denotes tumor eradication. Misclassification was assumed to be differential because of radiation-induced scarring: 𝛉0=0.95, 1−λ0=0.75, whereas 𝛉1=0.75=1−λ1. They found the optimal SP2 design, assuming n1=n0=100 and m=60, using their mean score method. Applying (4.1) and (4.2) instead gives (m00, m01, m10m11)/m=(0.139, 0.351, 0.255, 0.255) and V(β)=0.231. These figures agree with the authors' results [which were expressed as
7 DISCUSSION
This paper has concentrated on the simplest 2P, ‘errors in variables’, design for etiological studies so as to provide general insights into efficiency. For dichotomous X, we have shown that the maximum benefit for efficiency can be easily predicted from a simple function, g̃, of sensitivity and specificity, and that it may be quite small. Given that the administration of 2P studies may be more difficult than 1P studies with possibly higher refusal rates (Deming, 1977), a small theoretical gain in efficiency may not be enough to justify their use.
The limitations of the results should be noted: X is dichotomous although Z could be a continuous variable categorized for sampling; differential misclassification is assumed; and the cost model ignores costs other than those associated with exposure determination. The theory can be adapted to situations where any of the initial calculations from (3.6) yield an ‘optimal’ sampling fraction greater than 1; this can occur when c2/c1 is small and the πi are small. This problem can be solved theoretically by setting the excessive fractions equal to 1 and finding new optimal values for the others subject to this constraint, but the software by Reilly and Salim (2000) provides convenient empirical solutions. Importantly, these additionally constrained optimal designs lead to greater variances than unconstrained problems, thus reducing efficiency even more compared to (3.8).
2P designs are likely to be substantially more efficient when nondifferential misclassification is correctly assumed. Empirical comparison (not shown) of minimum variances given in Table 1 of Palmgren (1987) for the nondifferential model with stratification by Y only, with the equivalent minimum derived here for the differential model, showed that the former assumption halved variance in some cases. The wide variation in efficiency between analysis methods found in comparative studies (e.g. Breslow and Chatterjee, 1999; Spiegelman and Casella, 1997; Sturmer et al., 2002) may be partly explained by this factor. The nondifferential/differential distinction has also been shown to have implications for testing Ho: β=0: Palmgren (1987) found that a 2Ps design was never efficient for this task compared with 1P studies of X alone or Z alone. Accepting the differential assumption here, one might still question whether there is an estimation method more efficient than (2.1) and (2.4). There is evidence against this from the present study since (2.4) coincides with Carroll's maximum likelihood estimates in Example 1 and, in the case where there is stratification by Y only, (3.13) is equal to Palmgren's likelihood-based formula.
The addition of a second phase to a preexisting study based on Z need not be justified on the same cost grounds as a 2Ps study. The simple formulae (4.1) for the optimal allocation fractions, aij, give insight into the role played by parameters, such as sensitivity and specificity, as well as into the performance of balanced designs. They also show that optimal allocation at the second phase is independent of first-phase design. All these results are based on a differential misclassification assumption. An interesting, unanswered question is whether (4.1) would also be optimal under nondifferential misclassification.
Empirical comparisons suggest that, where there is an overlap between the optimality problems treated here and those covered by Reilly and Salim's software, the results are identical both in terms of design and standard error—provided that the pilot data supplied to the software is made to yield estimates of πi, 𝛉i, and λi identical to the correct values. It seems highly likely that their likelihood-based method, given appropriate distributional assumptions, is equivalent to the formulae here. The advantage of these formulae is their simplicity; also, by identifying the parameters which are the ‘drivers’ of optimal design, they show how one can easily evaluate the impact of uncertainty in estimates of these parameters. However, the software covers a much wider range of problems, including ‘missing data designs’, but with the restriction that the joint distribution of the first-phase variables is fixed in advance.
References
BRESLOW, N. E. AND CAIN, K. C. (
BRESLOW, N. E. AND CHATTERJEE, N. (
BRESLOW, N. E. AND HOLUBKOV, R. (
CAIN, K. AND BRESLOW, N. E. (
CARROLL, R. J., GAIL, M. H. AND LUBIN, J. H. (
DAHM, P. F., GAIL, M. H., ROSENBERG, P. S. AND PEE, D. (
DEMING, W. E. (
GREENLAND, S. (
GREENLAND, S. (
HOLCROFT, C. A. AND SPIEGELMAN, D. (
MCNAMEE, R. (
PALMGREN, J. (
PEPE, M. S., REILLY, M. AND FLEMING, T. R. (
REILLY, M. (
REILLY, M. AND PEPE, M. S. (
REILLY, M. AND SALIM, A. (
SPIEGELMAN, D. AND CASELLA, M. (
STURMER, T., THURIGEN, D., SPIEGELMAN, D., BLETTNER, M. AND BRENNER, H. (
THURIGEN, D., SPIEGELMAN, D., BLETTNER, M., HEUER, C. AND BRENNER, H. (