Estimating in Real Time the Efficacy of Measures to Control Emerging Communicable Diseases

Cauchemez, Simon; Boëlle, Pierre-Yves; Thomas, Guy; Valleron, Alain-Jacques

doi:10.1093/aje/kwj274

Abstract

Controlling an emerging communicable disease requires prompt adoption of measures such as quarantine. Assessment of the efficacy of these measures must be rapid as well. In this paper, the authors present a framework to monitor the efficacy of control measures in real time. Bayesian estimation of the reproduction number R (mean number of cases generated by a single infectious person) during an outbreak allows them to judge rapidly whether the epidemic is under control (R < 1). Only counts and time of onset of symptoms, plus tracing information from a subset of cases, are required. Markov chain Monte Carlo and Monte Carlo sampling are used to infer the temporal pattern of R up to the last observation. The operating characteristics of the method are investigated in a simulation study of severe acute respiratory syndrome–like outbreaks. In this particular setting, control measures lacking efficacy (R ≥ 1.1) could be detected after 2 weeks in at least 70% of the epidemics, with less than a 5% probability of a wrong conclusion. When control measures are efficacious (R = 0.5), this situation may be evidenced in 68% of the epidemics after 2 weeks and 92% of the epidemics after 3 weeks, with less than a 5% probability of a wrong conclusion.

communicable diseases, emerging, disease outbreaks, epidemiologic methods, population surveillance, SARS virus

SARS, severe acute respiratory syndrome

Outbreaks of severe communicable diseases usually prompt rapid intervention, as was the case with severe acute respiratory syndrome (SARS) in 2003 and for a number of epidemics reported to the Communicable Disease Surveillance and Response division of the World Health Organization (http://www.who.int/csr/en/). However, assessing the effectiveness of these interventions during the course of the epidemic remains a difficult task because of a lack of convenient tools.

The reproduction number R (the average number of cases generated by a single infectious person) is central to monitoring efficacy, since a natural threshold allows one to distinguish epidemics under control (R < 1) from others (R > 1) (1). The change in R with time could be obtained from perfect tracing of transmission but is seldom feasible because of biologic as well as logistic limitations in the field.

When the transmission rate is low and cases are clustered, the distribution of cluster sizes may be used to estimate R and detect increases in transmissibility (2–5).

In the epidemic context, it is possible to model transmission and derive R from the fit to available data, as was done with SARS, for instance (6). Doing so requires a detailed description of the natural history of the disease—likely to be missing at first for an emerging disease—and makes the approach very disease dependent. An alternative approach, which requires fewer assumptions, consists of reconstructing the chain of transmission. This approach is feasible, for example, when detailed data containing the spatial location of the cases are available (7, 8). In many instances, available data consist simply of the epidemic curve and partial tracing information. Generic estimation methods that could be used routinely with such data would be very useful for public health decision makers who have to determine in real time whether control measures should be reinforced.

It is remarkable that an estimate of R may be obtained from the epidemic curve alone, after marginalization over all possible chains of transmission, as described by Wallinga and Teunis (9). More precisely, these authors showed that, in the retrospective case, the distribution of the generation interval (the time separating onset of symptoms in an index case and that of a typical secondary case) enabled weighting of the chains of transmission and coming up with an estimate. In Cauchemez et al. (10), we extended their work to real time. However, in practice, the use of this last approach would be limited by two elements: 1) precise knowledge of the generation interval may not be available beforehand, especially for an emerging communicable disease; and 2) the use of daily estimates, often highly variable, makes appreciating the effect of control measures difficult.

In the following article, we describe a general estimation method that overcomes these problems. The approach requires no prior knowledge of the generation interval, but it uses partial tracing information that accrues during the outbreak. In a Bayesian framework, real-time estimates of R are therefore obtained from the epidemic curve and partial tracing information, simultaneously accounting for censored incubation times during the course of the outbreak and poor initial knowledge of the natural history of the disease. The new framework allows real-time estimation of R for any time period (as opposed to our previous study (10)), improving precision of the estimates and making it easier to judge the efficacy of control measures. In a simulation study, we use surrogate SARS-like outbreaks to investigate how the method performs.

MATERIALS AND METHODS

Assumptions and notations

Consider an epidemic outbreak observed from day 0 to day T. We make the following assumptions regarding the epidemic and collected data:

H1: No cases are imported during the course of the epidemic. In other words, all cases in the epidemic may ultimately be traced to those present at day 0.
H2: All cases are detected.
H3: Secondary cases are always reported after their index case.
H4: The characteristics of the generation interval do not change throughout the epidemic.
H5: Traced cases are a random sample of all cases.
H6: In the absence of an observed epidemic curve and of tracing information, all chains of transmission are equally likely.

We denote O_j the day of symptoms onset for case j and U_j the day of symptoms onset for his or her index case. At time T, a case j is in the “traced” set K_T when both O_j and U_j are known and linked together and is in the “untraced” set J_T when only O_j is known.

The generation interval of case j is the time O_j − U_j from symptoms onset of his or her index case to his or her own symptoms onset. A Weibull distribution with shape parameter η and scale parameter γ is assumed for the generation interval, with w(x; γ,η) = exp{−(x/γ)^η} − exp{−((x + 1)/γ)^η} the probability that the generation interval lasts x days.

For t ≤ T, we define n_t as the number of cases with symptoms onset at day t and X_t as the number of cases they infected. We will make use of the following decomposition:

\[X_{t}{=}X_{t}^{{-}}(T){+}X_{t}^{{+}}(T),\]

where the number of secondary cases from cases with onset at time t has been split into those with onset before or at

\(T(X_{t}^{{-}}(T),\)

“early” cases) and those with onset after

\(T(X_{t}^{{+}}(T),\)

“late” cases). Obviously, late secondary cases are censored at time T.

We define the reproduction number

\[R_{{\Delta}}{=}\frac{{\sum}_{t{\in}{\Delta}}X_{t}}{{\sum}_{t{\in}{\Delta}}n_{t}}\]

over a time period Δ as the average number of cases infected by a case with symptoms onset occurring during Δ. Therefore, the reproduction number is defined only when at least one case has been observed during the period Δ.

Statistical analysis

Bayesian hierarchical model.

A four-level Bayesian hierarchical model, described in detail in appendix 1, is built. The first level specifies the prior distribution of parameters {γ,η} characterizing the generation interval. The second level gives the likelihood of the observed sample of generation intervals. The third level allows reconstruction of complete case tracing data, where each untraced case is allocated to an observed index case (9). The last level characterizes the late number of secondary cases given the early number of secondary cases and allows correction for censorship of late secondary cases (10).

Predictive distribution of the temporal pattern of R.

Given observations and parameters {γ,η}, it is straightforward to sample from the distribution of the temporal pattern of (X_t)_t≤T considering the two last levels of the full Bayesian model. In appendix 2, we construct the predictive distribution of (X_t)_t≤T when parameters {γ,η} are imperfectly known and estimated from the subset of traced cases. In particular, we design a three-step procedure to sample from this distribution. A Markov chain Monte Carlo algorithm is set up to sample from the posterior distribution of parameters {γ,η} given available tracing data; the output of the Markov chain Monte Carlo procedure is then used in a Monte Carlo algorithm to reconstruct complete case tracing data, where each untraced case is allocated to an observed primary case; eventually, we correct for censorship of late secondary cases. Figure 1 summarizes the statistical framework.

FIGURE 1.

Open in new tab Download slide

Statistical framework used to estimate the reproduction number with data (epidemic curve and subset of traced cases) available up to time T. To handle uncertainty about the generation interval, a Markov chain Monte Carlo (MCMC) algorithm is set up to investigate the posterior distribution of the generation interval given case tracing data. The output of the MCMC is then used in a Monte Carlo algorithm to make an inference about complete case tracing up to time T. Eventually, a correction is applied for the censorship of late secondary cases during the course of the epidemic.

A decision rule to reinforce control measures.

Assume that control measures are implemented at time T_CM (<T). Here, we propose a decision rule to determine whether control measures are efficacious (R < 1), whether they lack efficacy (R > 1), or whether more data are required for decision.

If, in interval Δ⊂[T_CM,T], there is high predictive probability that R_Δ < 1, efficacy of control measures should be concluded, whereas lack of efficacy should be concluded if there is low predictive probability that R_Δ < 1. To make this rule operational, we choose two thresholds s₁ percent, s₂ percent (s₁ < s₂) and conclude lack of efficacy when the predictive probability P(R_Δ < 1) is <s₁ and efficacy when P(R_Δ < 1) is >s₂. If s₁ ≤ P(R_Δ < 1) is ≤s₂, the conclusion is delayed until more data are available.

We divided interval [T_CM,T] into three equal-sized intervals

\(({[}T_{\mathrm{CM}},t_{\mathrm{min}}{]},{[}t_{\mathrm{min}},t_{\mathrm{max}}{]},{[}t_{\mathrm{max}},T{]})\)

and applied the rule to interval Δ = [t_min,t_max]. We took s₁ = 5 percent and s₂ = 95 percent in the applications.

Simulation of SARS-like outbreaks

A simulation study of SARS-like outbreaks was used to investigate the operating characteristics of the proposed decision rule. Control measures were implemented on day T_CM = 20 days. The mean reproduction number before T_CM was 3. Four scenarios were simulated, with a mean reproduction number after T_CM equal to 0.5, 0.7, 1.1, or 1.3. Case tracing was simulated for 5 percent of the cases. For each scenario, 300 epidemics were simulated. Simulations are detailed in appendix 3.

In a first stage, the estimation method is used to analyze a single epidemic, simulated with R = 0.7 after control measures. The posterior distribution of the generation interval, the predictive mean, and the 95 percent credible interval of the temporal pattern of R are given according to time T of the last observation.

In a second stage, we give, for each scenario, the probability that control measures are said to be efficacious; that control measures are said to lack efficacy; or that more data are required, 2 and 3 weeks after control measures are implemented.

RESULTS

Real-time surveillance of an epidemic

Collected data.

In figure 2, parts (a) and (b) show the simulated data. Part (a) shows the daily rate of case onset. A total of 2,086 subjects were infected, with a peak incidence at day 28.

FIGURE 2.

Open in new tab Download slide

Monitoring of a severe acute respiratory syndrome–like outbreak using simulated data. Daily rate of case onset (part a) and cumulative number of traced cases (part b). In the simulation, the reproduction number was 3 before day 20 and 0.7 after; case tracing was available for 5% of the cases. Mean (part c) and standard deviation ((SD), part d) of the generation interval (GI) derived from case tracing data, according to time T of the last observation (solid line: posterior mean; dotted lines: 95% credible interval; dashed line: simulation value). Part (e): expectation (solid line) and 95% credible interval (dotted line) of the reproduction number R calculated for the last 10 days of observation (over the period [T-10, T-3]), according to time T of the last observation; dashed line, simulation value.

Part (b) presents the cumulative number of traced cases. Twelve cases were traced after 3 weeks, 36 after 4 weeks, and 61 after 5 weeks. In total, 114 cases were traced during the epidemic.

Posterior distribution of the generation interval.

Figure 2 also shows the posterior mean and 95 percent credible interval of the mean (part c) and standard deviation (part d) of the generation interval according to time T of the last observation. The Markov chain Monte Carlo algorithm did not converge for T < 14 days because there were ≤3 traced cases.

For both the mean and standard deviation of the generation interval, the 95 percent credible intervals shortened as the number of traced cases increased.

The mean of the generation interval was slightly biased for small values of T. Its posterior mean was 6.5 days (T = 21 days), 7.0 days (T = 28 days), 7.7 days (T = 35 days), and 8.4 days (T > 50 days) for a simulation value equal to 8.4 days.

Temporal pattern of the reproduction number.

In figure 2, part (e) shows the expectation and 95 percent credible interval of the reproduction number calculated for the last 10 days of observation, according to time T of the last observation.

Real-time estimates of R could be computed only once the posterior distribution of the generation interval was available, that is, for T ≥ 14. The real-time expectation of R remained close to its simulation value, irrespective of T.

Evaluating the efficiency of control measures

Table 1 gives the probability that control measures are said to lack efficacy; that control measures are said to be efficacious; or that more data are required, after 2 and 3 weeks, according to the mean reproduction number R.

TABLE 1.

Posterior probability (%) that control measures are said to lack efficacy, that control measures are said to be efficacious, or that more data are required, after 2 and 3 weeks, according to the mean reproduction number R*

	After 2 weeks					After 3 weeks
	Lack of efficacy	More data required	Efficacy	Lack of efficacy	More data required	Efficacy
R = 0.5	1.8	30.6	67.6	0.3	7.3	92.4
R = 0.7	7.2	71.0	21.8	1.8	14.7	83.5
R = 1.1	77.9	20.7	1.4	71.6	23.6	4.8
R = 1.3	89.7	8.9	1.4	89.7	8.5	1.8

	After 2 weeks					After 3 weeks
	Lack of efficacy	More data required	Efficacy	Lack of efficacy	More data required	Efficacy
R = 0.5	1.8	30.6	67.6	0.3	7.3	92.4
R = 0.7	7.2	71.0	21.8	1.8	14.7	83.5
R = 1.1	77.9	20.7	1.4	71.6	23.6	4.8
R = 1.3	89.7	8.9	1.4	89.7	8.5	1.8

*

The decision rule is based on the predictive probability p that R < 1: if p < 5%, control measures lack efficacy; if p > 95%, control measures are efficacious. Otherwise, more data are required.

Open in new tab

TABLE 1.

Posterior probability (%) that control measures are said to lack efficacy, that control measures are said to be efficacious, or that more data are required, after 2 and 3 weeks, according to the mean reproduction number R*

	After 2 weeks					After 3 weeks
	Lack of efficacy	More data required	Efficacy	Lack of efficacy	More data required	Efficacy
R = 0.5	1.8	30.6	67.6	0.3	7.3	92.4
R = 0.7	7.2	71.0	21.8	1.8	14.7	83.5
R = 1.1	77.9	20.7	1.4	71.6	23.6	4.8
R = 1.3	89.7	8.9	1.4	89.7	8.5	1.8

	After 2 weeks					After 3 weeks
	Lack of efficacy	More data required	Efficacy	Lack of efficacy	More data required	Efficacy
R = 0.5	1.8	30.6	67.6	0.3	7.3	92.4
R = 0.7	7.2	71.0	21.8	1.8	14.7	83.5
R = 1.1	77.9	20.7	1.4	71.6	23.6	4.8
R = 1.3	89.7	8.9	1.4	89.7	8.5	1.8

*

The decision rule is based on the predictive probability p that R < 1: if p < 5%, control measures lack efficacy; if p > 95%, control measures are efficacious. Otherwise, more data are required.

Open in new tab

When control measures lack efficacy (R ≥ 1.1), the probability of making the wrong judgment is less than 5 percent after 2 and 3 weeks. When control measures are efficacious (R ≤ 0.7), that probability is less than 2 percent after 3 weeks but may reach 7.2 percent after 2 weeks.

When control measures are efficacious, a correct conclusion is obtained in 67.6 percent (R = 0.5) and 21.8 percent (R = 0.7) of the simulations after 2 weeks and in 92.4 percent (R = 0.5) and 83.5 percent (R = 0.7) of the simulations after 3 weeks.

When control measures lack efficacy, a correct conclusion is obtained in 89.7 percent (R = 1.3) and 77.9 percent (R = 1.1) of the simulations after 2 weeks and in 89.7 percent (R = 1.3) and 71.6 percent (R = 1.1) of the simulations after 3 weeks.

DISCUSSION

In this paper, we have proposed a statistical framework for real-time monitoring of communicable disease outbreaks. In this context, requiring as few data as possible is desirable in order to limit exposure of data collectors and allow efficient use of resources. The method we proposed simply used the epidemic curve and a subset of traced cases. The approach required no prior knowledge of the disease and could therefore be used for emerging communicable diseases.

The existence of a deterministic threshold (R = 1) with a simple interpretation in terms of control of the epidemic led us to propose formal rules for assessing the efficacy of control depending on this threshold. However, it is possible to define alternative rules; for example, one could assess whether R is decreasing by monitoring the posterior probability of the ratio of R in one period to an earlier one. Defining and evaluating other rules requires further research. The method could also be used to estimate the basic reproduction number R₀, which is the mean number of cases generated by a typical infectious person when the entire population is susceptible. This information would naturally arise from the value estimated in the early part of the outbreak, before many people recovered or died.

A natural choice would be to implement the decision rule for the whole postintervention period. However, this choice was not found optimal because 1) daily estimates of R are biased for the first days of the interval (the method smoothes the temporal pattern of R; refer to Cauchemez et al. (10)); and 2) daily estimates of R have a large variance for the last days of the interval (10). Our approach of discarding the first and last days of the interval reduced both the bias and variance of the estimates.

There is a compromise to reach between 1) the capture of the temporal pattern of R and 2) the precision of the estimates for each time period. Since precision increases with the number of symptoms onsets over the time period (10), the second objective contradicts the first one. Consequently, at the peak of the epidemic, it might be possible to estimate R during short periods, whereas, during periods of low incidence, it might be optimal to consider longer time periods.

A number of assumptions were made regarding the epidemic and the data collected. First, we assumed that no cases were imported during the course of the epidemic (H1). If this assumption is violated, the estimate of R will be inflated for the days preceding the arrival of such a case, albeit to a small extent. Next, we assumed that all cases were detected (H2), therefore excluding asymptomatic carriage or underreporting. The effect of underreporting on the estimate of R is not clear-cut: downward bias is likely because undetected cases are discounted from the total offspring of earlier cases. However, at the same time, upward bias may follow because the offspring of an undetected case will be allocated to observed cases. We assumed that secondary cases were always reported after their index case (H3) was by considering w(.) null for negative values. This assumption is required for real-time analysis. It could be violated, especially if the incubation period of the disease is long and variable.

The next two assumptions concerned the generation interval, which was supposed to be unknown at the beginning of the epidemic. We assumed that the parameters defining the generation interval would not change throughout the epidemic (H4). This assumption could be relaxed, at the price of increased variability in the estimates, by restricting estimation to the subset of the most recent traced cases and provided that tracing is carried out throughout the epidemic. In the application, we found that, at the very beginning of the epidemic, it might not be possible to provide estimates because too few cases were traced. Using a more informative prior on the generation interval could be considered, until a few dozen cases have been traced. First estimates of the mean of the generation interval were underestimated, which was expected because the first-generation intervals to be detected are the shortest ones. However, for our simulation, the bias remained small and had little effect on the predictive distribution of the reproduction number. Hypothesis H5 was that a random sample of all cases would be traced, which would not be the case if, for example, long generation intervals were unlikely to be sampled. Adopting a parametric form for the generation interval may make the procedure more robust in this respect. Truncation of the generation interval could also be considered explicitly.

Last, we assumed that, in the absence of an observed epidemic curve and tracing information, all chains of transmission were equally likely (H6). This assumption was required to reconstruct complete case tracing data up to the last observation (9). The framework could be extended to integrate more detailed information regarding the social network when available.

An alternative approach for inference would be to sample from the full Bayesian model (refer to appendixes 1 and 2; in particular, compare equations A5 and A6). However, there is little reason to think that it would improve the results because case tracing data are expected to be the main source of information on the generation interval. The approach would be more difficult to implement (11, 12), with little anticipated benefit. The use of two separate steps (inference on the generation interval from the observed case tracing data, followed by reconstruction of the complete case tracing data) appeared to be more pragmatic. If the epidemic curve significantly improved our knowledge of the generation interval, exploration of the full Bayesian model would reduce credible intervals for the reproduction number, but our approach would remain valid.

In the simulation study, we assumed that dates of symptoms onset and case tracing information were available in real time. In practice, we may expect symptoms onset to be available nearly in real time, with a time lag required for case tracing information. The proposed framework makes it possible to retrospectively add case tracing information.

We have presented a statistical framework for real-time surveillance of emerging infectious diseases. It should be of benefit to public health decision makers who have to determine in real-time whether to reinforce control measures.

APPENDIX 1

Bayesian Hierarchical Model

The full Bayesian model follows:

\begin{eqnarray*}&&P\left({\{}X_{t}^{{+}}(T),X_{t}^{{-}}(T){\}}_{t{\leq}T},{\{}O_{j}{\}}_{j{\in}K_{T}},{\{}U_{j}{\}}_{j{\in}J_{T}},\right.\ \\&&\left.\ \mathrm{{\theta}},\mathrm{{\alpha}},\mathrm{{\beta}}{\vert}{\{}U_{j}{\}}_{j{\in}K_{T}},{\{}O_{j}{\}}_{j{\in}J_{T}}\right)\\&&{\propto}P(\mathrm{{\theta}})P\left({\{}O_{j}{\}}_{j{\in}K_{T}}{\vert}{\{}U_{j}{\}}_{j{\in}K_{T}},\mathrm{{\theta}}\right)\\&&{\times}P\left({\{}U_{j}{\}}_{j{\in}J_{T}}{\vert}{\{}O_{j}{\}}_{j{\in}J_{T}{\cup}K_{T}},\mathrm{{\theta}}\right)\\&&{\times}P\left({\{}X_{t}^{{-}}(T){\}}_{t{\leq}T}{\vert}{\{}U_{j}{\}}_{j{\in}J_{T}{\cup}K_{T}}\right)\\&&{\times}P\left({\{}X_{t}^{{+}}(T){\}}_{t{\leq}T}{\vert}{\{}X_{t}^{{-}}(T){\}}_{t{\leq}T},{\{}O_{j}{\}}_{j{\in}J_{T}{\cup}K_{T}},\mathrm{{\theta}},\mathrm{{\alpha}},\mathrm{{\beta}}\right)\\&&{\times}P(\mathrm{{\alpha}},\mathrm{{\beta}})\end{eqnarray*}

Level 1.

The first term of equation A1 gives the prior distributions of θ = {γ,η}. Here, we specify vague Exponential priors Exp(0.001) for these parameters.

Level 2.

The second term of equation A1 gives the likelihood of the observed sample of generation intervals:

\[P\left({\{}O_{j}{\}}_{j{\in}K_{T}}{\vert}{\{}U_{j}{\}}_{j{\in}K_{T}}\mathrm{{\gamma}},\mathrm{{\eta}}\right){=}{{\prod}_{j{\in}K_{T}}}w(O_{j}{-}U_{j};\mathrm{{\gamma}},\mathrm{{\eta}})\]

Level 3.

Following Wallinga and Teunis (9), the third term of equation A1 gives the probability of a completely reconstructed case tracing, where each untraced case is allocated to an observed case.

Now, given complete case tracing data, the early number of secondary cases is readily available:

\[X_{t}^{{-}}(T){=}{{\sum}_{j{\in}J_{t}{\cup}K_{t}}}1{\{}U_{j}{=}t{\}}\]

The fourth term of equation A1 is therefore the indicator function.

Level 4.

The last terms of equation A1 constitute the predictive distribution of the late number of secondary cases given the early number of secondary cases and θ. In a previous study (10), we showed that, under the assumption that X_t is Poisson distributed with mean n_tλ, where λ has a vague Gamma prior (α = 10⁻⁵, β = 10⁻⁵), the predictive distribution of

\(X_{t}^{{+}}(T)\)

is Negative Binomial with number

\(X_{t}^{{-}}(T){+}\mathrm{{\alpha}}\)

and probability

\((n_{t}W(T{-}t;\mathrm{{\gamma}},\mathrm{{\eta}}){+}\mathrm{{\beta}})/(n_{t}{+}\mathrm{{\beta}}),\)

where

\(W(x;\mathrm{{\gamma}},\mathrm{{\eta}}){=}{\sum}_{u{=}0}^{x}w(u;\mathrm{{\gamma}},\mathrm{{\eta}}).\)

APPENDIX 2

Predictive Distribution of the Temporal Pattern of R

We have to predict

\(Y{=}{\{}{\{}X_{t}^{{+}}(T){\}}_{t{\leq}T},{\{}U_{j}{\}}_{j{\in}J_{T}}{\}}\)

given observations

\(D{=}{\{}{\{}O_{j}{\}}_{j{\in}J_{T}},{\{}O_{j},U_{j}{\}}_{j{\in}K_{T}}{\}}\)

to determine the temporal pattern of R.

If θ = {γ,η} was known, it would be a straightforward process to sample from P(Y|θ, D) (refer to levels 3 and 4 of the full Bayesian model (appendix 1)). In practice, however, θ is a nuisance parameter. A variety of approaches have been developed to make inferences in the presence of nuisance parameters (13, 14). In Wallinga and Teunis (9) and Cauchemez et al. (10), inference was implicitly based on

\(\mathrm{{\bar{{\theta}}}}\)

that maximized the likelihood of the observed sample of generation intervals. An alternative approach is to integrate out the uncertainty on θ by using a predictive distribution of the form

\[P(Y{\vert}D){=}{\int}f(\mathrm{{\theta}})P(Y{\vert}\mathrm{{\theta}},D)d\mathrm{{\theta}}\]

with a suitable f(θ) representing uncertainty about its value. Several choices are possible. One could, for example, use the prior distribution of θ. A more sensible choice is the posterior distribution given all observed data P(θ|D) (15, 16), yielding the following predictive distribution:

\[P(Y{\vert}D){=}{\int}P\left(\mathrm{{\theta}}{\vert}{\{}O_{j}{\}}_{j{\in}J_{T}},{\{}O_{j},U_{j}{\}}_{j{\in}K_{T}}\right)P(Y{\vert}\mathrm{{\theta}},D)d\mathrm{{\theta}}\]

This quantity may be numerically approximated by Monte Carlo integration. In this case, samples from the full Bayesian model must be obtained (11, 12). However, in our problem, case tracing data are expected to be the main source of information on the generation interval. A reasonable choice for f(θ) is therefore to use the posterior distribution of θ given traced cases only, yielding the following predictive distribution:

\[P(Y{\vert}D){=}{\int}P\left(\mathrm{{\theta}}{\vert}{\{}O_{j},U_{j}{\}}_{j{\in}K_{T}}\right)P(Y{\vert}\mathrm{{\theta}},D)d\mathrm{{\theta}}\]

This formulation makes the calculation easier because the first term in the integrand is proportional to the product of levels 1 and 2 of the full Bayesian model, and the second term is the product of levels 3 and 4 of the full Bayesian model.

To draw a sample Y₁, …, Y_N from equation A6, we proceed as follows (15):

Step 1: draw a sample θ₁, …, θ_N from
\(P(\mathrm{{\theta}}{\vert}{\{}O_{j},U_{j}{\}}_{j{\in}K_{T}}),\)
defined as the product of levels 1 and 2 of the full Bayesian model.
Step 2: for i = 1, …, N, for each untraced case j in J_T, draw the day
\(U_{j}^{i}\)
of symptoms of his or her primary case given θ_i and the epidemic curve from level 3 of the full Bayesian model.
Step 3: for i = 1, …, N, for each t ≤ T, draw
\(X_{t}^{i{+}}(T)\)
given θ_i and
\(X_{t}^{i{-}}(T){=}{\sum}_{j{\in}J_{T}{\cup}K_{T}}1(U_{j}^{i}{=}t)\)
from level 4 of the full Bayesian model.

For step 1, we designed a Metropolis-Hastings Markov chain Monte Carlo algorithm (17). We performed 10 × N iterations for each run of this algorithm. The output was then recorded once every 10 iterations to constitute a sample from the posterior distribution. For step 2, we followed Wallinga and Teunis (9), who have shown that, given θ and the epidemic curve

\({\{}O_{j}{\}}_{j{\in}J_{T}{\cup}K_{T}},\)

\begin{eqnarray*}&&P\left(U_{j}{=}t{\vert}{\{}O_{j}{\}}_{j{\in}J_{T}{\cup}K_{T}},\mathrm{{\gamma}},\mathrm{{\eta}}\right)\\&&{=}\frac{(n_{t}{-}1{\{}O_{j}{=}t{\}})w(O_{j}{-}t;\mathrm{{\gamma}},\mathrm{{\eta}})}{{\sum}_{k{\leq}T}(n_{k}{-}1{\{}O_{j}{=}k{\}})w(O_{j}{-}k;\mathrm{{\gamma}},\mathrm{{\eta}})}\end{eqnarray*}

where n_t is the number of cases reported on day t. Equation A7 implies that the imputed date of symptoms onset for the index case must be a day in which at least one case was reported. In other words, imputation amounts to adding a “connection” between two existing cases, but never to create a nonexisting case. Step 3 was straightforward.

Eventually, we obtain a sample

\(({\{}x_{t}^{i}{\}}_{t{\leq}T})_{i{=}1,{\ldots},N}\)

of N realizations of the temporal pattern (X_t)_t≤T drawn from its predictive distribution. Then, partitioning the temporal period 0-T in M consecutive periods

\({\{}{\Delta}_{m}{\}}_{m{=}1,{\ldots},M},\)

equation 1 implies that

\[\left(\left\{r_{{\Delta}_{m}}^{i}{=}\left({{\sum}_{t{\in}{\Delta}_{m}}}x_{t}^{i}\right)\left/\left({{\sum}_{t{\in}{\Delta}_{m}}}n_{t}\right)\right\}_{m{=}1,{\ldots},M}\right)\right._{i{=}1,{\ldots},N}\]

constitutes a sample of size N drawn from the joint predictive distribution of the temporal pattern of the reproduction number

\({\{}R_{{\Delta}_{m}}{\}}_{m{=}1,{\ldots},M}.\)

In practice, we draw a sample of size N = 5,000.

APPENDIX 3

Simulation of SARS-like Outbreaks

Simulated epidemics were started with 10 index cases. The generation interval had a Weibull distribution with a mean of 8.4 days and a standard deviation of 3.8 days to reflect SARS (18). Control measures were implemented on day T_CM = 20 days. For each case detected before T_CM days, the number of secondary cases was drawn from a Negative Binomial with a mean of 3 and a shape parameter of 0.18 (9). Four scenarios were considered in which the mean reproduction number after T_CM was r = 0.5, 0.7, 1.1, and 1.3. Accordingly, the shape parameter k was calculated as k = −0.0236r² + 0.131r, which is consistent with Wallinga and Teunis (9). Case tracing was simulated for 5 percent of the cases. For each scenario, 300 epidemics were simulated.

We chose T_CM = 20 days so that, in the scenario consistent with the SARS outbreak in Hong Kong in 2003 (R = 0.7 after T_CM) (9), the expected number of cases (n = 1,781) was close to the number observed during the epidemic (n = 1,755) (19).

This study was funded by the EU Sixth Framework Programme for research for policy support (contract SP22-CT-2004-511066).

The authors thank Drs. Peter Dodd and Elizabeth Halloran for their useful comments on the manuscript.

Conflict of interest: none declared.

References

1.

Anderson RM, May RM. Infectious diseases of humans: dynamics and control. Oxford, United Kingdom: Oxford University Press,

1991

.

2.

De Serres G, Gay NJ, Farrington CP. Epidemiology of transmissible diseases after elimination.

Am J Epidemiol

2000

;

151

:

1039

–48; discussion 1049–52.

3.

Farrington CP, Kanaan MN, Gay NJ. Branching process models for surveillance of infectious diseases controlled by mass vaccination.

Biostatistics

2003

;

4

:

279

–95.

4.

Ferguson NM, Fraser C, Donnelly CA, et al. Public health. Public health risk from the avian H5N1 influenza epidemic.

Science

2004

;

304

:

968

–9.

5.

Jansen VA, Stollenwerk N, Jensen HJ, et al. Measles outbreaks in a population with declining vaccine uptake.

Science

2003

;

301

:

804

.

6.

Riley S, Fraser C, Donnelly CA, et al. Transmission dynamics of the etiological agent of SARS in Hong Kong: impact of public health interventions.

Science

2003

;

300

:

1961

–6.

7.

Ferguson NM, Donnelly CA, Anderson RM. Transmission intensity and impact of control policies on the foot and mouth epidemic in Great Britain.

Nature

2001

;

413

:

542

–8.

8.

Haydon DT, Chase-Topping M, Shaw DJ, et al. The construction and analysis of epidemic trees with reference to the 2001 UK foot-and-mouth outbreak.

Proc Biol Sci

2003

;

270

:

121

–7.

9.

Wallinga J, Teunis P. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures.

Am J Epidemiol

2004

;

160

:

509

–16.

10.

Cauchemez S, Boëlle PY, Donnelly CA, et al. Real-time estimates in early detection of SARS.

Emerg Infect Dis

2006

;

12

:

110

–13.

11.

Cauchemez S, Carrat F, Viboud C, et al. A Bayesian MCMC approach to study transmission of influenza: application to household longitudinal data.

Stat Med

2004

;

23

:

3469

–87.

12.

Cauchemez S, Temime L, Guillemot D, et al. Investigating heterogeneity in pneumococcal transmission: a Bayesian MCMC approach applied to a follow-up of schools. J Am Stat Assoc (in press).

13.

Berger JO, Liseo B, Wolpert RL. Integrated likelihood methods for eliminating nuisance parameters.

Stat Sci

1999

;

14

:

1

–22.

14.

Bjornstad JF. Predictive likelihood: a review.

Stat Sci

1990

;

5

:

242

–54.

15.

Gelfand AE. Model determination using sampling-based methods. In: Gilks WR, Richardson S, Spiegelhalter DJ, eds. MCMC in practice. London, United Kingdom: Chapman and Hall,

1996

:145–61.

16.

Rubin DB. Bayesianly justifiable and relevant frequency calculations for the applied statistician.

Ann Stat

1984

;

12

:

1151

–72.

17.

Gilks WR, Richardson S, Spiegelhalter DJ. Markov chain Monte Carlo in practice. London, United Kingdom: Chapman and Hall,

1996

.

18.

Lipsitch M, Cohen T, Cooper B, et al. Transmission dynamics and control of severe acute respiratory syndrome.

Science

2003

;

300

:

1966

–70.

19.

Leung GM, Hedley AJ, Ho LM, et al. The epidemiology of severe acute respiratory syndrome in the 2003 Hong Kong epidemic: an analysis of all 1755 patients.

Ann Intern Med

2004

;

141

:

662

–73.

Download all slides

Month:	Total Views:
December 2016	3
January 2017	1
February 2017	5
March 2017	10
April 2017	7
May 2017	9
June 2017	2
July 2017	2
August 2017	7
September 2017	2
October 2017	5
November 2017	11
December 2017	16
January 2018	9
February 2018	21
March 2018	25
April 2018	36
May 2018	22
June 2018	17
July 2018	19
August 2018	14
September 2018	11
October 2018	16
November 2018	21
December 2018	20
January 2019	19
February 2019	22
March 2019	29
April 2019	19
May 2019	29
June 2019	6
July 2019	21
August 2019	23
September 2019	11
October 2019	32
November 2019	12
December 2019	28
January 2020	15
February 2020	54
March 2020	114
April 2020	199
May 2020	125
June 2020	94
July 2020	105
August 2020	114
September 2020	60
October 2020	39
November 2020	67
December 2020	42
January 2021	25
February 2021	47
March 2021	49
April 2021	37
May 2021	47
June 2021	35
July 2021	19
August 2021	30
September 2021	33
October 2021	56
November 2021	15
December 2021	25
January 2022	36
February 2022	36
March 2022	39
April 2022	34
May 2022	49
June 2022	13
July 2022	22
August 2022	24
September 2022	42
October 2022	63
November 2022	16
December 2022	28
January 2023	29
February 2023	41
March 2023	13
April 2023	14
May 2023	17
June 2023	33
July 2023	22
August 2023	27
September 2023	31
October 2023	30
November 2023	28
December 2023	26
January 2024	54
February 2024	29
March 2024	20
April 2024	25

Article Contents

Estimating in Real Time the Efficacy of Measures to Control Emerging Communicable Diseases

Abstract

MATERIALS AND METHODS

Assumptions and notations

Statistical analysis

Bayesian hierarchical model.

Predictive distribution of the temporal pattern of R.

A decision rule to reinforce control measures.

Simulation of SARS-like outbreaks

RESULTS

Real-time surveillance of an epidemic

Collected data.

Posterior distribution of the generation interval.

Temporal pattern of the reproduction number.

Evaluating the efficiency of control measures

DISCUSSION

APPENDIX 1

Bayesian Hierarchical Model

Level 1.

Level 2.

Level 3.

Level 4.

APPENDIX 2

Predictive Distribution of the Temporal Pattern of R

APPENDIX 3

Simulation of SARS-like Outbreaks

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

Article Contents

Estimating in Real Time the Efficacy of Measures to Control Emerging Communicable Diseases

Abstract

MATERIALS AND METHODS

Assumptions and notations

Statistical analysis

Bayesian hierarchical model.

Predictive distribution of the temporal pattern of R.

A decision rule to reinforce control measures.

Simulation of SARS-like outbreaks

RESULTS

Real-time surveillance of an epidemic

Collected data.

Posterior distribution of the generation interval.

Temporal pattern of the reproduction number.

Evaluating the efficiency of control measures

DISCUSSION

APPENDIX 1

Bayesian Hierarchical Model

Level 1.

Level 2.

Level 3.

Level 4.

APPENDIX 2

Predictive Distribution of the Temporal Pattern of R

APPENDIX 3

Simulation of SARS-like Outbreaks

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

This Feature Is Available To Subscribers Only