A SAS macro for estimation of direct adjusted survival curves based on a stratified Cox regression model
Introduction
The Cox [1] model or log rank test [2] is commonly used in medical studies to compare the survival of patients on different treatments. In randomized clinical trials comparisons between treatments are direct and summary survival curves produced by using a Kaplan–Meier [3] technique are used to represent the survival experience of a patient given a specific treatment. These unadjusted curves represent the typical patient since randomization assures that patients are equally mixed between treatment arms with respect to possible confounding factors which could affect outcome.
When retrospective or non-randomized trials are used to make comparisons between treatments, the distribution of covariates between the treatment arms is often not the same. To make comparisons between treatments an adjusted analysis, typically based on the Cox model, is needed. This analysis gives the relative risk of survival between treatments arms adjusted for covariates. When the confounding risk factors distributions differ between treatment arms the summary Kaplan–Meier curves for each treatment arm can be misleading and not representative of the “average” patient on a given treatment arm.
Several approaches to graphically representing the survival experience of an average patient on a given treatment in a retrospective trial are possible. When there are only a few categorical covariates it is possible to provide summary survival curves for each combination of levels of the covariates. This, however, becomes intractable as the number of covariates grows or for models with continuous covariates.
Two methods of estimating an average or adjusted survival curve for each treatment following a Cox regression analysis have been proposed. This curve is hoped to faithfully represent the survival experience of a typical patient in the population who was given a particular treatment. For each treatment we first fit the model:where is the vector of covariates to be adjusted for and is the baseline hazard rate for the ith treatment. The function can either follow a proportional hazard model for the treatment effect () with or follow a model with distinct rates for each treatment. Estimates of are obtained by a standard Cox analysis. Estimates of the baseline hazard rates are obtained by Breslow’s [4] estimator.
The first method, proposed by Neuberger et al. [5], replaces the covariate by the average value of in the entire sample, . This adjusted survival curve:where is the estimated cumulative hazard function, can be computed in SAS using the command. This curve represents the survival experience of a patient with a prognostic index equal to the average prognostic index of all patients. This method, while easy to implement in practice, has several drawbacks. First, the covariate value for the average patient may be quite meaningless for categorical variables. For example, if one of the covariates is gender coded as 0 for male and 1 for female, the meaning of patient with a sex covariate of 0.4 is hard to interpret. Second, as discussed in Thomsen et al. [6] this method does not account for the sample variability in the prognostic indicator from individual to individual.
The second method often called the “direct adjusted survival curve” by among others Chang et al. [7], Makuch [8] and Gail and Byar [9] averages the estimated survival curves for each patient. That is
This method averages survival curves for each patient in the sample rather than the covariates and produces a more representative survival curve. Lee et al. [10] and Ghali et al. [11] provided programs in SAS, STATA, and S-plus for deriving such curves.
Other methods have been proposed to make adjustments when there is a treatment group present. Nieto and Coresh [12] provided a comparison of these methods. Cole and Hernán [13] discussed a technique that uses weights from a logistic regression of the covariates on treatment indicator to make adjustments to the survival estimator. They provided a SAS macro to implement this method. The macro seems to require only two treatment groups and does not provide estimates of the precision of the survival estimates.
Existing programs to compute the direct adjusted survival curves are of limited utility since they do not provide estimates of the uncertainty in the estimators such as the standard errors or confidence intervals. These programs have focused on the case where the treatment hazards ratio, after adjustment for covariates, are constant. They are not applicable when the treatment hazards are not proportional. In this case it is best to represent survival with a Cox model stratified on treatment and use this model for making inference about the direct adjusted survival for each treatment. These stratified models allow for a representation of treatments whose efficacy relative to each other changes over time. They are less model dependent than the more restrictive proportional hazards rate models. In Zhang and Klein [14] a confidence band for the difference of treatment curves based on a stratified Cox model using a Monte Carlo approach was presented.
We have implemented a SAS macro that computes the directed adjusted survival function for treatment groups based on either an unstratified Cox model or a stratified Cox model. The macro also produces standard errors of the estimates of survival and standard errors of the difference in survival between pairs of treatment groups. The standard errors of the difference can be used to make pointwise comparisons of treatment groups.
In Section 2 we review the estimation techniques for the direct adjusted survival estimators and their standard errors based on a stratified Cox model. The variance estimations for the direct adjusted survival probabilities and the differences of the direct adjusted survival probabilities are given in Appendix A. We describe the SAS macro and its output in Section 3. In Section 4 we utilize the SAS macro to analyze stem cell transplant data and Ewing’s sarcoma data. Discussions are given in Section 5.
Section snippets
Estimating the direct adjusted survival curve
Let the observations on subject j of treatment group i be for and , where is the observed time, if the subject is censored, otherwise, and are the covariates. Gail and Byar [9] derived the variance for the direct adjusted survival probabilities based on an unstratified Cox proportional hazards model. Here, we consider a variance estimator for the direct adjusted survival curve based on a stratified Cox model (1). For time t, the survival
The SAS macro
We have written a SAS macro to compute the direct adjusted survival curves. The macro reports for and for .
The macro requires a SAS data set with the following variables: (1) a variable with the failure time; (2) an indicator variable that indicates if an event has occurred (coded as 1 for an event and 0 for censoring); (3) a variable that indexes the treatments (coded as ); and (4) variables for all risk factors.
Suppose the macro is saved as a SAS
Example 1
In this example we show that our proposed SAS macro can analyze the survival data from more than two treatment groups and can handle relative large data sets efficiently. Besien et al. [18] conducted a study to compare three types of stem cell transplantation (unpurged autologous, purged autologous and allogeneic) for treating follicular lymphoma patients. The study cohort consisted of 904 patients: 597 patients received unpurged autologous transplants, 131 patients received purged autologous
Discussions
We have presented a SAS macro to compute the direct adjusted survival curves and the variances based on a stratified or unstratified Cox model. These curves, as discussed in Thomsen et al. [6], provide more realistic estimates of the “average” survival probability for a treatment by presenting the average survival curve if each patient in the sample had received a given treatment.
The average survival curve presented here is equal to , where the expectation is taken over the empirical
Acknowledgments
Dr. Klein and Dr. Zhang’s research was supported by National Cancer Institute grant 2 R01 CA54706-10. We thank the CIBMTR for providing us the data.
References (18)
- et al.
Corrected group prognostic curves and summary statistics
J. Chron. Dis.
(1982) Adjusted survival curve estimation using covariates
J. Chron. Dis.
(1982)- et al.
Adjusted survival curves with inverse probability weights
Comput. Meth. Programs Biomed.
(2004) Regression models and life-tables (with discussion)
J. Roy. Stat. Soc.: Ser. B
(1972)- et al.
Survival Analysis: Techniques for Censored and Truncated Data
(2003) - et al.
Nonparametric estimation from incomplete observations
J. Am. Stat. Assoc.
(1958) Analysis of survival data under the proportional hazards model
Int. Stat. Rev.
(1975)- et al.
Use of a prognostic index in evaluation of liver transplantation for primary biliary cirrhosis
Transplantation
(1986) - et al.
A note on the calculation of expected survival, illustrated by the survival of liver transplant patients
Stat. Med.
(1991)
Cited by (250)
Effect of Autograft CD34<sup>+</sup> Dose on Outcome in Pediatric Patients Undergoing Autologous Hematopoietic Stem Cell Transplant for Central Nervous System Tumors
2023, Transplantation and Cellular TherapySchool performance and psychiatric comorbidity in childhood absence epilepsy: A Danish cohort study
2023, European Journal of Paediatric NeurologyEffect of previous coronary stenting on subsequent coronary artery bypass grafting outcomes
2022, Journal of Thoracic and Cardiovascular Surgery