A SAS macro for estimation of direct adjusted survival curves based on a stratified Cox regression model

https://doi.org/10.1016/j.cmpb.2007.07.010Get rights and content

Abstract

Often in biomedical research the aim of a study is to compare the outcomes of several treatment arms while adjusting for multiple clinical prognostic factors. In this paper we focus on computation of the direct adjusted survival curves for different treatment groups based on an unstratified or a stratified Cox model. The estimators are constructed by taking the average of the individual predicted survival curves. The method of direct adjustment controls for possible confounders due to an imbalance of patient characteristics between treatment groups. This adjustment is especially useful for non-randomized studies. We have written a SAS macro to estimate and compare the direct adjusted survival curves. We illustrate the SAS macro through the examples analyzing stem cell transplant data and Ewing’s sarcoma data.

Introduction

The Cox [1] model or log rank test [2] is commonly used in medical studies to compare the survival of patients on different treatments. In randomized clinical trials comparisons between treatments are direct and summary survival curves produced by using a Kaplan–Meier [3] technique are used to represent the survival experience of a patient given a specific treatment. These unadjusted curves represent the typical patient since randomization assures that patients are equally mixed between treatment arms with respect to possible confounding factors which could affect outcome.

When retrospective or non-randomized trials are used to make comparisons between treatments, the distribution of covariates between the treatment arms is often not the same. To make comparisons between treatments an adjusted analysis, typically based on the Cox model, is needed. This analysis gives the relative risk of survival between treatments arms adjusted for covariates. When the confounding risk factors distributions differ between treatment arms the summary Kaplan–Meier curves for each treatment arm can be misleading and not representative of the “average” patient on a given treatment arm.

Several approaches to graphically representing the survival experience of an average patient on a given treatment in a retrospective trial are possible. When there are only a few categorical covariates it is possible to provide summary survival curves for each combination of levels of the covariates. This, however, becomes intractable as the number of covariates grows or for models with continuous covariates.

Two methods of estimating an average or adjusted survival curve for each treatment following a Cox regression analysis have been proposed. This curve is hoped to faithfully represent the survival experience of a typical patient in the population who was given a particular treatment. For each treatment we first fit the model:λi(t|Z)=λ0i(t)exp{βTZ},where Z is the vector of covariates to be adjusted for and λ0i(t) is the baseline hazard rate for the ith treatment. The function λ0i(t) can either follow a proportional hazard model for the treatment effect (γi) with λ0i(t)=λ0(t)exp{γi} or follow a model with distinct rates for each treatment. Estimates of β are obtained by a standard Cox analysis. Estimates of the baseline hazard rates are obtained by Breslow’s [4] estimator.

The first method, proposed by Neuberger et al. [5], replaces the covariate Z by the average value of Z in the entire sample, Z¯. This adjusted survival curve:Ŝi(t)=exp{Λˆ0i(t)eβˆTZ¯},where Λˆ0i(t) is the estimated cumulative hazard function, can be computed in SAS using the BASELINE command. This curve represents the survival experience of a patient with a prognostic index βTZ¯ equal to the average prognostic index of all patients. This method, while easy to implement in practice, has several drawbacks. First, the covariate value for the average patient may be quite meaningless for categorical variables. For example, if one of the covariates is gender coded as 0 for male and 1 for female, the meaning of patient with a sex covariate of 0.4 is hard to interpret. Second, as discussed in Thomsen et al. [6] this method does not account for the sample variability in the prognostic indicator from individual to individual.

The second method often called the “direct adjusted survival curve” by among others Chang et al. [7], Makuch [8] and Gail and Byar [9] averages the estimated survival curves for each patient. That isŜi(t)=1nl=1nexp{Λˆ0i(t)eβˆTZl}.

This method averages survival curves for each patient in the sample rather than the covariates and produces a more representative survival curve. Lee et al. [10] and Ghali et al. [11] provided programs in SAS, STATA, and S-plus for deriving such curves.

Other methods have been proposed to make adjustments when there is a treatment group present. Nieto and Coresh [12] provided a comparison of these methods. Cole and Hernán [13] discussed a technique that uses weights from a logistic regression of the covariates on treatment indicator to make adjustments to the survival estimator. They provided a SAS macro to implement this method. The macro seems to require only two treatment groups and does not provide estimates of the precision of the survival estimates.

Existing programs to compute the direct adjusted survival curves are of limited utility since they do not provide estimates of the uncertainty in the estimators such as the standard errors or confidence intervals. These programs have focused on the case where the treatment hazards ratio, after adjustment for covariates, are constant. They are not applicable when the treatment hazards are not proportional. In this case it is best to represent survival with a Cox model stratified on treatment and use this model for making inference about the direct adjusted survival for each treatment. These stratified models allow for a representation of treatments whose efficacy relative to each other changes over time. They are less model dependent than the more restrictive proportional hazards rate models. In Zhang and Klein [14] a confidence band for the difference of treatment curves based on a stratified Cox model using a Monte Carlo approach was presented.

We have implemented a SAS macro that computes the directed adjusted survival function for treatment groups based on either an unstratified Cox model or a stratified Cox model. The macro also produces standard errors of the estimates of survival and standard errors of the difference in survival between pairs of treatment groups. The standard errors of the difference can be used to make pointwise comparisons of treatment groups.

In Section 2 we review the estimation techniques for the direct adjusted survival estimators and their standard errors based on a stratified Cox model. The variance estimations for the direct adjusted survival probabilities and the differences of the direct adjusted survival probabilities are given in Appendix A. We describe the SAS macro and its output in Section 3. In Section 4 we utilize the SAS macro to analyze stem cell transplant data and Ewing’s sarcoma data. Discussions are given in Section 5.

Section snippets

Estimating the direct adjusted survival curve

Let the observations on subject j of treatment group i be {Tij,Dij,Zij} for i=1,,K and j=1,,ni, where Tij is the observed time, Dij=0 if the subject is censored, Dij=1 otherwise, and Zij are the covariates. Gail and Byar [9] derived the variance for the direct adjusted survival probabilities based on an unstratified Cox proportional hazards model. Here, we consider a variance estimator for the direct adjusted survival curve based on a stratified Cox model (1). For time t, the survival

The SAS macro

We have written a SAS macro to compute the direct adjusted survival curves. The macro reports Ŝi(t),σˆi,i(t) for i=1,,K and σˆi,j(t) for 1i<jK.

The macro requires a SAS data set with the following variables: (1) a variable with the failure time; (2) an indicator variable that indicates if an event has occurred (coded as 1 for an event and 0 for censoring); (3) a variable that indexes the treatments (coded as 1,,K); and (4) variables for all risk factors.

Suppose the macro is saved as a SAS

Example 1

In this example we show that our proposed SAS macro can analyze the survival data from more than two treatment groups and can handle relative large data sets efficiently. Besien et al. [18] conducted a study to compare three types of stem cell transplantation (unpurged autologous, purged autologous and allogeneic) for treating follicular lymphoma patients. The study cohort consisted of 904 patients: 597 patients received unpurged autologous transplants, 131 patients received purged autologous

Discussions

We have presented a SAS macro to compute the direct adjusted survival curves and the variances based on a stratified or unstratified Cox model. These curves, as discussed in Thomsen et al. [6], provide more realistic estimates of the “average” survival probability for a treatment by presenting the average survival curve if each patient in the sample had received a given treatment.

The average survival curve presented here is equal to EZ{Ŝ(t|Z)}, where the expectation is taken over the empirical

Acknowledgments

Dr. Klein and Dr. Zhang’s research was supported by National Cancer Institute grant 2 R01 CA54706-10. We thank the CIBMTR for providing us the data.

References (18)

  • I.M. Chang et al.

    Corrected group prognostic curves and summary statistics

    J. Chron. Dis.

    (1982)
  • R.W. Makuch

    Adjusted survival curve estimation using covariates

    J. Chron. Dis.

    (1982)
  • S.R. Cole et al.

    Adjusted survival curves with inverse probability weights

    Comput. Meth. Programs Biomed.

    (2004)
  • D.R. Cox

    Regression models and life-tables (with discussion)

    J. Roy. Stat. Soc.: Ser. B

    (1972)
  • J.P. Klein et al.

    Survival Analysis: Techniques for Censored and Truncated Data

    (2003)
  • E.L. Kaplan et al.

    Nonparametric estimation from incomplete observations

    J. Am. Stat. Assoc.

    (1958)
  • N.E. Breslow

    Analysis of survival data under the proportional hazards model

    Int. Stat. Rev.

    (1975)
  • J. Neuberger et al.

    Use of a prognostic index in evaluation of liver transplantation for primary biliary cirrhosis

    Transplantation

    (1986)
  • B.L. Thomsen et al.

    A note on the calculation of expected survival, illustrated by the survival of liver transplant patients

    Stat. Med.

    (1991)
There are more references available in the full text version of this article.

Cited by (250)

View all citing articles on Scopus
View full text