A semi-parametric generalization of the Cox proportional hazards regression model: Inference and applications
Introduction
The modeling and analysis of data in which the principal endpoint is the time until an event occurs is often of prime interest in medical and engineering studies. Typically, such an event is the onset of a disease or death itself as seen in clinical trials or failure of an item or a system as seen in industrial life testing. The time to an event is normally referred to as survival or failure time.
The primary goal in analyzing censored survival data is to assess the dependence of survival time on covariates. The secondary goal is the estimation of the underlying distribution of survival time. The Cox Proportional Hazards (PH) model (Cox, 1972) is a standard tool for exploring the association of covariates with survival time. An interesting feature of this model is that it is semi-parametric in the sense that it can be factored into a parametric part consisting of a regression parameter vector associated with the covariates and a non-parametric part that can be left completely unspecified.
In the Cox PH model, given a vector of possibly time-dependent covariates , the hazard function at time is assumed to be of the form where is the baseline hazard function, denoting the hazard under no covariate effect and is a non-negative function of the covariate vector , referred to as the risk function, such that . The most commonly used form of the Cox PH model is where is a vector of regression coefficients. The focus is on inference for , with the baseline hazard function, , the non-parametric part, left completely unspecified.
In spite of its semi-parametric feature, the Cox PH model implicitly assumes that the hazard and survival curves corresponding to two different values of the covariates do not cross. Although this assumption may be valid in many experimental settings, it has been found to be suspect in others. For example, if the treatment effect decreases with time, then one might expect the hazard curves corresponding to the treatment and control groups to converge. Other examples that indicate the presence of non-proportional hazards are also given in Gore et al. (1984), and Tonak et al. (1979), among others.
In this paper, we describe a semi-parametric generalization of the Cox PH model which allows crossing of hazards as well as survival functions. In Section 2, we discuss its unique properties and place it within the context of censored survival data analysis. In Section 3, we describe an estimation procedure for this model using cubic -spline approximations for the baseline hazard. We illustrate our method with real-life examples in Section 4 and provide some concluding remarks.
Section snippets
A semi-parametric generalization of the Cox PH model
We describe a semi-parametric generalization of the Cox PH model in which the hazard functions corresponding to different values of the covariates can cross. The special case of this model was originally introduced by Quantin et al. (1996) for the purpose of goodness of fit testing of the Cox PH model. Devarajan (2000) outlined the unique properties of this non-proportional hazards regression model as well as inference for this model using maximum penalized likelihood estimation, and provided a
Estimation for the non-proportional hazards model
The observed data consist of independent observations on the triple , where is the minimum of a failure and censoring time pair is the indicator of the event that a failure has been observed and is a vector of covariates. The random variables and denote the survival and censoring times respectively which are assumed to be independent.
The fundamental assumption of proportionality of hazards in the Cox PH model (1.2) requires that the hazards ratio
Illustration of our methods
We illustrate estimation in the non-proportional hazards model (2.2) using real-life examples. All figures presented were created using the R statistical language and environment (R Development Core Team (2009), www.R-project.org).
Acknowledgements
The authors would like to thank the Associate Editor and referee for providing valuable comments that helped improve the presentation of this paper. The work of the first author was supported in part by NIH grant P30 CA 06927 and an appropriation from the Commonwealth of Pennsylvania.
References (24)
- et al.
Sequential estimation for semiparametric models with application to the proportional hazards model
Journal of Statistical Planning and Inference
(2006) - et al.
Heterogeneity and varying effect in hazards regression
Journal of Statistical Planning and Inference
(2009) - et al.
Predicting survival probabilities with semiparametric transformation models
Journal of the American Statistical Association
(1997) Regression models and life tables (with discussion)
Journal of the Royal Statistical Society. Series B
(1972)- et al.
Analysis of trials with treatment-individual interactions
A Practical Guide to Splines
(2001)- Devarajan, K., 2000. Inference for a non-proportional hazards regression model and applications. Ph.D. Dissertation....
- et al.
Goodness-of-fit testing for the Cox proportional hazards model
- et al.
Testing for covariate effect in the Cox proportional hazards regression model
Communications in Statistics—Theory and Methods
(2009) - et al.
Flexible regression models with cubic splines
Statistics in Medicine
(1989)
Regression models and non-proportional hazards in the analysis of breast cancer survival
Applied Statistics
Spline-based tests in survival analysis
Biometrics
Cited by (29)
SurvNAM: The machine learning survival model explanation
2022, Neural NetworksCitation Excerpt :This assumption is referred to as the linear proportional hazards condition. The Cox model is semi-parametric in the sense that it can be factored into a parametric part, which consists of a regression parameter vector associated with the covariates, and a non-parametric part, which can be left completely unspecified (Devarajn & Ebrahimi, 2011). One of the main problems of using the Cox model is linear relationship assumption between covariates and the log-risk of an event.
Cox proportional hazards model used for predictive analysis of the energy consumption of healthcare buildings
2022, Energy and BuildingsCitation Excerpt :As the hazard function associated with cumulative energy consumption is unknown, parametric models cannot be applied. Nonetheless, the CPH model is applicable since it does not require this information about the event analysed [34]. There are no precedents in the state of the art which use the CPH model to analyse either the energy consumption of buildings or how to improve the energy efficiency of healthcare buildings.
Calculation of changes in life expectancy based on proportional hazards model of an intervention
2020, Insurance: Mathematics and EconomicsReliability of a collection and transport system for industrial waste water
2020, Process Safety and Environmental ProtectionCitation Excerpt :Monte Carlo simulations (MCs) were performed to validate the FTA results, and a sensitivity and lead analysis to determine the dominant events contributing to the main event failure. Devarajan and Ebrahimi (2011) find that data modeling and analysis in which the main element of analysis is the time to the occurrence of an event are generally of interest in medical and engineering studies. Typically, in engineering, this event is the failure of an item or system, as seen in industrial life tests and when one has primary interest in analyzing data from this type of modeling which is to evaluate the time dependency of the event has not yet occurred (survival or reliability) with covariates, the Cox Proportional Hazards model (Cox, 1972) is a standard tool for exploring the association of covariates with survival time.
A weighted random survival forest
2019, Knowledge-Based SystemsCitation Excerpt :This assumption is referred to as the linear proportional hazards condition. The Cox model is semi-parametric in the sense that it can be factored into a parametric part consisting of a regression parameter vector associated with the covariates and a non-parametric part that can be left completely unspecified [8]. It should be noted that the Cox model may provide unsatisfactory results under conditions of a high dimensionality of survivor data and a small number of observations.
On penalized likelihood estimation for a non-proportional hazards regression model
2013, Statistics and Probability Letters