I am running Cox Proportional Hazard Model in R, package survival, function coxph(). As I have time-varying covariates, my data is defined as counting process, that is there is one separate data record for each (t1,t2] time interval. So any object i can have multiple records, each for different time interval.
Does this mean that I must run the model with correlation structure, i.e. do I have to use cluster(Id) term in the model formula ? I have not seen used cluster(Id) term in some examples that I found, nevertheless I am in doubt, because how otherwise would you define that multiple observations belong to one object only ? I guess if multiple observations belong to one object, we should inform the model that the errors are not independent ?
Best Answer
The short answer is that no, you do not have to. The explanation is in the documentation of the survival package. See vignette("timedep", package = "survival")
.
The main reasoning is that an individual is at risk only in disjoint time intervals. Since the Cox partial likelihood is a sum over event time points, at each of those time points an individual may contribute with at most one line from the data.
Similar Posts:
- Solved – How to get predictions in terms of survival time from a Cox PH model
- Solved – Using propensity scores from twang in coxph
- Solved – R packages (or SAS code) to produce two simultaneous Kaplan-Meier curves
- Solved – How to use survival model to predict when there are time dependent continuous covariates
- Solved – Using survival analysis with multiple events