I am trying to caculate survival function in a time dependent covariates Cox model from its baseline hazard function. However, my program gives a different result compared with survfit
. Based on the formula
$S(t)=exp(-int_{0}^{t}lambda_{0}(mu)exp[hat{beta}Z(mu)]dmu$
result should match. I just need this program for illustration but speed. The data I used can be download here and program have been attached. Figure 1 shown the graph generated by the program below.
library(survival) fit.cox <- coxph( Surv(t1,t2,event) ~ x, data = sim$data) lambda0 <- basehaz(fit.cox, centered = F) # Estimated Baseline Hazard Function # Caculate Survival Function t.cut <- sim$t.cut lambda0 <- rbind(lambda0, c(0,0) ) x <- sim$x[1, ] beta <- fit.cox$coefficients pred <- exp( x * beta ) #Baseline Hazard Function lambda0.fun <- function(t) { approx(x = lambda0$time, y = lambda0$hazard, xout = t)$y } #Hazard Function lambda.fun <- function(t){ which.t <- sum(t >= t.cut) lambda0.fun(t) * pred[which.t] } #Survival Function survival.fun <- function(t){ cum.hazard <- integrate(Vectorize(lambda.fun), 0, t, subdivisions = 1e3L) exp(- cum.hazard$value) } #Test Program test <-sim$data[1,] s.est <- Vectorize(survival.fun)(seq(0,2,0.1)) s.est2 <- survfit(fit.cox, newdata = test, id = id, se.fit = F, type = "efron") plot(s.est2, xlim = c(0,2)) points(seq(0,2,0.1), s.est)
Best Answer
After looking for the source code of basehaz.S
, I've got the reason why I am wrong here. First basehaz
simply compute cumulate hazard function $Lambda_0(t)$ by using survfit
instead of instantaneous hazard function $lambda_0(t)$
Second the main code for basehaz.S
is
sfit<-survfit(fit) H<- -log(sfit$surv)
Then it's wrong to use this function to estimate a time dependent covariates Cox model (which require id
option in survfit
).
I think we should avoid to use the basehaz
function becuase it exists only because Prof. Therneau try to comfort SAS programmers as he described in the document of basehaz
function.