Solved – p-value zero in hypothesis testing for survival curves

I have done survival analysis. I used Kaplan-Meir to do the survival analysis.

Description of data:
My data set is large and data table has close 120,000 records of survival information belong to 6 groups.

Sample:

   user_id   time_in_days   event total_likes total_para_length group 1:       2          4657     1       38867        431117212   AA 2:       2          3056     1       31392        948984460   BB 3:       2            49     1          15            67770   CC 4:       3          4181     1       15778        379211806   BB 5:       3            17     1           3            19032   CC 6:       3          2885     1       12001        106259666   EE 

After fitting the survival curves and plotting it, I see they are similar but yet at any given point in time their survival proportions don't seem to look like identical.

Here is the plot:
Survival Curves

I ran a hypothesis test where my H0: There is not difference between the survival curves and here is the results that I got.

> survdiff(formula= Surv(time, event) ~ group, rh=0) Call: survdiff(formula = Surv(time, event) ~ group, rho = 0)               N Observed Expected (O-E)^2/E (O-E)^2/V group=FF 28310    27993    28632      14.3      19.0 group=AA 64732    63984    67853     220.6     460.1 group=BB 19017    18690    16839     203.4     245.6 group=CC  9687     9536     8699      80.6      91.0 group=DD 13438    13187    11891     141.3     164.2 group=EE  3910     3847     3324      82.4      89.7   Chisq= 788  on 5 degrees of freedom, p= 0   

I am little confuse by trying to figure out what it means, specially since I got p-value=0.

I am fairly new to survival analysis so after reading and digging through I realized that this is a non-parametric as I understand which means that it doesn't make any assumptions of the underline distributions of the time.

After reading about cox-proportional hazard function and going over c-cran pdf I performed a cox regression test and here is what I got from that:

> cox_model <- coxph(Surv(time, event) ~ X) > summary(cox_model) Call: coxph(formula = Surv(time, event) ~ X)    n= 139094, number of events= 137237            coef  exp(coef)   se(coef)       z Pr(>|z|)     X1 -7.655e-05  9.999e-01  1.504e-06 -50.897   <2e-16 *** X2 -1.649e-10  1.000e+00  5.715e-11  -2.886   0.0039 **  --- Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1     exp(coef) exp(-coef) lower .95 upper .95 X1    0.9999          1    0.9999    0.9999 X2    1.0000          1    1.0000    1.0000  Concordance= 0.847  (se = 0.001 ) Rsquare= 0.111   (max possible= 1 ) Likelihood ratio test= 16307  on 2 df,   p=0 Wald test            = 7379  on 2 df,   p=0 Score (logrank) test = 4628  on 2 df,   p=0 

My big X is generated by doing rbind on total_like and total_para_length. Looking at Rsquare and P-Values I am not sure what really is going on here. If I can't throw away the Null-Hypothesis I should give a larger p-value.

Your $p$-value is not actually zero, it's just very close to it. If you look at your test statistics ($chi^{2}=788$ in the Kaplan-Meier model, and the Wald $chi^{2}=7379$ in the CPH model) they are ginormous! Just really, really big. So the associated $p$-values are tiny.

But perhaps you wonder why, if your survival curves were so similar visually, you get such a significant difference with these tests? If so, consider: even tiny differences can obtain miniscule $p$-values if the sample size is large enough. And how big is your sample? It's about 120,000 observations: big! So you might expect that even very small differences will be found "significant."

What can you do with this inference, given that it is probably just telling you that you've a giant sample? I'm not sure, because I don't know if there's an equivalence test available for your two quantities. If there is an equivalence test for your estimates, you might (1) decide a priori what a relevant difference is (i.e. how large a difference between two groups needs to be, in order for you to care), (2) conduct a test of difference between your two groups, (3) conduct a test of equivalence using your definition of relevant difference, and (4) combine the inferences from these test approaches which will give you an idea of whether the significant difference you are finding is relevant (i.e. you rejected a test for difference, but did not reject a test for equivalence) or trivial (you rejected a test for difference and also rejected a test for equivalence).

Similar Posts:

Rate this post

Leave a Comment