# Solved – p-value zero in hypothesis testing for survival curves

I have done survival analysis. I used Kaplan-Meir to do the survival analysis.

Description of data:
My data set is large and data table has close 120,000 records of survival information belong to 6 groups.

Sample:

``   user_id   time_in_days   event total_likes total_para_length group 1:       2          4657     1       38867        431117212   AA 2:       2          3056     1       31392        948984460   BB 3:       2            49     1          15            67770   CC 4:       3          4181     1       15778        379211806   BB 5:       3            17     1           3            19032   CC 6:       3          2885     1       12001        106259666   EE ``

After fitting the survival curves and plotting it, I see they are similar but yet at any given point in time their survival proportions don't seem to look like identical.

Here is the plot: I ran a hypothesis test where my H0: There is not difference between the survival curves and here is the results that I got.

``> survdiff(formula= Surv(time, event) ~ group, rh=0) Call: survdiff(formula = Surv(time, event) ~ group, rho = 0)               N Observed Expected (O-E)^2/E (O-E)^2/V group=FF 28310    27993    28632      14.3      19.0 group=AA 64732    63984    67853     220.6     460.1 group=BB 19017    18690    16839     203.4     245.6 group=CC  9687     9536     8699      80.6      91.0 group=DD 13438    13187    11891     141.3     164.2 group=EE  3910     3847     3324      82.4      89.7   Chisq= 788  on 5 degrees of freedom, p= 0   ``

I am little confuse by trying to figure out what it means, specially since I got `p-value=0`.

I am fairly new to survival analysis so after reading and digging through I realized that this is a non-parametric as I understand which means that it doesn't make any assumptions of the underline distributions of the time.

After reading about cox-proportional hazard function and going over c-cran pdf I performed a cox regression test and here is what I got from that:

``> cox_model <- coxph(Surv(time, event) ~ X) > summary(cox_model) Call: coxph(formula = Surv(time, event) ~ X)    n= 139094, number of events= 137237            coef  exp(coef)   se(coef)       z Pr(>|z|)     X1 -7.655e-05  9.999e-01  1.504e-06 -50.897   <2e-16 *** X2 -1.649e-10  1.000e+00  5.715e-11  -2.886   0.0039 **  --- Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1     exp(coef) exp(-coef) lower .95 upper .95 X1    0.9999          1    0.9999    0.9999 X2    1.0000          1    1.0000    1.0000  Concordance= 0.847  (se = 0.001 ) Rsquare= 0.111   (max possible= 1 ) Likelihood ratio test= 16307  on 2 df,   p=0 Wald test            = 7379  on 2 df,   p=0 Score (logrank) test = 4628  on 2 df,   p=0 ``

My big X is generated by doing rbind on total_like and total_para_length. Looking at Rsquare and P-Values I am not sure what really is going on here. If I can't throw away the Null-Hypothesis I should give a larger p-value.

Contents