I have data that I analyzed using a stratified PWP-GT extension of the Cox PH model and I am having trouble figuring out what I need to report in the upcome manuscript and how to interpret the stratified components. Briefly, the data contains information about how long it took individual insects to reach certain developmental points ("event": 1, 2, 3, 4) after being treated with a pesticide. Including the control group, there were four levels of pesticide ("tx": C, L, M, and H). After some preliminary data exploration, I found that when the insects arrived ("ship": 1, 2, 3, 4) and started the experiment had a significant effect on development time.
From all of this, I made the following model in R:
> coxph(formula = Surv(rep(0, dim(sub1)[1]), tstop - tstart, status) ~ tx * strata(event) + ship * strata(event) + cluster(id) + strata(event), data = sub1, method = "breslow")
Where "id" equals the individual insects and "tstop – tstart" indicates the gap time between events.
This outputs: (EDITED TO INCLUDE "ship" AS A FACTOR)
Call: coxph(formula = Surv(rep(0, dim(sub1)[1]), tstop - tstart, status) ~ tx * strata(event) + ship * strata(event) + cluster(id) + strata(event), data = sub1, method = "breslow") n= 492, number of events= 478 coef exp(coef) se(coef) robust se z Pr(>|z|) txh -0.26370516438318 0.76819999899147 0.26233252301991 0.22689202991138 -1.16225 0.2451341 txl -0.11987649941476 0.88702997867426 0.25672484677695 0.18618575422004 -0.64385 0.5196699 txm -0.51362257507084 0.59832417391156 0.27195280894235 0.23832152919253 -2.15517 0.0311488 * ship2 1.65646739004232 5.24076452832491 1.03399484661244 0.21696246528283 7.63481 2.2649e-14 *** ship3 1.33578200803071 3.80296873598721 0.23058142160552 0.22861869093861 5.84284 5.1319e-09 *** txh:strata(event)event=2 0.09842786823565 1.10343480882414 0.37065411069586 0.31825353892903 0.30928 0.7571123 txl:strata(event)event=2 -0.07427724599606 0.92841425902055 0.36269132158140 0.24381582455421 -0.30464 0.7606366 txm:strata(event)event=2 0.49030567879075 1.63281526067281 0.38153511309810 0.31897747833434 1.53712 0.1242647 txh:strata(event)event=3 0.08024387819237 1.08355128998456 0.38622480003047 0.34278826007442 0.23409 0.8149138 txl:strata(event)event=3 -0.05197138144960 0.94935603564767 0.37101662106664 0.32942786091746 -0.15776 0.8746439 txm:strata(event)event=3 0.48005016860266 1.61615548042121 0.38211881930279 0.35384318204038 1.35667 0.1748845 txh:strata(event)event=4 0.25828662043757 1.29470985575366 0.37678415307640 0.29040930717579 0.88939 0.3737945 txl:strata(event)event=4 0.01633743201781 1.01647161761367 0.37635750078046 0.23765893060307 0.06874 0.9451940 txm:strata(event)event=4 0.79156021665709 2.20683688529518 0.39001352229631 0.25824225217817 3.06518 0.0021754 ** strata(event)event=2:ship2 -2.87880676955391 0.05620178452044 1.45849731918900 0.28690571886754 -10.03398 < 2.22e-16 *** strata(event)event=3:ship2 0.59127090608651 1.80628257277547 1.46984114123398 0.34311762620912 1.72323 0.0848468 . strata(event)event=4:ship2 -1.80811450547487 0.16396299686714 1.45261480585070 0.21783939462209 -8.30022 < 2.22e-16 *** strata(event)event=2:ship3 -2.15379915896782 0.11604245552070 0.31838628524726 0.28307156268510 -7.60867 2.7645e-14 *** strata(event)event=3:ship3 0.22972491271199 1.25825383268272 0.34233273407819 0.36139360103116 0.63566 0.5249954 strata(event)event=4:ship3 -1.15889323548630 0.31383332833294 0.32161353666032 0.26768619263227 -4.32930 1.4959e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 exp(coef) exp(-coef) lower .95 upper .95 txh 0.76819999899147 1.3017443391211 0.49242881278360 1.19840923831125 txl 0.88702997867426 1.1273576136565 0.61582412220287 1.27767353485976 txm 0.59832417391156 1.6713347773707 0.37503940106341 0.95454455204408 ship2 5.24076452832491 0.1908118547581 3.42543603154333 8.01813625723261 ship3 3.80296873598721 0.2629524640939 2.42952947665736 5.95282804586274 txh:strata(event)event=2 1.10343480882414 0.9062610604659 0.59135568542433 2.05894423159404 txl:strata(event)event=2 0.92841425902055 1.0771053872600 0.57571215103501 1.49719444830733 txm:strata(event)event=2 1.63281526067281 0.6124391559079 0.87382190804916 3.05106298082884 txh:strata(event)event=3 1.08355128998456 0.9228912458904 0.55343621886470 2.12144301006479 txl:strata(event)event=3 0.94935603564767 1.0533455968580 0.49775956270853 1.81066713719455 txm:strata(event)event=3 1.61615548042121 0.6187523490867 0.80777661572505 3.23351590779965 txh:strata(event)event=4 1.29470985575366 0.7723738222552 0.73278336798836 2.28754320009662 txl:strata(event)event=4 1.01647161761367 0.9837953000081 0.63796904694880 1.61953711446611 txm:strata(event)event=4 2.20683688529518 0.4531372511776 1.33031359891976 3.66088796074397 strata(event)event=2:ship2 0.05620178452044 17.7930293234080 0.03202841999485 0.09861993141683 strata(event)event=3:ship2 1.80628257277547 0.5536232343002 0.92198411562698 3.53873421180760 strata(event)event=4:ship2 0.16396299686714 6.0989370718218 0.10698444024686 0.25128761041906 strata(event)event=2:ship3 0.11604245552070 8.6175356727230 0.06662940869666 0.20210072018766 strata(event)event=3:ship3 1.25825383268272 0.7947521986624 0.61965430467327 2.55497734062470 strata(event)event=4:ship3 0.31383332833294 3.1864047241633 0.18571379401475 0.53033948552421 Concordance= 0.653 (se = 0.043 ) Rsquare= 0.164 (max possible= 1 ) Likelihood ratio test= 88.11 on 20 df, p=1.581164088549e-10 Wald test = 404.6 on 20 df, p=0 Score (logrank) test = 110.98 on 20 df, p=1.298960938811e-14, Robust = 52.59 p=9.340668759683e-05 (Note: the likelihood ratio and score tests assume independence of observations within a cluster, the Wald and robust score tests do not).
I understand the individual components of the output (coef, exp(coef), se(coef), z, Pr(>|z|), etc.) and that the numbers are compared to a baseline group (I think… txC, event=1, ship=1). However, I'm still unsure about two things:
What do I report in a manuscript? I have a couple other of these outputs and it would quickly get overwhelming for a reader to make their way through all of them. Would it be adequate just to explain the effects of the covariate of interest (tx)?
What about comparing the other treatments to each other? For example, I'd like to know the difference between L and H. My output doesn't contain that information. Would I just rerun the model without C to set a new baseline group? I'm hesitant because I don't want to inflate my type I error.
Best Answer
You need to go beyond the standard output from coxph()
and analyze the data in a way that accomplishes two things. For your first question, what you presumably want is a more general idea whether any of your treatments affect developmental times significantly, including both their direct effects and their interactions with shipments and types of events. For your second question, you need flexibility to examine particular combinations (contrasts) of factor levels other than the comparisons of each individually against the reference level (as you correctly take the coxph()
output to represent.)
Although it's possible to do this yourself by hand, the rms
package in R provides these capabilities directly. There is a bit of learning curve in terms of pre-processing the data (datadist
is important) and using the cph()
function with appropriate parameter values to allow useful downstream processing, but it's well worth it. Then the anova()
function applied to the cph()
output provides information about whether the treatments differ among themselves (as main effects or including interaction terms) and the contrast()
function allows for great flexibility in terms of examining pre-specified combinations of factor levels.
The rms
package also provides ways to calibrate and validate your model. You need to know that your data adequately meet the proportional hazards assumption before you report results that depend on the assumption, and you want to make sure that your results aren't too dependent on the particular data sample you have.
If you are publishing in a journal that provides for supplemental material outside of the main text, showing the anova tables and specific contrast results in the main text and putting more detailed results in the supplement might work. In clinical survival analysis, usual practice is to present hazard ratios (exp(coef)
in the output) along with their confidence intervals. As you might be more interested in whether the treatments delay rather than accelerate the developmental steps, you might consider using exp(-coef)
instead as a measure of delay and recalculate the confidence intervals from -coef
and robust se
.
In this particular application, you will of course have to deal with the biological problem of the differences among shipments being the greatest of all influences. That makes it very hard to go from your specific data to generalizations about these types of insects and treatments. A more than 5-fold difference in rates of developmental progression between 2 shipments of insects (ship2
versus ship1
) is pretty remarkable. My guess is that, in the background of those shipment differences and their striking interactions with specific developmental stages, any effects of the treatments themselves will be lost.
Similar Posts:
- Solved – How to interpret coxph() output for the stratified Prentice, Williams, Peterson extension of the Cox PH model with gap-times
- Solved – Competing risk survival analysis with time-dependent covariates
- Solved – How does clogit() (in R) handle incomplete strata
- Solved – How to interpret hazard ratios of Cox output
- Solved – Using strata instead of stratifying groups separately