# Solved – Comparing Cox Proportional Hazards Models (variable selection)

I am using a cox proportional hazards model to run a survival analysis in r on a number of non-nested, distinct covariates such as Age, Blood Type, Cancer, etc:

`` A, B, C, D, E     ``

When I run the model on the omnibus null hypothesis:

``surv ~ A + B + C + D     ``

The effects of all of the covariates are insignificant because the number of subjects that have measurements for every covariate is relatively small. However, when I isolate single or other combinations of covariates in different cox models:

``surv ~ A     surv ~ A + C surv ~ B + D ``

I'm showing significant effects because the sample set is larger (i.e. the number of observations discarded by the model shrinks).

What I'm having difficulty understanding is how to do the following:

• Comparing the different cox models for the best fit, i.e. is `surv ~ A + B + D` a better model than `surv ~ A + C` ? Should I be comparing the likelihood, wald or logrank scores?
• Is it possible to run every possible combination of covariates to determine the best model? I have about 15 covariates.
• More broadly, is this tactic the best approach to optimizing for both significant covariates and overall model "cost"? I will be attaching a cost to each distinct cox model i.e. using covariates `A + B + C` in the model costs \$100 while using covariates `A + B` costs \$75 and using only covariate `A` costs \$10. I'd like to look at the cost for each combination of covariates vs. the accuracy for each cox model.

Thanks very much for your help!

Contents