In a Cox regression model where our variable of interest is continuous (e.g., a lab measurement);
If we want to obtain something other than a unit risk ratio for that variable (e.g., the hazard ratio of mortality among people who are in Quartile 4 (>75 percentile) vs Quartile 1 (< 25 percentile) for the lab measurement based on distribution of the measurements in our cohort), what would be the best way to integrate this variable into our model?
Would it be okay to create another column in our data which would indicate the quartile that person falls into and subsequently add that to the model?
Or is it better to run the model by keeping the variable as continuous, and then use the beta coefficient we find to calculate the hazard ratio by subtracting median values of different quartiles, multiplying them by the beta and exponentiation?
Thank you for your responses.
If your variable of interest is continuous, you almost always lose information by splitting the variable into a categorical effect. The way you lose information is: choosing cutpoints based on the data rather than hypotheses, not borrowing information across adjacent groups, spending more degrees of freedom. Adjusting for the continuous effect is better in all these respects. This is a general modeling principle that has been recommended in linear, logistic, Cox, and other models.
The interpretation of a continuous variable in a Cox model is: a hazard ratio comparing groups differing by one unit in the variable. The interpretation you give in the second paragraph is the wrong interpretation unless that variable is scaled by it's range * 1.5. I would not recommend doing this. Dividing by range makes the effect a unitless quantity. Laboratory experts who interpret your analyses will be keenly interested to know the scale of effect, because they can well interpret the units, whether they are mmol/dL, mmHg, mg/kg or some other, contextualizing factor.