Solved – fuzzy distinction between discrete and categorical variables

Are there some variables where it can be difficult to assess whether they are purely categorical?

For example:

To assess the level of pain that a patient is in, you use a scale between 1 and 10.I wouldn't consider pain score to be a categorical variable here, just discrete; this is because it is a measurement (even if it is subjective).

Conversely if I have something such as tumour stage, it is still a discrete variable between say 1 and 4, however I am more tempted to say these are categories. The thing is, increasing value in the Tumour stage does actually have meaning: more likely to have aggressive disease. For categories such as hair colour, or gender, there is no numerical relationship between the categories!

The only thing I can really see is that in discrete variables, each step increase has the same meaning (thus increase of 1-2, 5-6 and 9-10) is the same. Whereas with tumour stage va the step sizes don't mean the same thing : 1-2 is different than 2-3 and 3-4 because the tumour aggression is probably not linear.

This has implications when running logistic regression models, where you could treat categorical variables differently (names you code them differently).

You can have ordered or unordered factors. The equivalent of a logistic model for these variable types are the ordered logit and multinomial logit. For the ordered logit you can have known or unknown cut points.

Similar Posts:

Rate this post

Leave a Comment