Is it common practice (and adequate) to regroup two binary dependant variables into a single 4-level dependent variable to take advantage of the multinomial regression? For instance, say we have information on two related conditions (outcomes) A and B. A new 4-category variable would be defined such that:
category 1 = Neither conditions A nor B category 2 = Condition A (only) category 3 = Condition B (only) category 4 = Both conditions A and B
This allows running a single multinomial regression instead of using two binary logistic models that include the same predictors.
Best Answer
As @Riaz Rizvi suggests, this may not be a good idea.
Your scheme enforces a particular (and rather unlikely) covariance structure on the problem by flattening to a multinomial this way. Since you suspect, or at least wish to allow the possibility that the presence of A is informative of B, then you should be working with a bivariate probit. Working with two separate logistic models is not going to be able to represent this. The model is a regression with an explicit correlated bivariate latent variable generating the choice probabilities, as discussed briefly in the link and at greater length in good econometrics texts.
Similar Posts:
- Solved – two ways of predicting a categorical variable
- Solved – Multinomial logistic regression low classification rate
- Solved – use glm algorithms to do a multinomial logistic regression
- Solved – Feature Importance for Multinomial Logistic Regression
- Solved – Comparing three groups on one categorical dependent variable