I'm exploring the effects of removing the intercept in a logistic regression model.
Assume a model:
$$logit(Y = 1) = beta_1 x + beta_2z + 0$$
with $x$ and $z$ being categorical variables with 2 levels each and no intercept.
I understood that having no intercept with categorical predictors produce coefficients that compare the $P(Y = 1)$ in each level of the two predictor against a null case where $P(Y=1) = 0.5$ or $logit(Y=1) = 0$.
I noticed a phenomenon that can understand. Using glm() function in R if you change the order of the variable in the right hand part of the formula, the coefficients change too. But even more oddly, the coefficient of the first variable is always the same.
y <- as.factor(sample(rep(1:2), 30, T)) x <- as.factor(sample(rep(1:2), 30, T)) z <- as.factor(sample(rep(1:2), 30, T)) coef(glm(y ~ x + z - 1, binomial) # x1 x2 z2 #-0.1764783 0.3260739 -0.1335192 coef(glm(y ~ z + x - 1, binomial)) # z1 z2 x2 #-0.1764783 -0.3099976 0.5025523
As you can see the first predictors have the same coefficient while the other are different in the two models.
Here is what I expected and instead behave differently than what I though:
- Since every level of the two predictors is compared to the same null case, I expected to have the same coefficients in the two models, independently from the order in which I use them.
- I expected to see the coefficients of every level of every predictor, instead the coefficient for the 1 level of the second predictor is not shown.
I therefore assume that only the first variable is compared against the null case, while the second is compared against a reference level; but what is this level? Is it $P(Y = 1 | X = 1 cap Z = 1)$? Reproducing one of the models WITH the intercept we get:
coef(glm(y ~ x + z - 1, binomial) # x1 x2 z2 #-0.1764783 0.3260739 -0.1335192 coef(glm(y ~ x + z, binomial)) #(Intercept) x2 z2 #-0.1764783 0.5025523 -0.1335192
As expected x1 become the intercept, and x2 is likely relative to x1. z1 is missing also in this case and z2 is the same as in the model without intercept.
Thus should I assume that the comparison against the null case $P(Y = 1) = 0.5$ is made only for the first variable in a formula, while the other are compared against the usual intercept?
Is this behavior normal?
What about the fact that the first coefficient has the same value whichever the order of the predictors in the formula?
What if I want to compare all level of each predictor against the null case and have a coefficient for all levels?
Or it's theoretically impossible for some reason I don't get?
The issue is not specific to a GLM. It's an issue of treatment contrasts.
You should also look at the model with intercept:
set.seed(42) y <- as.factor(sample(rep(1:2), 30, T)) x <- as.factor(sample(rep(1:2), 30, T)) z <- as.factor(sample(rep(1:2), 30, T)) fit0 <- glm(y ~ z + x, binomial) predict(fit0, newdata=data.frame(z=factor(2), x=factor(1))) coef(fit0) #(Intercept) z2 x2 # -0.1151303 0.3228803 1.0588217 predict(fit0, newdata=data.frame(z=factor(2), x=factor(1))) # 1 #0.20775
Here the intercept represents the group
x1/z1 and the other group means are calculated by adding the coefficients of
fit1 <- glm(y ~ z + x - 1, binomial) coef(fit1) # z1 z2 x2 #-0.1151303 0.2077500 1.0588217 predict(fit1, newdata=data.frame(z=factor(2), x=factor(1))) # 1 #0.20775
Here the coefficient of
z1 represents the group
x1/z1 which is the same as the intercept in
fit0. However, the coefficient of
z2 represents the group
x1/z2 instead of the difference between the group means. Note that 0.208 = -0.115 + 0.323. The
x2/* group means are calculated by adding the
x2 coefficient to the
x1/* group means.
It should now be easy to understand why order matters here.