I'm working through Gelman and Hill, Data Analysis and Regression using Multilevel/Hierarchical Models (2007), using the arm
package, and trying to relate multilevel models to the econometric framework I'm more familiar with. I expected a multilevel model with a non-varying slope coefficient and a varying intercept coefficient to provide identical results to a fixed effect regression with no constant.
I expected the following R and Stata code to produce the same results. They do not – can you tell me why?
R code:
M1 <- lmer(y ~ x1 + x2 + (1 | county))
Stata code:
reg y x1 x2 i.county, noconstant
The coefficients produced by these two approaches are quite different.
The Stata code regresses y on x1, x2 and K additional indicator variables for each county. What is R doing that is different? Is there an OLS regression analog?
Best Answer
The key difference is the lmer()
is a random effects model and xtreg
with the fe
option is a fixed effects model. A random effects model forces the random constant to be independent of x1
and x2
while a fixed effects model allows for correlations.
The effects for individual counties you get with your reg
command are not necessary. You can have Stata produce a fixed effects models with only the constant and the effects of x1
and x2
by typing in Stata:
xtset county xtreg y x1 x2, fe
This fixed effects model is exactly the same as the one you estimated with reg
. You can see that by running this code:
// here I import the data you linked to import delimited C:temptest.csv // fixed effects regression using xtreg xtset county xtreg y x1 x2, fe // fixed effects regression using reg reg y x1 x2 ibn.county, hascons
The coefficients and standard errors of x1
and x2
of these two models are exactly the same.