I'm working through Gelman and Hill, Data Analysis and Regression using Multilevel/Hierarchical Models (2007), using the
arm package, and trying to relate multilevel models to the econometric framework I'm more familiar with. I expected a multilevel model with a non-varying slope coefficient and a varying intercept coefficient to provide identical results to a fixed effect regression with no constant.
I expected the following R and Stata code to produce the same results. They do not – can you tell me why?
M1 <- lmer(y ~ x1 + x2 + (1 | county))
reg y x1 x2 i.county, noconstant
The coefficients produced by these two approaches are quite different.
The Stata code regresses y on x1, x2 and K additional indicator variables for each county. What is R doing that is different? Is there an OLS regression analog?
The key difference is the
lmer() is a random effects model and
xtreg with the
fe option is a fixed effects model. A random effects model forces the random constant to be independent of
x2 while a fixed effects model allows for correlations.
The effects for individual counties you get with your
reg command are not necessary. You can have Stata produce a fixed effects models with only the constant and the effects of
x2 by typing in Stata:
xtset county xtreg y x1 x2, fe
This fixed effects model is exactly the same as the one you estimated with
reg. You can see that by running this code:
// here I import the data you linked to import delimited C:temptest.csv // fixed effects regression using xtreg xtset county xtreg y x1 x2, fe // fixed effects regression using reg reg y x1 x2 ibn.county, hascons
The coefficients and standard errors of
x2 of these two models are exactly the same.