I have a fundamental question regarding standardization:
Say, I have predictor vector a
and b
(a
is temperature and b
is rainfall) so they are on different scale.
I want to regress y
using linear and quadratic function of a
and b
. I can do this as following:
Method 1: lm(y ~ a + I(a^2) + b + I(b^2)) Method 2: a2<-a^2 b2<-b^2 lm(y ~ a + a2 + b + b2)
However before running my model, I want to standardise a
and b
so that their effect size can be comparable. So which one of the two methods below is correct:
Method 1: z.a<-scale(a, scale = T, center = T) z.b<-scale(b, scale = T, center = T) lm(y ~ z.a + I(z.a^2) + z.b + I(z.b^2))
OR
Method 2: a2<- a^2 b2<- b^2 z.a<-scale(a, scale = T, center = T) z.a2<-scale(a2, scale = T, center = T) z.b<-scale(b, scale = T, center = T) z.b2<-scale(b2, scale = T, center = T) lm(y ~ z.a + z.a2 + z.b + z.b2)
Best Answer
I was just looking at the same question and did some simple simulations to get the answer using a poisson glm. It turns out that both methods make the exact same predictions as using the unstandardized variables. The difference is that method 2 (reflected in "mod1" in the code below) gives the exact same z-scores and p-values for both variables as the unstandardized model ("mod" below), while method 1 (reflected in "mod2" below) estimates that variable 1 is not significant while the quadratic is.
Here is the simulation code in R:
# ---------------------------------- n.site <- 200 vege <- sort(runif(n.site, 0, 1)) alpha.lam <- 2 beta1.lam <- 2 beta2.lam <- -2 lam <- exp(alpha.lam + beta1.lam*vege + beta2.lam*(vege^2)) N <- rpois(n.site, lam) plot(vege, lam) z.veg <- scale(vege) z.veg2 <- scale(vege^2) z.vege2.1 <- z.veg^2 mod <- glm(N ~ vege + I(vege^2), family = poisson) a <- predict(mod, data = vege) mod1 <- glm(N ~ z.veg + z.veg2, family = poisson) b <- predict(mod1, data = c(z.veg, z.veg2)) mod2 <- glm(N ~ z.veg + z.vege2.1, family = poisson) c <- predict(mod2, data = c(z.veg, z.vege2.1)) summary(mod) summary(mod1) summary(mod2) par(mfrow=c(2, 2)) plot(vege, lam) plot(vege, a) plot(vege, b) plot(vege, c) # ----------------------------------
Similar Posts:
- Solved – How to calculate standard errors for GLMs fitted values “by-hand”, without using predicted() in R
- Solved – How to calculate standard errors for GLMs fitted values “by-hand”, without using predicted() in R
- Solved – How to calculate standard errors for GLMs fitted values “by-hand”, without using predicted() in R
- Solved – How to calculate standard errors for GLMs fitted values “by-hand”, without using predicted() in R
- Solved – Equivalence of Poisson and Weibull PH regression in a survival setting