I recently ran a regression of the following form:

`mod <- lm(log(y) ~ log(x)) `

To examine how y scales as a function of x. I then examined the top 20 (super-linear scaling with respect to predicted trendline) and bottom 20 residuals (sub-linear scaling with respect to predicted trendline), and noticed that all these observations shared startling regularity in a third variable, `z`

. (all the positive residuals have very large values of `z`

and all the negative residuals have very small values of `z.`

)

I want to be able to demonstrate that this is a meaningful, statistically significant pattern. My intuition was to run

`lm(z ~ resid(mod)) `

but this strikes me as wrong. Is there a way to capture this pattern using residuals, or is this the wrong way of thinking about it altogether?

Re: Heteroskedastic Standard Errors

Here is a QQ plot:

Residuals vs. Fitted Values

The second graph in particular is somewhat alarming. Should I be worried about heteroskedastic standard errors with a huge dataset (300,000 obs), if I'm using robust SEs

**Contents**hide

#### Best Answer

Well, you want to understand how the residuals could depend on `z`

. So the model you should look at would be

`lm( resid(mod) ~ z ) `

but maybe first (show us) the corresponding plot. But, maybe what you see is that the spread of the residuals depend on `z`

not its mean (heteroskedasticity). Then try a model like

`lm( Id(resid(mod)^2) ~ z ) `

(or replace the square with absolute value.) If this turns out in the confirmative, maybe try a more complex model, a `gamlss`

which permits simultaneous estimation of mean and variance functions. You could look through gamlss for some examples.

### Similar Posts:

- Solved – Residuals from glm model with log link function
- Solved – Is $R^2$ valid in a nonlinear model
- Solved – Binomial GLMM: Model validation & ceiling effect
- Solved – Patterns in residuals plot from linear regression: do they tell us what model to use
- Solved – Use predicted values with or without random part to plot Residuals with binnedplot of a logistic regression in glmer (lme4 package) in R