Solved – ny way the adjusted $R^2$ might decrease by adding predictors

Let's consider a multiple linear regression formula:

$ hat{y} = beta_0 + beta_1 hat{x}_1 + beta_2 hat{x}_2 $ (1)

which produces adjusted $R^2 = r_1$.

Now I want to add to one predictor to the (1) which turns into:

$ hat{y} = beta_0 + beta_1 hat{x}_1 + beta_2 hat{x}_2 + beta_3 hat{x}_3 $ (2)

which produces adjusted $R^2 = r_2$.

If the data fed into (1) and (2) are exactly the same, is there any way to explain $r_2 < r_1$ apart from a code bug?

Yes, it's definitely possible for adjusted $R^2$ to decrease when you add parameters.

Ordinary $R^2$ can't decrease, but adjusted-$R^2$ certainly can. We can write the relationship between the two like so:

$R_{adj}^2 = R^2-(1-R^2)frac{p}{n-p-1}$

Note that both terms in the product $(1-R^2)cdotfrac{p}{n-p-1}$ are positive (unless $R^2=1$), so if $R^2<1$, $R_{adj}^2 < R^2$.

If $R^2<frac{p}{n-1}$, then adjusted-$R^2$ will be negative.

$R_{adj}^2$ will decrease if the $R^2$ for a model with an additional term if the second model's $R^2$ didn't increase from that for the first model by at least as much as would be expected for an unrelated variable.

We can see this happen quite easily: I just generated three unrelated variables in R (via)

  1. If we fit a linear regression with just the first $x$ (lm(y~x1)), the adjusted $R^2$ is smaller than with the null model (which is 0):
    Multiple R-squared: 0.0007048, Adjusted R-squared: -0.05481

  2. If we fit both independent variables (lm(y~x1+x2)), the adjusted $R^2$ goes down again (and the $R^2$ – necessarily – goes up):
    Multiple R-squared: 0.00199, Adjusted R-squared: -0.1154

For adjusted $R^2$ to increase, its addition has to explain more additional variation in the data than would be expected from an unrelated variable; it's possible for an unrelated variable to add less than it would be expected to, just by chance.

Similar Posts:

Rate this post

Leave a Comment