Let's consider a multiple linear regression formula:

$ hat{y} = beta_0 + beta_1 hat{x}_1 + beta_2 hat{x}_2 $ (1)

which produces adjusted $R^2 = r_1$.

Now I want to add to one predictor to the (1) which turns into:

$ hat{y} = beta_0 + beta_1 hat{x}_1 + beta_2 hat{x}_2 + beta_3 hat{x}_3 $ (2)

which produces adjusted $R^2 = r_2$.

If the data fed into (1) and (2) are exactly the same, is there any way to explain $r_2 < r_1$ apart from a code bug?

**Contents**hide

#### Best Answer

Yes, it's definitely possible for adjusted $R^2$ to decrease when you add parameters.

Ordinary $R^2$ can't decrease, but adjusted-$R^2$ certainly can. We can write the relationship between the two like so:

$R_{adj}^2 = R^2-(1-R^2)frac{p}{n-p-1}$

Note that both terms in the product $(1-R^2)cdotfrac{p}{n-p-1}$ are positive (unless $R^2=1$), so if $R^2<1$, $R_{adj}^2 < R^2$.

If $R^2<frac{p}{n-1}$, then adjusted-$R^2$ will be negative.

$R_{adj}^2$ will decrease if the $R^2$ for a model with an additional term if the second model's $R^2$ didn't increase from that for the first model by at least as much as would be expected for an unrelated variable.

We can see this happen quite easily: I just generated three unrelated variables in R (via)

`x1=rnorm(20);x2=rnorm(20);y=rnorm(20)`

If we fit a linear regression with just the first $x$ (

`lm(y~x1)`

), the adjusted $R^2$ is smaller than with the null model (which is 0):

`Multiple R-squared: 0.0007048, Adjusted R-squared: -0.05481`

If we fit both independent variables (

`lm(y~x1+x2)`

), the adjusted $R^2$ goes down again (and the $R^2$ – necessarily – goes up):

`Multiple R-squared: 0.00199, Adjusted R-squared: -0.1154`

For adjusted $R^2$ to increase, its addition has to explain more additional variation in the data than would be expected from an unrelated variable; it's possible for an unrelated variable to add less than it would be expected to, just by chance.

### Similar Posts:

- Solved – What happens when we introduce more variables to a linear regression model
- Solved – How low multiple R-squared value is enough to reject a model
- Solved – R-squared or adjusted R-squared to use when comparing nested models
- Solved – Interpreting adjusted R-squared of a log transformed regression model
- Solved – High t-stat for intercept