Someone claims that the adjusted $R^2$ will increase with the addition of an extra variable.

I wonder why, as it is called adjusted (in contrast to the normal $R^2$).

The only condition it has to satisfy (to increase the adjusted $R^2$) is that the F-value (by the way, how is it simple to calculate it?) of the null hypothesis that the new variable is greater than 1.

Can someone give me a hint where the links between the adjusted $R^2$ and the F-Stat of that test are?

And however, who wants to include a new variable in a multiple OLS regression model anyway, if the beta was tested to be 0? Therefore adjusted $R^2$ always changes.

**Contents**hide

#### Best Answer

The assertion of the question is true. We usually show the inverse situation, i.e. the case of *dropping* one variable. In a linear multiple regression model $y_i = Xbeta +u_i,; i=1,…,n$, with $k$ regressors (including the constant term), if the *t-ratio* $t$ of a variable is *less* than 1, then dropping this *one* variable will increase adjusted R_squared, $bar R^2$. When dealing with dropping one variable, then the corresponding F-statistic (reflecting just one linear restriction) is equal to $t^2$ (see this post). So both should be *smaller* than unity for $bar R^2$ to *increase*. This result can be proven as follows: $bar R^2$ is defined as

$$ 1- bar R^2 = frac {n-1}{n-k} (1-R^2) qquad [1]$$

Denoting $S_{yy} = sum_{i=1}^{n}(y_i-bar y)^2$ and since $R^2 = 1 – frac{sum_{i=1}^{n}hat u_i^2}{S_{yy}}$we can write

$$ (1- bar R^2) = frac {n-1}{n-k} left(1-1 + frac{sum_{i=1}^{n}hat u_i^2}{S_{yy}}right) = frac {n-1}{S_{yy}} frac{sum_{i=1}^{n}hat u_i^2}{n-k}$$

$$Rightarrow (1- bar R^2)frac {S_{yy}}{n-1} = hat sigma^2 qquad [2]$$

By dropping a regressor, $S_{yy}$ and $n$ remain unaffected. So as a matter of mathematical necessity, the term $(1- bar R^2)$ in the LHS of $[2]$ moves in the *same* direction as its RHS – meaning that as the OLS estimated variance of the regresion decreases, so is $(1-bar R^2)$, and hence, $bar R^2$ *increases* as $hat sigma^2$ decreases.

Consider now dropping one regressor, and index the various quantities related to this restricted regression with $r$. Denote $RSS$ the residuals sum of squares

The F-statistic to test whether the restricted regression with $k-1$ regressors is "better" than the regression with $k$ regressors is

$$F(1,n-k)= frac{RSS_r -RSS}{RSS/(n-k)} = frac {(n-k+1)hat sigma_r^2 – (n-k)hat sigma^2}{hat sigma^2}$$

$$=(n-k+1)frac {hat sigma_r^2}{hat sigma^2} – (n-k) Rightarrow frac {hat sigma_r^2}{hat sigma^2} = frac {F+(n-k)}{1+(n-k)} qquad [2]$$

From $[2]$ it is obvious that if

$$F<1 Rightarrow hat sigma_r^2 <hat sigma^2 Rightarrow bar R_r^2 > bar R^2$$

And $F(1,n-k)=t^2 <1 Rightarrow t<1$

Beware that the above results hold only when considering dropping *just one* regressor. Assume that we run the initial regression with $k$ regressors and we observe that *two* of them have t-ratios smaller than unity. This does *not* imply necessarily that if we drop *both simultaneously*, we will end up with a higher $bar R^2$.

Now think in reverse – start from the "restricted" model and *add* one variable.

### Similar Posts:

- Solved – How to interpret a negative adjusted R-squared
- Solved – is there a formula for an innovation outlier regressor in time series intervention analysis
- Solved – Strict exogeneity and lagged variables
- Solved – R-squared or adjusted R-squared to use when comparing nested models
- Solved – Can Adjusted R squared be equal to 1