# Solved – How is adjusted coefficient of determination (\$R^2\$) linked to the F-values of a test against zero when adding a new variable

Someone claims that the adjusted \$R^2\$ will increase with the addition of an extra variable.
I wonder why, as it is called adjusted (in contrast to the normal \$R^2\$).

The only condition it has to satisfy (to increase the adjusted \$R^2\$) is that the F-value (by the way, how is it simple to calculate it?) of the null hypothesis that the new variable is greater than 1.

Can someone give me a hint where the links between the adjusted \$R^2\$ and the F-Stat of that test are?

And however, who wants to include a new variable in a multiple OLS regression model anyway, if the beta was tested to be 0? Therefore adjusted \$R^2\$ always changes.

Contents

The assertion of the question is true. We usually show the inverse situation, i.e. the case of dropping one variable. In a linear multiple regression model \$y_i = Xbeta +u_i,; i=1,…,n\$, with \$k\$ regressors (including the constant term), if the t-ratio \$t\$ of a variable is less than 1, then dropping this one variable will increase adjusted R_squared, \$bar R^2\$. When dealing with dropping one variable, then the corresponding F-statistic (reflecting just one linear restriction) is equal to \$t^2\$ (see this post). So both should be smaller than unity for \$bar R^2\$ to increase. This result can be proven as follows: \$bar R^2\$ is defined as

\$\$ 1- bar R^2 = frac {n-1}{n-k} (1-R^2) qquad [1]\$\$

Denoting \$S_{yy} = sum_{i=1}^{n}(y_i-bar y)^2\$ and since \$R^2 = 1 – frac{sum_{i=1}^{n}hat u_i^2}{S_{yy}}\$we can write

\$\$ (1- bar R^2) = frac {n-1}{n-k} left(1-1 + frac{sum_{i=1}^{n}hat u_i^2}{S_{yy}}right) = frac {n-1}{S_{yy}} frac{sum_{i=1}^{n}hat u_i^2}{n-k}\$\$

\$\$Rightarrow (1- bar R^2)frac {S_{yy}}{n-1} = hat sigma^2 qquad [2]\$\$

By dropping a regressor, \$S_{yy}\$ and \$n\$ remain unaffected. So as a matter of mathematical necessity, the term \$(1- bar R^2)\$ in the LHS of \$[2]\$ moves in the same direction as its RHS – meaning that as the OLS estimated variance of the regresion decreases, so is \$(1-bar R^2)\$, and hence, \$bar R^2\$ increases as \$hat sigma^2\$ decreases.

Consider now dropping one regressor, and index the various quantities related to this restricted regression with \$r\$. Denote \$RSS\$ the residuals sum of squares

The F-statistic to test whether the restricted regression with \$k-1\$ regressors is "better" than the regression with \$k\$ regressors is

\$\$=(n-k+1)frac {hat sigma_r^2}{hat sigma^2} – (n-k) Rightarrow frac {hat sigma_r^2}{hat sigma^2} = frac {F+(n-k)}{1+(n-k)} qquad [2]\$\$

From \$[2]\$ it is obvious that if

\$\$F<1 Rightarrow hat sigma_r^2 <hat sigma^2 Rightarrow bar R_r^2 > bar R^2\$\$

And \$F(1,n-k)=t^2 <1 Rightarrow t<1\$

Beware that the above results hold only when considering dropping just one regressor. Assume that we run the initial regression with \$k\$ regressors and we observe that two of them have t-ratios smaller than unity. This does not imply necessarily that if we drop both simultaneously, we will end up with a higher \$bar R^2\$.

Now think in reverse – start from the "restricted" model and add one variable.

Rate this post