# Solved – Expected value of \$R^2\$, the coefficient of determination, under the null hypothesis

The text states:

The logic of the adjustment is the following: in ordinary multiple regression, a random predictor explains on average a proportion \$1/(n – 1)\$ of the response’s variation, so that \$m\$ random predictors explain together, on average, \$m/(n – 1)\$ of the response’s variation; in other words, the expected value of \$R^2\$ is \$mathbb{E}(R^2) = m/(n – 1)\$. Applying the [\$R^2_mathrm{adjusted}\$] formula to that value, where all predictors are random, gives \$R^2_mathrm{adjusted} = 0\$."

This seems to be a very simple and interpretable motivation for \$R^2_mathrm{adjusted}\$. However, I have not been able to work out that \$mathbb{E}(R^2)=1/(n – 1)\$ for single random (i.e. uncorrelated) predictor.

Could someone point me in the right direction here?

Contents

This is accurate mathematical statistics. See this post for the derivation of the distribution of \$R^2\$ under the hypothesis that all regressors (bar the constant term) are uncorrelated with the dependent variable ("random predictors").

This distribution is a Beta, with \$m\$ being the number of predictors without counting the constant term, and \$n\$ the sample size,

\$\$R^2 sim Betaleft (frac {m}{2}, frac {n-m-1}{2}right)\$\$

and so

\$\$E(R^2) = frac {m/2}{(m/2)+[(n-m-1)/2]} = frac{m}{n-1}\$\$

This appears to be a clever way to "justify" the logic behind the adjusted \$R^2\$: if indeed all regressors are uncorrelated, then the adjusted \$R^2\$ is "on average" zero.

Rate this post