# Solved – Can Adjusted R squared be equal to 1

I have a dataset with around 15 independent variables. I am using a multi-regression model to fit the dataset. For model selection, I am using a backward elimination procedure based on the p-values. The adjusted R^2 for the model with all predictors is exactly 1. At this point, I concluded that maybe the model is also picking up noise. But, based on the model selection I removed 5 predictor variables and still the adjusted R^2 is 1. I am not sure if this correct or I am just modeling noise. Can someone comment on this?

Contents

Dan and Michael point out the relevant issues. Just for completeness, the relationship between adjusted $$R^2$$ and $$R^2$$ is given by (see, e.g., here)

$$R^2_{adjusted}=1-(1-R^2)frac{n-1}{n-K},$$ (with $$K$$ the number of regressors, including the constant). This shows that $$R^2_{adjusted}=1$$ if $$R^2=1$$, unless (see below) $$K=n$$.

$$R^2=1$$ occurs when all residuals $$hat u_i=y_i-hat y_i$$ are zero, as $$R^2=1-frac{hat{u}'hat{u}/n}{tilde{y}'tilde{y}/n}.$$ Here, $$hat u$$ denotes the vector of residuals and $$tilde y$$ the vector of demeaned observations on the dependent variable.

Dan discusses one reason to get an $$R^2$$ of 1. Another is to have as many regressors as observations, i.e., $$K=n$$.

Technically, this is because the $$ntimes K$$ regressor matrix $$X$$ then is square. The OLS estimator $$hatbeta=(X'X)^{-1}X'y$$ can then be written as (assuming no exact multicollinearity) $$hatbeta=(X'X)^{-1}X'y=X^{-1}{X'}^{-1}X'y=X^{-1}y$$ so that the fitted values $$hat y=Xhatbeta$$ are just $$hat y=XX^{-1}y=y$$, so that all residuals are zero.

Here is an illustration using artificial data (code below), in which regressors are generated totally independently of $$y$$, and yet we achieve an $$R^2$$ of 1 once we have as many of them as we have observations.

Code:

``n <- 15 regressors <- n-1 # enough, as we'll also fit a constant y <- rnorm(n) X <- matrix(rnorm(regressors*n),ncol=regressors)  collectionR2s <- rep(NA,regressors) for (i in 1:regressors){   collectionR2s[i] <- summary(lm(y~X[,1:i]))\$r.squared } plot(1:regressors,collectionR2s,col="purple",pch=19,type="b",lwd=2) abline(h=1, lty=2) ``

When $$K=n$$, `R` however, correctly, does not report an adjusted $$R^2$$:

``> summary(lm(y~X))  Call: lm(formula = y ~ X)  Residuals: ALL 15 residuals are 0: no residual degrees of freedom!  Coefficients:             Estimate Std. Error t value Pr(>|t|) (Intercept)  2.36296         NA      NA       NA X1          -1.09003         NA      NA       NA X2           0.39177         NA      NA       NA X3           0.19273         NA      NA       NA X4           0.51528         NA      NA       NA X5          -0.04530         NA      NA       NA X6          -1.28539         NA      NA       NA X7          -0.72770         NA      NA       NA X8          -0.14604         NA      NA       NA X9           0.34385         NA      NA       NA X10         -0.93811         NA      NA       NA X11          2.23064         NA      NA       NA X12          0.06744         NA      NA       NA X13          0.21220         NA      NA       NA X14         -2.29134         NA      NA       NA  Residual standard error: NaN on 0 degrees of freedom Multiple R-squared:      1, Adjusted R-squared:    NaN  F-statistic:   NaN on 14 and 0 DF,  p-value: NA ``

Rate this post