Spuriously high R-squared is one of the pitfalls of regression through the origin (i.e. zero-intercept models). If the predictors do not contain zeroes, then is it an extrapolation? What are the uses and other pitfalls of regression through the origin? Are there any peer-reviewed articles?
Best Answer
To me the main issue boils down to imposing a strong constraint on an unknown process.
Consider a specification $y_t=f(x_t)+varepsilon_t$. If you don't know the exact form of a function $f(.)$, you could try a linear approximation: $$f(z)approx a+b x_t$$
Notice, how this linear approximation is actually the first order Maclaurin (Taylor) series of the function $f(.)$ around $x_t=0$: $$f(0)=a$$ $$frac{partial f(z)}{partial z}=b$$
Hence, when you regress through origin, from Maclaurin series view, you're saying that $f(0)=0$. This is a very strong constraint on a model.
There are situations where imposing such a constraint makes a sense, and these are driven by theory or outside knowledge. I would argue that unless you have a reason to believe that $f(0)=0$ it's not a good idea to regress through origin. As with any constraint, this will lead to suboptimal parameter estimation.
EXAMPLE: CAPM in finance. Here we state that the excess return $r-r_f$ on a stock is defined by its beta on the excess market return $r_m-r_f$: $$r-r_f=beta (r_m-r_f)$$
The theory tells us that the regression should be through origin. Now, some practitioners believe that they can get an additional return, alpha, on top of CAPM relationship: $$r-r_f=alpha+beta (r_m-r_f)$$
Both regressions are used in academic research and practice for different reasons. This example shows you when the imposition of a strong constraint, such as regression through origin, can make a sense in some situations.
Similar Posts:
- Solved – Analytical solution of a simple regression with fixed intercept
- Solved – How to obtain covariance matrix for constrained regression fit
- Solved – Setting Multiple linear restrictions equal to some coefficient in R
- Solved – Understanding linear projection in “The Elements of Statistical Learning”
- Solved – How does the inclusion of an intercept change the variability of the residual