Solved – What are the uses and pitfalls of regression through the origin?

Spuriously high R-squared is one of the pitfalls of regression through the origin (i.e. zero-intercept models). If the predictors do not contain zeroes, then is it an extrapolation? What are the uses and other pitfalls of regression through the origin? Are there any peer-reviewed articles?

To me the main issue boils down to imposing a strong constraint on an unknown process.

Consider a specification $y_t=f(x_t)+varepsilon_t$. If you don't know the exact form of a function $f(.)$, you could try a linear approximation: $$f(z)approx a+b x_t$$

Notice, how this linear approximation is actually the first order Maclaurin (Taylor) series of the function $f(.)$ around $x_t=0$: $$f(0)=a$$ $$frac{partial f(z)}{partial z}=b$$

Hence, when you regress through origin, from Maclaurin series view, you're saying that $f(0)=0$. This is a very strong constraint on a model.

There are situations where imposing such a constraint makes a sense, and these are driven by theory or outside knowledge. I would argue that unless you have a reason to believe that $f(0)=0$ it's not a good idea to regress through origin. As with any constraint, this will lead to suboptimal parameter estimation.

EXAMPLE: CAPM in finance. Here we state that the excess return $r-r_f$ on a stock is defined by its beta on the excess market return $r_m-r_f$: $$r-r_f=beta (r_m-r_f)$$

The theory tells us that the regression should be through origin. Now, some practitioners believe that they can get an additional return, alpha, on top of CAPM relationship: $$r-r_f=alpha+beta (r_m-r_f)$$

Both regressions are used in academic research and practice for different reasons. This example shows you when the imposition of a strong constraint, such as regression through origin, can make a sense in some situations.

Similar Posts:

Rate this post

Leave a Comment