# Solved – Do interaction terms in a linear regression model increase its predictive power

Interaction terms are sometimes added to linear regression models when the effect of one variable depends on the value of another variable. But will the inclusion of such interaction terms increase the model's predictive power? Or is the only effect to yield a model that can be better interpreted?

Or put another way, if I only care about the model's performance, and otherwise treat it as a black box, do I need to think about interaction terms?

Contents

For linear regression, we have the hypothesis space $$mathcal{F}_1$$, the set of all functions whose output is a linear combination of the input variables. Including interaction terms gives the hypothesis space $$mathcal{F}_2$$, the set of all functions whose output is a linear combination of the input variables and their interaction terms. Note that $$mathcal{F_1}$$ is a subset of $$mathcal{F}_2$$. That is, every function in $$mathcal{F}_1$$ is also in $$mathcal{F}_2$$ (because we can always set the coefficients for the interaction terms to zero), but $$mathcal{F}_2$$ contains functions that are not in $$mathcal{F}_1$$. This means that including interaction terms gives us the possibility of fitting a wider variety of functions. In particular, $$mathcal{F}_2$$ contains functions that are nonlinear with respect to the original input variables.
When fit to a particular dataset, a model that includes interaction terms must fit at least as well as a model that does not. This follows from the fact that, if the model with lowest error on the training data is in $$mathcal{F_1}$$, it's also in $$mathcal{F}_2$$, as above. But, whether this yields an increase in predictive power depends on the problem. Including interaction terms can increase predictive power in some settings, but decrease it in others.