Solved – Is Gaussian Process Regression a linear model

I had a discussion today with someone saying that Gaussian Processes are linear models. I don't see in which sense this may be correct. To be clear, here the definition of a linear model is the usual one, i.e., a model which is linear in the parameters. Thus,

$y=beta_0+beta_1x+beta_2sin(x)+epsilon$

and

$y=beta_0+boldsymbol{beta}^Tcdotmathbf{x}+epsilon$

are linear, and

$y=beta_0+beta_1x+beta_2exp({beta_3x})+epsilon$

is not.

For simplicity, let's consider a Squared Exponential covariance function, and assume that the correlation length, the signal variance and the noise variance are known. Given a design matrix $X$ and corresponding response vector $mathbf{y}$, the GP prediction at a new prediction point $mathbf{x}^*$ is

$$hat{y}(mathbf{x}^*)=mathbf{k}_*^T(K+sigma I)^{-1}mathbf{y}$$

Now, this estimator is clearly a nonlinear function of $X$ and a linear function of $mathbf{y}$. The other person insisted that $mathbf{y}$ is the parameter vector of this model, and thus the model is linear. I don't think this makes any sense: it would mean that the number of parameters of the model depends on the sample size. I think we can at most say that the estimator is a linear function
of $mathbf{y}$, but surely not the statistical model underlying Gaussian Process Regression. Do you agree?

I think the technically correct term to use here is that GP regression is a linear smoother, i.e. its predictions are a linearly weighted combination of past observed outputs. This does not make the model as such linear. For that to be true, the predictions must be a linear function of the inputs. This is only the case with GPs if you use a linear covariance function.

Similar Posts:

Rate this post

Leave a Comment