Is $||Y-Xbeta||_2^2 + lambdabeta^T Kbeta$ , the standard loss-function in kernel ridge regression, or is it different? Also, is the gaussian kernel a standard choice used for the kernel, in practice? If not, which kernels are used more often than not? Also, is $lambda$ the only parameter to be tuned via cross-validation or is the kernel parameter like $sigma$ in a gaussian kernel, also tuned via cross validation in practice? Please confirm and/or correct my understanding of Kernel ridge regression!

**Contents**hide

#### Best Answer

The standard loss function for kernel ridge regression is: $||Y-Kbeta||_2^2 + lambdabeta^T Kbeta$. The equation in your question is missing the kernel matrix K in the $L_2$ error term.

In practice, the Gaussian (a.k.a. RBF) and polynomial kernels are popular choices and could be a good starting point. However, the choice of kernel generally depends on the problem at hand. Sometimes it may be helpful to think of the kernel as a similarity metric for the input data vectors. You may need to experiment with different kernels to make an appropriate choice for the specific dataset.

Yes, in addition to $lambda$, you will need to determine the kernel parameters through cross-validation.