The VC dimension of SVM using the Gaussian kernel $K(x,x') = e^{-gamma |x-x'|_2^2}$ is infinite, but I have never seen a proof or a detailed explanation of this fact.
How do we explicitly choose points $x_1,cdots, x_n$ in, say, $mathbb R^2$ that can be scattered by SVM hypotheses using the Gaussian Kernel?
Intuitively, just from looking at pictures, the SVM with Gaussian kernel lets us place any number of bounded patches on the plane. Then we can choose points sufficiently far apart (depending on the parameter $gamma$, which also determines the size of the patches) and place a patch around precisely those points which we wish to classify positively. How do I make that rigorous?
Best Answer
The decision function of an SVM is $$ f(x) = mbox{sgn} left ( sum_i y_ialpha_i K(x,x_i) +b right ) $$
For a data point in your training set, $x_j$, this will look as follows: $$ f(x_j) = mbox{sgn} left ( sum_{i,ineq j} y_ialpha_i K(x_j,x_i) + y_jalpha_j K(x_j,x_j) +b right ) $$
Now, observe that
- $lim_{gamma rightarrow infty} K(x_j,x_j)=1$
- $lim_{gamma rightarrow infty} K(x_i,x_j)=0$ for all $x_ineq x_j$
Therefore, for sufficiently large $gamma$, the decision for a point in the data set will be made based on its own label only, as the summation over the rest of the points will become negligible.
The data layout does not really matter. It will just require larger values of $gamma$ if some points are very close to each other.
Similar Posts:
- Solved – Plotting the decision boundary of a kernel SVM (RBF)
- Solved – Is this a valid kernel
- Solved – How does the variance-covariance matrix change when I create a linear combination of two variables?
- Solved – Prove that this kernel is a valid kernel
- Solved – What are the various basic kernels available