In a previous thread, Computing the Decision Boundary of a linear SVM model(Computing the decision boundary of a linear SVM model), the following R code was given as a way to compute the formula of the hyperplane and it's margins, given an input set of hyperparameters:
library(kernlab) set.seed(101) x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2)) y <- matrix(c(rep(1,60),rep(-1,60))) svp <- ksvm(x,y,type="C-svc") plot(svp,data=x) alpha(svp) # support vectors whose indices may be found with alphaindex(svp) b(svp) # (negative) intercept plot(scale(x), col=y+2, pch=y+2, xlab="", ylab="") w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),]) b <- b(svp) abline(b/w[1],-w[2]/w[1]) abline((b+1)/w[1],-w[2]/w[1],lty=2) abline((b-1)/w[1],-w[2]/w[1],lty=2)
However, in the resulting plot, none of the support vectors actually fall on either margin line.
If
abline(b/w[1],-w[2]/w[1]) #is supposed to be the maximum-marginal hyperplane and abline((b+1)/w[1],-w[2]/w[1],lty=2) #is supposed to be the yi = +1 margin and abline((b-1)/w[1],-w[2]/w[1],lty=2) #is supposed to be the yi = -1 margin
then why don't the support vectors, alpha(svp)
, fall on either the +1 or the -1 margin?
Now before you say it's because kernelf = "rbfkernel"
by default and our hyperplane and margins have not been transformed according to the rbfkernel
, I can explicitly call the linear kernel with kernelf = "vanilladot"
(i.e svp <- ksvm(x,y,type="C-svc",kernelf="vanilladot")
) and the hyperplane and margins are even farther from the support vectors defined by alpha(svp)
then when kernelf="rbfkernel"
.
Note that I find similar results when I use the e1071
package (which is perhaps not surprising as they both based on the LIBSVM
library).
If I am doing something wrong, or missing a crucial point (like perhaps ksvm()
uses a soft margin by default and is hiding the slack variables from me), please let me know!
Here is my code for the linear kernel with the kernlab
package:
library(kernlab) set.seed(101) x <- rbind(matrix(rnorm(120),,2),matrix(rnorm(120,mean=3),,2)) y <- matrix(c(rep(1,60),rep(-1,60))) svp <- ksvm(x,y,type="C-svc",kernelf="vanilladot") plot(svp,data=x) plot(x, col=y+2, pch=y+2, xlab="", ylab="") w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),]) b <- b(svp) plot(svp,data=x) abline(b/w[1],-w[2]/w[1]) abline((b+1)/w[1],-w[2]/w[1],lty=2) abline((b-1)/w[1],-w[2]/w[1],lty=2)
Best Answer
I am not an R user, but I suspect it is because you are using the soft-margin support vector machine (which is what I presume "C-svc" means). The support vectors will only lie exactly on the margins for the hard margin SVM (where C is infinite). Essentially the C parameter penalises the degree to which the support vectors are allowed to violate the margin constraint, so if C is less than infinity, the support vectors are allowed to drift away from the margins in the interests of making the margin broader, which often leads to better generalisation.
Similar Posts:
- Solved – Computing the decision boundary of a linear SVM model
- Solved – Distance from hyperplane in SVM rbf kernel in R
- Solved – Why One class SVM seperate from the origin
- Solved – What are the decision values at the margins in an SVM model
- Solved – Deriving the intercept term in a linearly separable and soft-margin SVM