Solved – Where does the definition of the hyperplane in a simple SVM come from

I'm trying to figure out support vector machines using this resource. On page 2 it is stated that for linearly separable data the SVM problem is to select a hyperplane such that $vec{x}_ivec{w} + b geq 1$ for $y_i in 1$ and $vec{x}_ivec{w} + b leq -1$ for $y_i in -1$. I'm having trouble to understand where the right-hand side of the constraints come from?

P.S The next question would be how to show that the SVM's margin is equal to $frac{1}{||vec{w}||}$.

Essentially these two constraints basically require the training data to be correctly classified, and at least a certain distance from the decision threshold 0. The hyperplane that fulfils these constraints with the smallest norm of the weights will have the maximal margin. The value $pm 1$ is essentially arbitrary, you could replace it with $pm$ any value you like and it would merely rescale the coefficients of the hyper-plane, but without changing the decision boundary. A value of 1 is used just to keep the maths neat.

Similar Posts:

Rate this post

Leave a Comment