I actually understood the derivation behind support Vector Machine but I have a doubt about constraint equation.

Why we have a constraint equation $geq1$ if $y_i=1$ and $leq-1$ if $y_i=-1$?

Can we have any arbitrary constant instead of 1? If no, then what is rational behind having this particular value?

Any help is highly appreciated.

#### Best Answer

Yes, you can have any arbitrary, strictly positive constant instead of 1.

Why? First some background.

## Math and separating hyperplane:

Support vector machines attempts to find a separating hyper-plane between sets $X$ and $Y$. Mathematically, the condition for a separating hyperplane is:

$$ boldsymbol{w} cdot boldsymbol{x}_i – b < 0 quad quad boldsymbol{w} cdot boldsymbol{y}_i – b > 0 $$

Observe that the inequalities are strict!

## Numerical issues and practical solution:

Numerically, this formulation has practical problems. If the inequalities aren't strict, $boldsymbol{w} = boldsymbol{0}, b = 0$ is a trivial solution. Numerical optimization routines may give bizarre answers to this problem; standard floating point math isn't infinitely precise etc…

What to do? Let's replace the strict inequalities with non-strict inequalities plus some separation constant $t>0$: $$ boldsymbol{w} cdot boldsymbol{x}_i – b leq -t quad quad boldsymbol{w} cdot boldsymbol{y}_i – b geq t $$ Yay! Numerical optimization can handle this. Also observe that since $boldsymbol{w}$ and $b$ are choices variables, the scale of $t$ really doesn't matter. It's totally arbitrary. So we can just make it simple for ourselves and choose 1. (You could even choose different positive values for both inequalities; it doesn't matter.)

$$ boldsymbol{w} cdot boldsymbol{x}_i – b leq -1 quad quad boldsymbol{w} cdot boldsymbol{y}_i – b geq 1 $$

## Other interpretation:

As your text explains, another interepretation of this is that you're fitting two parallel hyperplanes, one touching the X set, one touching the Y set, with some distance between them.

### Similar Posts:

- Solved – Normalized correlation with a constant vector
- Solved – Understanding a characterization of minimal sufficient statistics
- Solved – Derivation of Restricted Boltzmann Machine Conditional Probability
- Solved – Sufficient statistic for bivariate or multivariate normal
- Solved – How to prove the identifiability of a likelihood