Support Vector Machine¶
Goal is to obtain hyperplane farthest from all sample points
\[ \begin{aligned} \text{Distance } & \text{between edge point and line} \\ &= \frac{|w^t x_i + w_0|}{||w||} \\ &=\frac{1}{||w||} \\ \implies m &= \frac{2}{||w||} \end{aligned} \]
Goal is to maximize โmarginโ \(m\) (distance between classes), subject to the following constraints
\[ \begin{cases} w^t x_i + w_0 \ge 1, & x_i > 0 \\ w^t x_i + w_0 \le -1, & x_i <0 \end{cases} \]
In other words, we need to minimize cost function
\[ J(\theta) = \frac{1}{2} ||w||^2 \]
We can derive through linear-programming
For Linearly-Separable¶
-
Plot sample points
-
Find support vectors (points that are on border of other class)
-
Find augmented vectors with bias = 1
\[ s_1 = \begin{pmatrix} 0 \\ 1 \end{pmatrix} \implies \tilde{s_1} = \begin{pmatrix} 0 \\ 1 \end{pmatrix} \]
-
Find values of \(\alpha\), assuming that
-
\(+ve = +1\)
- \(-ve = -1\)
\[ \begin{aligned} \alpha_1 \tilde{s_1} \cdot \tilde{s_1} + \alpha_2 \tilde{s_2} \cdot \tilde{s_1} + \alpha_3 \tilde{s_3} \cdot \tilde{s_1} &= -1 \\ \alpha_1 \tilde{s_1} \cdot \tilde{s_2} + \alpha_2 \tilde{s_2} \cdot \tilde{s_2} + \alpha_3 \tilde{s_3} \cdot \tilde{s_2} &= 1 \\ \alpha_1 \tilde{s_1} \cdot \tilde{s_3} + \alpha_2 \tilde{s_2} \cdot \tilde{s_3} + \alpha_3 \tilde{s_3} \cdot \tilde{s_3} &= 1 \end{aligned} \]
- Find \(w_i\)
\[ w_i = \]
- Something
Kernel function \(\phi(x)\)¶
Linear transformation function for Non-Linearly-Separable
For eg, to increase the dimensionality, we can use \(\phi(x) = (x, x^2)\)
Kernel Function | \(\phi(x)\) |
---|---|
Linear | \(x\) |
Polynomial | \((kx+c)^n\) |
Gaussian | \(\exp \left( \dfrac{-\vert x-y \vert^2}{2 \sigma^2} \right)\) where \(\sigma^2 =\) Variance of sample |
RBF (Radial Basis Function) Most powerful, but not necessary in most cases | \(\exp( -\gamma \vert x_i - x_j \vert^2 )\) |