Solving SVM without QP solver

Question

I've consulted three textbooks on support vector machines, and it is unclear to me how to solve such a problem without a QP solver (which is something my professor has asked us to to). Can anyone help me see the workings through a toy example?

Given toy data:

$$X = \begin{bmatrix} 0 & 0\\ 2 & 3\end{bmatrix} \qquad\qquad y = \begin{bmatrix} -1\\ 1\end{bmatrix}$$

My understanding proceeds as follows:

$$\min_w w^Tw \ \ \ \ \ s.t. \ \ y_i(w^Tx_i + b) \ge 1$$

Express the previous constrained problem as an unconstrained problem:

$$L(w,b,\alpha) = \frac{1}{2}w^Tw - \sum_i\alpha_i[y_i(w^Tx_i + b)-1]$$

Dual problem is equivalent to but easier than the primal problem:

$$\min_w \max_{\alpha \ge 0} \frac{1}{2}w^Tw - \sum_{i}\alpha_i[y_i(w^Tx_i+b)-1] = \max_{\alpha \ge 0} \min_w \frac{1}{2}w^Tw - \sum_{i}\alpha_i[y_i(w^Tx_i+b)-1]$$

Start by minimizing $w$ in terms of $\alpha$:

$$\frac{\delta L}{\delta w} = 0 = w - \sum_i\alpha_iy_ix_i \implies w = \sum_i\alpha_iy_ix_i$$

Plug this value for $w$ into the Lagrangian from earlier:

$$L(w,b,\alpha) = \frac{1}{2}(\sum_i\alpha_iy_ix_i)(\sum_i\alpha_iy_ix_i) - \sum_ja_j[y_j[(\sum_i\alpha_iy_ix_i)x_j + b]-1]$$

Simplify:

$$= \frac{1}{2}\sum_{i,j}\alpha_i\alpha_jy_iy_jx_ix_j - \sum_{i,j}a_ia_jy_iy_jx_ix_j + b\sum_ia_iy_i+\sum_i\alpha_i$$

The third term equals zero because:

$$\frac{\delta L}{\delta b} = 0 = -\sum_i\alpha_iy_i$$

Ergo:

$$L(w,b,\alpha) = \sum_i\alpha_i - \frac{1}{2}\sum_{i,j}\alpha_i\alpha_jy_iy_jx_ix_j$$

Plug in $X$ and $Y$:

$$= a_i + a_2 - \frac{1}{2}[ (-1)(-1)a_1^2(0) + (-1)(1)a_1a_2(0) + (-1)(1)a_2a_1(0) + (1)(1)a_2^2(13) ]$$

$$= a_1 + a_2 - \frac{13}{2}a_2^2$$

I imagined that my goal now was to solve for values of $a_i$ that maximize the foregoing equation. I thought I would take partial derivatives with respect to $a_i$ and set the derivatives to zero, but that can't be right because I would end up with $\frac{\delta L}{\delta a_1} = 0 = 1$

Why not write the objective function in matrix form? What is $x_i$? — Rodrigo de Azevedo
– Rodrigo de Azevedo, Commented Nov 19, 2016 at 19:28
@RodrigodeAzevedo : $x_i$ is a row in $X$, i.e. a single data point in 2-D space. $X$ is defined at the top of the post. Can you point out what function the "objective function" is? I'm not familiar with that term. I've tried imagining what matrix function would let me calculate $a_i$, but I'm still in the dark. — Jellicle
– Jellicle, Commented Nov 21, 2016 at 15:21

Rodrigo de Azevedo · Accepted Answer · 2016-11-22 10:41:17Z

We have the quadratic program (QP) in $\mathrm w \in \mathbb R^2$ and $b \in \mathbb R$

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & - (0 w_1 + 0 w_2 - b) \geq 1\\ & + (2 w_1 + 3 w_2 - b) \geq 1\end{array}$$

which can be written as follows

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & b \geq 1\\ & 2 w_1 + 3 w_2 \geq b + 1\end{array}$$

If $b \geq 1$, then $b + 1 \geq 2$. Thus,

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & b \geq 1\\ & 2 w_1 + 3 w_2 \geq 2\end{array}$$

Decoupling $\mathrm w$ and $b$, we solve the QP in $\mathrm w$

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & 2 w_1 + 3 w_2 \geq 2\end{array}$$

and then choose a $b \geq 1$. Rewriting the inequality constraint as an equality constraint, we then have the least-norm problem

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & 2 w_1 + 3 w_2 = 2\end{array}$$

whose solution is

$$\bar{\mathrm w} := \dfrac{2}{13} \begin{bmatrix} 2\\ 3\end{bmatrix}$$

Stack Exchange Network

Solving SVM without QP solver

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Solving SVM without QP solver

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions