1
$\begingroup$

I've consulted three textbooks on support vector machines, and it is unclear to me how to solve such a problem without a QP solver (which is something my professor has asked us to to). Can anyone help me see the workings through a toy example?

Given toy data:

$$X = \begin{bmatrix} 0 & 0\\ 2 & 3\end{bmatrix} \qquad\qquad y = \begin{bmatrix} -1\\ 1\end{bmatrix}$$

My understanding proceeds as follows:

$$\min_w w^Tw \ \ \ \ \ s.t. \ \ y_i(w^Tx_i + b) \ge 1$$

Express the previous constrained problem as an unconstrained problem:

$$L(w,b,\alpha) = \frac{1}{2}w^Tw - \sum_i\alpha_i[y_i(w^Tx_i + b)-1]$$

Dual problem is equivalent to but easier than the primal problem:

$$\min_w \max_{\alpha \ge 0} \frac{1}{2}w^Tw - \sum_{i}\alpha_i[y_i(w^Tx_i+b)-1] = \max_{\alpha \ge 0} \min_w \frac{1}{2}w^Tw - \sum_{i}\alpha_i[y_i(w^Tx_i+b)-1]$$

Start by minimizing $w$ in terms of $\alpha$:

$$\frac{\delta L}{\delta w} = 0 = w - \sum_i\alpha_iy_ix_i \implies w = \sum_i\alpha_iy_ix_i$$

Plug this value for $w$ into the Lagrangian from earlier:

$$L(w,b,\alpha) = \frac{1}{2}(\sum_i\alpha_iy_ix_i)(\sum_i\alpha_iy_ix_i) - \sum_ja_j[y_j[(\sum_i\alpha_iy_ix_i)x_j + b]-1]$$

Simplify:

$$= \frac{1}{2}\sum_{i,j}\alpha_i\alpha_jy_iy_jx_ix_j - \sum_{i,j}a_ia_jy_iy_jx_ix_j + b\sum_ia_iy_i+\sum_i\alpha_i$$

The third term equals zero because:

$$\frac{\delta L}{\delta b} = 0 = -\sum_i\alpha_iy_i$$

Ergo:

$$L(w,b,\alpha) = \sum_i\alpha_i - \frac{1}{2}\sum_{i,j}\alpha_i\alpha_jy_iy_jx_ix_j$$

Plug in $X$ and $Y$:

$$= a_i + a_2 - \frac{1}{2}[ (-1)(-1)a_1^2(0) + (-1)(1)a_1a_2(0) + (-1)(1)a_2a_1(0) + (1)(1)a_2^2(13) ]$$

$$= a_1 + a_2 - \frac{13}{2}a_2^2$$

I imagined that my goal now was to solve for values of $a_i$ that maximize the foregoing equation. I thought I would take partial derivatives with respect to $a_i$ and set the derivatives to zero, but that can't be right because I would end up with $\frac{\delta L}{\delta a_1} = 0 = 1$

$\endgroup$
2
  • $\begingroup$ Why not write the objective function in matrix form? What is $x_i$? $\endgroup$ Commented Nov 19, 2016 at 19:28
  • $\begingroup$ @RodrigodeAzevedo : $x_i$ is a row in $X$, i.e. a single data point in 2-D space. $X$ is defined at the top of the post. Can you point out what function the "objective function" is? I'm not familiar with that term. I've tried imagining what matrix function would let me calculate $a_i$, but I'm still in the dark. $\endgroup$ Commented Nov 21, 2016 at 15:21

1 Answer 1

1
$\begingroup$

We have the quadratic program (QP) in $\mathrm w \in \mathbb R^2$ and $b \in \mathbb R$

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & - (0 w_1 + 0 w_2 - b) \geq 1\\ & + (2 w_1 + 3 w_2 - b) \geq 1\end{array}$$

which can be written as follows

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & b \geq 1\\ & 2 w_1 + 3 w_2 \geq b + 1\end{array}$$

If $b \geq 1$, then $b + 1 \geq 2$. Thus,

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & b \geq 1\\ & 2 w_1 + 3 w_2 \geq 2\end{array}$$

Decoupling $\mathrm w$ and $b$, we solve the QP in $\mathrm w$

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & 2 w_1 + 3 w_2 \geq 2\end{array}$$

and then choose a $b \geq 1$. Rewriting the inequality constraint as an equality constraint, we then have the least-norm problem

$$\begin{array}{ll} \text{minimize} & w_1^2 + w_2^2\\ \text{subject to} & 2 w_1 + 3 w_2 = 2\end{array}$$

whose solution is

$$\bar{\mathrm w} := \dfrac{2}{13} \begin{bmatrix} 2\\ 3\end{bmatrix}$$

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.