I've consulted three textbooks on support vector machines, and it is unclear to me how to solve such a problem without a QP solver (which is something my professor has asked us to to). Can anyone help me see the workings through a toy example?
Given toy data:
$$X = \begin{bmatrix} 0 & 0\\ 2 & 3\end{bmatrix} \qquad\qquad y = \begin{bmatrix} -1\\ 1\end{bmatrix}$$
My understanding proceeds as follows:
$$\min_w w^Tw \ \ \ \ \ s.t. \ \ y_i(w^Tx_i + b) \ge 1$$
Express the previous constrained problem as an unconstrained problem:
$$L(w,b,\alpha) = \frac{1}{2}w^Tw - \sum_i\alpha_i[y_i(w^Tx_i + b)-1]$$
Dual problem is equivalent to but easier than the primal problem:
$$\min_w \max_{\alpha \ge 0} \frac{1}{2}w^Tw - \sum_{i}\alpha_i[y_i(w^Tx_i+b)-1] = \max_{\alpha \ge 0} \min_w \frac{1}{2}w^Tw - \sum_{i}\alpha_i[y_i(w^Tx_i+b)-1]$$
Start by minimizing $w$ in terms of $\alpha$:
$$\frac{\delta L}{\delta w} = 0 = w - \sum_i\alpha_iy_ix_i \implies w = \sum_i\alpha_iy_ix_i$$
Plug this value for $w$ into the Lagrangian from earlier:
$$L(w,b,\alpha) = \frac{1}{2}(\sum_i\alpha_iy_ix_i)(\sum_i\alpha_iy_ix_i) - \sum_ja_j[y_j[(\sum_i\alpha_iy_ix_i)x_j + b]-1]$$
Simplify:
$$= \frac{1}{2}\sum_{i,j}\alpha_i\alpha_jy_iy_jx_ix_j - \sum_{i,j}a_ia_jy_iy_jx_ix_j + b\sum_ia_iy_i+\sum_i\alpha_i$$
The third term equals zero because:
$$\frac{\delta L}{\delta b} = 0 = -\sum_i\alpha_iy_i$$
Ergo:
$$L(w,b,\alpha) = \sum_i\alpha_i - \frac{1}{2}\sum_{i,j}\alpha_i\alpha_jy_iy_jx_ix_j$$
Plug in $X$ and $Y$:
$$= a_i + a_2 - \frac{1}{2}[ (-1)(-1)a_1^2(0) + (-1)(1)a_1a_2(0) + (-1)(1)a_2a_1(0) + (1)(1)a_2^2(13) ]$$
$$= a_1 + a_2 - \frac{13}{2}a_2^2$$
I imagined that my goal now was to solve for values of $a_i$ that maximize the foregoing equation. I thought I would take partial derivatives with respect to $a_i$ and set the derivatives to zero, but that can't be right because I would end up with $\frac{\delta L}{\delta a_1} = 0 = 1$