0
$\begingroup$

If $f(x) = ax + b$ a linear function, and $(x_0, y_0), \ldots, (x_n, y_n)$ are observed values of $f$, then we can estimate the values of $a, b$ which minimize the sum of squares with the following system of equations:

\begin{equation*} \begin{bmatrix} n & \sum x_i &\mid \sum y_i \\ \sum x_i & \sum x_i^2 &~ \mid \sum y_i x_i \end{bmatrix} \end{equation*}

(where all sums are in the range $i=0, \ldots, n$). In fact, this is a particular form of a general result with regards to polynomials of degree $n$ with $n+1$ observed values, whose coefficients $a_0, \ldots, a_m$ are given by the system:

\begin{equation*} \begin{bmatrix} S(0) & S(1) & \ldots & S(n) &\mid \sum y_ix_i^0 \\ S(1) & S(2) & \ldots & S(n+1) &\mid \sum y_i x_i^1 \\ \vdots & & \ddots & & \vdots \\ S(n) &S(n+1)& \ldots & S(2n) &\mid \sum y_ix_i^n \end{bmatrix} \end{equation*}

where $S(j) = \sum x_i^j$.

I was given the model $F(l) = k(l - 5.3)$ with $k$ an unknown parameter, and the following observed values.

\begin{equation*} \begin{bmatrix} l \mid & 7 & 9.4 & 12.3 \\ F(l) \mid & 2 & 4 & 5 \end{bmatrix} \end{equation*}

I was asked to find the value of $k$ which minimizes the sum of squares, i.e. the value which minimizes

\begin{equation*} \mathcal{E} := \sum_{i=0}^2 \left( F(x_i) - y_i \right)^2 = \sum_{i=0}^2 \left( k(x_i - 5.3) - y_i \right) \end{equation*}

This can be done by direct computation of its critical point with respect to $k$:

\begin{align*} &\frac{\partial \mathcal{E}}{\partial k} = \sum_{i=0}^2 \frac{\partial }{\partial k}\left[ \left( k(x_i - 5.3) - y_i \right)^2 \right]=0 \\ \iff&k\sum_{i=0}^2(x_i - 5.3)^2 = \sum_{i=0}^2(x_i - 5.3)y_i \\ \iff& k = \frac{\sum_{i=0}^2 (x_i-5.3)y_i}{\sum_{i=0}^2(x_i - 5.3)^2} \\ \iff&k = \frac{54.8}{68.7} \\ \iff& k = 0.798 \end{align*}

(It is easy to see the second derivative is always $>0$ so $k=0.798$ is indeed a minimum.)

However, I suspect an application of the matricial form given at the beginning is possible in this problem. I have not been able to produce it myself because of the fact that, in the linear function $F(l) = k(l-5.3) = kl -5.3k$, the coeficients $a = k$, $b = 5.3k$ are obviously not independent. In other words, we don't have a system of equations with two variables to deal with.

My question is: Can the matricial approach given at the beginning be applied to this problem, instead of directly computing the critical point? If not, I'm still interested to know whether there was a simpler approach.

$\endgroup$

1 Answer 1

1
$\begingroup$

It is not clear to me how solving that system would help you to find the optimal parameters (the parameters $a$ and $b$ don't even appear in the system of equation !).

You can take the following general approach for these sort of least square problems.

In the initial example you had $f(x) = a x + b$ meaning that you are looking at a minimizing the squared loss.

$$ \min_{a,b} \sum_i (y_i - f(x_i))^2.$$

This can be rewritten in a vectorized form

$$\min_{(a,b)\in \mathbb R^2} \left\Vert \begin{pmatrix} y_0 \\ ... \\ y_n \end{pmatrix} - \begin{pmatrix} x_0 & 1 \\ ... & ... \\ x_n & 1 \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} \right\Vert^2$$

or more compactly as

$$ \min_{\beta \in \mathbb R^2} \Vert Y - X\beta\Vert^2$$

You can solve this either through calculus or through linear algebra (the solution is the projection onto span of the columns of $X$.) In both cases you will find that the solution (under certain conditions) is always unique and equal to

$$ \hat \beta = (X'X)^{-1}X'Y.$$ That is, $\hat \beta$ is exactly the value of $\beta$ for which $\Vert Y - X\beta\Vert^2$ is minimal.

When dealing with a function of the form $f(x) = k(x - 5.3) $ then the problem can be rewritten as

$$\min_{k \in \mathbb R} \left\Vert \begin{pmatrix} y_0 \\ ... \\ y_n \end{pmatrix} - \begin{pmatrix} x_0 - 5.3 \\ ... \\ x_n - 5.3 \end{pmatrix} \begin{pmatrix} k \end{pmatrix} \right\Vert^2$$

In this specific case $X$ is a column vector so $X'X = \Vert X \Vert^2 $ and we have the solution

$$ k = \frac{ X'Y}{\Vert X'X\Vert^2} = \frac{\sum_i(x_i - 5.3)y_i}{\sum_i (x_i - 5.3)^2}$$

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.