Matricial approach to parameter estimation of a linear function

Question

If $f(x) = ax + b$ a linear function, and $(x_0, y_0), \ldots, (x_n, y_n)$ are observed values of $f$, then we can estimate the values of $a, b$ which minimize the sum of squares with the following system of equations:

\begin{equation*} \begin{bmatrix} n & \sum x_i &\mid \sum y_i \\ \sum x_i & \sum x_i^2 &~ \mid \sum y_i x_i \end{bmatrix} \end{equation*}

(where all sums are in the range $i=0, \ldots, n$). In fact, this is a particular form of a general result with regards to polynomials of degree $n$ with $n+1$ observed values, whose coefficients $a_0, \ldots, a_m$ are given by the system:

\begin{equation*} \begin{bmatrix} S(0) & S(1) & \ldots & S(n) &\mid \sum y_ix_i^0 \\ S(1) & S(2) & \ldots & S(n+1) &\mid \sum y_i x_i^1 \\ \vdots & & \ddots & & \vdots \\ S(n) &S(n+1)& \ldots & S(2n) &\mid \sum y_ix_i^n \end{bmatrix} \end{equation*}

where $S(j) = \sum x_i^j$.

I was given the model $F(l) = k(l - 5.3)$ with $k$ an unknown parameter, and the following observed values.

\begin{equation*} \begin{bmatrix} l \mid & 7 & 9.4 & 12.3 \\ F(l) \mid & 2 & 4 & 5 \end{bmatrix} \end{equation*}

I was asked to find the value of $k$ which minimizes the sum of squares, i.e. the value which minimizes

\begin{equation*} \mathcal{E} := \sum_{i=0}^2 \left( F(x_i) - y_i \right)^2 = \sum_{i=0}^2 \left( k(x_i - 5.3) - y_i \right) \end{equation*}

This can be done by direct computation of its critical point with respect to $k$:

\begin{align*} &\frac{\partial \mathcal{E}}{\partial k} = \sum_{i=0}^2 \frac{\partial }{\partial k}\left[ \left( k(x_i - 5.3) - y_i \right)^2 \right]=0 \\ \iff&k\sum_{i=0}^2(x_i - 5.3)^2 = \sum_{i=0}^2(x_i - 5.3)y_i \\ \iff& k = \frac{\sum_{i=0}^2 (x_i-5.3)y_i}{\sum_{i=0}^2(x_i - 5.3)^2} \\ \iff&k = \frac{54.8}{68.7} \\ \iff& k = 0.798 \end{align*}

(It is easy to see the second derivative is always $>0$ so $k=0.798$ is indeed a minimum.)

However, I suspect an application of the matricial form given at the beginning is possible in this problem. I have not been able to produce it myself because of the fact that, in the linear function $F(l) = k(l-5.3) = kl -5.3k$, the coeficients $a = k$, $b = 5.3k$ are obviously not independent. In other words, we don't have a system of equations with two variables to deal with.

My question is: Can the matricial approach given at the beginning be applied to this problem, instead of directly computing the critical point? If not, I'm still interested to know whether there was a simpler approach.

Digitallis · Accepted Answer · 2025-02-05 11:36:39Z

It is not clear to me how solving that system would help you to find the optimal parameters (the parameters $a$ and $b$ don't even appear in the system of equation !).

You can take the following general approach for these sort of least square problems.

In the initial example you had $f(x) = a x + b$ meaning that you are looking at a minimizing the squared loss.

$$ \min_{a,b} \sum_i (y_i - f(x_i))^2.$$

This can be rewritten in a vectorized form

$$\min_{(a,b)\in \mathbb R^2} \left\Vert \begin{pmatrix} y_0 \\ ... \\ y_n \end{pmatrix} - \begin{pmatrix} x_0 & 1 \\ ... & ... \\ x_n & 1 \end{pmatrix} \begin{pmatrix} a \\ b \end{pmatrix} \right\Vert^2$$

or more compactly as

$$ \min_{\beta \in \mathbb R^2} \Vert Y - X\beta\Vert^2$$

You can solve this either through calculus or through linear algebra (the solution is the projection onto span of the columns of $X$.) In both cases you will find that the solution (under certain conditions) is always unique and equal to

$$ \hat \beta = (X'X)^{-1}X'Y.$$ That is, $\hat \beta$ is exactly the value of $\beta$ for which $\Vert Y - X\beta\Vert^2$ is minimal.

When dealing with a function of the form $f(x) = k(x - 5.3) $ then the problem can be rewritten as

$$\min_{k \in \mathbb R} \left\Vert \begin{pmatrix} y_0 \\ ... \\ y_n \end{pmatrix} - \begin{pmatrix} x_0 - 5.3 \\ ... \\ x_n - 5.3 \end{pmatrix} \begin{pmatrix} k \end{pmatrix} \right\Vert^2$$

In this specific case $X$ is a column vector so $X'X = \Vert X \Vert^2 $ and we have the solution

$$ k = \frac{ X'Y}{\Vert X'X\Vert^2} = \frac{\sum_i(x_i - 5.3)y_i}{\sum_i (x_i - 5.3)^2}$$

Stack Exchange Network

Matricial approach to parameter estimation of a linear function

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Matricial approach to parameter estimation of a linear function

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions