12
$\begingroup$

I'm trying to solve an homework question but I got stuck.

Let A be a m x n matrix with the SVD $A = U \Sigma V^*$ and $A^+ = (A^* A)^{-1} A^*$ its pseudoinverse.

I'm trying to get $A^+ = V \Sigma^{-1} U^*$, but I'm missing something.

Can anyone help me with this please?

Thanks!

$\endgroup$
3
  • $\begingroup$ If you are content with Sivaram Ambikasaran's or my answer you might want to accept one of them so that the question will be recognised as an answered one. $\endgroup$ Commented Feb 19, 2011 at 3:44
  • 3
    $\begingroup$ Saying "SVD decomposition" is not quite unlike saying "enter your PIN number into the ATM machine"... $\endgroup$ Commented Aug 3, 2011 at 8:31
  • 1
    $\begingroup$ Fair enough! Thanks for the fix! =) $\endgroup$ Commented Aug 24, 2011 at 21:06

4 Answers 4

15
$\begingroup$

$$ \begin{align} A^+ &= (A^*A)^{-1}A^* \\ &=(V\Sigma U^*U\Sigma V^*)^{-1} V\Sigma U^* \\ &=(V\Sigma^2 V^*)^{-1} V\Sigma U^* \\ &=(V^*)^{-1} \Sigma^{-2} V^{-1} V\Sigma U^* \\ &= V \Sigma^{-2}\Sigma U^* \\ &= V\Sigma^{-1}U^* \end{align} $$ using the properties of the matrices $U,V,\Sigma$ in the Singular_value_decomposition.

$\endgroup$
3
  • $\begingroup$ Hi Rasmus! I could get by myself until 3rd line. The 4th one was my point of doubt. I forgot to invert the $\left( \cdot \right)^{-1}$ sequence! Thanks in pointing that! =) $\endgroup$ Commented Feb 2, 2011 at 15:12
  • $\begingroup$ It could happen that $\Sigma^{-1}$ does not exist. If $\sigma_i=0$ for some $i$. $\endgroup$ Commented Aug 6, 2019 at 20:44
  • 3
    $\begingroup$ If $\Sigma$ is not square (and thus not invertible), the result still holds! Just show directly that $(V \Sigma' \Sigma V')^{-1} = V(\Sigma' \Sigma )^+V'$ and that $(\Sigma' \Sigma )^+ \Sigma' = \Sigma^+$. The result will follow by adjusting the derivation in the answer. I'm not super familiar with the pseudoinverse + notation, but hopefully I'm using it correctly. $\endgroup$ Commented Jun 27, 2020 at 2:10
3
$\begingroup$

If the dimensions of A are m x n and $m\not\equiv n$ then there isn't any way of deriving $A^+ = U\Sigma^{-1}V$. The reason is because $\Sigma$ has the same dimensions as $A$ therefore it is not invertible. If you see any source about SVD you will see that the equation is $A = U_{mxm} \Sigma_{mxn}V^T_{nxn}$. If A is rectangular maybe the possible derivation you're looking for is $$\begin{align} A^+ &=(A^TA)^{-1}A^T\\ &=(V\Sigma^TU^T U\Sigma V^T)^{-1}V\Sigma^TU^T\\ &=(V\Sigma^T\Sigma V^T)^{-1}V\Sigma^TU^T \\ &=(V^T)^{-1}(\Sigma^T \Sigma)^{-1}V^{-1}V\Sigma^TU^T \\ &=V(\Sigma^T \Sigma)^{-1}\Sigma^TU^T \end{align}$$

$\endgroup$
1
  • 1
    $\begingroup$ The result still holds for non-square matrices (unless I made a mistake); see my comment above $\endgroup$ Commented Jun 27, 2020 at 2:12
3
$\begingroup$

First you need to assume that the matrix $A^*A$ is invertible. For which you need $n \leq m$ and rank($A$) is $n$.

So when $n \leq m$ and when rank($A$) is $n$, then the reduced SVD of $A$ is $A = U \Sigma V^*$ where $U \in \mathbb{R}^{m \times n}$, $\Sigma \in \mathbb{R}^{n \times n}$ and $V \in \mathbb{R}^{n \times n}$ such that $U^* U = I_{n \times n}$, $V^* V = I_{n \times n}$, $V V^* = I_{n \times n}$ and $\Sigma$ is a square diagonal matrix and has only (positive) real entries.

Note that $V^{-1}=V^*$.

Also note that $A^* = V \Sigma^* U^* = V \Sigma U^*$ since $\Sigma^* = \Sigma$.

Further note that if $M_1,M_2 \text{and} M_3$ are invertible matrices then $(M_1 M_2 M_3)^{-1} = M_3^{-1} M_2^{-1} M_1^{-1}$.

Use these to get the final answer.

$\endgroup$
3
  • $\begingroup$ Given that the question is homework - which I realise only now - it would have probably been better to restrict to hints like you did. Sorry. $\endgroup$ Commented Feb 1, 2011 at 22:37
  • $\begingroup$ Hey Sivaram, I was in a hurry when I posted, so it was a (bad one) typo. Thanks for fixing the title! $\endgroup$ Commented Feb 2, 2011 at 15:05
  • $\begingroup$ I knew all these hints but I missed your "further note" and did not invert the sequence inside those parentheses. Thanks for your patience! $\endgroup$ Commented Feb 2, 2011 at 15:14
0
$\begingroup$

Let $\mathbb{K} \in \{\mathbb{R},\mathbb{C}\}$ be either the field of real numbers $\mathbb{R}$ or the field of complex numbers $\mathbb{C}$. There are multiple ways of defining the Moore-Penrose pseudo-inverse. We choose the following.

Definition (Singular value decomposition). A singular value decomposition of a matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ is a matrix decomposition \begin{equation} \mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*, \end{equation} for two unitary matrices $\mathbf{U} \in \mathbb{K}^{m \times m}$ and $\mathbf{V} \in \mathbb{K}^{n \times n}$, and a rectangular diagonal matrix \begin{equation} \boldsymbol{\Sigma} = \mathbf{diag}(\sigma_1,\sigma_2,\ldots,\sigma_p), \qquad p = \min\{m,n\}, \end{equation} such that \begin{equation} \sigma_1 \geq \sigma_2 \geq \ldots \geq \sigma_p \geq 0. \end{equation} We call diagonal entries $(\sigma_i)_{i=1}^p$ of $\boldsymbol{\Sigma}$ the singular values, the columns $(\mathbf{u}_i)_{i=1}^m$ of $\mathbf{U}$ the left-singular vectors and the columns $(\mathbf{v}_i)_{i=1}^n$ of $\mathbf{V}$ the right-singular vectors of the singular value decomposition.

Remark. The singular value decomposition always exists, but is not unique.

Definition. (Moore-Penrose pseudo-inverse). For a matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ with $\mathrm{rank}(\mathbf{A}) = r$ and a singular value decomposition $\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*$, the Moore-Penrose pseudo-inverse of $\mathbf{A}$ is the matrix \begin{equation} \mathbf{A}^\dagger := \mathbf{V} \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \in \mathbb{K}^{n \times m}, \end{equation} where \begin{equation} \boldsymbol{\Sigma}^\dagger = \mathbf{diag}(1/\sigma_1,1/\sigma_2,\ldots,1/\sigma_r,0,\ldots,0) \in \mathbb{K}^{n \times m}. \end{equation} Note that the pseudo-inverse of a tall matrix is wide, and of a wide matrix is tall.

Proposition. For any tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), the matrix $\mathbf{A}^*\mathbf{A}$ is invertible.

Proof. Firstly, notice that \begin{equation} \mathbf{x} \in \ker(\mathbf{A}) \implies \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{A}^*\mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x} \in \ker(\mathbf{A}^* \mathbf{A}), \end{equation} and that \begin{align*} \mathbf{x} \in \ker(\mathbf{A}^* \mathbf{A}) &\implies \mathbf{A}^* \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x}^* \mathbf{A}^* \mathbf{A} \mathbf{x} = \mathbf{0} \implies \Vert \mathbf{A} \mathbf{x} \Vert_2^2 = \mathbf{0} \\ &\implies \mathbf{A} \mathbf{x} = \mathbf{0} \implies \mathbf{x} \in \ker(\mathbf{A}). \end{align*} Hence, $\ker(\mathbf{A}) = \ker(\mathbf{A}^* \mathbf{A})$, and by the rank nullity theorem, \begin{equation} \mathrm{rank}(\mathbf{A}) = n - \dim(\ker(\mathbf{A})) = n - \dim(\ker(\mathbf{A}^* \mathbf{A})) = \mathrm{rank}(\mathbf{A}^*\mathbf{A}). \end{equation} And in particular, as $\mathbf{A}^* \mathbf{A} \in \mathbb{K}^{n \times n}$ is a square full rank matrix, its is invertible. $\qquad \square$

Proposition. For a tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), the Moore-Penrose pseudo inverse can be represented as \begin{equation} \mathbf{A}^\dagger = (\mathbf{A}^* \mathbf{A})^{-1} \mathbf{A}^*. \end{equation}

Proof. Consider a tall matrix $\mathbf{A} \in \mathbb{K}^{m \times n}$ ($m \geq n$) with full column rank ($\mathrm{rank}(\mathbf{A}) = r = n$), and let $\mathbf{A} = \mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*$ be any singular value decomposition of $\mathbf{A}$. Notice that the rank is preserved under the application of unitary matrices, so $\mathrm{rank}(\mathbf{A}) = \mathrm{rank}(\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*) = \mathrm{rank}(\boldsymbol{\Sigma})$. Hence, all singular values $(\sigma_i)_{i=1}^n$ on the diagonal of $\boldsymbol{\Sigma}$ are non-zero. Firstly, we show that the representation holds for $\mathbf{A} = \boldsymbol{\Sigma}$. We have \begin{equation} \boldsymbol{\Sigma}^* \boldsymbol{\Sigma} = \begin{bmatrix} \sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n & \cdots & 0 \end{bmatrix} \begin{bmatrix} \sigma_1 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n \\ \vdots & \ddots & \vdots \\ 0 & \cdots & 0 \end{bmatrix} = \begin{bmatrix} \sigma_1^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n^2 \end{bmatrix}, \end{equation} so \begin{equation} (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}) \boldsymbol{\Sigma}^\dagger = \begin{bmatrix} \sigma_1^2 & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n^2 \end{bmatrix} \begin{bmatrix} 1/\sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & 1/\sigma_n & \cdots & 0 \end{bmatrix} = \begin{bmatrix} \sigma_1 & \cdots & 0 & \cdots & 0 \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ 0 & \cdots & \sigma_n & \cdots & 0 \end{bmatrix} = \boldsymbol{\Sigma}^*. \end{equation} Clearly, $\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}$ is invertible, and \begin{equation} \boldsymbol{\Sigma}^\dagger = (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma})^{-1}\boldsymbol{\Sigma}^*. \end{equation} Therefore, \begin{align*} (\mathbf{A}^* \mathbf{A}) \mathbf{A}^\dagger &= (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*)^* (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V}^*) (\mathbf{V} \boldsymbol{\Sigma}^\dagger \mathbf{U}^*) \\ &= \mathbf{V} \boldsymbol{\Sigma}^* \underbrace{\mathbf{U}^* \mathbf{U}}_{=\mathbf{I}} \boldsymbol{\Sigma} \underbrace{\mathbf{V}^* \mathbf{V}}_{=\mathbf{I}} \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \\ &= \mathbf{V} (\boldsymbol{\Sigma}^* \boldsymbol{\Sigma}) \boldsymbol{\Sigma}^\dagger \mathbf{U}^* \\ &= \mathbf{V} \boldsymbol{\Sigma}^* \mathbf{U}^* \\ &= (\mathbf{U} \boldsymbol{\Sigma} \mathbf{V})^* \\ &= \mathbf{A}^*. \end{align*} By the proposition above, $\mathbf{A}^* \mathbf{A}$ is also invertible, and we conclude that \begin{equation} \mathbf{A}^\dagger = (\mathbf{A}^* \mathbf{A})^{-1} \mathbf{A}^*. \qquad \square \end{equation}

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.