I'm having trouble with the proof of the following theorem from Commutative Matrices by D.A. Suprunenko and R.I. Tyshkevich.
Theorem 3, Chapter 1: Let $P$ be any field and $M\subseteq M_n(P)$ a set of pairwise commuting matrices. The space $P^n$ can be represented by the direct sum of $M$-invariant subspaces $Q_j$ with $j=1,\cdots, k$ such that the irreducible parts of each restriction $M|_{Q_{i}} =\{ m|_{Q_i} ~:~ m\in M\}$ are equivalent and for $i\neq j$, the irreducible parts of $M|_{Q_i}$ and $M|_{Q_j}$ are not equivalent.
They use a lemma which I understand to prove this theorem. However at the beginning of the proof they make following claim which I do not follow:
(*) The matrices of $M$ can be simultaneously converted to the following form by a similarity transform. \begin{bmatrix} a_1(m) & &&\\ a_{21}(m) & a_2(m) & &\\ \vdots & & \ddots &\\ a_{s1}(m) & a_{s2}(m) & \cdots & a_s(m) \end{bmatrix} where $m\mapsto a_i(m)$ is an irreducible representation of $M$, for $i=1,\cdots,s$ and the matrix has total size $n\times n$.
I know that over an algebraically closed field, commuting matrices can be simultaneously triangularized, in fact the book proves in the same chapter that a sufficiently large finite extension of $P$ will do. However, it is not obvious to me that commuting matrices over an arbitrary field can be simultaneously converted to a block triangular form. Is the claim (*) true, and why?