Consider the model $y_t = F_{S_t} x_t + \varepsilon_{S_t}$ and $x_t = A_{S_t} x_{t-1} + \nu_{S_t}$, where $\varepsilon_{S_t}, \nu_{S_t} \sim N(0, R_{S_t})$ and $N(0, Q_{S_t})$ and $S_t$ is Markov chain with $\mathbb{P}(S_t = i | S_{t-1} = j) = p_{i,j}$.
We want to estimate the parameters of the model using EM algorithm. As usual for the EM algorithm, we need to start with:
$$ \mathbb{E}_{x \sim P(.|y, \theta)} \log[L] = \sum_{t} \mathbb{E}_{p(x_t|y_t, \theta_t)}[\log L] $$
The articles I've found either consider fixed $S_t$ or do not use EM algorithm (however, it's more useful for MLE attempt). Since we have $S_t$ we need to adjust the posterior probabilities.
$$ p(x_t|y_t, \theta_t) = \sum_{i, j} p(x_t, S_t = i, S_{t-1} = j|y_t, \theta_t) = \sum_{i,j} p(x_t|S_t = i, S_{t-1} = j, y_t, \theta_t) p(S_t = i, S_{t-1} = j|y_t, \theta_t) = \sum_{i, j}p(x_t|S_t = i, S_{t-1} = j, y_t, \theta_t) p_{i, j} p(S_{t-1} = j| y_t, \theta_t) $$
Here I've stucked. I guess it's not hard to calculate $p(x_t|S_t = i, S_{t-1} = j, y_t, \theta_t)$ (using some kind of dynamics), but what can we do with $p(S_{t-1} = j|y_t, \theta_t)$? Any ideas, links?