13 March 2017
notes on (Krishnan et al., 2017).
basically vae on state space models (SSMs): learn model parameters of SSMs and at the same time learn an inference network. the SSM under consideration is the standard SSM. but the transition is a neural net. emission is also a neural net. everything is gaussian.
the novelty lies in
\(q\) takes in the form \begin{align} q_{\phi}(x_{1:T} \given y_{1:T}) = q_{\phi}(x_1 \given x_1, \dotsc, x_T) \prod_{t = 2}^T q_{\phi}(x_t \given x_{t - 1}, y_t, \dotsc, y_T), \end{align} i.e. condition only on the last \(x\) and the future \(y_t\)s. this comes from considering the conditional independence structure of the posterior…
other forms of \(q_{\phi}\) such as
but \(q_{\phi}(x_1 \given x_1, \dotsc, x_T) \prod_{t = 2}^T q_{\phi}(x_t \given x_{t - 1}, y_t, \dotsc, y_T)\) performs best.
the elbo has some sort of weird form because everything is gaussian. reparametrization trick is thus not needed… check eq. 6.
experiments on
@inproceedings{krishnan2016structured, title = {Structured Inference Networks for Nonlinear State Space Models}, author = {Krishnan, Rahul G and Shalit, Uri and Sontag, David}, booktitle = {AAAI}, year = {2017} }
[back]