Attend, Infer, Repeat

19 February 2018

Generative network

Here is a pseudocode for the generative network \(p_\theta(x \given z)\) where the observation \(x \in \mathbb R^{D \times D}\) is an image and \(z\) are all the latent variables in the execution trace. \(\theta\) contains parameters of various neural nets in the generative network.

Initialize the image mean \(\mu = 0\),
While \(\mathrm{sample}\left(\mathrm{Bernoulli}(\rho)\right)\):
- \(z^{\text{where}} = \mathrm{sample}\left(\mathrm{Normal}(0, I)\right)\),
- \(z^{\text{what}} = \mathrm{sample}\left(\mathrm{Normal}(0, I)\right)\),
- \(\hat g = D_\theta(z^{\text{what}})\),
- \(\mu = \mu + \mathrm{STN}^{-1}(\hat g, z^{\text{where}})\),
\(\mathrm{observe}\left(x, \prod_{\text{pixel } i} \mathrm{Normal}(\mu_i, \sigma_x^2)\right)\).

where \(\mathrm{STN}^{-1}\) is an inverse Spatial Transformer Network, and the \(D_\theta\) a parametric function.

Inference Network

Here is a pseudocode for the inference network \(q_\phi(z \given x)\). \(\phi\) contains parameters of various neural nets in the inference network.

Initialize the hidden state \(h = 0\) for the LSTM cell \(R_\phi\),
\(w, h = R_\phi(\mathrm{concat}(x, 0, 0), h)\),
While \(\mathrm{sample}\left(\mathrm{Bernoulli}(f_\phi(w))\right)\):
- \(w, h = R_\phi(\mathrm{concat}(x, z^{\text{where}}, z^{\text{what}}), h)\),
- \(z^{\text{where}} = \mathrm{sample}\left(\mathrm{Normal}(\mu_\phi^{\text{where}}(w), \sigma_\phi^{\text{where}}(w)^2)\right)\),
- \(g = \mathrm{STN}(x, z^{\text{where}})\),
- \(z^{\text{what}} = \mathrm{sample}\left(\mathrm{Normal}(\mu_\phi^{\text{what}}(g), \sigma_\phi^{\text{what}}(g)^2)\right)\).

where \(\mathrm{STN}\) is a Spatial Transformer Network, and the LSTM cell \(R_\phi\) takes in an (input, hidden state) pair and outputs an (output, next hidden state) pair.

[back]

Tuan Anh Le

Attend, Infer, Repeat

Generative network

Inference Network