# importance weighted autoencoders

19 February 2017

## summary

understanding: 8/10
code: https://github.com/yburda/iwae

importance weighted autoencoders (iwae) improve variational autoencoders. the main difference is in the loss function. iwae uses: \begin{align} \mathcal L_K = \E_{q_{\phi}(x \given y)}\left[\log \frac{1}{K} \sum_{k = 1}^K w_k\right] \end{align} where $w_k = p_{\theta}(x_k, y) / q_{\phi}(x_k \given y)$ and $x_k \sim q_{\phi}(x \given y)$. vae uses the same, but always with $K = 1$. both objectives are lower bounds of $p_{\theta}(y)$.

iwae is better because:

• the lower bound is better as $K$ is larger (and converges to $p_{\theta}(y)$ as $K \to \infty$).
• iwae possibly uses the neural network modelling capacity better (more active units).
• experimentally better estimate of $p_{\theta}(y)$ (obtained by importance sampling with 5000 particles) than vae.

## references

1. Burda, Y., Grosse, R., & Salakhutdinov, R. (2016). Importance Weighted Autoencoders. In International Conference on Learning Representations (ICLR).
@inproceedings{burda2016importance,
title = {Importance Weighted Autoencoders},
author = {Burda, Yuri and Grosse, Roger and Salakhutdinov, Ruslan},
year = {2016},
booktitle = {International Conference on Learning Representations (ICLR)}
}


[back]