variational lossy autoencoder

19 April 2017

notes on (Chen et al., 2017).

understanding: 6/10
code: ?

goal

the problem that this paper is tackling is often refered to as the optimization challenges of VAEs:

when the decoder \(p_{\theta}(y \given x)\) is too expressive, the encoder \(q_{\phi}(x \given y)\) just learns the prior \(p_{\theta}(x)\) instead of the posterior \(p_{\theta}(x \given y)\).

this is a problem since VAEs won’t autoencode and the latents are meaningless.

information theory

an argument using code length of the joint code \((x, y)\) is used. i don’t quite get it but the bottom line is:

if \(p_{\theta}(y \given x)\) can model \(p_{\theta^\ast}(y)\) without using information from \(x\), it will do so. in this case, the posterior \(p_{\theta}(x \given y)\) is the same as the prior \(p_{\theta}(x)\).
information that can be modeled locally by decoding distribution \(p_{\theta}(y \given x)\) without access to latents will be encoded locally and only the remainder will be encoded in latents.

solution

we need a decoder \(p_{\theta}(y \given x)\) such that

it’s capable of modeling the information that we don’t want the lossy representation to capture
it’s incapable of modeling the information that we do want the lossy representation to capture

example: if we don’t want to include info about texture, force the decoder to learn the texture (e.g. using pixelcnn that can only see locally) then encoder will be forced to learn the other things, like global shapes

bottom line: if we want to encode something, make sure our decoder can’t possibly decode that something just by itself.

normalizing flows

make the prior powerful by using normalizing flows. then use decoders that can only capture local variations.

experiments

beats everything.

references

Chen, X., Kingma, D. P., Salimans, T., Duan, Y., Dhariwal, P., Schulman, J., Sutskever, I., & Abbeel, P. (2017). Variational Lossy Autoencoder. International Conference on Learning Representations (ICLR).

@inproceedings{chen2017variational,
  title = {Variational Lossy Autoencoder},
  author = {Chen, Xi and Kingma, Diederik P. and Salimans, Tim and Duan, Yan and Dhariwal, Prafulla and Schulman, John and Sutskever, Ilya and Abbeel, Pieter},
  year = {2017},
  booktitle = {International Conference on Learning Representations (ICLR)}
}

[back]