# Variance Reduction Techniques

04 February 2017

## One Example of Why We Need Them: Gradient Estimators

Given a family of $$\mathcal X$$-valued random variables $$X_{\theta}$$ and a function $$f: \mathcal X \to \mathbb R$$, we are interested in estimating gradients of its expectation with respect to $$\theta$$: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})]. \end{align}

### The Reparametrization Trick Gradient Estimator

The reparametrization trick relies on finding a random variable $$Z$$ and function $$g_{\theta}$$ such that $$X_{\theta} = g_{\theta}(Z)$$. The gradient can then be estimated using a Monte Carlo estimator: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \E[f(g_{\theta}(Z))] \\
&= \E\left[\frac{\partial}{\partial \theta} f(g_{\theta}(Z)) \right] \\
&= \E\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z) \right] \\
&\approx \frac{1}{N} \sum_{n = 1}^N f’(g_{\theta}(z_n)) \frac{\partial}{\partial \theta} g_{\theta}(z_n) \\
&=: I_{\text{reparam}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reparam}}] = \frac{1}{N} \Var\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z)\right]. \end{align}

The reinforce trick relies on knowing $$\frac{\partial}{\partial \theta} \log p_{\theta}(x)$$ where $$p_{\theta}(x)$$ is the density of $$X_{\theta}$$. The gradient is estimated as follows: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \int f(x)p_{\theta}(x) \,\mathrm dx \\
&= \int f(x) \frac{\partial}{\partial \theta} p_{\theta}(x) \,\mathrm dx \\
&= \int f(x) \left(\frac{\partial}{\partial \theta} \log p_{\theta}(x) \right) p_{\theta}(x) \,\mathrm dx \\
&= \E\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right] \\
&\approx \frac{1}{N} \sum_{n = 1}^N f(x_n) \frac{\partial}{\partial \theta} \log p_{\theta}(x_n) \\
&=: I_{\text{reinforce}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reinforce}}] = \frac{1}{N} \Var\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right]. \end{align}

### Comparison of Reparameterization trick and REINFORCE Estimators

We want to compare $$\Var[I_{\text{reparam}}]$$ and $$\Var[I_{\text{reinforce}}]$$.

It turns out that we can’t make such comparison hold true for all $$f$$ and $$X_{\theta}$$. For details, check out the proposition 1 from section 3.1.2. of yarin gal’s thesis (Gal, 2016). Tables 3.1. and 3.2. given an example of different $$f$$s for which the variance comparisons are inconsistent.

## Variance Reduction Techniques

### Rao-Blackwellization

1. Gal, Y. (2016). Uncertainty in Deep Learning [PhD thesis]. University of Cambridge.
@phdthesis{gal2016uncertainty,
title = {Uncertainty in Deep Learning},
author = {Gal, Yarin},
year = {2016},
school = {University of Cambridge},