Tuan Anh Le

Variance Reduction Techniques

04 February 2017

One Example of Why We Need Them: Gradient Estimators

Given a family of \(\mathcal X\)-valued random variables \(X_{\theta}\) and a function \(f: \mathcal X \to \mathbb R\), we are interested in estimating gradients of its expectation with respect to \(\theta\): \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})]. \end{align}

The Reparametrization Trick Gradient Estimator

The reparametrization trick relies on finding a random variable \(Z\) and function \(g_{\theta}\) such that \(X_{\theta} = g_{\theta}(Z)\). The gradient can then be estimated using a Monte Carlo estimator: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \E[f(g_{\theta}(Z))] \\
&= \E\left[\frac{\partial}{\partial \theta} f(g_{\theta}(Z)) \right] \\
&= \E\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z) \right] \\
&\approx \frac{1}{N} \sum_{n = 1}^N f’(g_{\theta}(z_n)) \frac{\partial}{\partial \theta} g_{\theta}(z_n) \\
&=: I_{\text{reparam}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reparam}}] = \frac{1}{N} \Var\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z)\right]. \end{align}

The REINFORCE Gradient Estimator

The reinforce trick relies on knowing \(\frac{\partial}{\partial \theta} \log p_{\theta}(x)\) where \(p_{\theta}(x)\) is the density of \(X_{\theta}\). The gradient is estimated as follows: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \int f(x)p_{\theta}(x) \,\mathrm dx \\
&= \int f(x) \frac{\partial}{\partial \theta} p_{\theta}(x) \,\mathrm dx \\
&= \int f(x) \left(\frac{\partial}{\partial \theta} \log p_{\theta}(x) \right) p_{\theta}(x) \,\mathrm dx \\
&= \E\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right] \\
&\approx \frac{1}{N} \sum_{n = 1}^N f(x_n) \frac{\partial}{\partial \theta} \log p_{\theta}(x_n) \\
&=: I_{\text{reinforce}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reinforce}}] = \frac{1}{N} \Var\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right]. \end{align}

Comparison of Reparameterization trick and REINFORCE Estimators

We want to compare \(\Var[I_{\text{reparam}}]\) and \(\Var[I_{\text{reinforce}}]\).

It turns out that we can’t make such comparison hold true for all \(f\) and \(X_{\theta}\). For details, check out the proposition 1 from section 3.1.2. of yarin gal’s thesis (Gal, 2016). Tables 3.1. and 3.2. given an example of different \(f\)s for which the variance comparisons are inconsistent.

Variance Reduction Techniques

Control Variates



  1. Gal, Y. (2016). Uncertainty in Deep Learning [PhD thesis]. University of Cambridge.
      title = {Uncertainty in Deep Learning},
      author = {Gal, Yarin},
      year = {2016},
      school = {University of Cambridge},
      link = {http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf}