Tuan Anh Le

Variance Reduction Techniques

04 February 2017

One Example of Why We Need Them: Gradient Estimators

Given a family of -valued random variables and a function , we are interested in estimating gradients of its expectation with respect to : \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})]. \end{align}

The Reparametrization Trick Gradient Estimator

The reparametrization trick relies on finding a random variable and function such that . The gradient can then be estimated using a Monte Carlo estimator: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \E[f(g_{\theta}(Z))] \\ &= \E\left[\frac{\partial}{\partial \theta} f(g_{\theta}(Z)) \right] \\ &= \E\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z) \right] \\ &\approx \frac{1}{N} \sum_{n = 1}^N f’(g_{\theta}(z_n)) \frac{\partial}{\partial \theta} g_{\theta}(z_n) \\ &=: I_{\text{reparam}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reparam}}] = \frac{1}{N} \Var\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z)\right]. \end{align}

The REINFORCE Gradient Estimator

The reinforce trick relies on knowing where is the density of . The gradient is estimated as follows: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \int f(x)p_{\theta}(x) \,\mathrm dx \\ &= \int f(x) \frac{\partial}{\partial \theta} p_{\theta}(x) \,\mathrm dx \\ &= \int f(x) \left(\frac{\partial}{\partial \theta} \log p_{\theta}(x) \right) p_{\theta}(x) \,\mathrm dx \\ &= \E\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right] \\ &\approx \frac{1}{N} \sum_{n = 1}^N f(x_n) \frac{\partial}{\partial \theta} \log p_{\theta}(x_n) \\ &=: I_{\text{reinforce}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reinforce}}] = \frac{1}{N} \Var\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right]. \end{align}

Comparison of Reparameterization trick and REINFORCE Estimators

We want to compare and .

It turns out that we can’t make such comparison hold true for all and . For details, check out the proposition 1 from section 3.1.2. of yarin gal’s thesis (Gal, 2016). Tables 3.1. and 3.2. given an example of different s for which the variance comparisons are inconsistent.

Variance Reduction Techniques

Control Variates



  1. Gal, Y. (2016). Uncertainty in Deep Learning (PhD thesis). University of Cambridge.
      title = {Uncertainty in Deep Learning},
      author = {Gal, Yarin},
      year = {2016},
      school = {University of Cambridge},
      link = {http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf}