Variance Reduction Techniques

04 February 2017

One Example of Why We Need Them: Gradient Estimators

Given a family of \(\mathcal X\)-valued random variables \(X_{\theta}\) and a function \(f: \mathcal X \to \mathbb R\), we are interested in estimating gradients of its expectation with respect to \(\theta\): \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})]. \end{align}

The Reparametrization Trick Gradient Estimator

The reparametrization trick relies on finding a random variable \(Z\) and function \(g_{\theta}\) such that \(X_{\theta} = g_{\theta}(Z)\). The gradient can then be estimated using a Monte Carlo estimator: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \E[f(g_{\theta}(Z))] \\
&= \E\left[\frac{\partial}{\partial \theta} f(g_{\theta}(Z)) \right] \\
&= \E\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z) \right] \\
&\approx \frac{1}{N} \sum_{n = 1}^N f’(g_{\theta}(z_n)) \frac{\partial}{\partial \theta} g_{\theta}(z_n) \\
&=: I_{\text{reparam}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reparam}}] = \frac{1}{N} \Var\left[f’(g_{\theta}(Z)) \frac{\partial}{\partial \theta} g_{\theta}(Z)\right]. \end{align}

The REINFORCE Gradient Estimator

The reinforce trick relies on knowing \(\frac{\partial}{\partial \theta} \log p_{\theta}(x)\) where \(p_{\theta}(x)\) is the density of \(X_{\theta}\). The gradient is estimated as follows: \begin{align} \frac{\partial}{\partial \theta} \E[f(X_{\theta})] &= \frac{\partial}{\partial \theta} \int f(x)p_{\theta}(x) \,\mathrm dx \\
&= \int f(x) \frac{\partial}{\partial \theta} p_{\theta}(x) \,\mathrm dx \\
&= \int f(x) \left(\frac{\partial}{\partial \theta} \log p_{\theta}(x) \right) p_{\theta}(x) \,\mathrm dx \\
&= \E\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right] \\
&\approx \frac{1}{N} \sum_{n = 1}^N f(x_n) \frac{\partial}{\partial \theta} \log p_{\theta}(x_n) \\
&=: I_{\text{reinforce}}. \end{align}

This estimator has a standard Monte Carlo variance: \begin{align} \Var[I_{\text{reinforce}}] = \frac{1}{N} \Var\left[f(X_{\theta}) \frac{\partial}{\partial \theta} \log p_{\theta}(X_{\theta})\right]. \end{align}

Comparison of Reparameterization trick and REINFORCE Estimators

We want to compare \(\Var[I_{\text{reparam}}]\) and \(\Var[I_{\text{reinforce}}]\).

It turns out that we can’t make such comparison hold true for all \(f\) and \(X_{\theta}\). For details, check out the proposition 1 from section 3.1.2. of yarin gal’s thesis (Gal, 2016). Tables 3.1. and 3.2. given an example of different \(f\)s for which the variance comparisons are inconsistent.