Help protect the Great Barrier Reef with TensorFlow on Kaggle

Variational losses implemented by monte_carlo_variational_loss are defined in general as an expectation of some fn under the surrogate posterior,

loss = expectation(fn, surrogate_posterior)

where the expectation is estimated in practice using a finite sample_size number of samples:

zs = surrogate_posterior.sample(sample_size)
loss_estimate = 1 / sample_size * sum([fn(z) for z in z])

Gradient estimators define a stochastic estimate of the gradient of the above expectation with respect to the parameters of the surrogate posterior.

Members:

• SCORE_FUNCTION: Also known as REINFORCE  or the log-derivative gradient estimator . This estimator works with any surrogate posterior, but gradient estimates may be very noisy.
• REPARAMETERIZATION: Reparameterization gradients as introduced by Kingma and Welling . These require a continuous-valued surrogate that sets reparameterization_type=FULLY_REPARAMETERIZED (which must implement reparameterized sampling either directly or via implicit reparameterization ), and typically yield much lower-variance gradient estimates than the generic score function estimator.
• DOUBLY_REPARAMETERIZED: The doubly-reparameterized estimator presented by Tucker et al.  for importance-weighted bounds. Note that this includes the sticking-the-landing estimator developed by Roeder et al.  as a special case when importance_sample_size=1. Compared to 'vanilla' reparameterization, this can provide even lower-variance gradient estimates, but requires a copy of the surrogate posterior with no gradient to its parameters (passed to the loss as stopped_surrogate_posterior), and incurs an additional evaluation of the surrogate density at each step.
• VIMCO: An extension of the score-function estimator, introduced by Minh and Rezende , with reduced variance when importance_sample_size > 1.

 R. J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3-4), 229–256, 1992.

 Shakir Mohamed. Machine Learning Trick of the Day: Log Derivative Trick. 2015. https://blog.shakirm.com/2015/11/machine-learning-trick-of-the-day-5-log-derivative-trick/

 Diederik P. Kingma, and Max Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013. https://arxiv.org/abs/1312.6114

 Michael Figurnov, Shakir Mohamed, and Andriy Mnih. Implicit reparameterization gradients. arXiv preprint arXiv:1805.08498, 2018. https://arxiv.org/abs/1805.08498

 George Tucker, Dieterich Lawson, Shixiang Gu, and Chris J. Maddison. Doubly reparameterized gradient estimators for Monte Carlo objectives. arXiv preprint arXiv:1810.04152, 2018. https://arxiv.org/abs/1810.04152

 Geoffrey Roeder, Yuhuai Wu, and David Duvenaud. Sticking the landing: Simple, lower-variance gradient estimators for variational inference. arXiv preprint arXiv:1703.09194, 2017. https://arxiv.org/abs/1703.09194

 Andriy Mnih and Danilo Rezende. Variational Inference for Monte Carlo objectives. In International Conference on Machine Learning, 2016. https://arxiv.org/abs/1602.06725