|View source on GitHub|
Lower bound on mutual information from [Barber and Agakov (2003)].
tfp.vi.mutual_information.lower_bound_barber_agakov( logu, entropy, name=None )
This method gives a lower bound on the mutual information I(X; Y), by replacing the unknown conditional p(x|y) with a variational decoder q(x|y), but it requires knowing the entropy of X, h(X). The lower bound was introduced in [Barber and Agakov (2003)].
I(X; Y) = E_p(x, y)[log( p(x|y) / p(x) )] = E_p(x, y)[log( q(x|y) / p(x) )] + E_p(y)[KL[ p(x|y) || q(x|y) ]] >= E_p(x, y)[log( q(x|y) )] + h(X) = I_[lower_bound_barbar_agakov]
y are samples from a joint Gaussian distribution, with correlation
0.8 and both of dimension
batch_size, rho, dim = 10000, 0.8, 1 y, eps = tf.split( value=tf.random.normal(shape=(2 * batch_size, dim), seed=7), num_or_size_splits=2, axis=0) mean, conditional_stddev = rho * y, tf.sqrt(1. - tf.square(rho)) x = mean + conditional_stddev * eps # Conditional distribution of p(x|y) conditional_dist = tfd.MultivariateNormalDiag( mean, scale_identity_multiplier=conditional_stddev) # Scores/unnormalized likelihood of pairs of joint samples `x[i], y[i]` joint_scores = conditional_dist.log_prob(x) # Differential entropy of `X` that is `1-D` Normal distributed. entropy_x = 0.5 * np.log(2 * np.pi * np.e) # Barber and Agakov lower bound on mutual information lower_bound_barber_agakov(logu=joint_scores, entropy=entropy_x)
: David Barber, Felix V. Agakov. The IM algorithm: a variational approach to Information Maximization. In Conference on Neural Information Processing Systems, 2003.