Google I/O is a wrap! Catch up on TensorFlow sessions

tf.contrib.distributions.TransformedDistribution

A Transformed Distribution.

Inherits From: `Distribution`

A `TransformedDistribution` models `p(y)` given a base distribution `p(x)`, and a deterministic, invertible, differentiable transform, `Y = g(X)`. The transform is typically an instance of the `Bijector` class and the base distribution is typically an instance of the `Distribution` class.

A `Bijector` is expected to implement the following functions:

• `forward`,
• `inverse`,
• `inverse_log_det_jacobian`. The semantics of these functions are outlined in the `Bijector` documentation.

We now describe how a `TransformedDistribution` alters the input/outputs of a `Distribution` associated with a random variable (rv) `X`.

Write `cdf(Y=y)` for an absolutely continuous cumulative distribution function of random variable `Y`; write the probability density function ```pdf(Y=y) := d^k / (dy_1,...,dy_k) cdf(Y=y)``` for its derivative wrt to `Y` evaluated at `y`. Assume that `Y = g(X)` where `g` is a deterministic diffeomorphism, i.e., a non-random, continuous, differentiable, and invertible function. Write the inverse of `g` as `X = g^{-1}(Y)` and `(J o g)(x)` for the Jacobian of `g` evaluated at `x`.

A `TransformedDistribution` implements the following operations:

• `sample` Mathematically: `Y = g(X)` Programmatically: `bijector.forward(distribution.sample(...))`

• `log_prob` Mathematically: `(log o pdf)(Y=y) = (log o pdf o g^{-1})(y)

``````               + (log o abs o det o J o g^{-1})(y)`
``````

Programmatically: ```(distribution.log_prob(bijector.inverse(y)) + bijector.inverse_log_det_jacobian(y))```

• `log_cdf` Mathematically: `(log o cdf)(Y=y) = (log o cdf o g^{-1})(y)` Programmatically: `distribution.log_cdf(bijector.inverse(x))`

• and similarly for: `cdf`, `prob`, `log_survival_function`, `survival_function`.

A simple example constructing a Log-Normal distribution from a Normal distribution:

``````ds = tfp.distributions
log_normal = ds.TransformedDistribution(
distribution=ds.Normal(loc=0., scale=1.),
bijector=ds.bijectors.Exp(),
name="LogNormalTransformedDistribution")
``````

A `LogNormal` made from callables:

``````ds = tfp.distributions
log_normal = ds.TransformedDistribution(
distribution=ds.Normal(loc=0., scale=1.),
bijector=ds.bijectors.Inline(
forward_fn=tf.exp,
inverse_fn=tf.math.log,
inverse_log_det_jacobian_fn=(
lambda y: -tf.reduce_sum(tf.math.log(y), axis=-1)),
name="LogNormalTransformedDistribution")
``````

Another example constructing a Normal from a StandardNormal:

``````ds = tfp.distributions
normal = ds.TransformedDistribution(
distribution=ds.Normal(loc=0., scale=1.),
bijector=ds.bijectors.Affine(
shift=-1.,
scale_identity_multiplier=2.)
name="NormalTransformedDistribution")
``````

A `TransformedDistribution`'s batch- and event-shape are implied by the base distribution unless explicitly overridden by `batch_shape` or `event_shape` arguments. Specifying an overriding `batch_shape` (`event_shape`) is permitted only if the base distribution has scalar batch-shape (event-shape). The bijector is applied to the distribution as if the distribution possessed the overridden shape(s). The following example demonstrates how to construct a multivariate Normal as a `TransformedDistribution`.

``````ds = tfp.distributions
# We will create two MVNs with batch_shape = event_shape = 2.
mean = [[-1., 0],      # batch:0
[0., 1]]       # batch:1
chol_cov = [[[1., 0],
[0, 1]],  # batch:0
[[1, 0],
[2, 2]]]  # batch:1
mvn1 = ds.TransformedDistribution(
distribution=ds.Normal(loc=0., scale=1.),
bijector=ds.bijectors.Affine(shift=mean, scale_tril=chol_cov),
batch_shape=[2],  # Valid because base_distribution.batch_shape == [].
event_shape=[2])  # Valid because base_distribution.event_shape == [].
mvn2 = ds.MultivariateNormalTriL(loc=mean, scale_tril=chol_cov)
# mvn1.log_prob(x) == mvn2.log_prob(x)
``````

`distribution` The base distribution instance to transform. Typically an instance of `Distribution`.
`bijector` The object responsible for calculating the transformation. Typically an instance of `Bijector`. `None` means `Identity()`.
`batch_shape` `integer` vector `Tensor` which overrides `distribution` `batch_shape`; valid only if `distribution.is_scalar_batch()`.
`event_shape` `integer` vector `Tensor` which overrides `distribution` `event_shape`; valid only if `distribution.is_scalar_event()`.
`validate_args` Python `bool`, default `False`. When `True` distribution parameters are checked for validity despite possibly degrading runtime performance. When `False` invalid inputs may silently render incorrect outputs.
`name` Python `str` name prefixed to Ops created by this class. Default: `bijector.name + distribution.name`.

`allow_nan_stats` Python `bool` describing behavior when a stat is undefined.

Stats return +/- infinity when it makes sense. E.g., the variance of a Cauchy distribution is infinity. However, sometimes the statistic is undefined, e.g., if a distribution's pdf does not achieve a maximum within the support of the distribution, the mode is undefined. If the mean is undefined, then by definition the variance is undefined. E.g. the mean for Student's T for df = 1 is undefined (no clear way to say it is either + or - infinity), so the variance = E[(X - mean)**2] is also undefined.

`batch_shape` Shape of a single sample from a single event index as a `TensorShape`.

May be partially defined or unknown.

The batch dimensions are indexes into independent, non-identical parameterizations of this distribution.

`bijector` Function transforming x => y.
`distribution` Base distribution, p(x).
`dtype` The `DType` of `Tensor`s handled by this `Distribution`.
`event_shape` Shape of a single sample from a single batch as a `TensorShape`.

May be partially defined or unknown.

`name` Name prepended to all ops created by this `Distribution`.
`parameters` Dictionary of parameters used to instantiate this `Distribution`.
`reparameterization_type` Describes how samples from the distribution are reparameterized.

Currently this is one of the static instances `distributions.FULLY_REPARAMETERIZED` or `distributions.NOT_REPARAMETERIZED`.

`validate_args` Python `bool` indicating possibly expensive checks are enabled.

Methods

`batch_shape_tensor`

View source

Shape of a single sample from a single event index as a 1-D `Tensor`.

The batch dimensions are indexes into independent, non-identical parameterizations of this distribution.

Args
`name` name to give to the op

Returns
`batch_shape` `Tensor`.

`cdf`

View source

Cumulative distribution function.

Given random variable `X`, the cumulative distribution function `cdf` is:

``````cdf(x) := P[X <= x]
``````

Args
`value` `float` or `double` `Tensor`.
`name` Python `str` prepended to names of ops created by this function.

Returns
`cdf` a `Tensor` of shape `sample_shape(x) + self.batch_shape` with values of type `self.dtype`.

`copy`

View source

Creates a deep copy of the distribution.

Args
`**override_parameters_kwargs` String/value dictionary of initialization arguments to override with new values.

Returns
`distribution` A new instance of `type(self)` initialized from the union of self.parameters and override_parameters_kwargs, i.e., `dict(self.parameters, **override_parameters_kwargs)`.

`covariance`

View source

Covariance.

Covariance is (possibly) defined only for non-scalar-event distributions.

For example, for a length-`k`, vector-valued distribution, it is calculated as,

``````Cov[i, j] = Covariance(X_i, X_j) = E[(X_i - E[X_i]) (X_j - E[X_j])]
``````

where `Cov` is a (batch of) `k x k` matrix, `0 <= (i, j) < k`, and `E` denotes expectation.

Alternatively, for non-vector, multivariate distributions (e.g., matrix-valued, Wishart), `Covariance` shall return a (batch of) matrices under some vectorization of the events, i.e.,

``````Cov[i, j] = Covariance(Vec(X)_i, Vec(X)_j) = [as above]
``````

where `Cov` is a (batch of) `k' x k'` matrices, `0 <= (i, j) < k' = reduce_prod(event_shape)`, and `Vec` is some function mapping indices of this distribution's event dimensions to indices of a length-`k'` vector.

Args
`name` Python `str` prepended to names of ops created by this function.

Returns
`covariance` Floating-point `Tensor` with shape `[B1, ..., Bn, k', k']` where the first `n` dimensions are batch coordinates and `k' = reduce_prod(self.event_shape)`.

`cross_entropy`

View source

Computes the (Shannon) cross entropy.

Denote this distribution (`self`) by `P` and the `other` distribution by `Q`. Assuming `P, Q` are absolutely continuous with respect to one another and permit densities `p(x) dr(x)` and `q(x) dr(x)`, (Shanon) cross entropy is defined as:

``````H[P, Q] = E_p[-log q(X)] = -int_F p(x) log q(x) dr(x)
``````

where `F` denotes the support of the random variable `X ~ P`.

Args
`other` `tfp.distributions.Distribution` instance.
`name` Python `str` prepended to names of ops created by this function.

Returns
`cross_entropy` `self.dtype` `Tensor` with shape `[B1, ..., Bn]` representing `n` different calculations of (Shanon) cross entropy.

`entropy`

View source

Shannon entropy in nats.

`event_shape_tensor`

View source

Shape of a single sample from a single batch as a 1-D int32 `Tensor`.

Args
`name` name to give to the op

Returns
`event_shape` `Tensor`.

`is_scalar_batch`

View source

Indicates that `batch_shape == []`.

Args
`name` Python `str` prepended to names of ops created by this function.

Returns
`is_scalar_batch` `bool` scalar `Tensor`.

`is_scalar_event`

View source

Indicates that `event_shape == []`.

Args
`name` Python `str` prepended to names of ops created by this function.

Returns
`is_scalar_event` `bool` scalar `Tensor`.

`kl_divergence`

View source

Computes the Kullback--Leibler divergence.

Denote this distribution (`self`) by `p` and the `other` distribution by `q`. Assuming `p, q` are absolutely continuous with respect to reference measure `r`, the KL divergence is defined as:

``````KL[p, q] = E_p[log(p(X)/q(X))]
= -int_F p(x) log q(x) dr(x) + int_F p(x) log p(x) dr(x)
= H[p, q] - H[p]
``````

where `F` denotes the support of the random variable `X ~ p`, `H[., .]` denotes (Shanon) cross entropy, and `H[.]` denotes (Shanon) entropy.

Args
`other` `tfp.distributions.Distribution` instance.
`name` Python `str` prepended to names of ops created by this function.

Returns
`kl_divergence` `self.dtype` `Tensor` with shape `[B1, ..., Bn]` representing `n` different calculations of the Kullback-Leibler divergence.