TensorFlow is back at Google I/O on May 14! Register now

tfp.bijectors.RealNVP

RealNVP 'affine coupling layer' for vector-valued events.

Inherits From: Bijector

tfp.bijectors.RealNVP(
    num_masked=None,
    fraction_masked=None,
    shift_and_log_scale_fn=None,
    bijector_fn=None,
    is_constant_jacobian=False,
    validate_args=False,
    name=None
)

Real NVP models a normalizing flow on a D-dimensional distribution via a single D-d-dimensional conditional distribution [(Dinh et al., 2017)][1]:

y[d:D] = x[d:D] * tf.exp(log_scale_fn(x[0:d])) + shift_fn(x[0:d]) y[0:d] = x[0:d]

The last D-d units are scaled and shifted based on the first d units only, while the first d units are 'masked' and left unchanged. Real NVP's shift_and_log_scale_fn computes vector-valued quantities. For scale-and-shift transforms that do not depend on any masked units, i.e. d=0, use the tfb.Scale and tfb.Shift bijectors with learned parameters instead.

Masking is currently only supported for base distributions with event_ndims=1. For more sophisticated masking schemes like checkerboard or channel-wise masking [(Papamakarios et al., 2016)[4], use the tfb.Permute bijector to re-order desired masked units into the first d units. For base distributions with event_ndims > 1, use the tfb.Reshape bijector to flatten the event shape.

Recall that the MAF bijector [(Papamakarios et al., 2016)][4] implements a normalizing flow via an autoregressive transformation. MAF and IAF have opposite computational tradeoffs - MAF can train all units in parallel but must sample units sequentially, while IAF must train units sequentially but can sample in parallel. In contrast, Real NVP can compute both forward and inverse computations in parallel. However, the lack of an autoregressive transformations makes it less expressive on a per-bijector basis.

A 'valid' shift_and_log_scale_fn must compute each shift (aka loc or 'mu' in [Papamakarios et al. (2016)][4]) and log(scale) (aka 'alpha' in [Papamakarios et al. (2016)][4]) such that each are broadcastable with the arguments to forward and inverse, i.e., such that the calculations in forward, inverse [below] are possible. For convenience, real_nvp_default_template is offered as a possible shift_and_log_scale_fn function.

NICE [(Dinh et al., 2014)][2] is a special case of the Real NVP bijector which discards the scale transformation, resulting in a constant-time inverse-log-determinant-Jacobian. To use a NICE bijector instead of Real NVP, shift_and_log_scale_fn should return (shift, None), and is_constant_jacobian should be set to True in the RealNVP constructor. Calling real_nvp_default_template with shift_only=True returns one such NICE-compatible shift_and_log_scale_fn.

The bijector_fn argument allows specifying a more general coupling relation, such as the LSTM-inspired activation from [5], or Neural Spline Flow [6].

Caching: the scalar input depth D of the base distribution is not known at construction time. The first call to any of forward(x), inverse(x), inverse_log_det_jacobian(x), or forward_log_det_jacobian(x) memoizes D, which is re-used in subsequent calls. This shape must be known prior to graph execution (which is the case if using tf.layers).

Examples

tfd = tfp.distributions
tfb = tfp.bijectors

# A common choice for a normalizing flow is to use a Gaussian for the base
# distribution. (However, any continuous distribution would work.) E.g.,
nvp = tfd.TransformedDistribution(
    distribution=tfd.MultivariateNormalDiag(loc=[0., 0., 0.]),
    bijector=tfb.RealNVP(
        num_masked=2,
        shift_and_log_scale_fn=tfb.real_nvp_default_template(
            hidden_layers=[512, 512])))

x = nvp.sample()
nvp.log_prob(x)
nvp.log_prob([0.0, 0.0, 0.0])

For more examples, see [Jang (2018)][3].

References

[1]: Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density Estimation using Real NVP. In International Conference on Learning Representations, 2017. https://arxiv.org/abs/1605.08803

[2]: Laurent Dinh, David Krueger, and Yoshua Bengio. NICE: Non-linear Independent Components Estimation. arXiv preprint arXiv:1410.8516, 2014. https://arxiv.org/abs/1410.8516

[3]: Eric Jang. Normalizing Flows Tutorial, Part 2: Modern Normalizing Flows. Technical Report, 2018. http://blog.evjang.com/2018/01/nf2.html

[4]: George Papamakarios, Theo Pavlakou, and Iain Murray. Masked Autoregressive Flow for Density Estimation. In Neural Information Processing Systems, 2017. https://arxiv.org/abs/1705.07057

[5]: Diederik P Kingma, Tim Salimans, Max Welling. Improving Variational Inference with Inverse Autoregressive Flow. In Neural Information Processing Systems, 2016. https://arxiv.org/abs/1606.04934

[6]: Conor Durkan, Artur Bekasov, Iain Murray, George Papamakarios. Neural Spline Flows, 2019. http://arxiv.org/abs/1906.04032

Args
`num_masked`	Python `int`, indicating the number of units of the event that should should be masked. Must be in the closed interval `[0, D-1]`, where `D` is the event size of the base distribution. If the value is negative, then the last `d` units of the event are masked instead. Must be `None` if `fraction_masked` is defined.
`fraction_masked`	Python `float`, indicating the number of units of the event that should should be masked. Must be in the closed interval `[-1, 1]`, and the value represents the fraction of the values to be masked. The final number of values to be masked will be the input size times the fraction, rounded to the the nearest integer towards zero. If negative, then the last fraction of units are masked instead. Must be `None` if `num_masked` is defined.
`shift_and_log_scale_fn`	Python `callable` which computes `shift` and `log_scale` from both the forward domain (`x`) and the inverse domain (`y`). Calculation must respect the 'autoregressive property' (see class docstring). Suggested default `masked_autoregressive_default_template(hidden_layers=...)`. Typically the function contains `tf.Variables` and is wrapped using `tf.make_template`. Returning `None` for either (both) `shift`, `log_scale` is equivalent to (but more efficient than) returning zero.
`bijector_fn`	Python `callable` which returns a `tfb.Bijector` which transforms the last `D-d` unit with the signature `(masked_units_tensor, output_units, **condition_kwargs) -> bijector`. The bijector must operate on scalar or vector events and must not alter the rank of its input.
`is_constant_jacobian`	Python `bool`. Default: `False`. When `True` the implementation assumes `log_scale` does not depend on the forward domain (`x`) or inverse domain (`y`) values. (No validation is made; `is_constant_jacobian=False` is always safe but possibly computationally inefficient.)
`validate_args`	Python `bool` indicating whether arguments should be checked for correctness.
`name`	Python `str`, name given to ops managed by this object.

Raises
`ValueError`	If both or none of `shift_and_log_scale_fn` and `bijector_fn` are specified.

Attributes
`dtype`
`forward_min_event_ndims`	Returns the minimal number of dimensions bijector.forward operates on. Multipart bijectors return structured `ndims`, which indicates the expected structure of their inputs. Some multipart bijectors, notably Composites, may return structures of `None`.
`graph_parents`	Returns this `Bijector`'s graph_parents as a Python list.
`inverse_min_event_ndims`	Returns the minimal number of dimensions bijector.inverse operates on. Multipart bijectors return structured `event_ndims`, which indicates the expected structure of their outputs. Some multipart bijectors, notably Composites, may return structures of `None`.
`is_constant_jacobian`	Returns true iff the Jacobian matrix is not a function of x. Note: Jacobian matrix is either constant for both forward and inverse or neither.
`name`	Returns the string name of this `Bijector`.
`name_scope`	Returns a `tf.name_scope` instance for this class.
`non_trainable_variables`	Sequence of non-trainable variables owned by this module and its submodules. Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.
`parameters`	Dictionary of parameters used to instantiate this `Bijector`.
`submodules`	Sequence of all sub-modules. Submodules are modules which are properties of this module, or found as properties of modules which are properties of this module (and so on). `a = tf.Module()` `b = tf.Module()` `c = tf.Module()` `a.b = b` `b.c = c` `list(a.submodules) == [b, c]` `True` `list(b.submodules) == [c]` `True` `list(c.submodules) == []` `True`
`trainable_variables`	Sequence of trainable variables owned by this module and its submodules. Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.
`validate_args`	Returns True if Tensor arguments will be validated.
`variables`	Sequence of variables owned by this module and its submodules. Note: this method uses reflection to find variables on the current instance and submodules. For performance reasons you may wish to cache the result of calling this method if you don't expect the return value to change.

Args
`x_event_ndims`	Optional Python `int` (structure) number of dimensions in a probabilistic event passed to `forward`; this must be greater than or equal to `self.forward_min_event_ndims`. If `None`, defaults to `self.forward_min_event_ndims`. Mutually exclusive with `y_event_ndims`. Default value: `None`.
`y_event_ndims`	Optional Python `int` (structure) number of dimensions in a probabilistic event passed to `inverse`; this must be greater than or equal to `self.inverse_min_event_ndims`. Mutually exclusive with `x_event_ndims`. Default value: `None`.

Args
`x`	`Tensor` (structure). The point at which to calculate the density.
`tangent_space`	`TangentSpace` or one of its subclasses. The tangent to the support manifold at `x`.
`backward_compat`	`bool` specifying whether to assume that the Bijector is dimension-preserving.
`**kwargs`	Optional keyword arguments forwarded to tangent space methods.

Args
`x`	`Tensor` (structure). The input to the 'forward' evaluation.
`name`	The name to give this op.
`**kwargs`	Named arguments forwarded to subclass implementation.

Raises
`TypeError`	if `self.dtype` is specified and `x.dtype` is not `self.dtype`.
`NotImplementedError`	if `_forward` is not implemented.

Args
`event_ndims`	Structure of Python and/or Tensor `int`s, and/or `None` values. The structure should match that of `self.forward_min_event_ndims`, and all non-`None` values must be greater than or equal to the corresponding value in `self.forward_min_event_ndims`.
`**kwargs`	Optional keyword arguments forwarded to nested bijectors.

Args
`input_shape`	`Tensor`, `int32` vector (structure) indicating event-portion shape passed into `forward` function.
`name`	name to give to the op

Args
`x`	`Tensor` (structure). The input to the 'forward' Jacobian determinant evaluation.
`event_ndims`	Optional number of dimensions in the probabilistic events being transformed; this must be greater than or equal to `self.forward_min_event_ndims`. If `event_ndims` is specified, the log Jacobian determinant is summed to produce a scalar log-determinant for each event. Otherwise (if `event_ndims` is `None`), no reduction is performed. Multipart bijectors require structured event_ndims, such that the batch rank `rank(y[i]) - event_ndims[i]` is the same for all elements `i` of the structured input. In most cases (with the exception of `tfb.JointMap`) they further require that `event_ndims[i] - self.inverse_min_event_ndims[i]` is the same for all elements `i` of the structured input. Default value: `None` (equivalent to `self.forward_min_event_ndims`).
`name`	The name to give this op.
`**kwargs`	Named arguments forwarded to subclass implementation.

Raises
`TypeError`	if `y`'s dtype is incompatible with the expected output dtype.
`NotImplementedError`	if neither `_forward_log_det_jacobian` nor {`_inverse`, `_inverse_log_det_jacobian`} are implemented, or this is a non-injective bijector.
`ValueError`	if the value of `event_ndims` is not valid for this bijector.

Raises
`TypeError`	if `y`'s structured dtype is incompatible with the expected output dtype.
`NotImplementedError`	if `_inverse` is not implemented.

Args
`output_shape`	`Tensor`, `int32` vector (structure) indicating event-portion shape passed into `inverse` function.
`name`	name to give to the op

Raises
`TypeError`	if `x`'s dtype is incompatible with the expected inverse-dtype.
`NotImplementedError`	if `_inverse_log_det_jacobian` is not implemented.
`ValueError`	if the value of `event_ndims` is not valid for this bijector.

Args
`value`	A `tfd.Distribution`, `tfb.Bijector`, or a (structure of) `Tensor`.
`name`	Python `str` name given to ops created by this function.
`**kwargs`	Additional keyword arguments passed into the created `tfd.TransformedDistribution`, `tfb.Bijector`, or `self.forward`.

tfp.bijectors.RealNVP

Examples

References

Args

Raises

Attributes

Methods

copy

experimental_batch_shape

experimental_batch_shape_tensor

experimental_compute_density_correction

forward

forward_dtype

forward_event_ndims

forward_event_shape

forward_event_shape_tensor

forward_log_det_jacobian

inverse

inverse_dtype

inverse_event_ndims

inverse_event_shape

inverse_event_shape_tensor

inverse_log_det_jacobian

parameter_properties

with_name_scope

__call__

Examples

__eq__

__getitem__

__iter__

`copy`

`experimental_batch_shape`

`experimental_batch_shape_tensor`

`experimental_compute_density_correction`

`forward`

`forward_dtype`

`forward_event_ndims`

`forward_event_shape`

`forward_event_shape_tensor`

`forward_log_det_jacobian`

`inverse`

`inverse_dtype`

`inverse_event_ndims`

`inverse_event_shape`

`inverse_event_shape_tensor`

`inverse_log_det_jacobian`

`parameter_properties`

`with_name_scope`

`call`

`eq`

`getitem`

`iter`