![]() |
Class Convolution2DReparameterization
2D convolution layer (e.g. spatial convolution over images).
Aliases:
This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the kernel
and/or bias
are drawn from
distributions.
By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,
outputs = f(inputs; kernel, bias), kernel, bias ~ posterior
where f denotes the layer's calculation. It uses the reparameterization
estimator [(Kingma and Welling, 2014)][1], which performs a Monte Carlo
approximation of the distribution integrating over the kernel
and bias
.
The arguments permit separate specification of the surrogate posterior
(q(W|x)
), prior (p(W)
), and divergence for both the kernel
and bias
distributions.
Upon being built, this layer adds losses (accessible via the losses
property) representing the divergences of kernel
and/or bias
surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if kl
is the sum of losses
for each element of the batch,
you should pass kl / num_examples_per_epoch
to your optimizer).
You can access the kernel
and/or bias
posterior and prior distributions
after the layer is built via the kernel_posterior
, kernel_prior
,
bias_posterior
and bias_prior
properties.
Examples
We illustrate a Bayesian neural network with variational inference,
assuming a dataset of features
and labels
.
import tensorflow as tf
import tensorflow_probability as tfp
model = tf.keras.Sequential([
tf.keras.layers.Reshape([32, 32, 3]),
tfp.layers.Convolution2DReparameterization(
64, kernel_size=5, padding='SAME', activation=tf.nn.relu),
tf.keras.layers.MaxPooling2D(pool_size=[2, 2],
strides=[2, 2],
padding='SAME'),
tf.keras.layers.Flatten(),
tfp.layers.DenseReparameterization(10),
])
logits = model(features)
neg_log_likelihood = tf.nn.softmax_cross_entropy_with_logits(
labels=labels, logits=logits)
kl = sum(model.losses)
loss = neg_log_likelihood + kl
train_op = tf.train.AdamOptimizer().minimize(loss)
It uses reparameterization gradients to minimize the Kullback-Leibler divergence up to a constant, also known as the negative Evidence Lower Bound. It consists of the sum of two terms: the expected negative log-likelihood, which we approximate via Monte Carlo; and the KL divergence, which is added via regularizer terms which are arguments to the layer.
References
[1]: Diederik Kingma and Max Welling. Auto-Encoding Variational Bayes. In International Conference on Learning Representations, 2014. https://arxiv.org/abs/1312.6114
__init__
__init__(
filters,
kernel_size,
strides=(1, 1),
padding='valid',
data_format='channels_last',
dilation_rate=(1, 1),
activation=None,
activity_regularizer=None,
kernel_posterior_fn=tfp_layers_util.default_mean_field_normal_fn(),
kernel_posterior_tensor_fn=(lambda d: d.sample()),
kernel_prior_fn=tfp.layers.default_multivariate_normal_fn,
kernel_divergence_fn=(lambda q, p, ignore: tfd.kl_divergence(q, p)),
bias_posterior_fn=tfp_layers_util.default_mean_field_normal_fn(is_singular=True),
bias_posterior_tensor_fn=(lambda d: d.sample()),
bias_prior_fn=None,
bias_divergence_fn=(lambda q, p, ignore: tfd.kl_divergence(q, p)),
**kwargs
)
Construct layer.
Args:
filters
: Integer, the dimensionality of the output space (i.e. the number of filters in the convolution).kernel_size
: An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions.strides
: An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width. Can be a single integer to specify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible with specifying anydilation_rate
value != 1.padding
: One of"valid"
or"same"
(case-insensitive).data_format
: A string, one ofchannels_last
(default) orchannels_first
. The ordering of the dimensions in the inputs.channels_last
corresponds to inputs with shape(batch, height, width, channels)
whilechannels_first
corresponds to inputs with shape(batch, channels, height, width)
.dilation_rate
: An integer or tuple/list of 2 integers, specifying the dilation rate to use for dilated convolution. Can be a single integer to specify the same value for all spatial dimensions. Currently, specifying anydilation_rate
value != 1 is incompatible with specifying any stride value != 1.activation
: Activation function. Set it to None to maintain a linear activation.activity_regularizer
: Regularizer function for the output.kernel_posterior_fn
: Pythoncallable
which createstfd.Distribution
instance representing the surrogate posterior of thekernel
parameter. Default value:default_mean_field_normal_fn()
.kernel_posterior_tensor_fn
: Pythoncallable
which takes atfd.Distribution
instance and returns a representative value. Default value:lambda d: d.sample()
.kernel_prior_fn
: Pythoncallable
which createstfd
instance. Seedefault_mean_field_normal_fn
docstring for required parameter signature. Default value:tfd.Normal(loc=0., scale=1.)
.kernel_divergence_fn
: Pythoncallable
which takes the surrogate posterior distribution, prior distribution and random variate sample(s) from the surrogate posterior and computes or approximates the KL divergence. The distributions aretfd.Distribution
-like instances and the sample is aTensor
.bias_posterior_fn
: Pythoncallable
which createstfd.Distribution
instance representing the surrogate posterior of thebias
parameter. Default value:default_mean_field_normal_fn(is_singular=True)
(which creates an instance oftfd.Deterministic
).bias_posterior_tensor_fn
: Pythoncallable
which takes atfd.Distribution
instance and returns a representative value. Default value:lambda d: d.sample()
.bias_prior_fn
: Pythoncallable
which createstfd
instance. Seedefault_mean_field_normal_fn
docstring for required parameter signature. Default value:None
(no prior, no variational inference)bias_divergence_fn
: Pythoncallable
which takes the surrogate posterior distribution, prior distribution and random variate sample(s) from the surrogate posterior and computes or approximates the KL divergence. The distributions aretfd.Distribution
-like instances and the sample is aTensor
.
Properties
activity_regularizer
Optional regularizer function for the output of this layer.
dtype
dynamic
input
Retrieves the input tensor(s) of a layer.
Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer.
Returns:
Input tensor or list of input tensors.
Raises:
RuntimeError
: If called in Eager mode.AttributeError
: If no inbound nodes are found.
input_mask
Retrieves the input mask tensor(s) of a layer.
Only applicable if the layer has exactly one inbound node, i.e. if it is connected to one incoming layer.
Returns:
Input mask tensor (potentially None) or list of input mask tensors.
Raises:
AttributeError
: if the layer is connected to more than one incoming layers.
input_shape
Retrieves the input shape(s) of a layer.
Only applicable if the layer has exactly one input, i.e. if it is connected to one incoming layer, or if all inputs have the same shape.
Returns:
Input shape, as an integer shape tuple (or list of shape tuples, one tuple per input tensor).
Raises:
AttributeError
: if the layer has no defined input_shape.RuntimeError
: if called in Eager mode.
input_spec
losses
Losses which are associated with this Layer
.
Variable regularization tensors are created when this property is accessed,
so it is eager safe: accessing losses
under a tf.GradientTape
will
propagate gradients back to the corresponding variables.
Returns:
A list of tensors.
metrics
name
Returns the name of this module as passed or determined in the ctor.
NOTE: This is not the same as the self.name_scope.name
which includes
parent module names.
name_scope
Returns a tf.name_scope
instance for this class.
non_trainable_variables
non_trainable_weights
output
Retrieves the output tensor(s) of a layer.
Only applicable if the layer has exactly one output, i.e. if it is connected to one incoming layer.
Returns:
Output tensor or list of output tensors.
Raises:
AttributeError
: if the layer is connected to more than one incoming layers.RuntimeError
: if called in Eager mode.
output_mask
Retrieves the output mask tensor(s) of a layer.
Only applicable if the layer has exactly one inbound node, i.e. if it is connected to one incoming layer.
Returns:
Output mask tensor (potentially None) or list of output mask tensors.
Raises:
AttributeError
: if the layer is connected to more than one incoming layers.
output_shape
Retrieves the output shape(s) of a layer.
Only applicable if the layer has one output, or if all outputs have the same shape.
Returns:
Output shape, as an integer shape tuple (or list of shape tuples, one tuple per output tensor).
Raises:
AttributeError
: if the layer has no defined output shape.RuntimeError
: if called in Eager mode.
submodules
Sequence of all sub-modules.
Submodules are modules which are properties of this module, or found as properties of modules which are properties of this module (and so on).
a = tf.Module()
b = tf.Module()
c = tf.Module()
a.b = b
b.c = c
assert list(a.submodules) == [b, c]
assert list(b.submodules) == [c]
assert list(c.submodules) == []
Returns:
A sequence of all submodules.
trainable
trainable_variables
Sequence of variables owned by this module and it's submodules.
Returns:
A sequence of variables for the current module (sorted by attribute name) followed by variables from all submodules recursively (breadth first).
trainable_weights
updates
variables
Returns the list of all layer variables/weights.
Alias of self.weights
.
Returns:
A list of variables.
weights
Returns the list of all layer variables/weights.
Returns:
A list of variables.
Methods
__call__
__call__(
inputs,
*args,
**kwargs
)
Wraps call
, applying pre- and post-processing steps.
Arguments:
inputs
: input tensor(s).*args
: additional positional arguments to be passed toself.call
.**kwargs
: additional keyword arguments to be passed toself.call
.
Returns:
Output tensor(s).
Note:
- The following optional keyword arguments are reserved for specific uses:
training
: Boolean scalar tensor of Python boolean indicating whether thecall
is meant for training or inference.mask
: Boolean input mask.
- If the layer's
call
method takes amask
argument (as some Keras layers do), its default value will be set to the mask generated forinputs
by the previous layer (ifinput
did come from a layer that generated a corresponding mask, i.e. if it came from a Keras layer with masking support.
Raises:
ValueError
: if the layer'scall
method returns None (an invalid value).
build
build(input_shape)
Creates the variables of the layer (optional, for subclass implementers).
This is a method that implementers of subclasses of Layer
or Model
can override if they need a state-creation step in-between
layer instantiation and layer call.
This is typically used to create the weights of Layer
subclasses.
Arguments:
input_shape
: Instance ofTensorShape
, or list of instances ofTensorShape
if the layer expects a list of inputs (one instance per input).
compute_mask
compute_mask(
inputs,
mask=None
)
Computes an output mask tensor.
Arguments:
inputs
: Tensor or list of tensors.mask
: Tensor or list of tensors.
Returns:
None or a tensor (or list of tensors, one per output tensor of the layer).
compute_output_shape
compute_output_shape(input_shape)
Computes the output shape of the layer.
Args:
input_shape
: Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns:
output_shape
: A tuple representing the output shape.
count_params
count_params()
Count the total number of scalars composing the weights.
Returns:
An integer count.
Raises:
ValueError
: if the layer isn't yet built (in which case its weights aren't yet defined).
from_config
from_config(
cls,
config
)
Creates a layer from its config.
This method is the reverse of get_config
, capable of instantiating the
same layer from the config dictionary.
Args:
config
: A Python dictionary, typically the output ofget_config
.
Returns:
layer
: A layer instance.
get_config
get_config()
Returns the config of the layer.
A layer config is a Python dictionary (serializable) containing the configuration of a layer. The same layer can be reinstantiated later (without its trained weights) from this configuration.
Returns:
config
: A Python dictionary of class keyword arguments and their serialized values.
get_input_at
get_input_at(node_index)
Retrieves the input tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A tensor (or list of tensors if the layer has multiple inputs).
Raises:
RuntimeError
: If called in Eager mode.
get_input_mask_at
get_input_mask_at(node_index)
Retrieves the input mask tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A mask tensor (or list of tensors if the layer has multiple inputs).
get_input_shape_at
get_input_shape_at(node_index)
Retrieves the input shape(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A shape tuple (or list of shape tuples if the layer has multiple inputs).
Raises:
RuntimeError
: If called in Eager mode.
get_losses_for
get_losses_for(inputs)
Retrieves losses relevant to a specific set of inputs.
Arguments:
inputs
: Input tensor or list/tuple of input tensors.
Returns:
List of loss tensors of the layer that depend on inputs
.
get_output_at
get_output_at(node_index)
Retrieves the output tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A tensor (or list of tensors if the layer has multiple outputs).
Raises:
RuntimeError
: If called in Eager mode.
get_output_mask_at
get_output_mask_at(node_index)
Retrieves the output mask tensor(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A mask tensor (or list of tensors if the layer has multiple outputs).
get_output_shape_at
get_output_shape_at(node_index)
Retrieves the output shape(s) of a layer at a given node.
Arguments:
node_index
: Integer, index of the node from which to retrieve the attribute. E.g.node_index=0
will correspond to the first time the layer was called.
Returns:
A shape tuple (or list of shape tuples if the layer has multiple outputs).
Raises:
RuntimeError
: If called in Eager mode.
get_updates_for
get_updates_for(inputs)
Retrieves updates relevant to a specific set of inputs.
Arguments:
inputs
: Input tensor or list/tuple of input tensors.
Returns:
List of update ops of the layer that depend on inputs
.
get_weights
get_weights()
Returns the current weights of the layer.
Returns:
Weights values as a list of numpy arrays.
set_weights
set_weights(weights)
Sets the weights of the layer, from Numpy arrays.
Arguments:
weights
: a list of Numpy arrays. The number of arrays and their shape must match number of the dimensions of the weights of the layer (i.e. it should match the output ofget_weights
).
Raises:
ValueError
: If the provided weights list does not match the layer's specifications.
with_name_scope
with_name_scope(
cls,
method
)
Decorator to automatically enter the module name scope.
class MyModule(tf.Module):
@tf.Module.with_name_scope
def __call__(self, x):
if not hasattr(self, 'w'):
self.w = tf.Variable(tf.random.normal([x.shape[1], 64]))
return tf.matmul(x, self.w)
Using the above module would produce tf.Variable
s and tf.Tensor
s whose
names included the module name:
mod = MyModule()
mod(tf.ones([8, 32]))
# ==> <tf.Tensor: ...>
mod.w
# ==> <tf.Variable ...'my_module/w:0'>
Args:
method
: The method to wrap.
Returns:
The original method wrapped such that it enters the module's name scope.