class tf.contrib.rnn.IntersectionRNNCell
Defined in tensorflow/contrib/rnn/python/ops/rnn_cell.py.
Intersection Recurrent Neural Network (+RNN) cell.
Architecture with coupled recurrent gate as well as coupled depth
gate, designed to improve information flow through stacked RNNs. As the
architecture uses depth gating, the dimensionality of the depth
output (y) also should not change through depth (input size == output size).
To achieve this, the first layer of a stacked Intersection RNN projects
the inputs to N (num units) dimensions. Therefore when initializing an
IntersectionRNNCell, one should set num_in_proj = N for the first layer
and use default settings for subsequent layers.
This implements the recurrent cell from the paper:
https://arxiv.org/abs/1611.09913
Jasmine Collins, Jascha Sohl-Dickstein, and David Sussillo. "Capacity and Trainability in Recurrent Neural Networks" Proc. ICLR 2017.
The Intersection RNN is built for use in deeply stacked RNNs so it may not achieve best performance with depth 1.
Properties
graph
losses
non_trainable_variables
non_trainable_weights
output_size
scope_name
state_size
trainable_variables
trainable_weights
updates
variables
Returns the list of all layer variables/weights.
Returns:
A list of variables.
weights
Returns the list of all layer variables/weights.
Returns:
A list of variables.
Methods
__init__
__init__(
num_units,
num_in_proj=None,
initializer=None,
forget_bias=1.0,
y_activation=tf.nn.relu,
reuse=None
)
Initialize the parameters for an +RNN cell.
Args:
num_units: int, The number of units in the +RNN cellnum_in_proj: (optional) int, The input dimensionality for the RNN. If creating the first layer of an +RNN, this should be set tonum_units. Otherwise, this should be set toNone(default). IfNone, dimensionality ofinputsshould be equal tonum_units, otherwise ValueError is thrown.initializer: (optional) The initializer to use for the weight matrices.forget_bias: (optional) float, default 1.0, The initial bias of the forget gates, used to reduce the scale of forgetting at the beginning of the training.y_activation: (optional) Activation function of the states passed through depth. Default is 'tf.nn.relu`.reuse: (optional) Python boolean describing whether to reuse variables in an existing scope. If notTrue, and the existing scope already has the given variables, an error is raised.
__call__
__call__(
inputs,
state,
scope=None
)
Run this RNN cell on inputs, starting from the given state.
Args:
inputs:2-Dtensor with shape[batch_size x input_size].state: ifself.state_sizeis an integer, this should be a2-D Tensorwith shape[batch_size x self.state_size]. Otherwise, ifself.state_sizeis a tuple of integers, this should be a tuple with shapes[batch_size x s] for s in self.state_size.scope: VariableScope for the created subgraph; defaults to class name.
Returns:
A pair containing:
- Output: A
2-Dtensor with shape[batch_size x self.output_size]. - New state: Either a single
2-Dtensor, or a tuple of tensors matching the arity and shapes ofstate.
__deepcopy__
__deepcopy__(memo)
add_loss
add_loss(
losses,
inputs=None
)
Add loss tensor(s), potentially dependent on layer inputs.
Some losses (for instance, activity regularization losses) may be dependent
on the inputs passed when calling a layer. Hence, when reusing a same layer
on different inputs a and b, some entries in layer.losses may be
dependent on a and some on b. This method automatically keeps track
of dependencies.
The get_losses_for method allows to retrieve the losses relevant to a
specific set of inputs.
Arguments:
losses: Loss tensor, or list/tuple of tensors.inputs: Optional input tensor(s) that the loss(es) depend on. Must match theinputsargument passed to the__call__method at the time the losses are created. IfNoneis passed, the losses are assumed to be unconditional, and will apply across all dataflows of the layer (e.g. weight regularization losses).
add_update
add_update(
updates,
inputs=None
)
Add update op(s), potentially dependent on layer inputs.
Weight updates (for instance, the updates of the moving mean and variance
in a BatchNormalization layer) may be dependent on the inputs passed
when calling a layer. Hence, when reusing a same layer on
different inputs a and b, some entries in layer.updates may be
dependent on a and some on b. This method automatically keeps track
of dependencies.
The get_updates_for method allows to retrieve the updates relevant to a
specific set of inputs.
Arguments:
updates: Update op, or list/tuple of update ops.inputs: Optional input tensor(s) that the update(s) depend on. Must match theinputsargument passed to the__call__method at the time the updates are created. IfNoneis passed, the updates are assumed to be unconditional, and will apply across all dataflows of the layer.
add_variable
add_variable(
name,
shape,
dtype=None,
initializer=None,
regularizer=None,
trainable=True
)
Adds a new variable to the layer, or gets an existing one; returns it.
Arguments:
name: variable name.shape: variable shape.dtype: The type of the variable. Defaults toself.dtype.initializer: initializer instance (callable).regularizer: regularizer instance (callable).trainable: whether the variable should be part of the layer's "trainable_variables" (e.g. variables, biases) or "non_trainable_variables" (e.g. BatchNorm mean, stddev).
Returns:
The created variable.
apply
apply(
inputs,
*args,
**kwargs
)
Apply the layer on a input.
This simply wraps self.__call__.
Arguments:
inputs: Input tensor(s). args: additional positional arguments to be passed toself.call.
*kwargs: additional keyword arguments to be passed toself.call.
Returns:
Output tensor(s).
build
build(_)
call
call(
inputs,
state
)
Run one step of the Intersection RNN.
Args:
inputs: input Tensor, 2D, batch x input size.state: state Tensor, 2D, batch x num units.
Returns:
new_y: batch x num units, Tensor representing the output of the +RNN after readinginputswhen previous state wasstate.new_state: batch x num units, Tensor representing the state of the +RNN after readinginputswhen previous state wasstate.
Raises:
ValueError: If input size cannot be inferred frominputsvia static shape inference.ValueError: If input size != output size (these must be equal when using the Intersection RNN).
get_losses_for
get_losses_for(inputs)
Retrieves losses relevant to a specific set of inputs.
Arguments:
inputs: Input tensor or list/tuple of input tensors. Must match theinputsargument passed to the__call__method at the time the losses were created. If you passinputs=None, unconditional losses are returned, such as weight regularization losses.
Returns:
List of loss tensors of the layer that depend on inputs.
get_updates_for
get_updates_for(inputs)
Retrieves updates relevant to a specific set of inputs.
Arguments:
inputs: Input tensor or list/tuple of input tensors. Must match theinputsargument passed to the__call__method at the time the updates were created. If you passinputs=None, unconditional updates are returned.
Returns:
List of update ops of the layer that depend on inputs.
zero_state
zero_state(
batch_size,
dtype
)
Return zero-filled state tensor(s).
Args:
batch_size: int, float, or unit Tensor representing the batch size.dtype: the data type to use for the state.
Returns:
If state_size is an int or TensorShape, then the return value is a
N-D tensor of shape [batch_size x state_size] filled with zeros.
If state_size is a nested list or tuple, then the return value is
a nested list or tuple (of the same structure) of 2-D tensors with
the shapes [batch_size x s] for each s in state_size.
