TensorFlow provides a number of methods for constructing Recurrent
Neural Networks. Most accept an RNNCell
subclassed object
(see the documentation for tf.nn.rnn_cell
).
tf.nn.dynamic_rnn(cell, inputs, sequence_length=None, initial_state=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None)
Creates a recurrent neural network specified by RNNCell cell
.
This function is functionally identical to the function rnn
above, but
performs fully dynamic unrolling of inputs
.
Unlike rnn
, the input inputs
is not a Python list of Tensors
, one for
each frame. Instead, inputs
may be a single Tensor
where
the maximum time is either the first or second dimension (see the parameter
time_major
). Alternatively, it may be a (possibly nested) tuple of
Tensors, each of them having matching batch and time dimensions.
The corresponding output is either a single Tensor
having the same number
of time steps and batch size, or a (possibly nested) tuple of such tensors,
matching the nested structure of cell.output_size
.
The parameter sequence_length
is optional and is used to copythrough state
and zeroout outputs when past a batch element's sequence length. So it's more
for correctness than performance, unlike in rnn().
Args:
cell
: An instance of RNNCell.
inputs
: The RNN inputs.If
time_major == False
(default), this must be aTensor
of shape:[batch_size, max_time, ...]
, or a nested tuple of such elements.If
time_major == True
, this must be aTensor
of shape:[max_time, batch_size, ...]
, or a nested tuple of such elements.This may also be a (possibly nested) tuple of Tensors satisfying this property. The first two dimensions must match across all the inputs, but otherwise the ranks and other shape components may differ. In this case, input to
cell
at each timestep will replicate the structure of these tuples, except for the time dimension (from which the time is taken).The input to
cell
at each time step will be aTensor
or (possibly nested) tuple of Tensors each with dimensions[batch_size, ...]
. 
sequence_length
: (optional) An int32/int64 vector sized[batch_size]
. initial_state
: (optional) An initial state for the RNN. Ifcell.state_size
is an integer, this must be aTensor
of appropriate type and shape[batch_size x cell.state_size]
. Ifcell.state_size
is a tuple, this should be a tuple of tensors having shapes[batch_size, s] for s in cell.state_size
.dtype
: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.parallel_iterations
: (Default: 32). The number of iterations to run in parallel. Those operations which do not have any temporal dependency and can be run in parallel, will be. This parameter trades off time for space. Values >> 1 use more memory but take less time, while smaller values use less memory but computations take longer.swap_memory
: Transparently swap the tensors produced in forward inference but needed for back prop from GPU to CPU. This allows training RNNs which would typically not fit on a single GPU, with very minimal (or no) performance penalty.time_major
: The shape format of theinputs
andoutputs
Tensors. If true, theseTensors
must be shaped[max_time, batch_size, depth]
. If false, theseTensors
must be shaped[batch_size, max_time, depth]
. Usingtime_major = True
is a bit more efficient because it avoids transposes at the beginning and end of the RNN calculation. However, most TensorFlow data is batchmajor, so by default this function accepts input and emits output in batchmajor form.scope
: VariableScope for the created subgraph; defaults to "RNN".
Returns:
A pair (outputs, state) where:

outputs
: The RNN outputTensor
.If time_major == False (default), this will be a
Tensor
shaped:[batch_size, max_time, cell.output_size]
.If time_major == True, this will be a
Tensor
shaped:[max_time, batch_size, cell.output_size]
.Note, if
cell.output_size
is a (possibly nested) tuple of integers orTensorShape
objects, thenoutputs
will be a tuple having the same structure ascell.output_size
, containing Tensors having shapes corresponding to the shape data incell.output_size
. 
state
: The final state. Ifcell.state_size
is an int, this will be shaped[batch_size, cell.state_size]
. If it is aTensorShape
, this will be shaped[batch_size] + cell.state_size
. If it is a (possibly nested) tuple of ints orTensorShape
, this will be a tuple having the corresponding shapes.
Raises:
TypeError
: Ifcell
is not an instance of RNNCell.ValueError
: If inputs is None or an empty list.
tf.nn.rnn(cell, inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)
Creates a recurrent neural network specified by RNNCell cell
.
The simplest form of RNN network generated is:
state = cell.zero_state(...)
outputs = []
for input_ in inputs:
output, state = cell(input_, state)
outputs.append(output)
return (outputs, state)
However, a few other options are available:
An initial state can be provided. If the sequence_length vector is provided, dynamic calculation is performed. This method of calculation does not compute the RNN steps past the maximum sequence length of the minibatch (thus saving computational time), and properly propagates the state at an example's sequence length to the final state output.
The dynamic calculation performed is, at time t for batch row b, (output, state)(b, t) = (t >= sequence_length(b)) ? (zeros(cell.output_size), states(b, sequence_length(b)  1)) : cell(input(b, t), state(b, t  1))
Args:
cell
: An instance of RNNCell.inputs
: A length T list of inputs, each aTensor
of shape[batch_size, input_size]
, or a nested tuple of such elements.initial_state
: (optional) An initial state for the RNN. Ifcell.state_size
is an integer, this must be aTensor
of appropriate type and shape[batch_size, cell.state_size]
. Ifcell.state_size
is a tuple, this should be a tuple of tensors having shapes[batch_size, s] for s in cell.state_size
.dtype
: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.sequence_length
: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size[batch_size]
, values in[0, T)
.scope
: VariableScope for the created subgraph; defaults to "RNN".
Returns:
A pair (outputs, state) where:  outputs is a length T list of outputs (one for each input), or a nested tuple of such elements.  state is the final state
Raises:
TypeError
: Ifcell
is not an instance of RNNCell.ValueError
: Ifinputs
isNone
or an empty list, or if the input depth (column size) cannot be inferred from inputs via shape inference.
tf.nn.state_saving_rnn(cell, inputs, state_saver, state_name, sequence_length=None, scope=None)
RNN that accepts a state saver for timetruncated RNN calculation.
Args:
cell
: An instance ofRNNCell
.inputs
: A length T list of inputs, each aTensor
of shape[batch_size, input_size]
.state_saver
: A state saver object with methodsstate
andsave_state
.state_name
: Python string or tuple of strings. The name to use with the state_saver. If the cell returns tuples of states (i.e.,cell.state_size
is a tuple) thenstate_name
should be a tuple of strings having the same length ascell.state_size
. Otherwise it should be a single string.sequence_length
: (optional) An int32/int64 vector size [batch_size]. See the documentation for rnn() for more details about sequence_length.scope
: VariableScope for the created subgraph; defaults to "RNN".
Returns:
A pair (outputs, state) where: outputs is a length T list of outputs (one for each input) states is the final state
Raises:
TypeError
: Ifcell
is not an instance of RNNCell.ValueError
: Ifinputs
isNone
or an empty list, or if the arity and type ofstate_name
does not match that ofcell.state_size
.
tf.nn.bidirectional_rnn(cell_fw, cell_bw, inputs, initial_state_fw=None, initial_state_bw=None, dtype=None, sequence_length=None, scope=None)
Creates a bidirectional recurrent neural network.
Similar to the unidirectional case above (rnn) but takes input and builds independent forward and backward RNNs with the final forward and backward outputs depthconcatenated, such that the output will have the format [time][batch][cell_fw.output_size + cell_bw.output_size]. The input_size of forward and backward cell must match. The initial state for both directions is zero by default (but can be set optionally) and no intermediate states are ever returned  the network is fully unrolled for the given (passed in) length(s) of the sequence(s) or completely unrolled if length(s) is not given.
Args:
cell_fw
: An instance of RNNCell, to be used for forward direction.cell_bw
: An instance of RNNCell, to be used for backward direction.inputs
: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.initial_state_fw
: (optional) An initial state for the forward RNN. This must be a tensor of appropriate type and shape[batch_size x cell_fw.state_size]
. Ifcell_fw.state_size
is a tuple, this should be a tuple of tensors having shapes[batch_size, s] for s in cell_fw.state_size
.initial_state_bw
: (optional) Same as forinitial_state_fw
, but using the corresponding properties ofcell_bw
.dtype
: (optional) The data type for the initial state. Required if either of the initial states are not provided.sequence_length
: (optional) An int32/int64 vector, size[batch_size]
, containing the actual lengths for each of the sequences.scope
: VariableScope for the created subgraph; defaults to "BiRNN"
Returns:
A tuple (outputs, output_state_fw, output_state_bw) where:
outputs is a length T
list of outputs (one for each input), which
are depthconcatenated forward and backward outputs.
output_state_fw is the final state of the forward rnn.
output_state_bw is the final state of the backward rnn.
Raises:
TypeError
: Ifcell_fw
orcell_bw
is not an instance ofRNNCell
.ValueError
: If inputs is None or an empty list.