Recurrent Neural Networks

TensorFlow provides a number of methods for constructing Recurrent Neural Networks. Most accept an RNNCell-subclassed object (see the documentation for tf.nn.rnn_cell).

tf.nn.dynamic_rnn(cell, inputs, sequence_length=None, initial_state=None, dtype=None, parallel_iterations=None, swap_memory=False, time_major=False, scope=None)

Creates a recurrent neural network specified by RNNCell cell.

This function is functionally identical to the function rnn above, but performs fully dynamic unrolling of inputs.

Unlike rnn, the input inputs is not a Python list of Tensors, one for each frame. Instead, inputs may be a single Tensor where the maximum time is either the first or second dimension (see the parameter time_major). Alternatively, it may be a (possibly nested) tuple of Tensors, each of them having matching batch and time dimensions. The corresponding output is either a single Tensor having the same number of time steps and batch size, or a (possibly nested) tuple of such tensors, matching the nested structure of cell.output_size.

The parameter sequence_length is optional and is used to copy-through state and zero-out outputs when past a batch element's sequence length. So it's more for correctness than performance, unlike in rnn().

Args:
  • cell: An instance of RNNCell.
  • inputs: The RNN inputs.

    If time_major == False (default), this must be a Tensor of shape: [batch_size, max_time, ...], or a nested tuple of such elements.

    If time_major == True, this must be a Tensor of shape: [max_time, batch_size, ...], or a nested tuple of such elements.

    This may also be a (possibly nested) tuple of Tensors satisfying this property. The first two dimensions must match across all the inputs, but otherwise the ranks and other shape components may differ. In this case, input to cell at each time-step will replicate the structure of these tuples, except for the time dimension (from which the time is taken).

    The input to cell at each time step will be a Tensor or (possibly nested) tuple of Tensors each with dimensions [batch_size, ...].

  • sequence_length: (optional) An int32/int64 vector sized [batch_size].

  • initial_state: (optional) An initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size x cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
  • dtype: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
  • parallel_iterations: (Default: 32). The number of iterations to run in parallel. Those operations which do not have any temporal dependency and can be run in parallel, will be. This parameter trades off time for space. Values >> 1 use more memory but take less time, while smaller values use less memory but computations take longer.
  • swap_memory: Transparently swap the tensors produced in forward inference but needed for back prop from GPU to CPU. This allows training RNNs which would typically not fit on a single GPU, with very minimal (or no) performance penalty.
  • time_major: The shape format of the inputs and outputs Tensors. If true, these Tensors must be shaped [max_time, batch_size, depth]. If false, these Tensors must be shaped [batch_size, max_time, depth]. Using time_major = True is a bit more efficient because it avoids transposes at the beginning and end of the RNN calculation. However, most TensorFlow data is batch-major, so by default this function accepts input and emits output in batch-major form.
  • scope: VariableScope for the created subgraph; defaults to "RNN".
Returns:

A pair (outputs, state) where:

  • outputs: The RNN output Tensor.

    If time_major == False (default), this will be a Tensor shaped: [batch_size, max_time, cell.output_size].

    If time_major == True, this will be a Tensor shaped: [max_time, batch_size, cell.output_size].

    Note, if cell.output_size is a (possibly nested) tuple of integers or TensorShape objects, then outputs will be a tuple having the same structure as cell.output_size, containing Tensors having shapes corresponding to the shape data in cell.output_size.

  • state: The final state. If cell.state_size is an int, this will be shaped [batch_size, cell.state_size]. If it is a TensorShape, this will be shaped [batch_size] + cell.state_size. If it is a (possibly nested) tuple of ints or TensorShape, this will be a tuple having the corresponding shapes.

Raises:
  • TypeError: If cell is not an instance of RNNCell.
  • ValueError: If inputs is None or an empty list.

tf.nn.rnn(cell, inputs, initial_state=None, dtype=None, sequence_length=None, scope=None)

Creates a recurrent neural network specified by RNNCell cell.

The simplest form of RNN network generated is:

  state = cell.zero_state(...)
  outputs = []
  for input_ in inputs:
    output, state = cell(input_, state)
    outputs.append(output)
  return (outputs, state)

However, a few other options are available:

An initial state can be provided. If the sequence_length vector is provided, dynamic calculation is performed. This method of calculation does not compute the RNN steps past the maximum sequence length of the minibatch (thus saving computational time), and properly propagates the state at an example's sequence length to the final state output.

The dynamic calculation performed is, at time t for batch row b, (output, state)(b, t) = (t >= sequence_length(b)) ? (zeros(cell.output_size), states(b, sequence_length(b) - 1)) : cell(input(b, t), state(b, t - 1))

Args:
  • cell: An instance of RNNCell.
  • inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size], or a nested tuple of such elements.
  • initial_state: (optional) An initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [batch_size, cell.state_size]. If cell.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell.state_size.
  • dtype: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype.
  • sequence_length: Specifies the length of each sequence in inputs. An int32 or int64 vector (tensor) size [batch_size], values in [0, T).
  • scope: VariableScope for the created subgraph; defaults to "RNN".
Returns:

A pair (outputs, state) where: - outputs is a length T list of outputs (one for each input), or a nested tuple of such elements. - state is the final state

Raises:
  • TypeError: If cell is not an instance of RNNCell.
  • ValueError: If inputs is None or an empty list, or if the input depth (column size) cannot be inferred from inputs via shape inference.

tf.nn.state_saving_rnn(cell, inputs, state_saver, state_name, sequence_length=None, scope=None)

RNN that accepts a state saver for time-truncated RNN calculation.

Args:
  • cell: An instance of RNNCell.
  • inputs: A length T list of inputs, each a Tensor of shape [batch_size, input_size].
  • state_saver: A state saver object with methods state and save_state.
  • state_name: Python string or tuple of strings. The name to use with the state_saver. If the cell returns tuples of states (i.e., cell.state_size is a tuple) then state_name should be a tuple of strings having the same length as cell.state_size. Otherwise it should be a single string.
  • sequence_length: (optional) An int32/int64 vector size [batch_size]. See the documentation for rnn() for more details about sequence_length.
  • scope: VariableScope for the created subgraph; defaults to "RNN".
Returns:

A pair (outputs, state) where: outputs is a length T list of outputs (one for each input) states is the final state

Raises:
  • TypeError: If cell is not an instance of RNNCell.
  • ValueError: If inputs is None or an empty list, or if the arity and type of state_name does not match that of cell.state_size.

tf.nn.bidirectional_rnn(cell_fw, cell_bw, inputs, initial_state_fw=None, initial_state_bw=None, dtype=None, sequence_length=None, scope=None)

Creates a bidirectional recurrent neural network.

Similar to the unidirectional case above (rnn) but takes input and builds independent forward and backward RNNs with the final forward and backward outputs depth-concatenated, such that the output will have the format [time][batch][cell_fw.output_size + cell_bw.output_size]. The input_size of forward and backward cell must match. The initial state for both directions is zero by default (but can be set optionally) and no intermediate states are ever returned -- the network is fully unrolled for the given (passed in) length(s) of the sequence(s) or completely unrolled if length(s) is not given.

Args:
  • cell_fw: An instance of RNNCell, to be used for forward direction.
  • cell_bw: An instance of RNNCell, to be used for backward direction.
  • inputs: A length T list of inputs, each a tensor of shape [batch_size, input_size], or a nested tuple of such elements.
  • initial_state_fw: (optional) An initial state for the forward RNN. This must be a tensor of appropriate type and shape [batch_size x cell_fw.state_size]. If cell_fw.state_size is a tuple, this should be a tuple of tensors having shapes [batch_size, s] for s in cell_fw.state_size.
  • initial_state_bw: (optional) Same as for initial_state_fw, but using the corresponding properties of cell_bw.
  • dtype: (optional) The data type for the initial state. Required if either of the initial states are not provided.
  • sequence_length: (optional) An int32/int64 vector, size [batch_size], containing the actual lengths for each of the sequences.
  • scope: VariableScope for the created subgraph; defaults to "BiRNN"
Returns:

A tuple (outputs, output_state_fw, output_state_bw) where: outputs is a length T list of outputs (one for each input), which are depth-concatenated forward and backward outputs. output_state_fw is the final state of the forward rnn. output_state_bw is the final state of the backward rnn.

Raises:
  • TypeError: If cell_fw or cell_bw is not an instance of RNNCell.
  • ValueError: If inputs is None or an empty list.