Google I/O returns May 18-20! Reserve space and build your schedule Register now


Long short-term memory unit (LSTM) recurrent network cell.

Inherits From: RNNCell, Layer, Layer, Module

The default non-peephole implementation is based on (Gers et al., 1999). The peephole implementation is based on (Sak et al., 2014).

The class uses optional peep-hole connections, optional cell clipping, and an optional projection layer.

Note that this cell is not optimized for performance. Please use tf.contrib.cudnn_rnn.CudnnLSTM for better performance on GPU, or tf.contrib.rnn.LSTMBlockCell and tf.contrib.rnn.LSTMBlockFusedCell for better performance on CPU. References: Long short-term memory recurrent neural network architectures for large scale acoustic modeling: Sak et al., 2014 (pdf) Learning to forget: Gers et al., 1999 (pdf) Long Short-Term Memory: Hochreiter et al., 1997 (pdf)

num_units int, The number of units in the LSTM cell.
use_peepholes bool, set True to enable diagonal/peephole connections.
cell_clip (optional) A float value, if provided the cell state is clipped by this value prior to the cell output activation.
initializer (optional) The initializer to use for the weight and projection matrices.
num_proj (optional) int, The output dimensionality for the projection matrices. If None, no projection is performed.
proj_clip (optional) A float value. If num_proj > 0 and proj_clip is provided, then the projected values are clipped elementwise to within [-proj_clip, proj_clip].
num_unit_shards Deprecated, will be removed by Jan. 2017. Use a variable_scope partitioner instead.
num_proj_shards Deprecated, will be removed by Jan. 2017. Use a variable_scope partitioner instead.
forget_bias Biases of the forget gate are initialized by default to 1 in order to reduce the scale of forgetting at the beginning of the training. Must set it manually to 0.0 when restoring from CudnnLSTM trained checkpoints.
state_is_tuple If True, accepted and returned states are 2-tuples of the c_state and m_state. If False, they are concatenated along the column axis. This latter behavior will soon be deprecated.
activation Activation function of the inner states. Default: tanh. It could also be string that is within Keras activation function names.
reuse (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.
name String, the name of the layer. Layers with the same name will share weights, but to avoid mistakes we require reuse=True in such cases.
dtype Default dtype of the layer (default of None means use the type of the first input). Required when build is called before call.
**kwargs Dict, keyword named properties for common layer attributes, like trainable etc when constructing the cell from configs of get_config(). When restoring from CudnnLSTM-trained checkpoints, use CudnnCompatibleLSTMCell instead.


output_size Integer or TensorShape: size of outputs produced by this cell.

state_size size(s) of state(s) used by this cell.

It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes.



View source


View source

Return zero-filled state tensor(s).

batch_size int, float, or unit Tensor representing the batch size.
dtype the data type to use for the state.

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size, state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size, s] for each s in state_size.