View source on GitHub

LSTM unit with layer normalization and recurrent dropout.

Inherits From: RNNCell

This class adds layer normalization and recurrent dropout to a basic LSTM unit. Layer normalization implementation is based on:

"Layer Normalization" Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

and is applied before the internal nonlinearities. Recurrent dropout is base on:

"Recurrent Dropout without Memory Loss" Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth.

num_units int, The number of units in the LSTM cell.
forget_bias float, The bias added to forget gates (see above).
input_size Deprecated and unused.
activation Activation function of the inner states.
layer_norm If True, layer normalization will be applied.
norm_gain float, The layer normalization gain initial value. If layer_norm has been set to False, this argument will be ignored.
norm_shift float, The layer normalization shift initial value. If layer_norm has been set to False, this argument will be ignored.
dropout_keep_prob unit Tensor or float between 0 and 1 representing the recurrent dropout probability value. If float and 1.0, no dropout will be applied.
dropout_prob_seed (optional) integer, the randomness seed.
reuse (optional) Python boolean describing whether to reuse variables in an existing scope. If not True, and the existing scope already has the given variables, an error is raised.


output_size Integer or TensorShape: size of outputs produced by this cell.

state_size size(s) of state(s) used by this cell.

It can be represented by an Integer, a TensorShape or a tuple of Integers or TensorShapes.



View source


View source

Return zero-filled state tensor(s).

batch_size int, float, or unit Tensor representing the batch size.
dtype the data type to use for the state.

If state_size is an int or TensorShape, then the return value is a N-D tensor of shape [batch_size, state_size] filled with zeros.

If state_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-D tensors with the shapes [batch_size, s] for each s in state_size.