Creates a network layer that adds a sinusoidal positional encoding.

Positional encoding is incremented across frames, and is added to the input. The positional encoding is first weighted at 0 so that the network can choose to ignore it. This implements:

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. Attention Is All You Need. (

initializer A str of initializer for weighting the positional encoding.
cache_encoding A bool. If True, cache the positional encoding tensor after calling build. Otherwise, rebuild the tensor for every call. Setting this to False can be useful when we want to input a variable number of frames, so the positional encoding tensor can change shape.
state_prefix a prefix string to identify states.
**kwargs Additional keyword arguments to be passed to this layer.



View source

Calls the layer with the given inputs.

inputs An input tf.Tensor.
states A dict of states such that, if any of the keys match for this layer, will overwrite the contents of the buffer(s). Expected keys include state_prefix + '_pos_enc_frame_count'.
output_states A bool. If True, returns the output tensor and output states. Returns just the output tensor otherwise.

An output tf.Tensor (and optionally the states if output_states=True).

ValueError If using 'channels_first' data format.