Have a question? Connect with the community at the TensorFlow Forum Visit Forum


Attention layer with cache used for auto-agressive decoding.

Inherits From: MultiHeadAttention

Arguments are the same as MultiHeadAttention layer.



View source

This is where the layer's logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

inputs Input tensor, or list/tuple of input tensors.
*args Additional positional arguments. Currently unused.
**kwargs Additional keyword arguments. Currently unused.

A tensor or list/tuple of tensors.