Have a question? Connect with the community at the TensorFlow Forum Visit Forum

tfnlp.layers.CachedAttention

Attention layer with cache used for auto-agressive decoding.

Inherits From: MultiHeadAttention

Arguments are the same as MultiHeadAttention layer.

Methods

call

View source

This is where the layer's logic lives.

Note here that call() method in tf.keras is little bit different from keras API. In keras API, you can pass support masking for layers as additional arguments. Whereas tf.keras has compute_mask() method to support masking.

Args
inputs Input tensor, or list/tuple of input tensors.
*args Additional positional arguments. Currently unused.
**kwargs Additional keyword arguments. Currently unused.

Returns
A tensor or list/tuple of tensors.