![]() |
TransformerScaffold for packing optimization to stride over inputs.
Inherits From: TransformerScaffold
tfm.nlp.layers.StridedTransformerScaffold(
num_attention_heads,
inner_dim=768,
inner_activation=tfm.utils.activations.gelu
,
attention_cls=attention.MultiHeadAttention,
attention_cfg=None,
feedforward_cls=None,
feedforward_cfg=None,
dropout_rate=0.0,
attention_dropout_rate=0.0,
norm_first=False,
norm_epsilon=1e-12,
kernel_initializer='glorot_uniform',
bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs
)
Methods
call
call(
inputs, stride: tf.Tensor, training=None
)
This is where the layer's logic lives.
The call()
method may not create state (except in its first
invocation, wrapping the creation of variables or other resources in
tf.init_scope()
). It is recommended to create state, including
tf.Variable
instances and nested Layer
instances,
in __init__()
, or in the build()
method that is
called automatically before call()
executes for the first time.
Args | |
---|---|
inputs
|
Input tensor, or dict/list/tuple of input tensors.
The first positional inputs argument is subject to special rules:
|
*args
|
Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above. |
**kwargs
|
Additional keyword arguments. May contain tensors, although
this is not recommended, for the reasons above.
The following optional keyword arguments are reserved:
training : Boolean scalar tensor of Python boolean indicating
whether the call is meant for training or inference.mask : Boolean input mask. If the layer's call() method takes a
mask argument, its default value will be set to the mask
generated for inputs by the previous layer (if input did come
from a layer that generated a corresponding mask, i.e. if it came
from a Keras layer with masking support).
|
Returns | |
---|---|
A tensor or list/tuple of tensors. |