![]() |
Block diagonal feedforward layer.
tfm.nlp.layers.BlockDiagFeedforward(
intermediate_size: int,
intermediate_activation: str,
dropout: float,
num_blocks: int = 1,
apply_mixing: bool = True,
kernel_initializer: str = 'glorot_uniform',
bias_initializer: str = 'zeros',
kernel_regularizer: Optional[tf.keras.regularizers.Regularizer] = None,
bias_regularizer: Optional[tf.keras.regularizers.Regularizer] = None,
activity_regularizer: Optional[tf.keras.regularizers.Regularizer] = None,
kernel_constraint: Optional[tf.keras.constraints.Constraint] = None,
bias_constraint: Optional[tf.keras.constraints.Constraint] = None,
**kwargs
)
This layer replaces the weight matrix of the output_dense layer with a block diagonal matrix to save layer parameters and FLOPs. A linear mixing layer can be added optionally to improve layer expressibility.
Args | |
---|---|
intermediate_size
|
Size of the intermediate layer. |
intermediate_activation
|
Activation for the intermediate layer. |
dropout
|
Dropout probability for the output dropout. |
num_blocks
|
The number of blocks for the block diagonal matrix of the output_dense layer. |
apply_mixing
|
Apply linear mixing if True. |
kernel_initializer
|
Initializer for dense layer kernels. |
bias_initializer
|
Initializer for dense layer biases. |
kernel_regularizer
|
Regularizer for dense layer kernels. |
bias_regularizer
|
Regularizer for dense layer biases. |
activity_regularizer
|
Regularizer for dense layer activity. |
kernel_constraint
|
Constraint for dense layer kernels. |
bias_constraint
|
Constraint for dense layer kernels. |
Methods
call
call(
inputs
)
This is where the layer's logic lives.
The call()
method may not create state (except in its first invocation,
wrapping the creation of variables or other resources in tf.init_scope()
).
It is recommended to create state in __init__()
, or the build()
method
that is called automatically before call()
executes the first time.
Args | |
---|---|
inputs
|
Input tensor, or dict/list/tuple of input tensors.
The first positional inputs argument is subject to special rules:
|
*args
|
Additional positional arguments. May contain tensors, although this is not recommended, for the reasons above. |
**kwargs
|
Additional keyword arguments. May contain tensors, although
this is not recommended, for the reasons above.
The following optional keyword arguments are reserved:
training : Boolean scalar tensor of Python boolean indicating
whether the call is meant for training or inference.mask : Boolean input mask. If the layer's call() method takes a
mask argument, its default value will be set to the mask generated
for inputs by the previous layer (if input did come from a layer
that generated a corresponding mask, i.e. if it came from a Keras
layer with masking support).
|
Returns | |
---|---|
A tensor or list/tuple of tensors. |