![]() |
Sparse MoE layer plus a FeedForward layer evaluated for all tokens.
tfm.nlp.layers.MoeLayerWithBackbone(
moe: tfm.nlp.layers.MoeLayer
,
backbone_d_ff: int,
*,
inner_dropout: float = 0.0,
output_dropout: float = 0.0,
activation: Callable[[tf.Tensor], tf.Tensor] = tf.keras.activations.gelu,
kernel_initializer: _InitializerType = _DEFAULT_KERNEL_INITIALIZER,
bias_initializer: _InitializerType = _DEFAULT_BIAS_INITIALIZER,
name: str = 'moe_with_backbone',
**kwargs
)
Uses Keras add_loss() and add_metric() APIs.
Methods
call
call(
inputs: tf.Tensor, *, training: Optional[bool] = None
) -> tf.Tensor
Applies MoeLayerWithBackbone layer.
Args | |
---|---|
inputs
|
Batch of input embeddings of shape
|
training
|
Only apply dropout and jitter noise during training. If not provided taken from tf.keras.backend. |
Returns | |
---|---|
Transformed inputs with same shape as inputs:
|