![]() |
Transformer block for MobileBERT.
tfm.nlp.layers.MobileBertTransformer(
hidden_size=512,
num_attention_heads=4,
intermediate_size=512,
intermediate_act_fn='relu',
hidden_dropout_prob=0.1,
attention_probs_dropout_prob=0.1,
intra_bottleneck_size=128,
use_bottleneck_attention=False,
key_query_shared_bottleneck=True,
num_feedforward_networks=4,
normalization_type='no_norm',
initializer=tf.keras.initializers.TruncatedNormal(stddev=0.02),
**kwargs
)
An implementation of one layer (block) of Transformer with bottleneck and inverted-bottleneck for MobilerBERT.
Original paper for MobileBERT: https://arxiv.org/pdf/2004.02984.pdf
Raises | |
---|---|
ValueError
|
A Tensor shape or parameter is invalid. |
Methods
call
call(
input_tensor, attention_mask=None, return_attention_scores=False
)
Implementes the forward pass.
Args | |
---|---|
input_tensor
|
Float tensor of shape
(batch_size, seq_length, hidden_size) .
|
attention_mask
|
(optional) int32 tensor of shape
(batch_size, seq_length, seq_length) , with 1 for positions that can
be attended to and 0 in positions that should not be.
|
return_attention_scores
|
If return attention score. |
Returns | |
---|---|
layer_output
|
Float tensor of shape
(batch_size, seq_length, hidden_size) .
|
attention_scores
|
Optional
Only when return_attention_scores is True. |
Raises | |
---|---|
ValueError
|
A Tensor shape or parameter is invalid. |