BERT pretraining model V2.

Adds the masked language model head and optional classification heads upon the transformer encoder.

encoder_network A transformer network. This network should output a sequence output and a classification output.
mlm_activation The activation (if any) to use in the masked LM network. If None, no activation will be used.
mlm_initializer The initializer (if any) to use in the masked LM. Default to a Glorot uniform initializer.
classification_heads A list of optional head layers to transform on encoder sequence outputs.
customized_masked_lm A customized masked_lm layer. If None, will create a standard layer from layers.MaskedLM; if not None, will use the specified masked_lm layer. Above arguments mlm_activation and mlm_initializer will be ignored.
name The name of the model.

Inputs: Inputs defined by the encoder network, plus masked_lm_positions as a dictionary. Outputs: A dictionary of lm_output, classification head outputs keyed by head names, and also outputs from encoder_network, keyed by sequence_output and encoder_outputs (if any).

checkpoint_items Returns a dictionary of items to be additionally checkpointed.



View source

Calls the model on new inputs and returns the outputs as tensors.

In this case call() just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

inputs Input tensor, or dict/list/tuple of input tensors.
training Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask A mask or list of masks. A mask can be either a boolean tensor or None (no mask). For more details, check the guide here.

A tensor if there is a single output, or a list of tensors if there are more than one outputs.