Classification network head for BERT modeling.

This network implements a simple classifier head based on a dense layer. If num_classes is one, it can be considered as a regression problem.

input_width The innermost dimension of the input tensor to this network.
num_classes The number of classes that this network should classify to. If equal to 1, a regression problem is assumed.
activation The activation, if any, for the dense layer in this network.
initializer The initializer for the dense layer in this network. Defaults to a Glorot uniform initializer.
output The output style for this network. Can be either logits or predictions.



Calls the model on new inputs and returns the outputs as tensors.

In this case call() just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).

inputs Input tensor, or dict/list/tuple of input tensors.
training Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask A mask or list of masks. A mask can be either a boolean tensor or None (no mask). For more details, check the guide here.

A tensor if there is a single output, or a list of tensors if there are more than one outputs.