tfa.losses.SigmoidFocalCrossEntropy

Implements the focal loss function.

Focal loss was first introduced in the RetinaNet paper (https://arxiv.org/pdf/1708.02002.pdf). Focal loss is extremely useful for classification when you have highly imbalanced classes. It down-weights well-classified examples and focuses on hard examples. The loss value is much higher for a sample which is misclassified by the classifier as compared to the loss value corresponding to a well-classified example. One of the best use-cases of focal loss is its usage in object detection where the imbalance between the background class and other classes is extremely high.

Usage:

fl = tfa.losses.SigmoidFocalCrossEntropy()
loss = fl(
    y_true = [[1.0], [1.0], [0.0]],y_pred = [[0.97], [0.91], [0.03]])
loss
<tf.Tensor: shape=(3,), dtype=float32, numpy=array([6.8532745e-06, 1.9097870e-04, 2.0559824e-05],
dtype=float32)>

Usage with tf.keras API:

model = tf.keras.Model()
model.compile('sgd', loss=tfa.losses.SigmoidFocalCrossEntropy())

alpha balancing factor, default value is 0.25.
gamma modulating factor, default value is 2.0.

Weighted loss float Tensor. If reduction is NONE, this has the same shape as y_true; otherwise, it is scalar.

ValueError If the shape of sample_weight is invalid or value of gamma is less than zero.

Methods

from_config

Instantiates a Loss from its config (output of get_config()).

Args
config Output of get_config().

Returns
A Loss instance.

get_config

View source

Returns the config dictionary for a Loss instance.

__call__

Invokes the Loss instance.

Args
y_true Ground truth values. shape = [batch_size, d0, .. dN], except sparse loss functions such as sparse categorical crossentropy where shape = [batch_size, d0, .. dN-1]
y_pred The predicted values. shape = [batch_size, d0, .. dN]
sample_weight Optional sample_weight acts as a coefficient for the loss. If a scalar is provided, then the loss is simply scaled by the given value. If sample_weight is a tensor of size [batch_size], then the total loss for each sample of the batch is rescaled by the corresponding element in the sample_weight vector. If the shape of sample_weight is [batch_size, d0, .. dN-1] (or can be broadcasted to this shape), then each loss element of y_pred is scaled by the corresponding value of sample_weight. (Note ondN-1: all loss functions reduce by 1 dimension, usually axis=-1.)

Returns
Weighted loss float Tensor. If reduction is NONE, this has shape [batch_size, d0, .. dN-1]; otherwise, it is scalar. (Note dN-1 because all loss functions reduce by 1 dimension, usually axis=-1.)

Raises
ValueError If the shape of sample_weight is invalid.