Base class for all TF1 loss scales.

This is an abstract base class, so you cannot instantiate it directly. Instead, use one of its concrete subclasses:

Loss scaling is a process that multiplies the loss by a multiplier called the loss scale, and divides each gradient by the same multiplier. The pseudocode for this process is:

loss = ...
loss *= loss_scale
grads = gradients(loss, vars)
grads /= loss_scale

Mathematically, loss scaling has no effect, but can help avoid numerical underflow in intermediate gradients when float16 tensors are used for mixed precision training. By multiplying the loss, each intermediate gradient will have the same multiplier applied.

Instances of this class represent a loss scale. Calling instances of this class returns the loss scale as a scalar float32 tensor, while method update() updates the loss scale depending on the values of the gradients. Optimizers use instances of this class to scale loss and gradients.

In most functions that accept a LossScale, you can also pass an int (such as

8) to create a FixedLossScale or the string "dynamic" to create a dynamic loss scale.



View source

Creates the LossScale from its config.


View source