|View source on GitHub|
Adds a KL-divergence to the training procedure.
nsl.lib.kl_divergence( labels, predictions, axis=None, weights=1.0, scope=None, loss_collection=tf.compat.v1.GraphKeys.LOSSES, reduction=tf.compat.v1.losses.Reduction.SUM_BY_NONZERO_WEIGHTS )
For brevity, let
P = labels and
Q = predictions. The
losses = P * log(P) - P * log(Q)
Note, the function assumes that
labels are the values of
multinomial distribution, i.e., each value is the probability of the
For the usage of
reduction, please refer to tf.losses.
Tensorof type float32 or float64, with shape
[d1, ..., dN, num_classes], represents target distribution.
Tensorof the same type and shape as
labels, represents predicted distribution.
axis: The dimension along which the KL divergence is computed. Note, the values of
axisshould meet the condition of multinomial distribution.
Tensorwhose rank is either 0, or the same rank as
labels, and must be broadcastable to
labels(i.e., all dimensions must be either
1, or the same as the corresponding
scope: The scope for the operations performed in computing the loss.
loss_collection: collection to which the loss will be added.
reduction: Type of reduction to apply to loss.
Weighted loss float
NONE, this has the same
labels; otherwise, it is scalar.
predictionsdoesn't meet the condition of multinomial distribution.
axisis None, or the shape of
predictionsdoesn't match that of
labelsor if the shape of