SigmoidCrossEntropyWithLogits

public class SigmoidCrossEntropyWithLogits

Public Constructors

Public Methods

static <T extends TNumber > Operand <T>
sigmoidCrossEntropyWithLogits ( Scope scope, Operand <T> labels, Operand <T> logits)
Computes sigmoid cross entropy given logits .

Inherited Methods

Public Constructors

public SigmoidCrossEntropyWithLogits ()

Public Methods

public static Operand <T> sigmoidCrossEntropyWithLogits ( Scope scope, Operand <T> labels, Operand <T> logits)

Computes sigmoid cross entropy given logits .

Measures the probability error in discrete classification tasks in which each class is independent and not mutually exclusive. For instance, one could perform multilabel classification where a picture can contain both an elephant and a dog at the same time.

For brevity, let x = logits , z = labels . The logistic loss in pseudo-code is

 z * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))
  = z * -log(1 / (1 + exp(-x))) + (1 - z) * -log(exp(-x) / (1 + exp(-x)))
  = z * log(1 + exp(-x)) + (1 - z) * (-log(exp(-x)) + log(1 + exp(-x)))
  = z * log(1 + exp(-x)) + (1 - z) * (x + log(1 + exp(-x))
  = (1 - z) * x + log(1 + exp(-x))
  = x - x * z + log(1 + exp(-x))
 

For x < 0 , to avoid overflow in exp(-x) , we reformulate the above

 x - x * z + log(1 + exp(-x))
  = log(exp(x)) - x * z + log(1 + exp(-x))
  = - x * z + log(1 + exp(x))
 

Hence, to ensure stability and avoid overflow, the implementation uses this equivalent formulation

   max(x, 0) - x * z + log(1 + exp(-abs(x)))