Sparsemax activation function 1.

For each batch i and class j we have

$$sparsemax[i, j] = max(logits[i, j] - tau(logits[i, :]), 0)$$

logits Input tensor.
axis Integer, axis along which the sparsemax operation is applied.

Tensor, output of sparsemax transformation. Has the same type and shape as logits.

ValueError In case dim(logits) == 1.