Normalization

Normalization is useful to prevent neurons from saturating when inputs may have varying scale, and to aid generalization.

tf.nn.l2_normalize(x, dim, epsilon=1e-12, name=None)

Normalizes along dimension dim using an L2 norm.

For a 1-D tensor with dim = 0, computes

output = x / sqrt(max(sum(x**2), epsilon))

For x with more dimensions, independently normalizes each 1-D slice along dimension dim.

Args:
  • x: A Tensor.
  • dim: Dimension along which to normalize.
  • epsilon: A lower bound value for the norm. Will use sqrt(epsilon) as the divisor if norm < sqrt(epsilon).
  • name: A name for this operation (optional).
Returns:

A Tensor with the same shape as x.


tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)

Local Response Normalization.

The 4-D input tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within depth_radius. In detail,

sqr_sum[a, b, c, d] =
    sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2)
output = input / (bias + alpha * sqr_sum) ** beta

For details, see [Krizhevsky et al., ImageNet classification with deep convolutional neural networks (NIPS 2012)] (http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks).

Args:
  • input: A Tensor. Must be one of the following types: float32, half. 4-D.
  • depth_radius: An optional int. Defaults to 5. 0-D. Half-width of the 1-D normalization window.
  • bias: An optional float. Defaults to 1. An offset (usually positive to avoid dividing by 0).
  • alpha: An optional float. Defaults to 1. A scale factor, usually positive.
  • beta: An optional float. Defaults to 0.5. An exponent.
  • name: A name for the operation (optional).
Returns:

A Tensor. Has the same type as input.


tf.nn.sufficient_statistics(x, axes, shift=None, keep_dims=False, name=None)

Calculate the sufficient statistics for the mean and variance of x.

These sufficient statistics are computed using the one pass algorithm on an input that's optionally shifted. See: https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Computing_shifted_data

Args:
  • x: A Tensor.
  • axes: Array of ints. Axes along which to compute mean and variance.
  • shift: A Tensor containing the value by which to shift the data for numerical stability, or None if no shift is to be performed. A shift close to the true mean provides the most numerically stable results.
  • keep_dims: produce statistics with the same dimensionality as the input.
  • name: Name used to scope the operations that compute the sufficient stats.
Returns:

Four Tensor objects of the same type as x: * the count (number of elements to average over). * the (possibly shifted) sum of the elements in the array. * the (possibly shifted) sum of squares of the elements in the array. * the shift by which the mean must be corrected or None if shift is None.


tf.nn.normalize_moments(counts, mean_ss, variance_ss, shift, name=None)

Calculate the mean and variance of based on the sufficient statistics.

Args:
  • counts: A Tensor containing a the total count of the data (one value).
  • mean_ss: A Tensor containing the mean sufficient statistics: the (possibly shifted) sum of the elements to average over.
  • variance_ss: A Tensor containing the variance sufficient statistics: the (possibly shifted) squared sum of the data to compute the variance over.
  • shift: A Tensor containing the value by which the data is shifted for numerical stability, or None if no shift was performed.
  • name: Name used to scope the operations that compute the moments.
Returns:

Two Tensor objects: mean and variance.


tf.nn.moments(x, axes, shift=None, name=None, keep_dims=False)

Calculate the mean and variance of x.

The mean and variance are calculated by aggregating the contents of x across axes. If x is 1-D and axes = [0] this is just the mean and variance of a vector.

When using these moments for batch normalization (see tf.nn.batch_normalization): * for so-called "global normalization", used with convolutional filters with shape [batch, height, width, depth], pass axes=[0, 1, 2]. * for simple batch normalization pass axes=[0] (batch only).

Args:
  • x: A Tensor.
  • axes: array of ints. Axes along which to compute mean and variance.
  • shift: A Tensor containing the value by which to shift the data for numerical stability, or None if no shift is to be performed. A shift close to the true mean provides the most numerically stable results.
  • name: Name used to scope the operations that compute the moments.
  • keep_dims: produce moments with the same dimensionality as the input.
Returns:

Two Tensor objects: mean and variance.