Normalization is useful to prevent neurons from saturating when inputs may have varying scale, and to aid generalization.

`tf.nn.l2_normalize(x, dim, epsilon=1e-12, name=None)`

Normalizes along dimension `dim`

using an L2 norm.

For a 1-D tensor with `dim = 0`

, computes

```
output = x / sqrt(max(sum(x**2), epsilon))
```

For `x`

with more dimensions, independently normalizes each 1-D slice along
dimension `dim`

.

##### Args:

: A`x`

`Tensor`

.: Dimension along which to normalize. A scalar or a vector of integers.`dim`

: A lower bound value for the norm. Will use`epsilon`

`sqrt(epsilon)`

as the divisor if`norm < sqrt(epsilon)`

.: A name for this operation (optional).`name`

##### Returns:

A `Tensor`

with the same shape as `x`

.

`tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)`

Local Response Normalization.

The 4-D `input`

tensor is treated as a 3-D array of 1-D vectors (along the last
dimension), and each vector is normalized independently. Within a given vector,
each component is divided by the weighted, squared sum of inputs within
`depth_radius`

. In detail,

```
sqr_sum[a, b, c, d] =
sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2)
output = input / (bias + alpha * sqr_sum) ** beta
```

For details, see Krizhevsky et al., ImageNet classification with deep convolutional neural networks (NIPS 2012).

##### Args:

: A`input`

`Tensor`

. Must be one of the following types:`float32`

,`half`

. 4-D.: An optional`depth_radius`

`int`

. Defaults to`5`

. 0-D. Half-width of the 1-D normalization window.: An optional`bias`

`float`

. Defaults to`1`

. An offset (usually positive to avoid dividing by 0).: An optional`alpha`

`float`

. Defaults to`1`

. A scale factor, usually positive.: An optional`beta`

`float`

. Defaults to`0.5`

. An exponent.: A name for the operation (optional).`name`

##### Returns:

A `Tensor`

. Has the same type as `input`

.

`tf.nn.sufficient_statistics(x, axes, shift=None, keep_dims=False, name=None)`

Calculate the sufficient statistics for the mean and variance of `x`

.

These sufficient statistics are computed using the one pass algorithm on an input that's optionally shifted. See: https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Computing_shifted_data

##### Args:

: A`x`

`Tensor`

.: Array of ints. Axes along which to compute mean and variance.`axes`

: A`shift`

`Tensor`

containing the value by which to shift the data for numerical stability, or`None`

if no shift is to be performed. A shift close to the true mean provides the most numerically stable results.: produce statistics with the same dimensionality as the input.`keep_dims`

: Name used to scope the operations that compute the sufficient stats.`name`

##### Returns:

Four `Tensor`

objects of the same type as `x`

:

- the count (number of elements to average over).
- the (possibly shifted) sum of the elements in the array.
- the (possibly shifted) sum of squares of the elements in the array.
- the shift by which the mean must be corrected or None if
`shift`

is None.

`tf.nn.normalize_moments(counts, mean_ss, variance_ss, shift, name=None)`

Calculate the mean and variance of based on the sufficient statistics.

##### Args:

: A`counts`

`Tensor`

containing a the total count of the data (one value).: A`mean_ss`

`Tensor`

containing the mean sufficient statistics: the (possibly shifted) sum of the elements to average over.: A`variance_ss`

`Tensor`

containing the variance sufficient statistics: the (possibly shifted) squared sum of the data to compute the variance over.: A`shift`

`Tensor`

containing the value by which the data is shifted for numerical stability, or`None`

if no shift was performed.: Name used to scope the operations that compute the moments.`name`

##### Returns:

Two `Tensor`

objects: `mean`

and `variance`

.

`tf.nn.moments(x, axes, shift=None, name=None, keep_dims=False)`

Calculate the mean and variance of `x`

.

The mean and variance are calculated by aggregating the contents of `x`

across `axes`

. If `x`

is 1-D and `axes = [0]`

this is just the mean
and variance of a vector.

When using these moments for batch normalization (see
`tf.nn.batch_normalization`

):

- for so-called "global normalization", used with convolutional filters with
shape
`[batch, height, width, depth]`

, pass`axes=[0, 1, 2]`

. - for simple batch normalization pass
`axes=[0]`

(batch only).

##### Args:

: A`x`

`Tensor`

.: Array of ints. Axes along which to compute mean and variance.`axes`

: A`shift`

`Tensor`

containing the value by which to shift the data for numerical stability, or`None`

in which case the true mean of the data is used as shift. A shift close to the true mean provides the most numerically stable results.: Name used to scope the operations that compute the moments.`name`

: produce moments with the same dimensionality as the input.`keep_dims`

##### Returns:

Two `Tensor`

objects: `mean`

and `variance`

.

`tf.nn.weighted_moments(x, axes, frequency_weights, name=None, keep_dims=False)`

Returns the frequency-weighted mean and variance of `x`

.

##### Args:

: A tensor.`x`

: 1-d tensor of int32 values; these are the axes along which to compute mean and variance.`axes`

: A tensor of positive weights which can be broadcast with x.`frequency_weights`

: Name used to scope the operation.`name`

: Produce moments with the same dimensionality as the input.`keep_dims`

##### Returns:

Two tensors: `weighted_mean`

and `weighted_variance`

.

`tf.nn.fused_batch_norm(x, scale, offset, mean=None, variance=None, epsilon=0.001, data_format='NHWC', is_training=True, name=None)`

Batch normalization.

As described in http://arxiv.org/abs/1502.03167.

##### Args:

: Input`x`

`Tensor`

of 4 dimensions.: A`scale`

`Tensor`

of 1 dimension for scaling.: A`offset`

`Tensor`

of 1 dimension for bias.: A`mean`

`Tensor`

of 1 dimension for population mean used for inference.: A`variance`

`Tensor`

of 1 dimension for population variance used for inference.: A small float number added to the variance of x.`epsilon`

: The data format for x. Either "NHWC" (default) or "NCHW".`data_format`

: A bool value to specify if the operation is used for training or inference.`is_training`

: A name for this operation (optional).`name`

##### Returns:

: A 4D Tensor for the normalized, scaled, offsetted x.`y`

: A 1D Tensor for the mean of x.`batch_mean`

: A 1D Tensor for the variance of x.`batch_var`

##### Raises:

: If mean or variance is not None when is_training is True.`ValueError`

`tf.nn.batch_normalization(x, mean, variance, offset, scale, variance_epsilon, name=None)`

Batch normalization.

As described in http://arxiv.org/abs/1502.03167.
Normalizes a tensor by `mean`

and `variance`

, and applies (optionally) a
`scale`

\(\gamma\) to it, as well as an `offset`

\(\beta\):

\(\frac{\gamma(x-\mu)}{\sigma}+\beta\)

`mean`

, `variance`

, `offset`

and `scale`

are all expected to be of one of two
shapes:

- In all generality, they can have the same number of dimensions as the
input
`x`

, with identical sizes as`x`

for the dimensions that are not normalized over (the 'depth' dimension(s)), and dimension 1 for the others which are being normalized over.`mean`

and`variance`

in this case would typically be the outputs of`tf.nn.moments(..., keep_dims=True)`

during training, or running averages thereof during inference. - In the common case where the 'depth' dimension is the last dimension in
the input tensor
`x`

, they may be one dimensional tensors of the same size as the 'depth' dimension. This is the case for example for the common`[batch, depth]`

layout of fully-connected layers, and`[batch, height, width, depth]`

for convolutions.`mean`

and`variance`

in this case would typically be the outputs of`tf.nn.moments(..., keep_dims=False)`

during training, or running averages thereof during inference.

##### Args:

: Input`x`

`Tensor`

of arbitrary dimensionality.: A mean`mean`

`Tensor`

.: A variance`variance`

`Tensor`

.: An offset`offset`

`Tensor`

, often denoted \(\beta\) in equations, or None. If present, will be added to the normalized tensor.: A scale`scale`

`Tensor`

, often denoted \(\gamma\) in equations, or`None`

. If present, the scale is applied to the normalized tensor.: A small float number to avoid dividing by 0.`variance_epsilon`

: A name for this operation (optional).`name`

##### Returns:

the normalized, scaled, offset tensor.

`tf.nn.batch_norm_with_global_normalization(t, m, v, beta, gamma, variance_epsilon, scale_after_normalization, name=None)`

Batch normalization.

This op is deprecated. See `tf.nn.batch_normalization`

.

##### Args:

: A 4D input Tensor.`t`

: A 1D mean Tensor with size matching the last dimension of t. This is the first output from tf.nn.moments, or a saved moving average thereof.`m`

: A 1D variance Tensor with size matching the last dimension of t. This is the second output from tf.nn.moments, or a saved moving average thereof.`v`

: A 1D beta Tensor with size matching the last dimension of t. An offset to be added to the normalized tensor.`beta`

: A 1D gamma Tensor with size matching the last dimension of t. If "scale_after_normalization" is true, this tensor will be multiplied with the normalized tensor.`gamma`

: A small float number to avoid dividing by 0.`variance_epsilon`

: A bool indicating whether the resulted tensor needs to be multiplied with gamma.`scale_after_normalization`

: A name for this operation (optional).`name`

##### Returns:

A batch-normalized `t`

.