Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

Module: tf.nn

Primitive Neural Net (NN) Operations.

Notes on padding

Several neural network operations, such as tf.nn.conv2d and tf.nn.max_pool2d, take a padding parameter, which controls how the input is padded before running the operation. The input is padded by inserting values (typically zeros) before and after the tensor in each spatial dimension. The padding parameter can either be the string 'VALID', which means use no padding, or 'SAME' which adds padding according to a formula which is described below. Certain ops also allow the amount of padding per dimension to be explicitly specified by passing a list to padding.

In the case of convolutions, the input is padded with zeros. In case of pools, the padded input values are ignored. For example, in a max pool, the sliding window ignores padded values, which is equivalent to the padded values being -infinity.

'VALID' padding

Passing padding='VALID' to an op causes no padding to be used. This causes the output size to typically be smaller than the input size, even when the stride is one. In the 2D case, the output size is computed as:

out_height = ceil((in_height - filter_height + 1) / stride_height)
out_width  = ceil((in_width - filter_width + 1) / stride_width)

The 1D and 3D cases are similar. Note filter_height and filter_width refer to the filter size after dilations (if any) for convolutions, and refer to the window size for pools.

'SAME' padding

With 'SAME' padding, padding is applied to each spatial dimension. When the strides are 1, the input is padded such that the output size is the same as the input size. In the 2D case, the output size is computed as:

out_height = ceil(in_height / stride_height)
out_width  = ceil(in_width / stride_width)

The amount of padding used is the smallest amount that results in the output size. The formula for the total amount of padding per dimension is:

if (in_height % strides[1] == 0):
  pad_along_height = max(filter_height - stride_height, 0)
  pad_along_height = max(filter_height - (in_height % stride_height), 0)
if (in_width % strides[2] == 0):
  pad_along_width = max(filter_width - stride_width, 0)
  pad_along_width = max(filter_width - (in_width % stride_width), 0)

Finally, the padding on the top, bottom, left and right are:

pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

Note that the division by 2 means that there might be cases when the padding on both sides (top vs bottom, right vs left) are off by one. In this case, the bottom and right sides always get the one additional padded pixel. For example, when pad_along_height is 5, we pad 2 pixels at the top and 3 pixels at the bottom. Note that this is different from existing libraries such as PyTorch and Caffe, which explicitly specify the number of padded pixels and always pad the same number of pixels on both sides.

Here is an example of 'SAME' padding:

in_height = 5
filter_height = 3
stride_height = 2

in_width = 2
filter_width = 2
stride_width = 1

inp = tf.ones((2, in_height, in_width, 2))
filter = tf.ones((filter_height, filter_width, 2, 2))
strides = [stride_height, stride_width]
output = tf.nn.conv2d(inp, filter, strides, padding='SAME')
output.shape[1]  # output_height: ceil(5 / 2)
output.shape[2] # output_width: ceil(2 / 1)

Explicit padding

Certain ops, like tf.nn.conv2d, also allow a list of explicit padding amounts to be passed to the padding parameter. This list is in the same format as what is passed to tf.pad, except the padding must be a nested list, not a tensor. For example, in the 2D case, the list is in the format [[0, 0], [pad_top, pad_bottom], [pad_left, pad_right], [0, 0]] when data_format is its default value of 'NHWC'. The two [0, 0] pairs indicate the batch and channel dimensions have no padding, which is required, as only spatial dimensions can have padding.

For example:

inp = tf.ones((1, 3, 3, 1))
filter = tf.ones((2, 2, 1, 1))
strides = [1, 1]
padding = [[0, 0], [1, 2], [0, 1], [0, 0]]
output = tf.nn.conv2d(inp, filter, strides, padding=padding)
(1, 5, 3, 1)
# Equivalently, tf.pad can be used, since convolutions pad with zeros.
inp = tf.pad(inp, padding)
# 'VALID' means to use no padding in conv2d (we already padded inp)
output2 = tf.nn.conv2d(inp, filter, strides, padding='VALID')
tf.debugging.assert_equal(output, output2)


experimental module: Public API for tf.nn.experimental namespace.


class RNNCellDeviceWrapper: Operator that ensures an RNNCell runs on a particular device.

class RNNCellDropoutWrapper: Operator adding dropout to inputs and outputs of the given cell.

class RNNCellResidualWrapper: RNNCell wrapper that ensures cell inputs are added to the outputs.


all_candidate_sampler(...): Generate the set of all classes.

atrous_conv2d(...): Atrous convolution (a.k.a. convolution with holes or dilated convolution).

atrous_conv2d_transpose(...): The transpose of atrous_conv2d.

avg_pool(...): Performs the avg pooling on the input.

avg_pool1d(...): Performs the average pooling on the input.

avg_pool2d(...): Performs the average pooling on the input.