The pooling ops sweep a rectangular window over the input tensor, computing a
reduction operation for each window (average, max, or max with argmax). Each
pooling op uses rectangular windows of size ksize
separated by offset
strides
. For example, if strides
is all ones every window is used, if
strides
is all twos every other window is used in each dimension, etc.
In detail, the output is
output[i] = reduce(value[strides * i:strides * i + ksize])
where the indices also take into consideration the padding values. Please refer
to the Convolution
section for details about the padding calculation.
tf.nn.avg_pool(value, ksize, strides, padding, data_format='NHWC', name=None)
Performs the average pooling on the input.
Each entry in output
is the mean of the corresponding size ksize
window in value
.
Args:
value
: A 4DTensor
of shape[batch, height, width, channels]
and typefloat32
,float64
,qint8
,quint8
, orqint32
.ksize
: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.strides
: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.padding
: A string, either'VALID'
or'SAME'
. The padding algorithm. See the comment heredata_format
: A string. 'NHWC' and 'NCHW' are supported.name
: Optional name for the operation.
Returns:
A Tensor
with the same type as value
. The average pooled output tensor.
tf.nn.max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)
Performs the max pooling on the input.
Args:
value
: A 4DTensor
with shape[batch, height, width, channels]
and typetf.float32
.ksize
: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.strides
: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.padding
: A string, either'VALID'
or'SAME'
. The padding algorithm. See the comment heredata_format
: A string. 'NHWC' and 'NCHW' are supported.name
: Optional name for the operation.
Returns:
A Tensor
with type tf.float32
. The max pooled output tensor.
tf.nn.max_pool_with_argmax(input, ksize, strides, padding, Targmax=None, name=None)
Performs max pooling on the input and outputs both max values and indices.
The indices in argmax
are flattened, so that a maximum value at position
[b, y, x, c]
becomes flattened index
((b * height + y) * width + x) * channels + c
.
Args:
input
: ATensor
. Must be one of the following types:float32
,half
. 4D with shape[batch, height, width, channels]
. Input to pool over.ksize
: A list ofints
that has length>= 4
. The size of the window for each dimension of the input tensor.strides
: A list ofints
that has length>= 4
. The stride of the sliding window for each dimension of the input tensor.padding
: Astring
from:"SAME", "VALID"
. The type of padding algorithm to use.Targmax
: An optionaltf.DType
from:tf.int32, tf.int64
. Defaults totf.int64
.name
: A name for the operation (optional).
Returns:
A tuple of Tensor
objects (output, argmax).
output
: ATensor
. Has the same type asinput
. The max pooled output tensor.argmax
: ATensor
of typeTargmax
. 4D. The flattened indices of the max values chosen for each output.
tf.nn.avg_pool3d(input, ksize, strides, padding, name=None)
Performs 3D average pooling on the input.
Args:
input
: ATensor
. Must be one of the following types:float32
,float64
,int64
,int32
,uint8
,uint16
,int16
,int8
,complex64
,complex128
,qint8
,quint8
,qint32
,half
. Shape[batch, depth, rows, cols, channels]
tensor to pool over.ksize
: A list ofints
that has length>= 5
. 1D tensor of length 5. The size of the window for each dimension of the input tensor. Must haveksize[0] = ksize[4] = 1
.strides
: A list ofints
that has length>= 5
. 1D tensor of length 5. The stride of the sliding window for each dimension ofinput
. Must havestrides[0] = strides[4] = 1
.padding
: Astring
from:"SAME", "VALID"
. The type of padding algorithm to use.name
: A name for the operation (optional).
Returns:
A Tensor
. Has the same type as input
.
The average pooled output tensor.
tf.nn.max_pool3d(input, ksize, strides, padding, name=None)
Performs 3D max pooling on the input.
Args:
input
: ATensor
. Must be one of the following types:float32
,float64
,int64
,int32
,uint8
,uint16
,int16
,int8
,complex64
,complex128
,qint8
,quint8
,qint32
,half
. Shape[batch, depth, rows, cols, channels]
tensor to pool over.ksize
: A list ofints
that has length>= 5
. 1D tensor of length 5. The size of the window for each dimension of the input tensor. Must haveksize[0] = ksize[4] = 1
.strides
: A list ofints
that has length>= 5
. 1D tensor of length 5. The stride of the sliding window for each dimension ofinput
. Must havestrides[0] = strides[4] = 1
.padding
: Astring
from:"SAME", "VALID"
. The type of padding algorithm to use.name
: A name for the operation (optional).
Returns:
A Tensor
. Has the same type as input
. The max pooled output tensor.
tf.nn.fractional_avg_pool(value, pooling_ratio, pseudo_random=None, overlapping=None, deterministic=None, seed=None, seed2=None, name=None)
Performs fractional average pooling on the input.
Fractional average pooling is similar to Fractional max pooling in the pooling region generation step. The only difference is that after pooling regions are generated, a mean operation is performed instead of a max operation in each pooling region.
Args:
value
: ATensor
. Must be one of the following types:float32
,float64
,int32
,int64
. 4D with shape[batch, height, width, channels]
.pooling_ratio
: A list offloats
that has length>= 4
. Pooling ratio for each dimension ofvalue
, currently only supports row and col dimension and should be >= 1.0. For example, a valid pooling ratio looks like [1.0, 1.44, 1.73, 1.0]. The first and last elements must be 1.0 because we don't allow pooling on batch and channels dimensions. 1.44 and 1.73 are pooling ratio on height and width dimensions respectively.pseudo_random
: An optionalbool
. Defaults toFalse
. When set to True, generates the pooling sequence in a pseudorandom fashion, otherwise, in a random fashion. Check paper Benjamin Graham, Fractional MaxPooling for difference between pseudorandom and random.
overlapping
: An optionalbool
. Defaults toFalse
. When set to True, it means when pooling, the values at the boundary of adjacent pooling cells are used by both cells. For example:index 0 1 2 3 4
value 20 5 16 3 7
If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice. The result would be [41/3, 26/3] for fractional avg pooling.

deterministic
: An optionalbool
. Defaults toFalse
. When set to True, a fixed pooling region will be used when iterating over a FractionalAvgPool node in the computation graph. Mainly used in unit test to make FractionalAvgPool deterministic. seed
: An optionalint
. Defaults to0
. If either seed or seed2 are set to be nonzero, the random number generator is seeded by the given seed. Otherwise, it is seeded by a random seed.seed2
: An optionalint
. Defaults to0
. An second seed to avoid seed collision.name
: A name for the operation (optional).
Returns:
A tuple of Tensor
objects (output, row_pooling_sequence, col_pooling_sequence).
output
: ATensor
. Has the same type asvalue
. output tensor after fractional avg pooling.row_pooling_sequence
: ATensor
of typeint64
. row pooling sequence, needed to calculate gradient.col_pooling_sequence
: ATensor
of typeint64
. column pooling sequence, needed to calculate gradient.
tf.nn.fractional_max_pool(value, pooling_ratio, pseudo_random=None, overlapping=None, deterministic=None, seed=None, seed2=None, name=None)
Performs fractional max pooling on the input.
Fractional max pooling is slightly different than regular max pooling. In regular max pooling, you downsize an input set by taking the maximum value of smaller N x N subsections of the set (often 2x2), and try to reduce the set by a factor of N, where N is an integer. Fractional max pooling, as you might expect from the word "fractional", means that the overall reduction ratio N does not have to be an integer.
The sizes of the pooling regions are generated randomly but are fairly uniform. For example, let's look at the height dimension, and the constraints on the list of rows that will be pool boundaries.
First we define the following:
 input_row_length : the number of rows from the input set
 output_row_length : which will be smaller than the input
 alpha = input_row_length / output_row_length : our reduction ratio
 K = floor(alpha)
 row_pooling_sequence : this is the result list of pool boundary rows
Then, row_pooling_sequence should satisfy:
 a[0] = 0 : the first value of the sequence is 0
 a[end] = input_row_length : the last value of the sequence is the size
 K <= (a[i+1]  a[i]) <= K+1 : all intervals are K or K+1 size
 length(row_pooling_sequence) = output_row_length+1
For more details on fractional max pooling, see this paper: Benjamin Graham, Fractional MaxPooling
Args:
value
: ATensor
. Must be one of the following types:float32
,float64
,int32
,int64
. 4D with shape[batch, height, width, channels]
.pooling_ratio
: A list offloats
that has length>= 4
. Pooling ratio for each dimension ofvalue
, currently only supports row and col dimension and should be >= 1.0. For example, a valid pooling ratio looks like [1.0, 1.44, 1.73, 1.0]. The first and last elements must be 1.0 because we don't allow pooling on batch and channels dimensions. 1.44 and 1.73 are pooling ratio on height and width dimensions respectively.pseudo_random
: An optionalbool
. Defaults toFalse
. When set to True, generates the pooling sequence in a pseudorandom fashion, otherwise, in a random fashion. Check paper Benjamin Graham, Fractional MaxPooling for difference between pseudorandom and random.
overlapping
: An optionalbool
. Defaults toFalse
. When set to True, it means when pooling, the values at the boundary of adjacent pooling cells are used by both cells. For example:index 0 1 2 3 4
value 20 5 16 3 7
If the pooling sequence is [0, 2, 4], then 16, at index 2 will be used twice. The result would be [20, 16] for fractional max pooling.

deterministic
: An optionalbool
. Defaults toFalse
. When set to True, a fixed pooling region will be used when iterating over a FractionalMaxPool node in the computation graph. Mainly used in unit test to make FractionalMaxPool deterministic. seed
: An optionalint
. Defaults to0
. If either seed or seed2 are set to be nonzero, the random number generator is seeded by the given seed. Otherwise, it is seeded by a random seed.seed2
: An optionalint
. Defaults to0
. An second seed to avoid seed collision.name
: A name for the operation (optional).
Returns:
A tuple of Tensor
objects (output, row_pooling_sequence, col_pooling_sequence).
output
: ATensor
. Has the same type asvalue
. output tensor after fractional max pooling.row_pooling_sequence
: ATensor
of typeint64
. row pooling sequence, needed to calculate gradient.col_pooling_sequence
: ATensor
of typeint64
. column pooling sequence, needed to calculate gradient.
tf.nn.pool(input, window_shape, pooling_type, padding, dilation_rate=None, strides=None, name=None, data_format=None)
Performs an ND pooling operation.
In the case that data_format
does not start with "NC", computes for
0 <= b < batch_size,
0 <= x[i] < output_spatial_shape[i],
0 <= c < num_channels:
output[b, x[0], ..., x[N1], c] = REDUCE_{z[0], ..., z[N1]} input[b, x[0] * strides[0]  pad_before[0] + dilation_rate[0]z[0], ... x[N1]strides[N1]  pad_before[N1] + dilation_rate[N1]*z[N1], c],
where the reduction function REDUCE depends on the value of pooling_type
,
and pad_before is defined based on the value of padding
as described in the
comment here.
The reduction never includes outofbounds positions.
In the case that data_format
starts with "NC"
, the input
and output are
simply transposed as follows:
pool(input, data_format, kwargs) = tf.transpose(pool(tf.transpose(input, [0] + range(2,N+2) + [1]), kwargs), [0, N+1] + range(1, N+1))
Args:
input
: Tensor of rank N+2, of shape[batch_size] + input_spatial_shape + [num_channels]
if data_format does not start with "NC" (default), or[batch_size, num_channels] + input_spatial_shape
if data_format starts with "NC". Pooling happens over the spatial dimensions only.window_shape
: Sequence of N ints >= 1.pooling_type
: Specifies pooling operation, must be "AVG" or "MAX".padding
: The padding algorithm, must be "SAME" or "VALID". See the comment heredilation_rate
: Optional. Dilation rate. List of N ints >= 1. Defaults to [1]*N. If any value of dilation_rate is > 1, then all values of strides must be 1.strides
: Optional. Sequence of N ints >= 1. Defaults to [1]*N. If any value of strides is > 1, then all values of dilation_rate must be 1.name
: Optional. Name of the op.data_format
: A string or None. Specifies whether the channel dimension of theinput
and output is the last dimension (default, or ifdata_format
does not start with "NC"), or the second dimension (ifdata_format
starts with "NC"). For N=1, the valid values are "NWC" (default) and "NCW". For N=2, the valid values are "NHWC" (default) and "NCHW". For N=3, the valid value is "NDHWC".
Returns:
Tensor of rank N+2, of shape [batch_size] + output_spatial_shape + [num_channels]
if data_format is None or does not start with "NC", or
[batch_size, num_channels] + output_spatial_shape
if data_format starts with "NC",
where output_spatial_shape
depends on the value of padding:
If padding = "SAME": output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides[i]) If padding = "VALID": output_spatial_shape[i] = ceil((input_spatial_shape[i]  (window_shape[i]  1) * dilation_rate[i]) / strides[i]).
Raises:
ValueError
: if arguments are invalid.