Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge


A preprocessing layer which buckets continuous features by ranges.

Inherits From: PreprocessingLayer, Layer, Module

Used in the notebooks

Used in the guide Used in the tutorials

This layer will place each element of its input data into one of several contiguous ranges and output an integer index indicating which range each element was placed in.

For an overview and full list of preprocessing layers, see the preprocessing guide.

Input shape:

Any tf.Tensor or tf.RaggedTensor of dimension 2 or higher.

Output shape:

Same as input shape.


Bucketize float values based on provided buckets.

>>> input = np.array([[-1.5, 1.0, 3.4, .5], [0.0, 3.0, 1.3, 0.0]])
>>> layer = tf.keras.layers.Discretization(bin_boundaries=[0., 1., 2.])
>>> layer(input)
<tf.Tensor: shape=(2, 4), dtype=int64, numpy=
array([[0, 2, 3, 1],
       [1, 3, 2, 1]], dtype=int64)>

Bucketize float values based on a number of buckets to compute.

>>> input = np.array([[-1.5, 1.0, 3.4, .5], [0.0, 3.0, 1.3, 0.0]])
>>> layer = tf.keras.layers.Discretization(num_bins=4, epsilon=0.01)
>>> layer.adapt(input)
>>> layer(input)
<tf.Tensor: shape=(2, 4), dtype=int64, numpy=
array([[0, 2, 3, 2],
       [1, 3, 3, 1]], dtype=int64)>

bin_boundaries A list of bin boundaries. The leftmost and rightmost bins will always extend to -inf and inf, so bin_boundaries=[0., 1., 2.] generates bins (-inf, 0.), [0., 1.), [1., 2.), and [2., +inf). If this option is set, adapt should not be called.
num_bins The integer number of bins to compute. If this option is set, adapt should be called to learn the bin boundaries.
epsilon Error tolerance, typically a small fraction close to zero (e.g. 0.01). Higher values of epsilon increase the quantile approximation, and hence result in more unequal buckets, but could improve performance and resource consumption.
is_adapted Whether the layer has been fit to data already.



View source

Fits the state of the preprocessing layer to the data being passed.

After calling adapt on a layer, a preprocessing layer's state will not update during training. In order to make preprocessing layers efficient in any distribution context, they are kept constant with respect to any compiled tf.Graphs that call the layer. This does not affect the layer use when adapting each layer only once, but if you adapt a layer multiple times you will need to take care to re-compile any compiled functions as follows:

  • If you are adding a preprocessing layer to a keras.Model, you need to call model.compile after each subsequent call to adapt.
  • If you are calling a preprocessing layer inside, you should call map again on the input after each adapt.
  • If you are using a tf.function directly which calls a preprocessing layer, you need to call tf.function again on your callable after each subsequent call to adapt.

tf.keras.Model example with multiple adapts:

layer = tf.keras.layers.experimental.preprocessing.Normalization(
layer.adapt([0, 2])
model = tf.keras.Sequential(layer)
model.predict([0, 1, 2])
array([-1.,  0.,  1.], dtype=float32)
layer.adapt([-1, 1])
model.compile() # This is needed to re-compile model.predict!
model.predict([0, 1, 2])
array([0., 1., 2.], dtype=float32)