tf.feature_column.bucketized_column

TensorFlow 1 version View source on GitHub

Represents discretized dense input bucketed by boundaries.

Used in the notebooks

Used in the tutorials

Buckets include the left boundary, and exclude the right boundary. Namely, boundaries=[0., 1., 2.] generates buckets (-inf, 0.), [0., 1.), [1., 2.), and [2., +inf).

For example, if the inputs are

boundaries = [0, 10, 100]
input tensor = [[-5, 10000]
                [150,   10]
                [5,    100]]

then the output will be

output = [[0, 3]
          [3, 2]
          [1, 3]]

Example:

price = tf.feature_column.numeric_column('price')
bucketized_price = tf.feature_column.bucketized_column(
    price, boundaries=[...])
columns = [bucketized_price, ...]
features = tf.io.parse_example(
    ..., features=tf.feature_column.make_parse_example_spec(columns))
dense_tensor = tf.keras.layers.DenseFeatures(columns)(features)

bucketized_column can also be crossed with another categorical column using crossed_column:

price = tf.feature_column.numeric_column('price')
# bucketized_column converts numerical feature to a categorical one.
bucketized_price = tf.feature_column.bucketized_column(
    price, boundaries=[...])
# 'keywords' is a string feature.
price_x_keywords = tf.feature_column.crossed_column(
    [bucketized_price, 'keywords'], 50K)
columns = [price_x_keywords, ...]
features = tf.io.parse_example(
    ..., features=tf.feature_column.make_parse_example_spec(columns))
dense_tensor = tf.keras.layers.DenseFeatures(columns)(features)
linear_model = tf.keras.experimental.LinearModel(units=...)(dense_tensor)

source_column A one-dimensional dense column which is generated with numeric_column.
boundaries A sorted list or tuple of floats specifying the boundaries.

A BucketizedColumn.

ValueError If source_column is not a numeric column, or if it is not one-dimensional.
ValueError If boundaries is not a sorted list or tuple.