tft.bucketize

Returns a bucketized column, with a bucket index assigned to each input.

Used in the notebooks

Used in the tutorials

x A numeric input Tensor, SparseTensor, or RaggedTensor whose values should be mapped to buckets. For a CompositeTensor only non-missing values will be included in the quantiles computation, and the result of bucketize will be a CompositeTensor with non-missing values mapped to buckets. If elementwise=True then x must be dense.
num_buckets Values in the input x are divided into approximately equal-sized buckets, where the number of buckets is num_buckets.
epsilon (Optional) Error tolerance, typically a small fraction close to zero. If a value is not specified by the caller, a suitable value is computed based on experimental results. For num_buckets less than 100, the value of 0.01 is chosen to handle a dataset of up to ~1 trillion input data values. If num_buckets is larger, then epsilon is set to (1/num_buckets) to enforce a stricter error tolerance, because more buckets will result in smaller range for each bucket, and so we want the boundaries to be less fuzzy. See analyzers.quantiles() for details.
weights (Optional) Weights tensor for the quantiles. Tensor must have the same shape as x.
elementwise (Optional) If true, bucketize each element of the tensor independently.
name (Optional) A name for this operation.

A Tensor of the same shape as x, with each element in the returned tensor representing the bucketized value. Bucketized value is in the range [0, actual_num_buckets). Sometimes the actual number of buckets can be different than num_buckets hint, for example in case the number of distinct values is smaller than num_buckets, or in cases where the input values are not uniformly distributed. NaN values are mapped to the last bucket. Values with NaN weights are ignored in bucket boundaries calculation.

TypeError If num_buckets is not an int.
ValueError If value of num_buckets is not > 1.
ValueError If elementwise=True and x is a CompositeTensor.