tft.bucketize
bookmark_border Stay organized with collections Save and categorize content based on your preferences.

Returns a bucketized column, with a bucket index assigned to each input.

tft.bucketize(
    x: common_types.ConsistentTensorType,
    num_buckets: int,
    epsilon: Optional[float] = None,
    weights: Optional[tf.Tensor] = None,
    elementwise: bool = False,
    name: Optional[str] = None
) -> common_types.ConsistentTensorType

Used in the notebooks

Used in the tutorials
TFX Estimator Component Tutorial TFX Keras Component Tutorial

Args
`x`	A numeric input `Tensor`, `SparseTensor`, or `RaggedTensor` whose values should be mapped to buckets. For a `CompositeTensor` only non-missing values will be included in the quantiles computation, and the result of `bucketize` will be a `CompositeTensor` with non-missing values mapped to buckets. If elementwise=True then `x` must be dense.
`num_buckets`	Values in the input `x` are divided into approximately equal-sized buckets, where the number of buckets is `num_buckets`.
`epsilon`	(Optional) Error tolerance, typically a small fraction close to zero. If a value is not specified by the caller, a suitable value is computed based on experimental results. For `num_buckets` less than 100, the value of 0.01 is chosen to handle a dataset of up to ~1 trillion input data values. If `num_buckets` is larger, then epsilon is set to (1/`num_buckets`) to enforce a stricter error tolerance, because more buckets will result in smaller range for each bucket, and so we want the boundaries to be less fuzzy. See analyzers.quantiles() for details.
`weights`	(Optional) Weights tensor for the quantiles. Tensor must have the same shape as x.
`elementwise`	(Optional) If true, bucketize each element of the tensor independently.
`name`	(Optional) A name for this operation.

Returns
A `Tensor` of the same shape as `x`, with each element in the returned tensor representing the bucketized value. Bucketized value is in the range [0, actual_num_buckets). Sometimes the actual number of buckets can be different than num_buckets hint, for example in case the number of distinct values is smaller than num_buckets, or in cases where the input values are not uniformly distributed. NaN values are mapped to the last bucket. Values with NaN weights are ignored in bucket boundaries calculation.

Returns

A Tensor of the same shape as x, with each element in the returned tensor representing the bucketized value. Bucketized value is in the range [0, actual_num_buckets). Sometimes the actual number of buckets can be different than num_buckets hint, for example in case the number of distinct values is smaller than num_buckets, or in cases where the input values are not uniformly distributed. NaN values are mapped to the last bucket. Values with NaN weights are ignored in bucket boundaries calculation.

Raises
`TypeError`	If num_buckets is not an int.
`ValueError`	If value of num_buckets is not > 1.
`ValueError`	If elementwise=True and x is a `CompositeTensor`.

tft.bucketize bookmark_borderbookmark Stay organized with collections Save and categorize content based on your preferences.

Used in the notebooks

Args

Returns

Raises

tft.bucketize
bookmark_border Stay organized with collections Save and categorize content based on your preferences.