Module: tft

Init module for TF.Transform.

Modules

coders module: Module level imports for tensorflow_transform.coders.

experimental module: Module level imports for tensorflow_transform.experimental.

Classes

class DatasetMetadata: Metadata about a dataset used for the "instance dict" format.

class TFTransformOutput: A wrapper around the output of the tf.Transform.

class TransformFeaturesLayer: A Keras layer for applying a tf.Transform output to input layers.

Functions

annotate_asset(...): Creates mapping between user-defined keys and SavedModel assets.

apply_buckets(...): Returns a bucketized column, with a bucket index assigned to each input.

apply_buckets_with_interpolation(...): Interpolates within the provided buckets and then normalizes to 0 to 1.

apply_pyfunc(...): Applies a python function to some Tensors.

apply_vocabulary(...): Maps x to a vocabulary specified by the deferred tensor.

bag_of_words(...): Computes a bag of "words" based on the specified ngram configuration.

bucketize(...): Returns a bucketized column, with a bucket index assigned to each input.

bucketize_per_key(...): Returns a bucketized column, with a bucket index assigned to each input.

compute_and_apply_vocabulary(...): Generates a vocabulary for x and maps it to an integer with this vocab.

count_per_key(...): Computes the count of each element of a Tensor.

covariance(...): Computes the covariance matrix over the whole dataset.

deduplicate_tensor_per_row(...): Deduplicates each row (0-th dimension) of the provided tensor.

estimated_probability_density(...): Computes an approximate probability density at each x, given the bins.

get_analyze_input_columns(...): Return columns that are required inputs of AnalyzeDataset.

get_num_buckets_for_transformed_feature(...): Provides the number of buckets for a transformed feature if annotated.

get_transform_input_columns(...): Return columns that are required inputs of TransformDataset.

hash_strings(...): Hash strings into buckets.

histogram(...): Computes a histogram over x, given the bin boundaries or bin count.

make_and_track_object(...): Keeps track of the object created by invoking trackable_factory_callable.

max(...): Computes the maximum of the values of x over the whole dataset.

mean(...): Computes the mean of the values of a Tensor over the whole dataset.

min(...): Computes the minimum of the values of x over the whole dataset.

ngrams(...): Create a SparseTensor of n-grams.

pca(...): Computes PCA on the dataset using biased covariance.

quantiles(...): Computes the quantile boundaries of a Tensor over the whole dataset.

scale_by_min_max(...): Scale a numerical column into the range [output_min, output_max].

scale_by_min_max_per_key(...): Scale a numerical column into a predefined range on a per-key basis.

scale_to_0_1(...): Returns a column which is the input column scaled to have range [0,1].

scale_to_0_1_per_key(...): Returns a column which is the input column scaled to have range [0,1].

scale_to_gaussian(...): Returns an (approximately) normal column with mean to 0 and variance 1.

scale_to_z_score(...): Returns a standardized column with mean 0 and variance 1.

scale_to_z_score_per_key(...): Returns a standardized column with mean 0 and variance 1, grouped per key.

segment_indices(...): Returns a Tensor of indices within each segment.

size(...): Computes the total size of instances in a Tensor over the whole dataset.

sparse_tensor_left_align(...): Re-arranges a tf.SparseTensor and returns a left-aligned version of it.

sparse_tensor_to_dense_with_shape(...): Converts a SparseTensor into a dense tensor and sets its shape.

sum(...): Computes the sum of the values of a Tensor over the whole dataset.

tfidf(...): Maps the terms in x to their term frequency * inverse document frequency.

tukey_h_params(...): Computes the h parameters of the values of a Tensor over the dataset.

tukey_location(...): Computes the location of the values of a Tensor over the whole dataset.

tukey_scale(...): Computes the scale of the values of a Tensor over the whole dataset.

var(...): Computes the variance of the values of a Tensor over the whole dataset.

vocabulary(...): Computes the unique values of x over the whole dataset.

word_count(...): Find the token count of each document/row.

version '1.15.0'