tft_beam.Context

Context manager for tensorflow-transform.

All the attributes in this context are kept on a thread local state. Note that the temp dir should be accessible to worker jobs, e.g. if running with the Cloud Dataflow runner, the temp dir should be on GCS and should have permissions that allow both launcher and workers to access it.

temp_dir (Optional) The temporary directory used within in this block.
desired_batch_size (Optional) A batch size to batch elements by. If not provided, a batch size will be computed automatically.
passthrough_keys (Optional) A set of strings that are keys to instances that should pass through the pipeline and be hidden from the preprocessing_fn. This should only be used in cases where additional information should be attached to instances in the pipeline which should not be part of the transformation graph, instance keys is one such example.
use_deep_copy_optimization (Optional) If True, makes deep copies of PCollections that are used in multiple TFT phases.
force_tf_compat_v1 (Optional) If True, TFT's public APIs (e.g. AnalyzeDataset) will use Tensorflow in compat.v1 mode irrespective of installed version of Tensorflow. Defaults to False.

Methods

create_base_temp_dir

View source

Generate a temporary location.

get_desired_batch_size

View source

Retrieves a user set fixed batch size, None if not set.

get_passthrough_keys

View source

Retrieves a user set passthrough_keys, None if not set.

get_use_deep_copy_optimization

View source

Retrieves a user set use_deep_copy_optimization, None if not set.

get_use_tf_compat_v1

View source

Computes use_tf_compat_v1 from TF environment and force_tf_compat_v1.

__enter__

View source

__exit__

View source