tft_beam.Context

Class Context

Context manager for tensorflow-transform.

All the attributes in this context are kept on a thread local state.

Args:

  • temp_dir: (Optional) The temporary directory used within in this block.
  • desired_batch_size: (Optional) A batch size to batch elements by. If not provided, a batch size will be computed automatically.
  • passthrough_keys: (Optional) A set of strings that are keys to instances that should pass through the pipeline and be hidden from the preprocessing_fn. This should only be used in cases where additional information should be attached to instances in the pipeline which should not be part of the transformation graph, instance keys is one such example.

Note that the temp dir should be accessible to worker jobs, e.g. if running with the Cloud Dataflow runner, the temp dir should be on GCS and should have permissions that allow both launcher and workers to access it.

__init__

__init__(
    temp_dir=None,
    desired_batch_size=None,
    passthrough_keys=None,
    use_deep_copy_optimization=None
)

Methods

__enter__

__enter__()

__exit__

__exit__(*exn_info)

create_base_temp_dir

@classmethod
create_base_temp_dir(cls)

Generate a temporary location.

get_desired_batch_size

@classmethod
get_desired_batch_size(cls)

Retrieves a user set fixed batch size, None if not set.

get_passthrough_keys

@classmethod
get_passthrough_keys(cls)

Retrieves a user set passthrough_keys, None if not set.

get_use_deep_copy_optimization

@classmethod
get_use_deep_copy_optimization(cls)

Retrieves a user set use_deep_copy_optimization, None if not set.