Announcing the TensorFlow Dev Summit 2020 Learn more

tfx.components.statistics_gen.executor.Executor

View source on GitHub

Class Executor

Computes statistics over input training data for example validation.

Inherits From: BaseExecutor

The StatisticsGen component generates features statistics and random samples over training data, which can be used for visualization and validation. StatisticsGen uses Beam and appropriate algorithms to scale to large datasets.

To include StatisticsGen in a TFX pipeline, configure your pipeline similar to https://github.com/tensorflow/tfx/blob/master/tfx/examples/chicago_taxi_pipeline/taxi_pipeline_simple.py#L75.

__init__

View source

__init__(context=None)

Constructs a beam based executor.

Child Classes

class Context

Methods

Do

View source

Do(
    input_dict,
    output_dict,
    exec_properties
)

Computes stats for each split of input using tensorflow_data_validation.

Args:

  • input_dict: Input dict from input key to a list of Artifacts.
    • input_data: A list of 'ExamplesPath' type. This should contain both 'train' and 'eval' split.
  • output_dict: Output dict from output key to a list of Artifacts.
    • output: A list of 'ExampleStatisticsPath' type. This should contain both 'train' and 'eval' split.
  • exec_properties: A dict of execution properties. Not used yet.

Returns:

None