Join the SIG TFX-Addons community and help make TFX even better!

tfx.v1.components.ImportExampleGen

Official TFX ImportExampleGen component.

The ImportExampleGen component takes TFRecord files with TF Example data format, and generates train and eval examples for downstream components. This component provides consistent and configurable partition, and it also shuffle the dataset for ML best practice.

Component outputs contains:

input_base an external directory containing the TFRecord files.
input_config An example_gen_pb2.Input instance, providing input configuration. If unset, the files under input_base will be treated as a single split. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Input proto message.
output_config An example_gen_pb2.Output instance, providing output configuration. If unset, default splits will be 'train' and 'eval' with size 2:1. If any field is provided as a RuntimeParameter, output_config should be constructed as a dict with the same field names as Output proto message.
range_config An optional range_config_pb2.RangeConfig instance, specifying the range of span values to consider. If unset, driver will default to searching for latest span with no restrictions.
payload_format Payload format of input data. Should be one of example_gen_pb2.PayloadFormat enum. Note that payload format of output data is the same as input.

outputs Component's output channel dict.

Methods

with_beam_pipeline_args

Add per component Beam pipeline args.

Args
beam_pipeline_args List of Beam pipeline args to be added to the Beam executor spec.

Returns
the same component itself.