tfx.components.BigQueryExampleGen

View source on GitHub

Official TFX BigQueryExampleGen component.

The BigQuery examplegen component takes a query, and generates train and eval examples for downsteam components.

query BigQuery sql string, query result will be treated as a single split, can be overwritten by input_config.
input_config An example_gen_pb2.Input instance with Split.pattern as BigQuery sql string. If set, it overwrites the 'query' arg, and allows different queries per split. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Input proto message.
output_config An example_gen_pb2.Output instance, providing output configuration. If unset, default splits will be 'train' and 'eval' with size 2:1. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Output proto message.
example_artifacts Optional channel of 'ExamplesPath' for output train and eval examples.
instance_name Optional unique instance name. Necessary if multiple BigQueryExampleGen components are declared in the same pipeline.
enable_cache Optional boolean to indicate if cache is enabled for the BigQueryExampleGen component. If not specified, defaults to the value specified for pipeline's enable_cache parameter.

RuntimeError Only one of query and input_config should be set.

component_id DEPRECATED FUNCTION

component_type DEPRECATED FUNCTION
downstream_nodes

enable_cache

exec_properties

id Node id, unique across all TFX nodes in a pipeline.

If instance name is available, node_id will be: . otherwise, node_id will be:

inputs

outputs

type

upstream_nodes

Child Classes

class DRIVER_CLASS

class SPEC_CLASS

Methods

add_downstream_node

View source

add_upstream_node

View source

from_json_dict

View source

Convert from dictionary data to an object.

get_id

View source

Gets the id of a node.

This can be used during pipeline authoring time. For example: from tfx.components import Trainer

resolver = ResolverNode(..., model=Channel( type=Model, producer_component_id=Trainer.get_id('my_trainer')))

Args
instance_name (Optional) instance name of a node. If given, the instance name will be taken into consideration when generating the id.

Returns
an id for the node.

to_json_dict

View source

Convert from an object to a JSON serializable dictionary.

Class Variables

  • EXECUTOR_SPEC