Join the SIG TFX-Addons community and help make TFX even better!

tfx.extensions.google_cloud_big_query.experimental.elwc_example_gen.component.component.BigQueryToElwcExampleGen

Official TFX BigQueryToElwcExampleGen component.

Inherits From: QueryBasedExampleGen, BaseComponent, BaseNode

The BigQueryToElwcExampleGen component takes a query, and generates train and eval ExampleListWithContext(ELWC) for downstream components.

query BigQuery sql string, query result will be treated as a single split, can be overwritten by input_config.
elwc_config The elwc config contains a list of context feature fields. The fields are used to build context feature. Examples with the same context feature will be converted to an ELWC(ExampleListWithContext) instance. For example, when there are two examples with the same context field, the two examples will be intergrated to a ELWC instance.
input_config An example_gen_pb2.Input instance with Split.pattern as BigQuery sql string. If set, it overwrites the 'query' arg, and allows different queries per split. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Input proto message.
output_config An example_gen_pb2.Output instance, providing output configuration. If unset, default splits will be 'train' and 'eval' with size 2:1. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Output proto message.
example_artifacts Optional channel of 'ExamplesPath' for output train and eval examples.
instance_name Optional unique instance name. Necessary if multiple BigQueryExampleGen components are declared in the same pipeline.

RuntimeError Only one of query and input_config should be set and elwc_config is required.

component_id

component_type

downstream_nodes

exec_properties

id Node id, unique across all TFX nodes in a pipeline.

If id is set by the user, return it directly. otherwise, if instance name (deprecated) is available, node id will be: . otherwise, node id will be:

inputs

outputs

type

upstream_nodes

Child Classes

class DRIVER_CLASS

class SPEC_CLASS

Methods

add_downstream_node

View source

Experimental: Add another component that must run after this one.

This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.

Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.

It is symmetric with add_upstream_node.

Args
downstream_node a component that must run after this node.

add_upstream_node

View source

Experimental: Add another component that must run before this one.

This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.

Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.

It is symmetric with add_downstream_node.

Args
upstream_node a component that must run before this node.

from_json_dict

View source

Convert from dictionary data to an object.

get_class_type

View source

get_id

View source

Gets the id of a node.

This can be used during pipeline authoring time. For example: from tfx.components import Trainer

resolver = ResolverNode(..., model=Channel( type=Model, producer_component_id=Trainer.get_id('my_trainer')))

Args
instance_name (Optional) instance name of a node. If given, the instance name will be taken into consideration when generating the id.

Returns
an id for the node.

to_json_dict

View source

Convert from an object to a JSON serializable dictionary.

with_id

View source

with_platform_config

View source

Attaches a proto-form platform config to a component.

The config will be a per-node platform-specific config.

Args
config platform config to attach to the component.

Returns
the same component itself.

EXECUTOR_SPEC Instance of /home/kbuilder/.local/lib/python3.7/site-packages/tfx/components/base/executor_spec.py:33._NewDeprecatedClass