Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

tfx.components.CsvExampleGen

View source on GitHub

Official TFX CsvExampleGen component.

Inherits From: FileBasedExampleGen

tfx.components.CsvExampleGen(
    input=None, input_config=None, output_config=None, example_artifacts=None,
    input_base=None, instance_name=None
)

Used in the notebooks

Used in the tutorials

The csv examplegen component takes csv data, and generates train and eval examples for downsteam components.

Args:

  • input: A Channel of type standard_artifacts.ExternalArtifact, which includes one artifact whose uri is an external directory containing csv files (required).
  • input_config: An example_gen_pb2.Input instance, providing input configuration. If unset, the files under input_base will be treated as a single split. If any field is provided as a RuntimeParameter, input_config should be constructed as a dict with the same field names as Input proto message.
  • output_config: An example_gen_pb2.Output instance, providing output configuration. If unset, default splits will be 'train' and 'eval' with size 2:1. If any field is provided as a RuntimeParameter, output_config should be constructed as a dict with the same field names as Output proto message.
  • example_artifacts: Optional channel of 'ExamplesPath' for output train and eval examples.
  • input_base: Backwards compatibility alias for the 'input' argument.
  • instance_name: Optional unique instance name. Necessary if multiple CsvExampleGen components are declared in the same pipeline.

Attributes:

  • component_id: DEPRECATED FUNCTION

  • component_type: DEPRECATED FUNCTION

  • downstream_nodes

  • exec_properties

  • id: Node id, unique across all TFX nodes in a pipeline.

    If instance name is available, node_id will be: . otherwise, node_id will be:

  • inputs

  • outputs

  • type

  • upstream_nodes

Child Classes

class DRIVER_CLASS

class SPEC_CLASS

Methods

add_downstream_node

View source

add_downstream_node(
    downstream_node
)

add_upstream_node

View source

add_upstream_node(
    upstream_node
)

from_json_dict

View source

@classmethod
from_json_dict(
    cls, dict_data
)

Convert from dictionary data to an object.

get_id

View source

@classmethod
get_id(
    cls, instance_name=None
)

Gets the id of a node.

This can be used during pipeline authoring time. For example: from tfx.components import Trainer

resolver = ResolverNode(..., model=Channel( type=Model, producer_component_id=Trainer.get_id('my_trainer')))

Args:

  • instance_name: (Optional) instance name of a node. If given, the instance name will be taken into consideration when generating the id.

Returns:

an id for the node.

to_json_dict

View source

to_json_dict()

Convert from an object to a JSON serializable dictionary.

Class Variables

  • EXECUTOR_SPEC