tfx.components.ExampleValidator

View source on GitHub

Class ExampleValidator

A TFX component to validate input examples.

Inherits From: BaseComponent

Aliases: tfx.components.example_validator.component.ExampleValidator

Used in the tutorials:

The ExampleValidator component uses Tensorflow Data Validation to validate the statistics of some splits on input examples against a schema.

The ExampleValidator component identifies anomalies in training and serving data. The component can be configured to detect different classes of anomalies in the data. It can: - perform validity checks by comparing data statistics against a schema that codifies expectations of the user. - detect data drift by looking at a series of data. - detect changes in dataset-wide data (i.e., num_examples) across spans or versions.

Schema Based Example Validation The ExampleValidator component identifies any anomalies in the example data by comparing data statistics computed by the StatisticsGen component against a schema. The schema codifies properties which the input data is expected to satisfy, and is provided and maintained by the user.

Please see https://www.tensorflow.org/tfx/data_validation for more details.

Example

# Performs anomaly detection based on statistics and data schema.
validate_stats = ExampleValidator(
    statistics=statistics_gen.outputs['statistics'],
    schema=infer_schema.outputs['schema'])

__init__

View source

__init__(
    statistics=None,
    schema=None,
    output=None,
    stats=None,
    instance_name=None
)

Construct an ExampleValidator component.

Args:

  • statistics: A Channel of 'ExampleStatisticsPath` type. This should contain at least 'eval' split. Other splits are ignored currently.
  • schema: A Channel of "SchemaPath' type. required
  • output: Output channel of 'ExampleValidationPath' type.
  • stats: Backwards compatibility alias for the 'statistics' argument.
  • instance_name: Optional name assigned to this specific instance of ExampleValidator. Required only if multiple ExampleValidator components are declared in the same pipeline.

Either stats or statistics must be present in the arguments.

Child Classes

class DRIVER_CLASS

class SPEC_CLASS

Properties

component_id

DEPRECATED FUNCTION

component_type

DEPRECATED FUNCTION

downstream_nodes

exec_properties

id

Node id, unique across all TFX nodes in a pipeline.

If instance name is available, node_id will be: . otherwise, node_id will be:

Returns:

node id.

inputs

outputs

type

upstream_nodes

Methods

add_downstream_node

View source

add_downstream_node(downstream_node)

add_upstream_node

View source

add_upstream_node(upstream_node)

from_json_dict

View source

from_json_dict(
    cls,
    dict_data
)

Convert from dictionary data to an object.

to_json_dict

View source

to_json_dict()

Convert from an object to a JSON serializable dictionary.

Class Members

  • EXECUTOR_SPEC