Official TFX StatisticsGen component.
Inherits From: BaseBeamComponent
, BaseComponent
, BaseNode
tfx.v1.components.StatisticsGen(
examples: tfx.v1.types.BaseChannel
,
schema: Optional[tfx.v1.types.BaseChannel
] = None,
stats_options: Optional[tfdv.StatsOptions] = None,
exclude_splits: Optional[List[str]] = None
)
Used in the notebooks
Used in the tutorials |
---|
The StatisticsGen component generates features statistics and random samples over training data, which can be used for visualization and validation. StatisticsGen uses Apache Beam and approximate algorithms to scale to large datasets.
Example
# Computes statistics over data for visualization and example validation.
statistics_gen = StatisticsGen(examples=example_gen.outputs['examples'])
Component outputs
contains:
statistics
: Channel of typestandard_artifacts.ExampleStatistics
for statistics of each split provided in the input examples.
Please see the StatisticsGen guide for more details.
Args | |
---|---|
examples
|
A BaseChannel of ExamplesPath type, likely generated by the
ExampleGen component.
This needs to contain two splits labeled train and eval .
required
|
schema
|
A Schema channel to use for automatically configuring the value
of stats options passed to TFDV.
|
stats_options
|
The StatsOptions instance to configure optional TFDV
behavior. When stats_options.schema is set, it will be used instead of
the schema channel input. Due to the requirement that stats_options be
serialized, the slicer functions and custom stats generators are not
usable, and an error will be raised if either is specified.
|
exclude_splits
|
Names of splits where statistics and sample should not be generated. Default behavior (when exclude_splits is set to None) is excluding no splits. |
Attributes | |
---|---|
outputs
|
Component's output channel dict. |
Methods
with_beam_pipeline_args
with_beam_pipeline_args(
beam_pipeline_args: Iterable[Union[str, placeholder.Placeholder]]
) -> 'BaseBeamComponent'
Add per component Beam pipeline args.
Args | |
---|---|
beam_pipeline_args
|
List of Beam pipeline args to be added to the Beam executor spec. |
Returns | |
---|---|
the same component itself. |
with_node_execution_options
with_node_execution_options(
node_execution_options: utils.NodeExecutionOptions
) -> typing_extensions.Self
Class Variables | |
---|---|
POST_EXECUTABLE_SPEC |
None
|
PRE_EXECUTABLE_SPEC |
None
|