tfdv.CombinerStatsGenerator

Class CombinerStatsGenerator

Generate statistics using combiner function.

This object mirrors a beam.CombineFn.

__init__

__init__(
    name,
    schema=None
)

Initializes a statistics generator.

Args:

  • name: A unique name associated with the statistics generator.
  • schema: An optional schema for the dataset.

Properties

name

schema

Methods

add_input

add_input(
    accumulator,
    input_batch
)

Returns result of folding a batch of inputs into accumulator.

Args:

  • accumulator: The current accumulator.
  • input_batch: A Python dict whose keys are strings denoting feature names and values are lists representing a batch of examples, which should be added to the accumulator.

Returns:

The accumulator after updating the statistics for the batch of inputs.

create_accumulator

create_accumulator()

Returns a fresh, empty accumulator.

Returns:

An empty accumulator.

extract_output

extract_output(accumulator)

Returns result of converting accumulator into the output value.

Args:

  • accumulator: The final accumulator value.

Returns:

A proto representing the result of this stats generator.

merge_accumulators

merge_accumulators(accumulators)

Merges several accumulators to a single accumulator value.

Args:

  • accumulators: The accumulators to merge.

Returns:

The merged accumulator.