Module: tfdv

View source on GitHub

Init module for TensorFlow Data Validation.

Classes

class CombinerStatsGenerator: Generate statistics using combiner function.

class DecodeCSV: Decodes CSV records into Arrow RecordBatches.

class FeaturePath: Represents the path to a feature in an input example.

class GenerateStatistics: API for generating data statistics.

class LiftStatsGenerator: A transform stats generator for computing lift between two features.

class NonStreamingCustomStatsGenerator: Estimates custom statistics in a non-streaming fashion.

class StatsOptions: Options for generating statistics.

class TransformStatsGenerator: Generate statistics using a Beam PTransform.

class WriteStatisticsToTFRecord: API for writing serialized data statistics to TFRecord file.

class WriteStatisticsToText: API for writing serialized data statistics to text file.

Functions

DecodeTFExample(...): Decodes serialized TF examples into Arrow RecordBatches.

compare_slices(...): Compare statistics of two slices using Facets.

display_anomalies(...): Displays the input anomalies.

display_schema(...): Displays the input schema.

generate_statistics_from_csv(...): Compute data statistics from CSV files.

generate_statistics_from_dataframe(...): Compute data statistics for the input pandas DataFrame.

generate_statistics_from_tfrecord(...): Compute data statistics from TFRecord files containing TFExamples.

get_domain(...): Get the domain associated with the input feature from the schema.

get_feature(...): Get a feature from the schema.

get_feature_value_slicer(...): Returns a function that generates sliced record batches for a given one.

get_slice_stats(...): Get statistics associated with a specific slice.

infer_schema(...): Infers schema from the input statistics.

load_anomalies_text(...): Loads the Anomalies proto stored in text format in the input path.

load_schema_text(...): Loads the schema stored in text format in the input path.

load_statistics(...): Loads data statistics proto from file.

load_stats_text(...): Loads the specified DatasetFeatureStatisticsList proto stored in text format.

set_domain(...): Sets the domain for the input feature in the schema.

update_schema(...): Updates input schema to conform to the input statistics.

validate_examples_in_csv(...): Validates examples in csv files.

validate_examples_in_tfrecord(...): Validates TFExamples in TFRecord files.

validate_instance(...): Validates a batch of examples against the schema provided in options.

validate_statistics(...): Validates the input statistics against the provided input schema.

visualize_statistics(...): Visualize the input statistics using Facets.

write_anomalies_text(...): Writes the Anomalies proto to a file in text format.

write_schema_text(...): Writes input schema to a file in text format.

write_stats_text(...): Writes a DatasetFeatureStatisticsList proto to a file in text format.

Other Members

  • __version__ = '0.22.1'