|View source on GitHub|
Infers schema from the input statistics.
tfdv.infer_schema( statistics, infer_feature_shape=True, max_string_domain_size=100, schema_transformations=None )
statistics: A DatasetFeatureStatisticsList protocol buffer. Schema inference is currently supported only for lists with a single DatasetFeatureStatistics proto or lists with multiple DatasetFeatureStatistics protos corresponding to data slices that include the default slice (i.e., the slice with all examples). If a list with multiple DatasetFeatureStatistics protos is used, this function will infer the schema from the statistics corresponding to the default slice.
infer_feature_shape: A boolean to indicate if shape of the features need to be inferred from the statistics.
max_string_domain_size: Maximum size of the domain of a string feature in order to be interpreted as a categorical feature.
schema_transformations: List of transformation functions to apply to the auto-inferred schema. Each transformation function should take the schema and statistics as input and should return the transformed schema. The transformations are applied in the order provided in the list.
A Schema protocol buffer.
TypeError: If the input argument is not of the expected type.
ValueError: If the input statistics proto contains multiple datasets, none of which corresponds to the default slice.