Join us at TensorFlow World, Oct 28-31. Use code TF20 for 20% off select passes. Register now

tfdv.infer_schema

tfdv.infer_schema(
    statistics,
    infer_feature_shape=True,
    max_string_domain_size=100
)

Infers schema from the input statistics.

Args:

  • statistics: A DatasetFeatureStatisticsList protocol buffer. Schema inference is currently only supported for lists with a single DatasetFeatureStatistics proto.
  • infer_feature_shape: A boolean to indicate if shape of the features need to be inferred from the statistics.
  • max_string_domain_size: Maximum size of the domain of a string feature in order to be interpreted as a categorical feature.

Returns:

A Schema protocol buffer.

Raises:

  • TypeError: If the input argument is not of the expected type.
  • ValueError: If the input statistics proto does not have only one dataset.