tfdv.generate_statistics_from_dataframe

Compute data statistics for the input pandas DataFrame.

This is a utility function for users with in-memory data represented as a pandas DataFrame.

This function supports only DataFrames with columns of primitive string or numeric types. DataFrames with multivalent features or holding non-string object types are not supported.

dataframe Input pandas DataFrame.
stats_options tfdv.StatsOptions for generating data statistics.
n_jobs Number of processes to run (defaults to 1). If -1 is provided, uses the same number of processes as the number of CPU cores.

A DatasetFeatureStatisticsList proto.