tff.analytics.data_processing.get_capped_elements_with_counts

Gets the capped elements with counts from the input dataset.

The input dataset must yield batched rank-1 tensors. This function reads each coordinate of the tensor as an individual element and caps the total number of elements to return. Note either none of the elements in one batch is added to the returned result, or all the elements are added. This means the length of the returned list of elements could be less than max_user_contribution when dataset is capped.

dataset A tf.data.Dataset.
max_user_contribution The maximum number of elements to return.
batch_size The number of elements in each batch of dataset.
string_max_bytes The maximum length (in bytes) of strings in the dataset. Strings longer than string_max_bytes will be truncated. Defaults to None, which means there is no limit of the string length.

elements A rank-1 Tensor containing the unique elements of the input dataset after being capped. If the total number of elements is less than or equal to max_user_contribution, returns all the elements in dataset.
counts A rank-1 Tensor containing the counts for each of the elements in elements.

ValueError -- If the shape of elements in dataset is not rank 1. -- If max_user_contribution is less than 1. -- If batch_size is less than 1. -- If string_max_bytes is not None and is less than 1.
TypeError If dataset.element_spec.dtype must be tf.string is not tf.string.