Gets unique elements and their counts from the input dataset
.
tff.analytics.data_processing.get_unique_elements_with_counts(
dataset: tf.data.Dataset, max_string_length: Optional[int] = None
) -> Tuple[tf.Tensor, tf.Tensor]
This method returns a tuple of elements
and counts
, where elements
are
the unique elements in the dataset, and counts is the number of times each one
appears.
The input dataset
must yield batched rank-1 tensors. This function reads
each coordinate of the tensor as an individual element and caps the total
number of elements to return.
Args |
dataset
|
A tf.data.Dataset to elements from. Element type must be
tf.string .
|
max_string_length
|
The maximum lenghth (in bytes) of strings in the dataset.
Strings longer than max_string_length will be truncated. Defaults to
None , which means there is no limit of the string length.
|
Returns |
elements
|
A rank-1 Tensor containing all the unique elements of the input
dataset .
|
counts
|
A rank-1 Tensor containing the counts for each of the elements in
elements .
|
Raises |
ValueError
|
-- If the shape of elements in dataset is not rank 1
-- If max_string_length is not None and is less than 1.
|
TypeError
|
If dataset.element_spec.dtype must be tf.string is not
tf.string .
|