The implementation is based on tf.data.Dataset.from_tensor_slices. This
class is intended only for constructing toy federated datasets, especially
to support simulation tests. Using this for large datasets is not
recommended, as it requires putting all client data into the underlying
TensorFlow graph (which is memory intensive).
A dictionary keyed by client_id, where values are
lists, tuples, or dicts for passing to
tf.data.Dataset.from_tensor_slices. Note that namedtuples and attrs
classes are not explicitly supported, but a user can convert their data
from those formats to a dict, and then use this class.
If a client with no data is found.
If tensor_slices_dict is not a dictionary, or its value
structures are namedtuples, or its value structures are not either
strictly lists, strictly (standard, non-named) tuples, or strictly
If flattened values in tensor_slices_dict convert to different
TensorFlow data types.
A list of string identifiers for clients in this dataset.
This method partitions the clients of client_data into two ClientData
objects with disjoint sets of ClientData.client_ids. All clients in the
test ClientData are guaranteed to have non-empty datasets, but the
training ClientData may have clients with no data.
The base ClientData to split.
How many clients to hold out for testing. This can be at
most len(client_data.client_ids) - 1, since we don't want to produce
Optional seed to fix shuffling of clients before splitting. The seed
can be any nonnegative 32-bit integer, an array of such integers, or
A pair (train_client_data, test_client_data), where test_client_data
has num_test_clients selected at random, subject to the constraint they
each have at least 1 batch in their dataset.
If num_test_clients cannot be satistifed by client_data,
or too many clients have empty datasets.