|View source on GitHub|
Transforms client data, potentially expanding by adding pseudo-clients.
Each client of the raw_client_data is "expanded" into some number of pseudo-clients. Each client ID is a string consisting of the original client ID plus a concatenated integer index. For example, the raw client id "client_a" might be expanded into pseudo-client ids "client_a_0", "client_a_1" and "client_a_2". A function fn(x) maps datapoint x to a new datapoint, where the constructor of fn is parameterized by the (raw) client_id and index i. For example if x is an image, then make_transform_fn("client_a", 0)(x) might be the identity, while make_transform_fn("client_a", 1)(x) could be a random rotation of the image with the angle determined by a hash of "client_a" and "1". Typically by convention the index 0 corresponds to the identity function if the identity is supported.
__init__( raw_client_data, make_transform_fn, num_transformed_clients )
Initializes the TransformingClientData.
raw_client_data: A ClientData to expand.
make_transform_fn: A function that returns a callable that maps datapoint x to a new datapoint x'. make_transform_fn will be called as make_transform_fn(raw_client_id, i) where i is an integer index, and should return a function fn(x)->x. For example if x is an image, then make_transform_fn("client_a", 0)(x) might be the identity, while make_transform_fn("client_a", 1)(x) could be a random rotation of the image with the angle determined by a hash of "client_a" and "1". If transform_fn_cons returns
None, no transformation is performed. Typically by convention the index 0 corresponds to the identity function if the identity is supported.
num_transformed_clients: The total number of transformed clients to produce. If it is an integer multiple k of the number of real clients, there will be exactly k pseudo-clients per real client, with indices 0...k-1. Any remainder g will be generated from the first g real clients and will be given index k.
The list of string identifiers for clients in this dataset.
Returns the shape of each component of an element of the client datasets.
A nested structure of
tf.TensorShape objects corresponding to each
component of an element of the client datasets.
Returns the type of each component of an element of the client datasets.
A nested structure of
tf.DType objects corresponding to each component
of an element of the client datasets.
Creates a new
tf.data.Dataset containing the client training examples.
client_id: The string client_id for the desired client.
Creates a new
tf.data.Dataset containing all client examples.
NOTE: the returned
tf.data.Dataset is not serializable and runnable on
other devices, as it uses
Currently, the implementation produces a dataset that contains all examples from a single client in order, and so generally additional shuffling should be performed.
seed: Optional, a seed to determine the order in which clients are processed in the joined dataset.
from_clients_and_fn( cls, client_ids, create_tf_dataset_for_client_fn )
ClientData based on the given function.
client_ids: A non-empty list of client_ids which are valid inputs to the create_tf_dataset_for_client_fn.
create_tf_dataset_for_client_fn: A function that takes a client_id from the above list, and returns a
preprocess_fn to each client's data.
train_test_client_split( cls, client_data, num_test_clients )
Returns a pair of (train, test)
This method partitions the clients of
client_data into two
objects with disjoint sets of
ClientData.client_ids. All clients in the
ClientData are guaranteed to have non-empty datasets, but the
ClientData may have clients with no data.
client_data: The base
num_test_clients: How many clients to hold out for testing. This can be at most len(client_data.client_ids) - 1, since we don't want to produce empty
A pair (train_client_data, test_client_data), where test_client_data
num_test_clients selected at random, subject to the constraint they
each have at least 1 batch in their dataset.
num_test_clientscannot be satistifed by
client_data, or too many clients have empty datasets.