Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge


Represents a dataset distributed among devices and machines.

A tf.distribute.DistributedDataset could be thought of as a "distributed" dataset. When you use tf.distribute API to scale training to multiple devices or machines, you also need to distribute the input data, which leads to a tf.distribute.DistributedDataset instance, instead of a instance in the non-distributed case. In TF 2.x, tf.distribute.DistributedDataset objects are Python iterables.

There are two APIs to create a tf.distribute.DistributedDataset object: tf.distribute.Strategy.experimental_distribute_dataset(dataset)and tf.distribute.Strategy.distribute_datasets_from_function(dataset_fn). When to use which? When you have a instance, and the regular batch splitting (i.e. re-batch the input instance with a new batch size that is equal to the global batch size divided by the number of replicas in sync) and autosharding (i.e. the options) work for you, use the former API. Otherwise, if you are not using a canonical instance, or you would like to customize the batch splitting or sharding, you can wrap these logic in a dataset_fn and use the latter API. Both API handles prefetch to device for the user. For more details and examples, follow the links to the APIs.

There are two main usages of a DistributedDataset object:

  1. Iterate over it to generate the input for a single device or multiple devices, which is a tf.distribute.DistributedValues instance. To do this, you can:

    • use a pythonic for-loop construct:
    global_batch_size = 4
    strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
    dataset =[1.],[1.])).repeat(4).batch(global_batch_size)
    dist_dataset = strategy.experimental_distribute_dataset(dataset)
    def train_step(input):
      features, labels = input
      return labels - 0.3 * features
    for x in dist_dataset:
      # train_step trains the model using the dataset elements
      loss =, args=(x,))
      print("Loss is", loss)
        Loss is PerReplica:{
          0: tf.Tensor(
         [0.7]], shape=(2, 1), dtype=float32),
          1: tf.Tensor(
         [0.7]], shape=(2, 1), dtype=float32)
Placing the loop inside a <a href="../../tf/function"><code>tf.function</code></a> will give a performance boost.
However `break` and `return` are currently not supported if the loop is
placed inside a <a href="../../tf/function"><code>tf.function</code></a>. We also don't support placing the loop
inside a <a href="../../tf/function"><code>tf.function</code></a> when using
<a href="../../tf/distribute/experimental/MultiWorkerMirroredStrategy"><code>tf.distribute.experimental.MultiWorkerMirroredStrategy</code></a> or
<a href="../../tf/distribute/experimental/TPUStrategy"><code>tf.distribute.experimental.TPUStrategy</code></a> with multiple workers.
    global_batch_size = 4
    strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
    train_dataset =[1.],[1.])).repeat(50).batch(global_batch_size)
    train_dist_dataset = strategy.experimental_distribute_dataset(train_dataset)
    def distributed_train_step(dataset_inputs):