tf.distribute.Strategy

A state & compute distribution policy on a list of devices.

See the guide for overview and examples. See tf.distribute.StrategyExtended and tf.distribute for a glossory of concepts mentioned on this page such as "per-replica", replica, and reduce.

In short:

A custom training loop can be as simple as:

with my_strategy.scope():
  @tf.function
  def distribute_train_epoch(dataset):
    def replica_fn(input):
      # process input and return result
      return result

    total_result = 0
    for x in dataset:
      per_replica_result = my_strategy.run(replica_fn, args=(x,))
      total_result += my_strategy.reduce(tf.distribute.ReduceOp.SUM,
                                         per_replica_result, axis=None)
    return total_result

  dist_da