|TensorFlow 1 version||View source on GitHub|
Synchronous training on TPUs and TPU Pods.
tf.distribute.experimental.TPUStrategy( tpu_cluster_resolver=None, device_assignment=None )
To construct a TPUStrategy object, you need to run the initialization code as below:
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')
strategy = tf.distribute.experimental.TPUStrategy(resolver)
While using distribution strategies, the variables created within the strategy's scope will be replicated across all the replicas and can be kept in sync using all-reduce algorithms.
To run TF2 programs on TPUs, you can either use
.fit APIs in
tf.keras with TPUStrategy, or write your own customized
training loop by calling
strategy.run directly. Note that
TPUStrategy doesn't support pure eager execution, so please make sure the
function passed into
strategy.run is a
strategy.run is called inside a
tf.function if eager
behavior is enabled.
||A tf.distribute.cluster_resolver.TPUClusterResolver, which provides information about the TPU cluster.|
Returns the cluster resolver associated with this strategy.
||Returns number of replicas over which gradients are aggregated.|
distribute_datasets_from_function( dataset_fn, options=None )
tf.data.Dataset instances created by calls to
dataset_fn that users pass in is an input function that has a
tf.distribute.InputContext argument and returns a
instance. It is expected that the returned dataset from
already batched by per-replica batch size (i.e. global batch size divided by
the number of replicas in sync) and sharded.
not batch or shard the
returned from the input function.
dataset_fn will be called on the CPU
device of each of the workers and each generates a dataset whe