Есть вопрос? Присоединяйтесь к сообществу на форуме TensorFlow. Посетите форум.

tf.distribute.ReplicaContext

A class with a collection of APIs that can be called in a replica context.

You can use tf.distribute.get_replica_context to get an instance of ReplicaContext, which can only be called inside the function passed to tf.distribute.Strategy.run.

strategy = tf.distribute.MirroredStrategy(['GPU:0', 'GPU:1'])
def func():
  replica_context = tf.distribute.get_replica_context()
  return replica_context.replica_id_in_sync_group
strategy.run(func)
PerReplica:{
  0: <tf.Tensor: shape=(), dtype=int32, numpy=0>,
  1: <tf.Tensor: shape=(), dtype=int32, numpy=1>
}

strategy A tf.distribute.Strategy.
replica_id_in_sync_group An integer, a Tensor or None. Prefer an integer whenever possible to avoid issues with nested tf.function. It accepts a Tensor only to be compatible with tpu.replicate.

devices Returns the devices this replica is to be executed on, as a tuple of strings. (deprecated)

num_replicas_in_sync Returns number of replicas that are kept in sync.
replica_id_in_sync_group Returns the id of the replica.

This identifies the replica among all replicas that are kept in sync. The value of the replica id can range from 0 to tf.distribute.ReplicaContext.num_replicas_in_sync - 1.

strategy The current tf.distribute.Strategy object.

Methods

all_gather

View source

All-gathers value across all replicas along axis.

For all strategies except tf.distribute.TPUStrategy, the input value on different replicas must have the same rank, and their shapes must be the same in all dimensions except the axis-th dimension. In other words, their shapes cannot be different in a dimension d where d does not equal to the axis argument. For example, given a tf.distribute.DistributedValues with component tensors of shape (1, 2, 3) and (1, 3, 3) on two replicas, you can call all_gather(..., axis=1, ...) on it, but not all_gather(..., axis=0, ...) or all_gather(..., axis=2, ...). However, with tf.distribute.TPUStrategy, all tensors must have exactly the same rank and same shape.

You can pass in a single tensor to all-gather:

strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
@tf.function
def gather_value():
  ctx = tf.distribute.get_replica_context()
  local_value = tf.constant([1, 2, 3])
  return ctx.all_gather(local_value, axis=0)
result = strategy.run(gather_value)
result
PerReplica:{
  0: <tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 1, 2, 3], dtype=int32)>,
  1: <tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 1, 2, 3], dtype=int32)>
}
strategy.experimental_local_results(result)
(<tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 1, 2, 3],
dtype=int32)>,
<tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 1, 2, 3],
dtype=int32)>)

You can also pass in a nested structure of tensors to all-gather, say, a list:

strategy = tf.distribute.MirroredStrategy(["GPU:0", "GPU:1"])
@tf.function
def gather_nest():
  ctx = tf.distribute.get_replica_context()
  value_1 = tf.constant([1, 2, 3])
  value_2 = tf.constant([[1, 2], [3, 4]])
  # all_gather a nest of `tf.distribute.DistributedValues`
  return ctx.all_gather([value_1, value_2], axis=0)
result = strategy.run(gather_nest)
result
[PerReplica:{
  0: <tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 1, 2, 3], dtype=int32)>,
  1: <tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 1, 2, 3], dtype=int32)>
}, PerReplica:{
  0: <tf.Tensor: shape=(4, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]], dtype=int32)>,
  1: <tf.Tensor: shape=(4, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]], dtype=int32)>
}]
strategy.experimental_local_results(result)
([<tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 1, 2, 3], dtype=int32)>,
<tf.Tensor: shape=(4, 2), dtype=int32, numpy=
array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]], dtype=int32)>],