tf.contrib.all_reduce.build_ring_all_reduce

tf.contrib.all_reduce.build_ring_all_reduce(
    input_tensors,
    num_workers,
    num_subchunks,
    gpu_perm,
    red_op,
    un_op=None
)

Defined in tensorflow/contrib/all_reduce/python/all_reduce.py.

Construct a subgraph performing a ring-style all-reduce of input_tensors.

Args:

  • input_tensors: a list of T tf.Tensor objects, which must all have the same shape and type.
  • num_workers: number of worker tasks spanned by input_tensors.
  • num_subchunks: number of subchunks each device should process in one tick.
  • gpu_perm: a list of ints giving a ring-wise rank ordering of GPUs at each worker. All workers must have the same number of GPUs with the same rank ordering. If NVLINK is available, this should be a ring order supported by NVLINK edges.
  • red_op: a binary operator for elementwise reduction.
  • un_op: an optional unary operator to apply to fully reduced values.

Raises:

  • ValueError: empty input_tensors or they don't all have same size.

Returns:

a list of T tf.Tensor identical sum-reductions of input_tensors.