Defined in tensorflow/contrib/all_reduce/python/

Construct a subgraph for shuffle all-reduce.

Shuffle reduce is essentially the algorithm implemented when using parameter servers. Suppose tensor length is n, there are d devices and g gather shards. Each device sends a n/g length sub-tensor to each gather shard. The gather shards perform a reduction across d fragments, then broadcast the result back to each device. The devices then join the g fully reduced fragments they receive from the shards. The gather shards could perform d-1 pairwise reductions, or one d-way reduction. The first is better where reduction Op time is low compared to transmission time, the second better in the other case.


  • input_tensors: list of T @(tf.Tensor} values to be reduced.
  • gather_devices: list of names of devices on which reduction shards should be placed.
  • red_op: an n-array elementwise reduction Op
  • un_op: optional elementwise unary Op to be applied to fully-reduced values.


list of T tf.Tensor which are the fully reduced tensors.