Returns a list of tensors with the all-reduce sum across
The computation is done with an all-reduce operation, so if only some of the returned tensors are evaluated then the computation will hang.
tensors: The input tensors across which to sum; must be assigned to GPU devices.
List of tensors, each with the sum of the input tensors, where tensor i has
the same device as