Returns a distributed TPU mesh optimized for AllReduce ring reductions.
tf.experimental.dtensor.create_tpu_mesh(
mesh_dim_names: List[str],
mesh_shape: List[int],
mesh_name: str,
ring_dims: Optional[int] = None,
ring_axes: Optional[List[str]] = None,
ring_bounds: Optional[List[int]] = None,
can_split_host_across_rings: bool = True,
build_ring_across_rings: bool = False,
rotate_ring_across_rings: bool = False,
use_xla_spmd: bool = layout_lib.USE_XLA_SPMD
) -> tf.experimental.dtensor.Mesh
Only as many as leading axes specified by ring_axes
as necessary will be
used to build rings, as long as the subslice formed by these axes have enough
cores to contain a ring of the required size. The leftover axes in ring_axes
won't affect results.
This function always uses all TPU devices, and offers more customization than
tf.experimental.dtensor.create_distributed_mesh
.
Args |
mesh_dim_names
|
List of mesh dimension names.
|
mesh_shape
|
Shape of the mesh.
|
mesh_name
|
A unique name for the mesh. If empty, internally generate one.
|
ring_dims
|
Optional; The number of leading (ring_dims > 0) or trailing
(ring_dims < 0) mesh dimensions to build rings for. If unspecified, build
rings for all but the first dimension.
|
ring_axes
|
Optional; A permutation of ["x", "y", "z", "core"], specifying
the order of TPU topology axes to build rings in. If unspecified, default
to ["core", "x", "y", "z"].
|
ring_bounds
|
Optional; The maximum number of devices on each axis, in the x,
y, z, core order. If unspecified, default to physical topology limits.
|
can_split_host_across_rings
|
Optional; If true, devices attached to the same
host (i.e., DTensor client) may get assigned to different rings. Setting
it to false may cause some combinations of arguments to be infeasible; see
DeviceAssignmentTest.testCreateMesh[No]SplittingHosts* for examples.
|
build_ring_across_rings
|
Optional; If true, also build a data-parallel ring
across model-parallel rings. This ring could be strided.
|
rotate_ring_across_rings
|
Optional; If true, build the data-parallel ring in
column-major instead of row-major order.
|
use_xla_spmd
|
Boolean when True, will use XLA SPMD instead of
DTensor SPMD.
|