Prototype of a distributed computation library for TF.


class AllReduceCrossTowerOps: Reduction using all reduce.

class CrossTowerOps: Base class for cross-tower reduction and broadcasting algorithms.

class DistributionStrategy: A list of devices with a state & compute distribution policy.

class MirroredStrategy: Mirrors vars to distribute across multiple devices on a single machine.

class Monitor: Executes training steps, recovers and checkpoints.

class OneDeviceStrategy: A distribution strategy for running on a single device.

class ReductionToOneDeviceCrossTowerOps: Always do reduction to one device first and then do broadcasting.

class StandardInputStep: Step with a standard implementation of input handling.

class StandardSingleLossStep: A step function that implements a training step for a feed forward network.

class Step: Interface for performing each step of a training algorithm.

class TowerContext: DistributionStrategy API inside a call_for_each_tower() call.


get_cross_tower_context(...): Returns the current DistributionStrategy if in a cross-tower context.

get_distribution_strategy(...): Returns the current DistributionStrategy object.

get_loss_reduction(...): Reduce method_string corresponding to the last loss reduction.

get_tower_context(...): Returns the current TowerContext or None if in a cross-tower context.

has_distribution_strategy(...): Return if there is a current non-default DistributionStrategy.

require_tower_context(...): Verify in tower_ctx tower context.

