tf.distribute.cluster_resolver.GCEClusterResolver

ClusterResolver for Google Compute Engine.

Inherits From: ClusterResolver

This is an implementation of cluster resolvers for the Google Compute Engine instance group platform. By specifying a project, zone, and instance group, this will retrieve the IP address of all the instances within the instance group and return a ClusterResolver object suitable for use for distributed TensorFlow.

Usage example with tf.distribute.Strategy:

# On worker 0
cluster_resolver = GCEClusterResolver("my-project", "us-west1",
                                      "my-instance-group",
                                      task_type="worker", task_id=0)
strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy(
    cluster_resolver=cluster_resolver)

# On worker 1
cluster_resolver = GCEClusterResolver("my-project", "us-west1",
                                      "my-instance-group",
                                      task_type="worker", task_id=1)
strategy = tf.distribute.experimental.MultiWorkerMirroredStrategy(
    cluster_resolver=cluster_resolver)

project Name of the GCE project.
zone Zone of the GCE instance group.
instance_group Name of the GCE instance group.
port Port of the listening TensorFlow server (default: 8470)
task_type Name of the TensorFlow job this GCE instance group of VM instances belong to.
task_id The task index for this particular VM, within the GCE instance group. In particular, every single instance should be assigned a unique ordinal index within an instance group manually so that they can be distinguished from each other.
rpc_layer The RPC layer TensorFlow should use to communicate across instances.
credentials GCE Credentials. If nothing is specified, this defaults to GoogleCredentials.get_application_default().