TPU related configuration required by
iterations_per_loop: This is the number of train steps running in TPU system before returning to CPU host for each
Session.run. This means global step is increased
iterations_per_looptimes in one
Session.run. It is recommended to be set as number of global steps for next checkpoint.
num_shards: (Deprecated, ignored by TPUEstimator). The number of model replicas in the system. For non-model-parallelism case, this number equals the total number of TPU cores. For model-parallelism, the total number of TPU cores equals product(computation_shape) * num_shards.
num_cores_per_replica: Defaults to
None, which disables model parallelism. An integer which describes the number of TPU cores per model replica. This is required by model-parallelism which enables partitioning the model to multiple cores. Currently num_cores_per_replica must be 1, 2, 4, or 8.
input_fnis invoked once on each host. With the per-core input pipeline configuration, it is invoked once for each core. With a global batch size
TPUEstimatorconstructor, the batch size for each shard is
train_batch_size// #hosts in the
PER_HOST_V2mode, it is
train_batch_size// #cores. In
input_fnis only invoked once on host 0 and the tensors are broadcasted to all other replicas. The batch size equals to train_batch_size
. With the per-core input pipeline configuration, the shard batch size is alsotrain_batch_size` // #cores.
Note: per_host_input_for_training==PER_SHARD_V1 only supports mode.TRAIN.
tpu_job_name: The name of the TPU job. Typically, this name is auto-inferred within TPUEstimator, however when using ClusterSpec propagation in more esoteric cluster configurations, you may need to specify the job name as a string.
initial_infeed_sleep_secs: The number of seconds the infeed thread should wait before enqueueing the first batch. This helps avoid timeouts for models that require a long compilation time.
Alias for field number 5
Alias for field number 0
Alias for field number 2
Alias for field number 1
Alias for field number 3
Alias for field number 4
@staticmethod __new__( cls, iterations_per_loop=2, num_shards=None, num_cores_per_replica=None, per_host_input_for_training=True, tpu_job_name=None, initial_infeed_sleep_secs=None )