Class TPUConfig
Defined in tensorflow/contrib/tpu/python/tpu/tpu_config.py
.
TPU related configuration required by TPUEstimator
.
Args:
iterations_per_loop
: This is the number of train steps running in TPU system before returning to CPU host for eachSession.run
. This means global step is increasediterations_per_loop
times in oneSession.run
. It is recommended to be set as number of global steps for next checkpoint.num_shards
: (Deprecated, ignored by TPUEstimator). The number of model replicas in the system. For non-model-parallelism case, this number equals the total number of TPU cores. For model-parallelism, the total number of TPU cores equals product(computation_shape) * num_shards.num_cores_per_replica
: Defaults toNone
, which disables model parallelism. An integer which describes the number of TPU cores per model replica. This is required by model-parallelism which enables partitioning the model to multiple cores. Currently num_cores_per_replica must be 1, 2, 4, or 8.per_host_input_for_training
: IfTrue
,PER_HOST_V1
, orPER_HOST_V2
,input_fn
is invoked once on each host. With the per-core input pipeline configuration, it is invoked once for each core. With a global batch sizetrain_batch_size
inTPUEstimator
constructor, the batch size for each shard istrain_batch_size
// #hosts in theTrue
orPER_HOST_V1
mode. InPER_HOST_V2
mode, it istrain_batch_size
// #cores. InBROADCAST
mode,input_fn
is only invoked once on host 0 and the tensors are broadcasted to all other replicas. The batch size equals to train_batch_size. With the per-core input pipeline configuration, the shard batch size is also
train_batch_size` // #cores.Note
: per_host_input_for_training==PER_SHARD_V1 only supports mode.TRAIN.tpu_job_name
: The name of the TPU job. Typically, this name is auto-inferred within TPUEstimator, however when using ClusterSpec propagation in more esoteric cluster configurations, you may need to specify the job name as a string.initial_infeed_sleep_secs
: The number of seconds the infeed thread should wait before enqueueing the first batch. This helps avoid timeouts for models that require a long compilation time.Raises
: *ValueError
: Ifcomputation_shape
orcomputation_shape
are invalid.
Properties
initial_infeed_sleep_secs
Alias for field number 5
iterations_per_loop
Alias for field number 0
num_cores_per_replica
Alias for field number 2
num_shards
Alias for field number 1
per_host_input_for_training
Alias for field number 3
tpu_job_name
Alias for field number 4
Methods
__new__
@staticmethod
__new__(
cls,
iterations_per_loop=2,
num_shards=None,
num_cores_per_replica=None,
per_host_input_for_training=True,
tpu_job_name=None,
initial_infeed_sleep_secs=None
)