View source on GitHub

Optimization parameters for Adagrad with TPU embeddings.

    learning_rate, initial_accumulator=0.1, use_gradient_accumulation=True,
    clip_weight_min=None, clip_weight_max=None, weight_decay_factor=None,

Pass this to tf.estimator.tpu.experimental.EmbeddingConfigSpec via the optimization_parameters argument to set the optimizer and its parameters. See the documentation for tf.estimator.tpu.experimental.EmbeddingConfigSpec for more details.

estimator = tf.estimator.tpu.TPUEstimator(


  • learning_rate: used for updating embedding table.
  • initial_accumulator: initial accumulator for Adagrad.
  • use_gradient_accumulation: setting this to False makes embedding gradients calculation less accurate but faster. Please see optimization_parameters.proto for details. for details.
  • clip_weight_min: the minimum value to clip by; None means -infinity.
  • clip_weight_max: the maximum value to clip by; None means +infinity.
  • weight_decay_factor: amount of weight decay to apply; None means that the weights are not decayed.
  • multiply_weight_decay_factor_by_learning_rate: if true, weight_decay_factor is multiplied by the current learning rate.