Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

tf.compat.v1.train.AdagradOptimizer

Optimizer that implements the Adagrad algorithm.

Inherits From: Optimizer

Migrate to TF2

tf.compat.v1.train.AdagradOptimizer is compatible with eager mode and tf.function. When eager execution is enabled, learning_rate, initial_accumulator_value, and epsilon can each be a callable that takes no arguments and returns the actual value to use. This can be useful for changing these values across different invocations of optimizer functions.

To switch to native TF2 style, use tf.keras.optimizers.Adagrad instead. Please notice that due to the implementation differences, tf.keras.optimizers.Adagrad and tf.compat.v1.train.AdagradOptimizer may have slight differences in floating point numerics even though the formula used for the variable updates still matches.

Structural mapping to native TF2

Before:

optimizer = tf.compat.v1.train.AdagradOptimizer(
  learning_rate=learning_rate,
  initial_accumulator_value=initial_accumulator_value)

After:

optimizer = tf.keras.optimizers.Adagrad(
  learning_rate=learning_rate,
  initial_accumulator_value=initial_accumulator_value,
  epsilon=1e-07)

How to map arguments

TF1 Arg Name TF2 Arg Name Note
learning_rate learning_rate Be careful of setting learning_rate tensor value computed from the global step. In TF1 this was usually meant to imply a dynamic learning rate and would recompute in each step. In TF2 (eager + function) it will treat it as a scalar value that only gets computed once instead of a symbolic placeholder to be computed each time.
initial_accumulator_value initial_accumulator_value The argument can be value of zero in TF2, which is not accepted in TF1.|
- epsilon epsilon is become configurable in TF2. The defualt value is changed from 1e-8 to 1e-7
use_locking - Not applicable in TF2.

Before & after usage example

Before:

x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.compat.v1.train.AdagradOptimizer(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))

After:

x = tf.Variable([1,2,3], dtype=tf.float32)
grad = tf.constant([0.1, 0.2, 0.3])
optimizer = tf.keras.optimizers.Adagrad(learning_rate=0.001)
optimizer.apply_gradients(zip([grad], [x]))

Description

Used in the notebooks

Used in the guide

References:

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization :Duchi et al., 2011 (pdf)

learning_rate A Tensor or a floating point value. The learning rate.
initial_accumulator_value A floating point value. Starting value for the accumulators, must be positive.
use_locking If True use locks for update operations.
name Optional name prefix for the operations created when applying gradients. Defaults to "Adagrad".

ValueError If the initial_accumulator_value is invalid.

Methods

apply_gradients

View source

Apply gradients to variables.

This is the second part of minimize(). It returns an Operation that applies gradients.

Args
grads_and_vars List of (gradient, variable) pairs as returned by compute_gradients().
global_step Optional Variable to increment by one after the variables have been updated.
name Optional name for the returned operation. Default to the name passed to the Optimizer constructor.

Returns
An Operation that applies the specified gradients. If global_step was not None, that operation also increments global_step.
</