Save the date! Google I/O returns May 18-20 Register now


Optimizer that implements the Adagrad algorithm.

Inherits From: Optimizer

Used in the notebooks

Used in the tutorials

Adagrad is an optimizer with parameter-specific learning rates, which are adapted relative to how frequently a parameter gets updated during training. The more updates a parameter receives, the smaller the updates.

learning_rate A Tensor, floating point value, or a schedule that is a tf.keras.optimizers.schedules.LearningRateSchedule. The learning rate.
initial_accumulator_value A floating point value. Starting value for the accumulators, must be non-negative.
epsilon A small floating point value to avoid zero denominator.
name Optional name prefix for the operations created when applying gradients. Defaults to "Adagrad".
**kwargs Keyword arguments. Allowed to be one of "clipnorm" or "clipvalue". "clipnorm" (float) clips gradients by norm; "clipvalue" (float) clips gradients by value.


ValueError in case of any invalid argument.