Optimizer that implements the Momentum algorithm.

Inherits From: Optimizer

Computes (if use_nesterov = False):

accumulation = momentum * accumulation + gradient
variable -= learning_rate * accumulation

Note that in the dense version of this algorithm, accumulation is updated and applied regardless of a gradient's value, whereas the sparse version (when the gradient is an IndexedSlices, typically because of tf.gather or an embedding) only updates variable slices and corresponding accumulation terms when that part of the variable was used in the forward pass.

learning_rate A Tensor or a floating point value. The learning rate.
momentum A Tensor or a floating point value. The momentum.
use_locking If True use locks for update operations.
name Optional name prefix for the operations created when applying gradients. Defaults to "Momentum".
use_nesterov If True u