Optimizer that implements the Adam algorithm.

Inherits From: Optimizer

Used in the notebooks

Used in the guide Used in the tutorials

Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.

According to Kingma et al., 2014, the method is "computationally efficient, has little memory requirement, invariant to diagonal rescaling of gradients, and is well suited for problems that are large in terms of data/parameters".