|TensorFlow 1 version||View source on GitHub|
Optimizer that implements the Adam algorithm.
See Migration guide for more details.
tf.keras.optimizers.Adam( learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False, name='Adam', **kwargs )
Used in the notebooks
|Used in the guide||Used in the tutorials|
Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.
According to Kingma et al., 2014, the method is "computationally efficient, has little memory requirement, invariant to diagonal rescaling of gradients, and is well suited for problems that are large in terms of data/parameters".