The following classes are available globally.
A mutable, shareable, owning reference to a tensor.
public final class Parameter<Scalar> where Scalar : TensorFlowScalar
Class wrapping a C pointer to a TensorHandle. This class owns the TensorHandle and is responsible for destroying it.
public class TFETensorHandle : _AnyTensorHandle
It is recommended to leave the parameters of this optimizer at their default values (except for the learning rate, which can be freely tuned). This optimizer is usually a good choice for recurrent neural networks.
Individually adapts the learning rates of all model parameters by scaling them inversely proportional to the square root of the sum of all the historical squared values of the gradient.
ADADELTA is a more robust extension of AdaGrad. ADADELTA adapts learning rates based on a moving window of gradient updates rather than by accumulating all past gradient norms. It can thus adapt faster to changing dynamics of the optimization problem space.
ADADELTA: An Adaptive Learning Rate Method
A variant of Adam based on the infinity-norm.
Reference: Section 7 of
Adam - A Method for Stochastic Optimization
This algorithm is a modification of Adam with better convergence properties when close to local optima.
On the Convergence of Adam and Beyond
Stochastic gradient descent (SGD) optimizer.
An optimizer that implements stochastic gradient descent, with support for momentum, learning rate decay, and Nesterov momentum.