tf.train provides a set of classes and functions that help train models.


The Optimizer base class provides methods to compute gradients for a loss and apply gradients to variables. A collection of subclasses implement classic optimization algorithms such as GradientDescent and Adagrad.

You never instantiate the Optimizer class itself, but instead instantiate one of the subclasses.

See tf.contrib.opt for more optimizers.

Gradient Computation

TensorFlow provides functions to compute the derivatives for a given TensorFlow computation graph, adding operations to the graph. The optimizer classes automatically compute derivatives on your graph, but creators of new Optimizers or expert users can call the lower-level functions below.

Gradient Clipping

TensorFlow provides several operations that you can use to add clipping functions to your graph. You can use these functions to perform general data clipping, but they're particularly useful for handling exploding or vanishing gradients.

Decaying the learning rate

Moving Averages

Some training algorithms, such as GradientDescent and Momentum often benefit from maintaining a moving average of variables during optimization. Using the moving averages for evaluations often improve results significantly.

Coordinator and QueueRunner

See Threading and Queues for how to use threads and queues. For documentation on the Queue API, see Queues.

Distributed execution

See Distributed TensorFlow for more information about how to configure a distributed TensorFlow program.

Reading Summaries from Event Files

See Summaries and TensorBoard for an overview of summaries, event files, and visualization in TensorBoard.

Training Hooks

Hooks are tools that run in the process of training/evaluation of the model.

Training Utilities