TensorFlow provides functions to compute the derivatives for a given TensorFlow computation graph, adding operations to the graph. The optimizer classes automatically compute derivatives on your graph, but creators of new Optimizers or expert users can call the lower-level functions below.
tf.gradients(ys, xs, grad_ys=None, name='gradients', colocate_gradients_with_ops=False, gate_gradients=False, aggregation_method=None)
Constructs symbolic partial derivatives of sum of
ys w.r.t. x in
xs are each a
Tensor or a list of tensors.
is a list of
Tensor, holding the gradients received by the
ys. The list must be the same length as
gradients() adds ops to the graph to output the partial
ys with respect to
xs. It returns a list of
Tensor of length
len(xs) where each tensor is the
for y in
grad_ys is a list of tensors of the same length as
ys that holds
the initial gradients for each y in
grad_ys is None,
we fill in a tensor of '1's of the shape of y for each y in
user can provide their own initial
grad_ys to compute the
derivatives using a different initial gradient for each y (e.g., if
one wanted to weight the gradient differently for each value in
Tensoror list of tensors to be differentiated.
Tensoror list of tensors to be used for differentiation.
grad_ys: Optional. A
Tensoror list of tensors the same size as
ysand holding the gradients computed for each y in
name: Optional name to use for grouping all the gradient ops together. defaults to 'gradients'.
colocate_gradients_with_ops: If True, try colocating gradients with the corresponding op.
gate_gradients: If True, add a tuple around the gradients returned for an operations. This avoids some race conditions.
aggregation_method: Specifies the method used to combine gradient terms. Accepted values are constants defined in the class
A list of
sum(dy/dx) for each x in
LookupError: if one of the operations between
ydoes not have a registered gradient function.
ValueError: if the arguments are invalid.
A class listing aggregation methods used to combine gradients.
Computing partial derivatives can require aggregating gradient contributions. This class lists the various methods that can be used to combine gradients in the graph:
ADD_N: All of the gradient terms are summed as part of one operation using the "AddN" op. It has the property that all gradients must be ready before any aggregation is performed.
DEFAULT: The system-chosen default aggregation method.
Stops gradient computation.
When executed in a graph, this op outputs its input tensor as-is.
When building ops to compute gradients, this op prevents the contribution of its inputs to be taken into account. Normally, the gradient generator adds ops to a graph to compute the derivatives of a specified 'loss' by recursively finding out inputs that contributed to its computation. If you insert this op in the graph it inputs are masked from the gradient generator. They are not taken into account for computing gradients.
This is useful any time you want to compute a value with TensorFlow but need to pretend that the value was a constant. Some examples include:
- The EM algorithm where the M-step should not involve backpropagation through the output of the E-step.
- Contrastive divergence training of Boltzmann machines where, when differentiating the energy function, the training must not backpropagate through the graph that generated the samples from the model.
- Adversarial training, where no backprop should happen through the adversarial example generation process.
name: A name for the operation (optional).
Tensor. Has the same type as