Google I/O is a wrap! Catch up on TensorFlow sessions

tf.keras.optimizers.SGD

Stochastic gradient descent and momentum optimizer.

Inherits From: `Optimizer`

Computes:

``````theta(t+1) = theta(t) - learning_rate * gradient
gradient is evaluated at theta(t).
``````

or Computes (if `nesterov = False`):

``````v(t+1) = momentum * v(t) - learning_rate * gradient
theta(t+1) = theta(t) + v(t+1)
if `nesterov` is False, gradient is evaluated at theta(t).
if `nesterov` is True, gradient is evaluated at theta(t) + momentum * v(t),
and the variables always store theta + m v instead of theta
``````

Some of the args below are hyperparameters, where a hyperparameter is defined as a scalar Tensor, a regular Python value, or a callable (which will be evaluated when `apply_gradients` is called) returning a scalar Tensor or a Python value.

References

``````nesterov = True, See [Sutskever et al., 2013](
http://jmlr.org/proceedings/papers/v28/sutskever13.pdf).
``````

`learning_rate` float hyperparameter >= 0. Learning rate.
`momentum` float hyperparameter >= 0 that accelerates SGD in the relevant direction and dampens oscillations.
`nesterov` boolean. Whether to apply Nesterov momentum.
`name` Optional name prefix for the operations created when applying gradients. Defaults to 'SGD'.
`**kwargs` keyword arguments. Allowed to be {`clipnorm`, `clipvalue`, `lr`, `decay`}. `clipnorm` is clip gradients by norm; `clipvalue` is clip gradients by value, `decay` is included for backward compatibility to allow time inverse decay of learning rate. `lr` is included for backward compatibility, recommended to use `learning_rate` instead.

Eager Compatibility

When eager execution is enabled, learning_rate can be a callable that takes no arguments and returns the actual value to use. This can be useful for changing these values across different invocations of optimizer functions.

`iterations` Variable. The number of training steps this Optimizer has run.
`weights` Returns variables of this Optimizer based on the order created.

Methods

`add_slot`

View source

Add a new slot variable for `var`.

View source

`apply_gradients`

View source

Apply gradients to variables.

This is the second part of `minimize()`. It returns an `Operation` that applies gradients.

Args
`grads_and_vars` List of (gradient, variable) pairs.
`name` Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.

Returns
An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.

Raises
`TypeError` If `grads_and_vars` is malformed.
`ValueError` If none of the variables have gradients.

`from_config`

View source

Creates an optimizer from its config.

This method is the reverse of `get_config`, capable of instantiating the same optimizer from the config dictionary.

Arguments
`config` A Python dictionary, typically the output of get_config.
`custom_objects` A Python dictionary mapping names to additional Python objects used to create this optimizer, such as a function used for a hyperparameter.

Returns
An optimizer instance.

`get_config`

View source

Returns the config of the optimimizer.

An optimizer config is a Python dictionary (serializable) containing the configuration of an optimizer. The same optimizer can be reinstantiated later (without any saved state) from this configuration.

Returns
Python dictionary.

`get_gradients`

View source

Returns gradients of `loss` with respect to `params`.

Arguments
`loss` Loss tensor.
`params` List of variables.

Returns
List of gradient tensors.

Raises
`ValueError` In case any gradient cannot be computed (e.g. if gradient function not implemented).

View source

`get_slot_names`

View source

A list of names for this optimizer's slots.

View source

View source

`minimize`

View source

Minimize `loss` by updating `var_list`.

This method simply computes gradient using `tf.GradientTape` and calls `apply_gradients()`. If you want to process the gradient before applying then call `tf.GradientTape` and `apply_gradients()` explicitly instead of using this function.

Args
`loss` A callable taking no arguments which returns the value to minimize.
`var_list` list or tuple of `Variable` objects to update to minimize `loss`, or a callable returning the list or tuple of `Variable` objects. Use callable when the variable list would otherwise be incomplete before `minimize` since the variables are created at the first time `loss` is called.
`grad_loss` Optional. A `Tensor` holding the gradient computed for `loss`.
`name` Optional name for the returned operation.

Returns
An Operation that updates the variables in `var_list`. If `global_step` was not `None`, that operation also increments `global_step`.

Raises
`ValueError` If some of the variables are not `Variable` objects.

View source

`variables`

View source

Returns variables of this Optimizer based on the order created.

[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]