# tfp.optimizer.lbfgs_minimize

Applies the L-BFGS algorithm to minimize a differentiable function.

### Used in the notebooks

Used in the tutorials

Performs unconstrained minimization of a differentiable function using the L-BFGS scheme. See [Nocedal and Wright(2006)][1] for details of the algorithm.

### Usage:

The following example demonstrates the L-BFGS optimizer attempting to find the minimum for a simple high-dimensional quadratic objective function.

``````  # A high-dimensional quadratic bowl.
ndims = 60
minimum = np.ones([ndims], dtype='float64')
scales = np.arange(ndims, dtype='float64') + 1.0

# The objective function and the gradient.
lambda x: tf.reduce_sum(
scales * tf.math.squared_difference(x, minimum), axis=-1),
x)
start = np.arange(ndims, 0, -1, dtype='float64')
optim_results = tfp.optimizer.lbfgs_minimize(
initial_position=start,
num_correction_pairs=10,
tolerance=1e-8)

# Check that the search converged
assert(optim_results.converged)
# Check that the argmin is close to the actual value.
np.testing.assert_allclose(optim_results.position, minimum)
``````

### References:

[1] Jorge Nocedal, Stephen Wright. Numerical Optimization. Springer Series in Operations Research. pp 176-180. 2006

http://pages.mtu.edu/~struther/Courses/OLD/Sp2013/5630/Jorge_Nocedal_Numerical_optimization_267490.pdf

`value_and_gradients_function` A Python callable that accepts a point as a real `Tensor` and returns a tuple of `Tensor`s of real dtype containing the value of the function and its gradient at that point. The function to be minimized. The input is of shape `[..., n]`, where `n` is the size of the domain of input points, and all others are batching dimensions. The first component of the return value is a real `Tensor` of matching shape `[...]`. The second component (the gradient) is also of shape `[..., n]` like the input value to the function.
`initial_position` Real `Tensor` of shape `[..., n]`. The starting point, or points when using batching dimensions, of the search procedure. At these points the function value and the gradient norm should be finite. Exactly one of `initial_position` and `previous_optimizer_results` can be non-None.
`previous_optimizer_results` An `LBfgsOptimizerResults` namedtuple to intialize the optimizer state from, instead of an `initial_position`. This can be passed in from a previous return value to resume optimization with a different `stopping_condition`. Exactly one of `initial_position` and `previous_optimizer_results` can be non-None.
`num_correction_pairs` Positive integer. Specifies the maximum number of (position_delta, gradient_delta) correction pairs to keep as implicit approximation of the Hessian matrix.
`tolerance` Scalar `Tensor` of real dtype. Specifies the gradient tolerance for the procedure. If the supremum norm of the gradient vector is below this number, the algorithm is stopped.
`x_tolerance` Scalar `Tensor` of real dtype. If the absolute change in the position between one iteration and the next is smaller than this number, the algorithm is stopped.
`f_relative_tolerance` Scalar `Tensor` of real dtype. If the relative change in the objective value between one iteration and the next is smaller than this value, the algorithm is stopped.
`initial_inverse_hessian_estimate` None. Option currently not supported.
`max_iterations` Scalar positive int32 `Tensor`. The maximum number of iterations for L-BFGS updates.
`parallel_iterations` Positive integer. The number of iterations allowed to run in parallel.
`stopping_condition` (Optional) A Python function that takes as input two Boolean tensors of shape `[...]`, and returns a Boolean scalar tensor. The input tensors are `converged` and `failed`, indicating the current status of each respective batch member; the return value states whether the algorithm should stop. The default is tfp.optimizer.converged_all which only stops when all batch members have either converged or failed. An alternative is tfp.optimizer.converged_any which stops as soon as one batch member has converged, or when all have failed.
`max_line_search_iterations` Python int. The maximum number of iterations for the `hager_zhang` line search algorithm.
`f_absolute_tolerance` Scalar `Tensor` of real dtype. If the absolute change in the objective value between one iteration and the next is smaller than this value, the algorithm is stopped.
`name` (Optional) Python str. The name prefixed to the ops created by this function. If not supplied, the default name 'minimize' is used.

`optimizer_results` A namedtuple containing the following items: converged: Scalar boolean tensor indicating whether the minimum was found within tolerance. failed: Scalar boolean tensor indicating whether a line search step failed to find a suitable step size satisfying Wolfe conditions. In the absence of any constraints on the number of objective evaluations permitted, this value will be the complement of `converged`. However, if there is a constraint and the search stopped due to available evaluations being exhausted, both `failed` and `converged` will be simultaneously False. num_objective_evaluations: The total number of objective evaluations performed. position: A tensor containing the last argument value found during the search. If the search converged, then this value is the argmin of the objective function. objective_value: A tensor containing the value of the objective function at the `position`. If the search converged, then this is the (local) minimum of the objective function. objective_gradient: A tensor containing the gradient of the objective function at the `position`. If the search converged the max-norm of this tensor should be below the tolerance. position_deltas: A tensor encoding information about the latest changes in `position` during the algorithm execution. gradient_deltas: A tensor encoding information about the latest changes in `objective_gradient` during the algorithm execution.

[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]