Watch keynotes, product sessions, workshops, and more from Google I/O

# tfp.glm.fit_sparse_one_step

One step of (the outer loop of) the GLM fitting algorithm.

This function returns a new value of `model_coefficients`, equal to `model_coefficients_start + model_coefficients_update`. The increment `model_coefficients_update in R^n` is computed by a coordinate descent method, that is, by a loop in which each iteration updates exactly one coordinate of `model_coefficients_update`. (Some updates may leave the value of the coordinate unchanged.)

The particular update method used is to apply an L1-based proximity operator, "soft threshold", whose fixed point `model_coefficients_update^*` is the desired minimum

``````model_coefficients_update^* = argmin{
-LogLikelihood(model_coefficients_start + model_coefficients_update')

+ l1_regularizer *
||model_coefficients_start + model_coefficients_update'||_1
+ l2_regularizer *
||model_coefficients_start + model_coefficients_update'||_2**2
: model_coefficients_update' }
``````

where in each iteration `model_coefficients_update'` has at most one nonzero coordinate.

This update method preserves sparsity, i.e., tends to find sparse solutions if `model_coefficients_start` is sparse. Additionally, the choice of step size is based on curvature (Fisher information matrix), which significantly speeds up convergence.

`model_matrix` (Batch of) matrix-shaped, `float` `Tensor` or `SparseTensor` where each row represents a sample's features. Has shape `[N, n]` where `N` is the number of data samples and `n` is the number of features per sample.
`response` (Batch of) vector-shaped `Tensor` with the same dtype as `model_matrix` where each element represents a sample's observed response (to the corresponding row of features).
`model` `tfp.glm.ExponentialFamily`-like instance, which specifies the link function and distribution of the GLM, and thus characterizes the negative log-likelihood which will be minimized. Must have sufficient statistic equal to the response, that is, `T(y) = y`.
`model_coefficients_start` (Batch of) vector-shaped, `float` `Tensor` with the same dtype as `model_matrix`, representing the initial values of the coefficients for the GLM regression. Has shape `[n]` where `model_matrix` has shape `[N, n]`.
`tolerance` scalar, `float` `Tensor` representing the convergence threshold. The optimization step will terminate early, returning its current value of `model_coefficients_start + model_coefficients_update`, once the following condition is met: ```||model_coefficients_update_end - model_coefficients_update_start||_2 / (1 + ||model_coefficients_start||_2) < sqrt(tolerance)```, where `model_coefficients_update_end` is the value of `model_coefficients_update` at the end of a sweep and `model_coefficients_update_start` is the value of `model_coefficients_update` at the beginning of that sweep.
`l1_regularizer` scalar, `float` `Tensor` representing the weight of the L1 regularization term (see equation above).
`l2_regularizer` scalar, `float` `Tensor` representing the weight of the L2 regularization term (see equation above). Default value: `None` (i.e., no L2 regularization).
`maximum_full_sweeps` Python integer specifying maximum number of sweeps to run. A "sweep" consists of an iteration of coordinate descent on each coordinate. After this many sweeps, the algorithm will terminate even if convergence has not been reached. Default value: `1`.
`learning_rate` scalar, `float` `Tensor` representing a multiplicative factor used to dampen the proximal gradient descent steps. Default value: `None` (i.e., factor is conceptually `1`).
`name` Python string representing the name of the TensorFlow operation. The default name is `"fit_sparse_one_step"`.

`model_coefficients` (Batch of) `Tensor` having the same shape and dtype as `model_coefficients_start`, representing the updated value of `model_coefficients`, that is, ```model_coefficients_start + model_coefficients_update```.
`is_converged` scalar, `bool` `Tensor` indicating whether convergence occurred across all batches within the specified number of sweeps.
`iter` scalar, `int` `Tensor` representing the actual number of coordinate updates made (before achieving convergence). Since each sweep consists of `tf.size(model_coefficients_start)` iterations, the maximum number of updates is `maximum_full_sweeps * tf.size(model_coefficients_start)`.

[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]