tfp.glm.fit

Runs multiple Fisher scoring steps.

tfp.glm.fit(
    model_matrix,
    response,
    model,
    model_coefficients_start=None,
    predicted_linear_response_start=None,
    l2_regularizer=None,
    dispersion=None,
    offset=None,
    convergence_criteria_fn=None,
    learning_rate=None,
    fast_unsafe_numerics=True,
    maximum_iterations=None,
    l2_regularization_penalty_factor=None,
    name=None
)

Used in the notebooks

Used in the tutorials
Generalized Linear Models

Args
`model_matrix`	(Batch of) `float`-like, matrix-shaped `Tensor` where each row represents a sample's features.
`response`	(Batch of) vector-shaped `Tensor` where each element represents a sample's observed response (to the corresponding row of features). Must have same `dtype` as `model_matrix`.
`model`	`tfp.glm.ExponentialFamily`-like instance which implicitly characterizes a negative log-likelihood loss by specifying the distribuion's `mean`, `gradient_mean`, and `variance`.
`model_coefficients_start`	Optional (batch of) vector-shaped `Tensor` representing the initial model coefficients, one for each column in `model_matrix`. Must have same `dtype` as `model_matrix`. Default value: Zeros.
`predicted_linear_response_start`	Optional `Tensor` with `shape`, `dtype` matching `response`; represents `offset` shifted initial linear predictions based on `model_coefficients_start`. Default value: `offset` if `model_coefficients is None`, and `tf.linalg.matvec(model_matrix, model_coefficients_start) + offset` otherwise.
`l2_regularizer`	Optional scalar `Tensor` representing L2 regularization penalty, i.e., `loss(w) = sum{-log p(y[i]\|x[i],w) : i=1..n} + l2_regularizer \|\|w\|\|_2^2`. Default value: `None` (i.e., no L2 regularization).
`dispersion`	Optional (batch of) `Tensor` representing `response` dispersion, i.e., as in, `p(y\|theta) := exp((y theta - A(theta)) / dispersion)`. Must broadcast with rows of `model_matrix`. Default value: `None` (i.e., "no dispersion").
`offset`	Optional `Tensor` representing constant shift applied to `predicted_linear_response`. Must broadcast to `response`. Default value: `None` (i.e., `tf.zeros_like(response)`).
`convergence_criteria_fn`	Python `callable` taking: `is_converged_previous`, `iter_`, `model_coefficients_previous`, `predicted_linear_response_previous`, `model_coefficients_next`, `predicted_linear_response_next`, `response`, `model`, `dispersion` and returning a `bool` `Tensor` indicating that Fisher scoring has converged. See `convergence_criteria_small_relative_norm_weights_change` as an example function. Default value: `None` (i.e., `convergence_criteria_small_relative_norm_weights_change`).
`learning_rate`	Optional (batch of) scalar `Tensor` used to dampen iterative progress. Typically only needed if optimization diverges, should be no larger than `1` and typically very close to `1`. Default value: `None` (i.e., `1`).
`fast_unsafe_numerics`	Optional Python `bool` indicating if faster, less numerically accurate methods can be employed for computing the weighted least-squares solution. Default value: `True` (i.e., "fast but possibly diminished accuracy").
`maximum_iterations`	Optional maximum number of iterations of Fisher scoring to run; "and-ed" with result of `convergence_criteria_fn`. Default value: `None` (i.e., `infinity`).
`l2_regularization_penalty_factor`	Optional (batch of) vector-shaped `Tensor`, representing a separate penalty factor to apply to each model coefficient, length equal to columns in `model_matrix`. Each penalty factor multiplies l2_regularizer to allow differential regularization. Can be 0 for some coefficients, which implies no regularization. Default is 1 for all coefficients. `loss(w) = sum{-log p(y[i]\|x[i],w) : i=1..n} + l2_regularizer \|\|w * l2_regularization_penalty_factor\|\|_2^2` Default value: `None` (i.e., no per coefficient regularization).
`name`	Python `str` used as name prefix to ops created by this function. Default value: `"fit"`.

Returns
`model_coefficients`	(Batch of) vector-shaped `Tensor`; represents the fitted model coefficients, one for each column in `model_matrix`.
`predicted_linear_response`	`response`-shaped `Tensor` representing linear predictions based on new `model_coefficients`, i.e., `tf.linalg.matvec(model_matrix, model_coefficients) + offset`.
`is_converged`	`bool` `Tensor` indicating that the returned `model_coefficients` met the `convergence_criteria_fn` criteria within the `maximum_iterations` limit.
`iter_`	`int32` `Tensor` indicating the number of iterations taken.

Example

  import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions

def make_dataset(n, d, link, scale=1., dtype=np.float32):
  model_coefficients = tfd.Uniform(
      low=np.array(-1, dtype),
      high=np.array(1, dtype)).sample(d, seed=42)
  radius = np.sqrt(2.)
  model_coefficients *= radius / tf.linalg.norm(model_coefficients)
  model_matrix = tfd.Normal(
      loc=np.array(0, dtype),
      scale=np.array(1, dtype)).sample([n, d], seed=43)
  scale = tf.convert_to_tensor(scale, dtype)
  linear_response = tf.tensordot(
      model_matrix, model_coefficients, axes=[[1], [0]])
  if link == 'linear':
    response = tfd.Normal(loc=linear_response, scale=scale).sample(seed=44)
  elif link == 'probit':
    response = tf.cast(
        tfd.Normal(loc=linear_response, scale=scale).sample(seed=44) > 0,
        dtype)
  elif link == 'logit':
    response = tfd.Bernoulli(logits=linear_response).sample(seed=44)
  else:
    raise ValueError('unrecognized true link: {}'.format(link))
  return model_matrix, response, model_coefficients

X, Y, w_true = make_dataset(n=int(1e6), d=100, link='probit')

w, linear_response, is_converged, num_iter = tfp.glm.fit(
    model_matrix=X,
    response=Y,
    model=tfp.glm.BernoulliNormalCDF())
log_likelihood = tfp.glm.BernoulliNormalCDF().log_prob(Y, linear_response)

print('is_converged: ', is_converged.numpy())
print('    num_iter: ', num_iter.numpy())
print('    accuracy: ', np.mean((linear_response > 0.) == tf.cast(Y, bool)))
print('    deviance: ', 2. * np.mean(log_likelihood))
print('||w0-w1||_2 / (1+||w0||_2): ', (np.linalg.norm(w_true - w, ord=2) /
                                       (1. + np.linalg.norm(w_true, ord=2))))

# ==>
# is_converged:  True
#     num_iter:  6
#     accuracy:  0.804382
#     deviance:  -0.820746600628
# ||w0-w1||_2 / (1+||w0||_2):  0.00619245105309

tfp.glm.fit Stay organized with collections Save and categorize content based on your preferences.

Used in the notebooks

Args

Returns

Example

tfp.glm.fit