Runs multiple Fisher scoring steps.
tfp.glm.fit(
model_matrix,
response,
model,
model_coefficients_start=None,
predicted_linear_response_start=None,
l2_regularizer=None,
dispersion=None,
offset=None,
convergence_criteria_fn=None,
learning_rate=None,
fast_unsafe_numerics=True,
maximum_iterations=None,
l2_regularization_penalty_factor=None,
name=None
)
Used in the notebooks
Args |
model_matrix
|
(Batch of) float -like, matrix-shaped Tensor where each row
represents a sample's features.
|
response
|
(Batch of) vector-shaped Tensor where each element represents a
sample's observed response (to the corresponding row of features). Must
have same dtype as model_matrix .
|
model
|
tfp.glm.ExponentialFamily -like instance which implicitly
characterizes a negative log-likelihood loss by specifying the
distribuion's mean , gradient_mean , and variance .
|
model_coefficients_start
|
Optional (batch of) vector-shaped Tensor
representing the initial model coefficients, one for each column in
model_matrix . Must have same dtype as model_matrix .
Default value: Zeros.
|
predicted_linear_response_start
|
Optional Tensor with shape , dtype
matching response ; represents offset shifted initial linear
predictions based on model_coefficients_start .
Default value: offset if model_coefficients is None , and
tf.linalg.matvec(model_matrix, model_coefficients_start) + offset
otherwise.
|
l2_regularizer
|
Optional scalar Tensor representing L2 regularization
penalty, i.e.,
loss(w) = sum{-log p(y[i]|x[i],w) : i=1..n} + l2_regularizer ||w||_2^2 .
Default value: None (i.e., no L2 regularization).
|
dispersion
|
Optional (batch of) Tensor representing response dispersion,
i.e., as in, p(y|theta) := exp((y theta - A(theta)) / dispersion) .
Must broadcast with rows of model_matrix .
Default value: None (i.e., "no dispersion").
|
offset
|
Optional Tensor representing constant shift applied to
predicted_linear_response . Must broadcast to response .
Default value: None (i.e., tf.zeros_like(response) ).
|
convergence_criteria_fn
|
Python callable taking:
is_converged_previous , iter_ , model_coefficients_previous ,
predicted_linear_response_previous , model_coefficients_next ,
predicted_linear_response_next , response , model , dispersion and
returning a bool Tensor indicating that Fisher scoring has converged.
See convergence_criteria_small_relative_norm_weights_change as an
example function.
Default value: None (i.e.,
convergence_criteria_small_relative_norm_weights_change ).
|
learning_rate
|
Optional (batch of) scalar Tensor used to dampen iterative
progress. Typically only needed if optimization diverges, should be no
larger than 1 and typically very close to 1 .
Default value: None (i.e., 1 ).
|
fast_unsafe_numerics
|
Optional Python bool indicating if faster, less
numerically accurate methods can be employed for computing the weighted
least-squares solution.
Default value: True (i.e., "fast but possibly diminished accuracy").
|
maximum_iterations
|
Optional maximum number of iterations of Fisher scoring
to run; "and-ed" with result of convergence_criteria_fn .
Default value: None (i.e., infinity ).
|
l2_regularization_penalty_factor
|
Optional (batch of) vector-shaped
Tensor , representing a separate penalty factor to apply to each model
coefficient, length equal to columns in model_matrix . Each penalty
factor multiplies l2_regularizer to allow differential regularization. Can
be 0 for some coefficients, which implies no regularization. Default is 1
for all coefficients.
loss(w) = sum{-log p(y[i]|x[i],w) : i=1..n} + l2_regularizer ||w *
l2_regularization_penalty_factor||_2^2
Default value: None (i.e., no per coefficient regularization).
|
name
|
Python str used as name prefix to ops created by this function.
Default value: "fit" .
|
Returns |
model_coefficients
|
(Batch of) vector-shaped Tensor ; represents the
fitted model coefficients, one for each column in model_matrix .
|
predicted_linear_response
|
response -shaped Tensor representing linear
predictions based on new model_coefficients , i.e.,
tf.linalg.matvec(model_matrix, model_coefficients) + offset .
|
is_converged
|
bool Tensor indicating that the returned
model_coefficients met the convergence_criteria_fn criteria within the
maximum_iterations limit.
|
iter_
|
int32 Tensor indicating the number of iterations taken.
|
Example
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
def make_dataset(n, d, link, scale=1., dtype=np.float32):
model_coefficients = tfd.Uniform(
low=np.array(-1, dtype),
high=np.array(1, dtype)).sample(d, seed=42)
radius = np.sqrt(2.)
model_coefficients *= radius / tf.linalg.norm(model_coefficients)
model_matrix = tfd.Normal(
loc=np.array(0, dtype),
scale=np.array(1, dtype)).sample([n, d], seed=43)
scale = tf.convert_to_tensor(scale, dtype)
linear_response = tf.tensordot(
model_matrix, model_coefficients, axes=[[1], [0]])
if link == 'linear':
response = tfd.Normal(loc=linear_response, scale=scale).sample(seed=44)
elif link == 'probit':
response = tf.cast(
tfd.Normal(loc=linear_response, scale=scale).sample(seed=44) > 0,
dtype)
elif link == 'logit':
response = tfd.Bernoulli(logits=linear_response).sample(seed=44)
else:
raise ValueError('unrecognized true link: {}'.format(link))
return model_matrix, response, model_coefficients
X, Y, w_true = make_dataset(n=int(1e6), d=100, link='probit')
w, linear_response, is_converged, num_iter = tfp.glm.fit(
model_matrix=X,
response=Y,
model=tfp.glm.BernoulliNormalCDF())
log_likelihood = tfp.glm.BernoulliNormalCDF().log_prob(Y, linear_response)
print('is_converged: ', is_converged.numpy())
print(' num_iter: ', num_iter.numpy())
print(' accuracy: ', np.mean((linear_response > 0.) == tf.cast(Y, bool)))
print(' deviance: ', 2. * np.mean(log_likelihood))
print('||w0-w1||_2 / (1+||w0||_2): ', (np.linalg.norm(w_true - w, ord=2) /
(1. + np.linalg.norm(w_true, ord=2))))
# ==>
# is_converged: True
# num_iter: 6
# accuracy: 0.804382
# deviance: -0.820746600628
# ||w0-w1||_2 / (1+||w0||_2): 0.00619245105309