Formal representation of a sparse linear regression.
This model defines a time series given by a sparse linear combination of covariate time series provided in a design matrix:
observed_time_series = matmul(design_matrix, weights)
This is identical to
tfp.sts.LinearRegression, except that
SparseLinearRegression uses a parameterization of a Horseshoe
prior  to encode the assumption that many of the
weights are zero,
i.e., many of the covariate time series are irrelevant. See the mathematical
details section below for further discussion. The prior parameterization used
SparseLinearRegression is more suitable for inference than that
obtained by simply passing the equivalent
tfd.Horseshoe prior to
LinearRegression; when sparsity is desired,
likely yield better results.
This component does not itself include observation noise; it defines a
deterministic distribution with mass at the point
matmul(design_matrix, weights). In practice, it should be combined with
observation noise from another component such as
Tensors each of shape
representing covariate time series, we create a regression model that
conditions on these covariates:
regression = tfp.sts.SparseLinearRegression( design_matrix=tf.stack([series1, series2], axis=-1), weights_prior_scale=0.1)
weights_prior_scale determines the level of sparsity; small
scales encourage the weights to be sparse. In some cases, such as when
the likelihood is iid Gaussian with known scale, the prior scale can be
analytically related to the expected number of nonzero weights ; however,
this is not the case in general for STS models.
If the design matrix has batch dimensions, by default the model will create a
matching batch of weights. For example, if
design_matrix.shape == [
num_users, num_timesteps, num_features], by default the model will fit
separate weights for each user, i.e., it will internally represent
weights.shape == [num_users, num_features]. To share weights across some or
all batch dimensions, you can manually specify the batch shape for the
# design_matrix.shape == [num_users, num_timesteps, num_features] regression = tfp.sts.SparseLinearRegression( design_matrix=design_matrix, weights_batch_shape=) # weights.shape -> [num_features]
The basic horseshoe prior  is defined as a Cauchy-normal scale mixture:
scales[i] ~ HalfCauchy(loc=0, scale=1) weights[i] ~ Normal(loc=0., scale=scales[i] * global_scale)`
The Cauchy scale parameters puts substantial mass near zero, encouraging
weights to be sparse, but their heavy tails allow weights far from zero to be
estimated without excessive shrinkage. The horseshoe can be thought of as a
continuous relaxation of a traditional 'spike-and-slab' discrete sparsity
prior, in which the latent Cauchy scale mixes between 'spike'
scales[i] ~= 0) and 'slab' (
scales[i] >> 0) regimes.
Following the recommendations in ,
a horseshoe with the following adaptations:
- The Cauchy prior on
scales[i]is represented as an InverseGamma-Normal compound.
global_scaleparameter is integrated out following a
Cauchy(0., scale=weights_prior_scale)hyperprior, which is also represented as an InverseGamma-Normal compound.
- All compound distributions are implemented using a non-centered parameterization.
The compound, non-centered representation defines the same marginal prior as the original horseshoe (up to integrating out the global scale), but allows samplers to mix more efficiently through the heavy tails; for variational inference, the compound representation implicity expands the representational power of the variational model.
Note that we do not yet implement the regularized ('Finnish') horseshoe, proposed in  for models with weak likelihoods, because the likelihood in STS models is typically Gaussian, where it's not clear that additional regularization is appropriate. If you need this functionality, please email firstname.lastname@example.org.
The full prior parameterization implemented in
# Sample global_scale from Cauchy(0, scale=weights_prior_scale). global_scale_variance ~ InverseGamma(alpha=0.5, beta=0.5) global_scale_noncentered ~ HalfNormal(loc=0, scale=1) global_scale = (global_scale_noncentered * sqrt(global_scale_variance) * weights_prior_scale) # Sample local_scales from Cauchy(0, 1). local_scale_variances[i] ~ InverseGamma(alpha=0.5, beta=0.5) local_scales_noncentered[i] ~ HalfNormal(loc=0, scale=1) local_scales[i] = local_scales_noncentered[i] * sqrt(local_scale_variances[i]) weights[i] ~ Normal(loc=0., scale=local_scales[i] * global_scale)
: Carvalho, C., Polson, N. and Scott, J. Handling Sparsity via the Horseshoe. AISTATS (2009). http://proceedings.mlr.press/v5/carvalho09a/carvalho09a.pdf : Juho Piironen, Aki Vehtari. Sparsity information and regularization in the horseshoe and other shrinkage priors (2017). https://arxiv.org/abs/1707.01694
__init__( design_matrix, weights_prior_scale=0.1, weights_batch_shape=None, name=None )
Specify a sparse linear regression model.
concat([batch_shape, [num_timesteps, num_features]]). This may also optionally be an instance of
Tensordefining the scale of the Horseshoe prior on regression weights. Small values encourage the weights to be sparse. The shape must broadcast with
weights_batch_shape. Default value:
None, defaults to
design_matrix.batch_shape_tensor(). Must broadcast with the batch shape of
design_matrix. Default value:
name: the name of this model component. Default value: 'SparseLinearRegression'.
Static batch shape of models represented by this component.
tf.TensorShapegiving the broadcast batch shape of all model parameters. This should match the batch shape of derived state space models, i.e.,
self.make_state_space_model(...).batch_shape. It may be partially defined or unknown.
LinearOperator representing the design matrix.
int dimensionality of the latent space in this model.
Name of this model component.
List of Parameter(name, prior, bijector) namedtuples for this model.
Runtime batch shape of models represented by this component.
Tensorgiving the broadcast batch shape of all model parameters. This should match the batch shape of derived state space models, i.e.,
Build the joint density
log p(params) + log p(y|params) as a callable.
Tensortrajectories of shape
sample_shape + batch_shape + [num_timesteps, 1](the trailing
1dimension is optional if
num_timesteps > 1), where
self.batch_shape(the broadcast batch shape of all priors on parameters for this structural time series model). May optionally be an instance of
tfp.sts.MaskedTimeSeries, which includes a mask
Tensorto specify timesteps with missing observations.
log_joint_fn: A function taking a
Tensorargument for each model parameter, in canonical order, and returning a
Tensorlog probability of shape
batch_shape. Note that, unlike
log_jointsums over the
sample_shapefrom y, so that
sample_shapedoes not appear in the output log_prob. This corresponds to viewing multiple samples in
yas iid observations from a single model, which is typically the desired behavior for parameter inference.
make_state_space_model( num_timesteps, param_vals=None, initial_state_prior=None, initial_step=0 )
Instantiate this model as a Distribution over specified
intnumber of timesteps to model.
param_vals: a list of
Tensorparameter values in order corresponding to
self.parameters, or a dict mapping from parameter names to values.
initial_state_prior: an optional
Distributioninstance overriding the default prior on the model's initial state. This is used in forecasting ("today's prior is yesterday's posterior").
intspecifying the initial timestep to model. This is relevant when the model contains time-varying components, e.g., holidays or seasonality.
params_to_weights( global_scale_variance, global_scale_noncentered, local_scale_variances, local_scales_noncentered, weights_noncentered )
Build regression weights from model parameters.
prior_sample( num_timesteps, initial_step=0, params_sample_shape=(), trajectories_sample_shape=(), seed=None )
Sample from the joint prior over model parameters and trajectories.
Tensornumber of timesteps to model.
initial_step: Optional scalar
Tensorspecifying the starting timestep. Default value: 0.
params_sample_shape: Number of possible worlds to sample iid from the parameter prior, or more generally,
intshape to fill with iid samples. Default value: .
trajectories_sample_shape: For each sampled set of parameters, number of trajectories to sample, or more generally,
intshape to fill with iid samples. Default value: .
trajectories_sample_shape + params_sample_shape + [num_timesteps, 1]containing all sampled trajectories.
param_samples: list of sampled parameter value
Tensors, in order corresponding to
self.parameters, each of shape
params_sample_shape + prior.batch_shape + prior.event_shape.