Join us at TensorFlow World, Oct 28-31. Use code TF20 for 20% off select passes. Register now

tfp.sts.AutoregressiveStateSpaceModel

Class AutoregressiveStateSpaceModel

State space model for an autoregressive process.

Inherits From: LinearGaussianStateSpaceModel

Defined in python/sts/autoregressive.py.

A state space model (SSM) posits a set of latent (unobserved) variables that evolve over time with dynamics specified by a probabilistic transition model p(z[t+1] | z[t]). At each timestep, we observe a value sampled from an observation model conditioned on the current state, p(x[t] | z[t]). The special case where both the transition and observation models are Gaussians with mean specified as a linear function of the inputs, is known as a linear Gaussian state space model and supports tractable exact probabilistic calculations; see tfp.distributions.LinearGaussianStateSpaceModel for details.

In an autoregressive process, the expected level at each timestep is a linear function of previous levels, with added Gaussian noise:

level[t+1] = (sum(coefficients * levels[t:t-order:-1]) +
              Normal(0., level_scale))
 ```

The process is characterized by a vector `coefficients` whose size determines
the order of the process (how many previous values it looks at), and by
`level_scale`, the standard deviation of the noise added at each step.

This is formulated as a state space model by letting the latent state encode
the most recent values; see 'Mathematical Details' below.

The parameters `level_scale` and `observation_noise_scale` are each (a batch
of) scalars, and `coefficients` is a (batch) vector of size `[order]`. The
batch shape of this `Distribution` is the broadcast batch
shape of these parameters and of the `initial_state_prior`.

#### Mathematical Details

The autoregressive model implements a
<a href="../../tfp/distributions/LinearGaussianStateSpaceModel"><code>tfp.distributions.LinearGaussianStateSpaceModel</code></a> with `latent_size = order`
and `observation_size = 1`. The latent state vector encodes the recent history
of the process, with the current value in the topmost dimension. At each
timestep, the transition sums the previous values to produce the new expected
value, shifts all other values down by a dimension, and adds noise to the
current value. This is formally encoded by the transition model:

transition_matrix = [ coefs[0], coefs[1], ..., coefs[order] 1., 0 , ..., 0. 0., 1., ..., 0. ... 0., 0., ..., 1., 0. ] transition_noise ~ N(loc=0., scale=diag([level_scale, 0., 0., ..., 0.])) ```

The observation model simply extracts the current (topmost) value, and optionally adds independent noise at each step:

observation_matrix = [[1., 0., ..., 0.]]
observation_noise ~ N(loc=0, scale=observation_noise_scale)

Models with observation_noise_scale = 0. are AR processes in the formal sense. Setting observation_noise_scale to a nonzero value corresponds to a latent AR process observed under an iid noise model.

Examples

A simple model definition:

ar_model = AutoregressiveStateSpaceModel(
    num_timesteps=50,
    coefficients=[0.8, -0.1],
    level_scale=0.5,
    initial_state_prior=tfd.MultivariateNormalDiag(
      scale_diag=[1., 1.]))

y = ar_model.sample() # y has shape [50, 1]
lp = ar_model.log_prob(y) # log_prob is scalar

Passing additional parameter dimensions constructs a batch of models. The overall batch shape is the broadcast batch shape of the parameters:

ar_model = AutoregressiveStateSpaceModel(
    num_timesteps=50,
    coefficients=[0.8, -0.1],
    level_scale=tf.ones([10]),
    initial_state_prior=tfd.MultivariateNormalDiag(
      scale_diag=tf.ones([10, 10, 2])))

y = ar_model.sample(5) # y has shape [5, 10, 10, 50, 1]
lp = ar_model.log_prob(y) # has shape [5, 10, 10]

__init__

__init__(
    num_timesteps,
    coefficients,
    level_scale,
    initial_state_prior,
    observation_noise_scale=0.0,
    initial_step=0,
    validate_args=False,
    name=None
)

Build a state space model implementing an autoregressive process.

Args:

  • num_timesteps: Scalar int Tensor number of timesteps to model with this distribution.
  • coefficients: float Tensor of shape concat(batch_shape, [order]) defining the autoregressive coefficients. The coefficients are defined backwards in time: coefficients[0] * level[t] + coefficients[1] * level[t-1] + ... + coefficients[order-1] * level[t-order+1].
  • level_scale: Scalar (any additional dimensions are treated as batch dimensions) float Tensor indicating the standard deviation of the transition noise at each step.
  • initial_state_prior: instance of tfd.MultivariateNormal representing the prior distribution on latent states. Must have event shape [order].
  • observation_noise_scale: Scalar (any additional dimensions are treated as batch dimensions) float Tensor indicating the standard deviation of the observation noise. Default value: 0.
  • initial_step: Optional scalar int Tensor specifying the starting timestep. Default value: 0.
  • validate_args: Python bool. Whether to validate input with asserts. If validate_args is False, and the inputs are invalid, correct behavior is not guaranteed. Default value: False.
  • name: Python str name prefixed to ops created by this class. Default value: "AutoregressiveStateSpaceModel".

Properties

allow_nan_stats

Python bool describing behavior when a stat is undefined.

Stats return +/- infinity when it makes sense. E.g., the variance of a Cauchy distribution is infinity. However, sometimes the statistic is undefined, e.g., if a distribution's pdf does not achieve a maximum within the support of the distribution, the mode is undefined. If the mean is undefined, then by definition the variance is undefined. E.g. the mean for Student's T for df = 1 is undefined (no clear way to say it is either + or - infinity), so the variance = E[(X - mean)**2] is also undefined.

Returns:

  • allow_nan_stats: Python bool.

batch_shape

Shape of a single sample from a single event index as a TensorShape.

May be partially defined or unknown.

The batch dimensions are indexes into independent, non-identical parameterizations of this distribution.

Returns:

  • batch_shape: TensorShape, possibly unknown.

coefficients

dtype

The DType of Tensors handled by this Distribution.

event_shape

Shape of a single sample from a single batch as a TensorShape.

May be partially defined or unknown.

Returns:

  • event_shape: TensorShape, possibly unknown.

level_scale

name

Name prepended to all ops created by this Distribution.

order

parameters

Dictionary of parameters used to instantiate this Distribution.

reparameterization_type

Describes how samples from the distribution are reparameterized.

Currently this is one of the static instances tfd.FULLY_REPARAMETERIZED or tfd.NOT_REPARAMETERIZED.

Returns:

An instance of ReparameterizationType.

validate_args

Python bool indicating possibly expensive checks are enabled.

Methods

__getitem__

__getitem__(slices)

Slices the batch axes of this distribution, returning a new instance.

b = tfd.Bernoulli(logits=tf.zeros([3, 5, 7, 9]))
b.batch_shape  # => [3, 5, 7, 9]
b2 = b[:, tf.newaxis, ..., -2:, 1::2]
b2.batch_shape  # => [3, 1, 5, 2, 4]

x = tf.random.normal([5, 3, 2, 2])
cov = tf.matmul(x, x, transpose_b=True)
chol = tf.cholesky(cov)
loc = tf.random.normal([4, 1, 3, 1])
mvn = tfd.MultivariateNormalTriL(loc, chol)
mvn.batch_shape  # => [4, 5, 3]
mvn.event_shape  # => [2]
mvn2 = mvn[:, 3:, ..., ::-1, tf.newaxis]
mvn2.batch_shape  # => [4, 2, 3, 1]
mvn2.event_shape  # => [2]

Args:

  • slices: slices from the [] operator

Returns:

  • dist: A new tfd.Distribution instance with sliced parameters.

__iter__

__iter__()

backward_smoothing_pass

backward_smoothing_pass(
    filtered_means,
    filtered_covs,
    predicted_means,
    predicted_covs
)

Run the backward pass in Kalman smoother.

The backward smoothing is using Rauch, Tung and Striebel smoother as as discussed in section 18.3.2 of Kevin P. Murphy, 2012, Machine Learning: A Probabilistic Perspective, The MIT Press. The inputs are returned by forward_filter function.

Args:

  • filtered_means: Means of the per-timestep filtered marginal distributions p(zt | x{:t}), as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps, latent_size].
  • filtered_covs: Covariances of the per-timestep filtered marginal distributions p(zt | x{:t}), as a Tensor of shape batch_shape + [num_timesteps, latent_size, latent_size].
  • predicted_means: Means of the per-timestep predictive distributions over latent states, p(z{t+1} | x{:t}), as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps, latent_size].
  • predicted_covs: Covariances of the per-timestep predictive distributions over latent states, p(z{t+1} | x{:t}), as a Tensor of shape batch_shape + [num_timesteps, latent_size, latent_size].

Returns:

  • posterior_means: Means of the smoothed marginal distributions p(zt | x{1:T}), as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps, latent_size], which is of the same shape as filtered_means.
  • posterior_covs: Covariances of the smoothed marginal distributions p(zt | x{1:T}), as a Tensor of shape batch_shape + [num_timesteps, latent_size, latent_size]. which is of the same shape as filtered_covs.

batch_shape_tensor

batch_shape_tensor(name='batch_shape_tensor')

Shape of a single sample from a single event index as a 1-D Tensor.

The batch dimensions are indexes into independent, non-identical parameterizations of this distribution.

Args:

  • name: name to give to the op

Returns:

  • batch_shape: Tensor.

cdf

cdf(
    value,
    name='cdf',
    **kwargs
)

Cumulative distribution function.

Given random variable X, the cumulative distribution function cdf is:

cdf(x) := P[X <= x]

Args:

  • value: float or double Tensor.
  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • cdf: a Tensor of shape sample_shape(x) + self.batch_shape with values of type self.dtype.

copy

copy(**override_parameters_kwargs)

Creates a deep copy of the distribution.

Args:

  • **override_parameters_kwargs: String/value dictionary of initialization arguments to override with new values.

Returns:

  • distribution: A new instance of type(self) initialized from the union of self.parameters and override_parameters_kwargs, i.e., dict(self.parameters, **override_parameters_kwargs).

covariance

covariance(
    name='covariance',
    **kwargs
)

Covariance.

Covariance is (possibly) defined only for non-scalar-event distributions.

For example, for a length-k, vector-valued distribution, it is calculated as,

Cov[i, j] = Covariance(X_i, X_j) = E[(X_i - E[X_i]) (X_j - E[X_j])]

where Cov is a (batch of) k x k matrix, 0 <= (i, j) < k, and E denotes expectation.

Alternatively, for non-vector, multivariate distributions (e.g., matrix-valued, Wishart), Covariance shall return a (batch of) matrices under some vectorization of the events, i.e.,

Cov[i, j] = Covariance(Vec(X)_i, Vec(X)_j) = [as above]

where Cov is a (batch of) k' x k' matrices, 0 <= (i, j) < k' = reduce_prod(event_shape), and Vec is some function mapping indices of this distribution's event dimensions to indices of a length-k' vector.

Args:

  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • covariance: Floating-point Tensor with shape [B1, ..., Bn, k', k'] where the first n dimensions are batch coordinates and k' = reduce_prod(self.event_shape).

cross_entropy

cross_entropy(
    other,
    name='cross_entropy'
)

Computes the (Shannon) cross entropy.

Denote this distribution (self) by P and the other distribution by Q. Assuming P, Q are absolutely continuous with respect to one another and permit densities p(x) dr(x) and q(x) dr(x), (Shannon) cross entropy is defined as:

H[P, Q] = E_p[-log q(X)] = -int_F p(x) log q(x) dr(x)

where F denotes the support of the random variable X ~ P.

Args:

Returns:

  • cross_entropy: self.dtype Tensor with shape [B1, ..., Bn] representing n different calculations of (Shannon) cross entropy.

entropy

entropy(
    name='entropy',
    **kwargs
)

Shannon entropy in nats.

event_shape_tensor

event_shape_tensor(name='event_shape_tensor')

Shape of a single sample from a single batch as a 1-D int32 Tensor.

Args:

  • name: name to give to the op

Returns:

  • event_shape: Tensor.

forward_filter

forward_filter(
    x,
    mask=None
)

Run a Kalman filter over a provided sequence of outputs.

Note that the returned values filtered_means, predicted_means, and observation_means depend on the observed time series x, while the corresponding covariances are independent of the observed series; i.e., they depend only on the model itself. This means that the mean values have shape concat([sample_shape(x), batch_shape, [num_timesteps, {latent/observation}_size]]), while the covariances have shape concat[(batch_shape, [num_timesteps, {latent/observation}_size, {latent/observation}_size]]), which does not depend on the sample shape.

Args:

  • x: a float-type Tensor with rightmost dimensions [num_timesteps, observation_size] matching self.event_shape. Additional dimensions must match or be broadcastable to self.batch_shape; any further dimensions are interpreted as a sample shape.
  • mask: optional bool-type Tensor with rightmost dimension [num_timesteps]; True values specify that the value of x at that timestep is masked, i.e., not conditioned on. Additional dimensions must match or be broadcastable to self.batch_shape; any further dimensions must match or be broadcastable to the sample shape of x. Default value: None.

Returns:

  • log_likelihoods: Per-timestep log marginal likelihoods log p(x_t | x_{:t-1}) evaluated at the input x, as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps].
  • filtered_means: Means of the per-timestep filtered marginal distributions p(zt | x{:t}), as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps, latent_size].
  • filtered_covs: Covariances of the per-timestep filtered marginal distributions p(zt | x{:t}), as a Tensor of shape sample_shape(mask) + batch_shape + [num_timesteps, latent_size, latent_size]. Note that the covariances depend only on the model and the mask, not on the data, so this may have fewer dimensions than filtered_means.
  • predicted_means: Means of the per-timestep predictive distributions over latent states, p(z{t+1} | x{:t}), as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps, latent_size].
  • predicted_covs: Covariances of the per-timestep predictive distributions over latent states, p(z{t+1} | x{:t}), as a Tensor of shape sample_shape(mask) + batch_shape + [num_timesteps, latent_size, latent_size]. Note that the covariances depend only on the model and the mask, not on the data, so this may have fewer dimensions than predicted_means.
  • observation_means: Means of the per-timestep predictive distributions over observations, p(x{t} | x{:t-1}), as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps, observation_size].
  • observation_covs: Covariances of the per-timestep predictive distributions over observations, p(x{t} | x{:t-1}), as a Tensor of shape sample_shape(mask) + batch_shape + [num_timesteps, observation_size, observation_size]. Note that the covariances depend only on the model and the mask, not on the data, so this may have fewer dimensions than observation_means.

is_scalar_batch

is_scalar_batch(name='is_scalar_batch')

Indicates that batch_shape == [].

Args:

  • name: Python str prepended to names of ops created by this function.

Returns:

  • is_scalar_batch: bool scalar Tensor.

is_scalar_event

is_scalar_event(name='is_scalar_event')

Indicates that event_shape == [].

Args:

  • name: Python str prepended to names of ops created by this function.

Returns:

  • is_scalar_event: bool scalar Tensor.

kl_divergence

kl_divergence(
    other,
    name='kl_divergence'
)

Computes the Kullback--Leibler divergence.

Denote this distribution (self) by p and the other distribution by q. Assuming p, q are absolutely continuous with respect to reference measure r, the KL divergence is defined as:

KL[p, q] = E_p[log(p(X)/q(X))]
         = -int_F p(x) log q(x) dr(x) + int_F p(x) log p(x) dr(x)
         = H[p, q] - H[p]

where F denotes the support of the random variable X ~ p, H[., .] denotes (Shannon) cross entropy, and H[.] denotes (Shannon) entropy.

Args:

Returns:

  • kl_divergence: self.dtype Tensor with shape [B1, ..., Bn] representing n different calculations of the Kullback-Leibler divergence.

latents_to_observations

latents_to_observations(
    latent_means,
    latent_covs
)

Push latent means and covariances forward through the observation model.

Args:

  • latent_means: float Tensor of shape [..., num_timesteps, latent_size]
  • latent_covs: float Tensor of shape [..., num_timesteps, latent_size, latent_size].

Returns:

  • observation_means: float Tensor of shape [..., num_timesteps, observation_size]
  • observation_covs: float Tensor of shape [..., num_timesteps, observation_size, observation_size]

log_cdf

log_cdf(
    value,
    name='log_cdf',
    **kwargs
)

Log cumulative distribution function.

Given random variable X, the cumulative distribution function cdf is:

log_cdf(x) := Log[ P[X <= x] ]

Often, a numerical approximation can be used for log_cdf(x) that yields a more accurate answer than simply taking the logarithm of the cdf when x << -1.

Args:

  • value: float or double Tensor.
  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • logcdf: a Tensor of shape sample_shape(x) + self.batch_shape with values of type self.dtype.

log_prob

log_prob(
    value,
    name='log_prob',
    **kwargs
)

Log probability density/mass function.

Additional documentation from LinearGaussianStateSpaceModel:

kwargs:
  • mask: optional bool-type Tensor with rightmost dimension [num_timesteps]; True values specify that the value of x at that timestep is masked, i.e., not conditioned on. Additional dimensions must match or be broadcastable to self.batch_shape; any further dimensions must match or be broadcastable to the sample shape of x. Default value: None.

Args:

  • value: float or double Tensor.
  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • log_prob: a Tensor of shape sample_shape(x) + self.batch_shape with values of type self.dtype.

log_survival_function

log_survival_function(
    value,
    name='log_survival_function',
    **kwargs
)

Log survival function.

Given random variable X, the survival function is defined:

log_survival_function(x) = Log[ P[X > x] ]
                         = Log[ 1 - P[X <= x] ]
                         = Log[ 1 - cdf(x) ]

Typically, different numerical approximations can be used for the log survival function, which are more accurate than 1 - cdf(x) when x >> 1.

Args:

  • value: float or double Tensor.
  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

Tensor of shape sample_shape(x) + self.batch_shape with values of type self.dtype.

mean

mean(
    name='mean',
    **kwargs
)

Mean.

mode

mode(
    name='mode',
    **kwargs
)

Mode.

param_shapes

param_shapes(
    cls,
    sample_shape,
    name='DistributionParamShapes'
)

Shapes of parameters given the desired shape of a call to sample().

This is a class method that describes what key/value arguments are required to instantiate the given Distribution so that a particular shape is returned for that instance's call to sample().

Subclasses should override class method _param_shapes.

Args:

  • sample_shape: Tensor or python list/tuple. Desired shape of a call to sample().
  • name: name to prepend ops with.

Returns:

dict of parameter name to Tensor shapes.

param_static_shapes

param_static_shapes(
    cls,
    sample_shape
)

param_shapes with static (i.e. TensorShape) shapes.

This is a class method that describes what key/value arguments are required to instantiate the given Distribution so that a particular shape is returned for that instance's call to sample(). Assumes that the sample's shape is known statically.

Subclasses should override class method _param_shapes to return constant-valued tensors when constant values are fed.

Args:

  • sample_shape: TensorShape or python list/tuple. Desired shape of a call to sample().

Returns:

dict of parameter name to TensorShape.

Raises:

  • ValueError: if sample_shape is a TensorShape and is not fully defined.

posterior_marginals

posterior_marginals(
    x,
    mask=None
)

Run a Kalman smoother to return posterior mean and cov.

Note that the returned values smoothed_means depend on the observed time series x, while the smoothed_covs are independent of the observed series; i.e., they depend only on the model itself. This means that the mean values have shape concat([sample_shape(x), batch_shape, [num_timesteps, {latent/observation}_size]]), while the covariances have shape concat[(batch_shape, [num_timesteps, {latent/observation}_size, {latent/observation}_size]]), which does not depend on the sample shape.

This function only performs smoothing. If the user wants the intermediate values, which are returned by filtering pass forward_filter, one could get it by:

(log_likelihoods,
 filtered_means, filtered_covs,
 predicted_means, predicted_covs,
 observation_means, observation_covs) = model.forward_filter(x)
smoothed_means, smoothed_covs = model.backward_smoothing_pass(x)

where x is an observation sequence.

Args:

  • x: a float-type Tensor with rightmost dimensions [num_timesteps, observation_size] matching self.event_shape. Additional dimensions must match or be broadcastable to self.batch_shape; any further dimensions are interpreted as a sample shape.
  • mask: optional bool-type Tensor with rightmost dimension [num_timesteps]; True values specify that the value of x at that timestep is masked, i.e., not conditioned on. Additional dimensions must match or be broadcastable to self.batch_shape; any further dimensions must match or be broadcastable to the sample shape of x. Default value: None.

Returns:

  • smoothed_means: Means of the per-timestep smoothed distributions over latent states, p(x{t} | x{:T}), as a Tensor of shape sample_shape(x) + batch_shape + [num_timesteps, observation_size].
  • smoothed_covs: Covariances of the per-timestep smoothed distributions over latent states, p(x{t} | x{:T}), as a Tensor of shape sample_shape(mask) + batch_shape + [num_timesteps, observation_size, observation_size]. Note that the covariances depend only on the model and the mask, not on the data, so this may have fewer dimensions than filtered_means.

prob

prob(
    value,
    name='prob',
    **kwargs
)

Probability density/mass function.

Additional documentation from LinearGaussianStateSpaceModel:

kwargs:
  • mask: optional bool-type Tensor with rightmost dimension [num_timesteps]; True values specify that the value of x at that timestep is masked, i.e., not conditioned on. Additional dimensions must match or be broadcastable to self.batch_shape; any further dimensions must match or be broadcastable to the sample shape of x. Default value: None.

Args:

  • value: float or double Tensor.
  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • prob: a Tensor of shape sample_shape(x) + self.batch_shape with values of type self.dtype.

quantile

quantile(
    value,
    name='quantile',
    **kwargs
)

Quantile function. Aka "inverse cdf" or "percent point function".

Given random variable X and p in [0, 1], the quantile is:

quantile(p) := x such that P[X <= x] == p

Args:

  • value: float or double Tensor.
  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • quantile: a Tensor of shape sample_shape(x) + self.batch_shape with values of type self.dtype.

sample

sample(
    sample_shape=(),
    seed=None,
    name='sample',
    **kwargs
)

Generate samples of the specified shape.

Note that a call to sample() without arguments will generate a single sample.

Args:

  • sample_shape: 0D or 1D int32 Tensor. Shape of the generated samples.
  • seed: Python integer seed for RNG
  • name: name to give to the op.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • samples: a Tensor with prepended dimensions sample_shape.

stddev

stddev(
    name='stddev',
    **kwargs
)

Standard deviation.

Standard deviation is defined as,

stddev = E[(X - E[X])**2]**0.5

where X is the random variable associated with this distribution, E denotes expectation, and stddev.shape = batch_shape + event_shape.

Args:

  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • stddev: Floating-point Tensor with shape identical to batch_shape + event_shape, i.e., the same shape as self.mean().

survival_function

survival_function(
    value,
    name='survival_function',
    **kwargs
)

Survival function.

Given random variable X, the survival function is defined:

survival_function(x) = P[X > x]
                     = 1 - P[X <= x]
                     = 1 - cdf(x).

Args:

  • value: float or double Tensor.
  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

Tensor of shape sample_shape(x) + self.batch_shape with values of type self.dtype.

variance

variance(
    name='variance',
    **kwargs
)

Variance.

Variance is defined as,

Var = E[(X - E[X])**2]

where X is the random variable associated with this distribution, E denotes expectation, and Var.shape = batch_shape + event_shape.

Args:

  • name: Python str prepended to names of ops created by this function.
  • **kwargs: Named arguments forwarded to subclass implementation.

Returns:

  • variance: Floating-point Tensor with shape identical to batch_shape + event_shape, i.e., the same shape as self.mean().