TF 2.0 is out! Get hands-on practice at TF World, Oct 28-31. Use code TF20 for 20% off select passes. Register now

tfp.sts.impute_missing_values

View source on GitHub

Runs posterior inference to impute the missing values in a time series.

tfp.sts.impute_missing_values(
    model,
    observed_time_series,
    parameter_samples,
    include_observation_noise=False
)

This method computes the posterior marginals p(latent state | observations), given the time series at observed timesteps (a missingness mask should be specified using tfp.sts.MaskedTimeSeries). It pushes this posterior back through the observation model to impute a predictive distribution on the observed time series. At unobserved steps, this is an imputed value; at other steps it is interpreted as the model's estimate of the underlying noise-free series.

Args:

  • model: tfp.sts.Sum instance defining an additive STS model.
  • observed_time_series: float Tensor of shape concat([sample_shape, model.batch_shape, [num_timesteps, 1]]) where sample_shape corresponds to i.i.d. observations, and the trailing [1] dimension may (optionally) be omitted if num_timesteps > 1. May optionally be an instance of tfp.sts.MaskedTimeSeries including a mask Tensor to encode the locations of missing observations.
  • parameter_samples: Python list of Tensors representing posterior samples of model parameters, with shapes [concat([ [num_posterior_draws], param.prior.batch_shape, param.prior.event_shape]) for param in model.parameters]. This may optionally also be a map (Python dict) of parameter names to Tensor values.
  • include_observation_noise: If False, the imputed uncertainties represent the model's estimate of the noise-free time series at each timestep. If True, they represent the model's estimate of the range of values that could be observed at each timestep, including any i.i.d. observation noise. Default value: False.

Returns:

  • imputed_series_dist: a tfd.MixtureSameFamily instance with event shape [num_timesteps] and batch shape concat([sample_shape, model.batch_shape]), with num_posterior_draws mixture components.

Example

To specify a time series with missing values, use tfp.sts.MaskedTimeSeries:

time_series_with_nans = [-1., 1., np.nan, 2.4, np.nan, 5]
observed_time_series = tfp.sts.MaskedTimeSeries(
  time_series=time_series_with_nans,
  is_missing=tf.math.is_nan(time_series_with_nans))

Masked time series can be passed to tfp.sts methods in place of a observed_time_series Tensor:

# Build model using observed time series to set heuristic priors.
linear_trend_model = tfp.sts.LocalLinearTrend(
  observed_time_series=observed_time_series)
model = tfp.sts.Sum([linear_trend_model],
                    observed_time_series=observed_time_series)

# Fit model to data
parameter_samples, _ = tfp.sts.fit_with_hmc(model, observed_time_series)

After fitting a model, impute_missing_values will return a distribution

# Impute missing values
imputed_series_distribution = tfp.sts.impute_missing_values(
  model, observed_time_series)
print('imputed means and stddevs: ',
      imputed_series_distribution.mean(),
      imputed_series_distribution.stddev())