View source on GitHub |
Named tuple encoding a time series Tensor
and optional missingness mask.
tfp.sts.MaskedTimeSeries(
time_series, is_missing
)
Structural time series models handle missing values naturally, following the rules of conditional probability. Posterior inference can be used to impute missing values, with uncertainties. Forecasting and posterior decomposition are also supported for time series with missing values; the missing values will generally lead to corresponding higher forecast uncertainty.
All methods in the tfp.sts
API that accept an observed_time_series
Tensor
should optionally also accept a MaskedTimeSeries
instance.
The time series should be a float Tensor
of shape [..., num_timesteps]
or
[..., num_timesteps, 1]
. The is_missing
mask must be either a boolean
Tensor
of shape [..., num_timesteps]
, or None
. True
values in
is_missing
denote missing (masked) observations; False
denotes observed
(unmasked) values. Note that these semantics are opposite that of low-level
TensorFlow methods like tf.boolean_mask
, but consistent with the behavior
of Numpy masked arrays.
The batch dimensions of is_missing
must broadcast with the batch
dimensions of time_series
.
A MaskedTimeSeries
is just a collections.namedtuple
instance, i.e., a dumb
container. Although the convention for the elements is as described here, it's
left to downstream methods to validate or convert the elements as required.
In particular, most downstream methods will call tf.convert_to_tensor
on the components. In order to prevent duplicate Tensor
creation, you may
(if memory is an issue) wish to ensure that the components are already
Tensors
, as opposed to numpy arrays or similar.
Examples
To construct a simple MaskedTimeSeries instance:
observed_time_series = tfp.sts.MaskedTimeSeries(
time_series=tf.random.normal([3, 4, 5]),
is_missing=[True, False, False, True, False])
Note that the mask we specified will broadcast against the batch dimensions of the time series.
For time series with missing entries specified as NaN 'magic values', you can
generate a mask using tf.is_nan
:
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
time_series_with_nans = [-1., 1., np.nan, 2.4, np.nan, 5]
observed_time_series = tfp.sts.MaskedTimeSeries(
time_series=time_series_with_nans,
is_missing=tf.is_nan(time_series_with_nans))
# Build model using observed time series to set heuristic priors.
linear_trend_model = tfp.sts.LocalLinearTrend(
observed_time_series=observed_time_series)
model = tfp.sts.Sum([linear_trend_model],
observed_time_series=observed_time_series)
# Fit model to data
parameter_samples, _ = tfp.sts.fit_with_hmc(model, observed_time_series)
# Forecast
forecast_dist = tfp.sts.forecast(
model, observed_time_series, num_steps_forecast=5)
# Impute missing values
observations_dist = tfp.sts.impute_missing_values(model, observed_time_series)
print('imputed means and stddevs: ',
observations_dist.mean(),
observations_dist.stddev())
Attributes | |
---|---|
time_series
|
A namedtuple alias for field number 0
|
is_missing
|
A namedtuple alias for field number 1
|