View source on GitHub |
Compute quantiles of x
along axis
.
tfp.substrates.numpy.stats.quantiles(
x,
num_quantiles,
axis=None,
interpolation=None,
keepdims=False,
validate_args=False,
name=None
)
The quantiles of a distribution are cut points dividing the range into intervals with equal probabilities.
Given a vector x
of samples, this function estimates the cut points by
returning num_quantiles + 1
cut points, (c0, ..., cn)
, such that, roughly
speaking, equal number of sample points lie in the num_quantiles
intervals
[c0, c1), [c1, c2), ..., [c_{n-1}, cn]
. That is,
- About
1 / n
fraction of the data lies in[c_{k-1}, c_k)
,k = 1, ..., n
- About
k / n
fraction of the data lies belowc_k
. c0
is the sample minimum andcn
is the maximum.
The exact number of data points in each interval depends on the size of
x
(e.g. whether the size is divisible by n
) and the interpolation
kwarg.
Raises | |
---|---|
ValueError
|
If argument 'interpolation' is not an allowed type. |
ValueError
|
If interpolation type not compatible with dtype .
|
Examples
# Get quartiles of x with various interpolation choices.
x = [0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]
tfp.stats.quantiles(x, num_quantiles=4, interpolation='nearest')
==> [ 0., 2., 5., 8., 10.]
tfp.stats.quantiles(x, num_quantiles=4, interpolation='linear')
==> [ 0. , 2.5, 5. , 7.5, 10. ]
tfp.stats.quantiles(x, num_quantiles=4, interpolation='lower')
==> [ 0., 2., 5., 7., 10.]
# Get deciles of columns of an R x C data set.
data = load_my_columnar_data(...)
tfp.stats.quantiles(data, num_quantiles=10)
==> Shape [11, C] Tensor