|View source on GitHub|
Returns an (approximately) normal column with mean to 0 and variance 1.
tft.scale_to_gaussian( x: common_types.ConsistentTensorType, elementwise: bool = False, name: Optional[str] = None, output_dtype: Optional[tf.DType] = None ) -> common_types.ConsistentTensorType
We transform the column to values that are approximately distributed according to a standard normal distribution. The transformation is obtained by applying the moments method to estimate the parameters of a Tukey HH distribution and applying the inverse of the estimated function to the column values. The method is partially described in
Georg M. Georgm "The Lambert Way to Gaussianize Heavy-Tailed Data with the Inverse of Tukey's h Transformation as a Special Case," The Scientific World Journal, Vol. 2015, Hindawi Publishing Corporation.
We use the L-moments instead of conventional moments to be able to deal with long-tailed distributions. The expressions of the L-moments for the Tukey HH distribution is in
Todd C. Headrick, and Mohan D. Pant. "Characterizing Tukey H and HH-Distributions through L-Moments and the L-Correlation," ISRN Applied Mathematics, vol. 2012, 2012. doi:10.5402/2012/980153
Note that the transformation to Gaussian is applied only if the column has long-tails. If this is not the case, for instance if values are uniformly distributed, the values are only normalized using the z score. This applies also to the cases where only one of the tails is long; the other tail is only rescaled but not non linearly transformed. Also, if the analysis set is empty, the transformation is set to to leave the input vaules unchanged.
Note that TFLearn generally permits only tf.int64 and tf.float32, so casting this scaler's output may be necessary.