Join the SIG TFX-Addons community and help make TFX even better!

# tft.scale_to_gaussian

Returns an (approximately) normal column with mean to 0 and variance 1.

We transform the column to values that are approximately distributed according to a standard normal distribution. The transformation is obtained by applying the moments method to estimate the parameters of a Tukey HH distribution and applying the inverse of the estimated function to the column values. The method is partially described in

Georg M. Georgm "The Lambert Way to Gaussianize Heavy-Tailed Data with the Inverse of Tukey's h Transformation as a Special Case," The Scientific World Journal, Vol. 2015, Hindawi Publishing Corporation.

We use the L-moments instead of conventional moments to be able to deal with long-tailed distributions. The expressions of the L-moments for the Tukey HH distribution is in

Todd C. Headrick, and Mohan D. Pant. "Characterizing Tukey H and HH-Distributions through L-Moments and the L-Correlation," ISRN Applied Mathematics, vol. 2012, 2012. doi:10.5402/2012/980153

Note that the transformation to Gaussian is applied only if the column has long-tails. If this is not the case, for instance if values are uniformly distributed, the values are only normalized using the z score. This applies also to the cases where only one of the tails is long; the other tail is only rescaled but not non linearly transformed. Also, if the analysis set is empty, the transformation is set to to leave the input vaules unchanged.

`x` A numeric `Tensor` or `SparseTensor`.
`elementwise` If true, scales each element of the tensor independently; otherwise uses the parameters of the whole tensor.
`name` (Optional) A name for this operation.
`output_dtype` (Optional) If not None, casts the output tensor to this type.

A `Tensor` or `SparseTensor` containing the input column transformed to be approximately standard distributed (i.e. a Gaussian with mean 0 and variance 1). If `x` is floating point, the mean will have the same type as `x`. If `x` is integral, the output is cast to tf.float32.

Note that TFLearn generally permits only tf.int64 and tf.float32, so casting this scaler's output may be necessary.

[{ "type": "thumb-down", "id": "missingTheInformationINeed", "label":"Missing the information I need" },{ "type": "thumb-down", "id": "tooComplicatedTooManySteps", "label":"Too complicated / too many steps" },{ "type": "thumb-down", "id": "outOfDate", "label":"Out of date" },{ "type": "thumb-down", "id": "samplesCodeIssue", "label":"Samples / code issue" },{ "type": "thumb-down", "id": "otherDown", "label":"Other" }]
[{ "type": "thumb-up", "id": "easyToUnderstand", "label":"Easy to understand" },{ "type": "thumb-up", "id": "solvedMyProblem", "label":"Solved my problem" },{ "type": "thumb-up", "id": "otherUp", "label":"Other" }]