With probability rate, drops elements of x. Input that are kept are
scaled up by 1 / (1 - rate), otherwise outputs 0. The scaling is so that
the expected sum is unchanged.
By default, each element is kept or dropped independently. If noise_shape
is specified, it must be
to the shape of x, and only dimensions with noise_shape[i] == shape(x)[i]
will make independent decisions. For example, if shape(x) = [k, l, m, n]
and noise_shape = [k, 1, 1, n], each batch and channel component will be
kept independently and each row and column will be kept or not kept together.
A floating point tensor.
A scalar Tensor with the same type as x. The probability
that each element is dropped. For example, setting rate=0.1 would drop
10% of input elements.
A 1-D Tensor of type int32, representing the
shape for randomly generated keep/drop flags.