Quantizes then dequantizes a tensor.
tf.raw_ops.QuantizeAndDequantizeV2(
input,
input_min,
input_max,
signed_input=True,
num_bits=8,
range_given=False,
round_mode='HALF_TO_EVEN',
narrow_range=False,
axis=-1,
name=None
)
This op simulates the precision loss from the quantized forward pass by:
- Quantizing the tensor to fixed point numbers, which should match the target quantization method when it is used in inference.
- Dequantizing it back to floating point numbers for the following ops, most likely matmul.
There are different ways to quantize. This version uses only scaling, so 0.0 maps to 0.
From the specified 'num_bits' in the quantized output type, it determines minimum and maximum representable quantized values.
e.g.
- [-128, 127] for signed, num_bits = 8, or
- [0, 255] for unsigned, num_bits = 8.
If range_given == False, the initial input_min, input_max will be determined automatically as the minimum and maximum values in the input tensor, otherwise the specified values of input_min, input_max are used.
This op determines the maximum scale_factor that would map the initial [input_min, input_max] range to a range that lies within the representable quantized range.
It determines the scale from one of input_min and input_max, then updates the other one to maximize the representable range.
e.g.
- if the output is signed, num_bits = 8, [input_min, input_max] = [-10.0, 5.0]: it would use a scale_factor of -128 / -10.0 = 12.8 In this case, it would update input_max to be 127 / 12.8 = 9.921875
- if the output is signed, num_bits = 8, [input_min, input_max] = [-10.0, 10.0]: it would use a scale_factor of 127 / 10.0 = 12.7 In this case, it would update input_min to be 128.0 / 12.7 = -10.07874
- if the output is unsigned, input_min is forced to be 0, and only the specified input_max is used.
After determining the scale_factor and updating the input range, it applies the following to each value in the 'input' tensor.
output = round(clamp(value, input_min, input_max) * scale_factor) / scale_factor.
The above round function rounds the value based on the given round_mode.
Returns | |
---|---|
A Tensor . Has the same type as input .
|