tensorflow::ops::QuantizeV2

#include <array_ops.h>

Quantize the 'input' tensor of type float to 'output' tensor of type 'T'.

Summary

[min_range, max_range] are scalar floats that specify the range for the 'input' data. The 'mode' attribute controls exactly which calculations are used to convert the float values to their quantized equivalents.

In 'MIN_COMBINED' mode, each value of the tensor will undergo the following:

``` out[i] = (in[i] - min_range) * range(T) / (max_range - min_range) if T == qint8, out[i] -= (range(T) + 1) / 2.0 `` hererange(T) = numeric_limits::max() - numeric_limits::min()`

MIN_COMBINED Mode Example

Assume the input is type float and has a possible range of [0.0, 6.0] and the output type is quint8 ([0, 255]). The min_range and max_range values should be specified as 0.0 and 6.0. Quantizing from float to quint8 will multiply each value of the input by 255/6 and cast to quint8.

If the output type was qint8 ([-128, 127]), the operation will additionally subtract each value by 128 prior to casting, so that the range of values aligns with the range of qint8.

If the mode is 'MIN_FIRST', then this approach is used:

``` number_of_steps = 1 << (# of bits in T) range_adjust = number_of_steps / (number_of_steps - 1) range = (range_max - range_min) * range_adjust range_scale = number_of_steps / range quantized = round(input * range_scale) - round(range_min * range_scale) + numeric_limits::min() quantized = max(quantized, numeric_limits::min()) quantized = min(quantized, numeric_limits::max()) ```

The biggest difference between this and MIN_COMBINED is that the minimum range is rounded first, before it's subtracted from the rounded value. With MIN_COMBINED, a small bias is introduced where repeated iterations of quantizing and dequantizing will introduce a larger and larger error.

One thing to watch out for is that the operator may choose to adjust the requested minimum and maximum values slightly during the quantization process, so you should always use the output ports as the range for further calculations. For example, if the requested minimum and maximum values are close to equal, they will be separated by a small epsilon value to prevent ill-formed quantized buffers from being created. Otherwise, you can end up with buffers where all the quantized values map to the same float value, which causes problems for operations that have to perform further calculations on them.

Arguments:

  • scope: A Scope object
  • min_range: The minimum scalar value possibly produced for the input.
  • max_range: The maximum scalar value possibly produced for the input.

Returns:

  • Output output: The quantized data produced from the float input.
  • Output output_min: The actual minimum scalar value used for the output.
  • Output output_max: The actual maximum scalar value used for the output.

Constructors and Destructors

QuantizeV2(const ::tensorflow::Scope & scope, ::tensorflow::Input input, ::tensorflow::Input min_range, ::tensorflow::Input max_range, DataType T)
QuantizeV2(const ::tensorflow::Scope & scope, ::tensorflow::Input input, ::tensorflow::Input min_range, ::tensorflow::Input max_range, DataType T, const QuantizeV2::Attrs & attrs)

Public attributes

output
output_max
output_min

Public static functions

Mode(StringPiece x)

Structs

tensorflow::ops::QuantizeV2::Attrs

Optional attribute setters for QuantizeV2.

Public attributes

output

::tensorflow::Output output

output_max

::tensorflow::Output output_max

output_min

::tensorflow::Output output_min

Public functions

QuantizeV2

 QuantizeV2(
  const ::tensorflow::Scope & scope,
  ::tensorflow::Input input,
  ::tensorflow::Input min_range,
  ::tensorflow::Input max_range,
  DataType T
)

QuantizeV2

 QuantizeV2(
  const ::tensorflow::Scope & scope,
  ::tensorflow::Input input,
  ::tensorflow::Input min_range,
  ::tensorflow::Input max_range,
  DataType T,
  const QuantizeV2::Attrs & attrs
)

Public static functions

Mode

Attrs Mode(
  StringPiece x
)