Perform a quantized matrix multiplication of `a` by the matrix `b` with bias add and relu and requantize fusion.
The inputs must be two-dimensional matrices and 1D bias vector. And the inner dimension of `a` (after being transposed if `transpose_a` is non-zero) must match the outer dimension of `b` (after being transposed if `transposed_b` is non-zero). Then do broadcast add operation with bias values on the matrix multiplication result. The bias size must match inner dimension of `b`. Then do relu activation to get non-negative result. Then do requantize operation to get final uint8 result.
Nested Classes
class | QuantizedMatMulWithBiasAndReluAndRequantize.Options |
Optional attributes for
QuantizedMatMulWithBiasAndReluAndRequantize
|
Public Methods
static <W, T, U, V> QuantizedMatMulWithBiasAndReluAndRequantize <W> |
create
(
Scope
scope,
Operand
<T> a,
Operand
<U> b,
Operand
<V> bias,
Operand
<Float> minA,
Operand
<Float> maxA,
Operand
<Float> minB,
Operand
<Float> maxB,
Operand
<Float> minFreezedOutput,
Operand
<Float> maxFreezedOutput, Class<W> Toutput,
Options...
options)
Factory method to create a class wrapping a new QuantizedMatMulWithBiasAndReluAndRequantize operation.
|
static QuantizedMatMulWithBiasAndReluAndRequantize.Options |
inputQuantMode
(String inputQuantMode)
|
Output <Float> |
maxOut
()
The float value that the highest quantized output value represents.
|
Output <Float> |
minOut
()
The float value that the lowest quantized output value represents.
|
Output <W> |
out
()
|
static QuantizedMatMulWithBiasAndReluAndRequantize.Options |
transposeA
(Boolean transposeA)
|
static QuantizedMatMulWithBiasAndReluAndRequantize.Options |
transposeB
(Boolean transposeB)
|
Inherited Methods
Public Methods
public static QuantizedMatMulWithBiasAndReluAndRequantize <W> create ( Scope scope, Operand <T> a, Operand <U> b, Operand <V> bias, Operand <Float> minA, Operand <Float> maxA, Operand <Float> minB, Operand <Float> maxB, Operand <Float> minFreezedOutput, Operand <Float> maxFreezedOutput, Class<W> Toutput, Options... options)
Factory method to create a class wrapping a new QuantizedMatMulWithBiasAndReluAndRequantize operation.
Parameters
scope | current scope |
---|---|
a | A matrix to be multiplied. Must be a two-dimensional tensor of type `quint8`. |
b | A matrix to be multiplied and must be a two-dimensional tensor of type `qint8`. |
bias | A 1D bias tensor with size matching with inner dimension of `b` (after being transposed if `transposed_b` is non-zero). |
minA | The float value that the lowest quantized `a` value represents. |
maxA | The float value that the highest quantized `a` value represents. |
minB | The float value that the lowest quantized `b` value represents. |
maxB | The float value that the highest quantized `b` value represents. |
minFreezedOutput | The float value that the highest quantized output value after requantize. |
options | carries optional attributes values |
Returns
- a new instance of QuantizedMatMulWithBiasAndReluAndRequantize
public static QuantizedMatMulWithBiasAndReluAndRequantize.Options inputQuantMode (String inputQuantMode)
Parameters
inputQuantMode | Input data quantization mode. Either MIN_FIRST(default) or SCALED. |
---|
public static QuantizedMatMulWithBiasAndReluAndRequantize.Options transposeA (Boolean transposeA)
Parameters
transposeA | If true, `a` is transposed before multiplication. |
---|
public static QuantizedMatMulWithBiasAndReluAndRequantize.Options transposeB (Boolean transposeB)
Parameters
transposeB | If true, `b` is transposed before multiplication. |
---|