tfmot.quantization.keras.quantize_annotate_model

View source on GitHub

Annotate a tf.keras model to be quantized.

This function does not actually quantize the model. It merely specifies that the model needs to be quantized. quantize_apply can then be used to quantize the model.

This function is intended to be used in conjunction with the quantize_annotate_layer API. Otherwise, it is simpler to use quantize_model.

Annotate a model while overriding the default behavior for a layer:

quantize_config = MyDenseQuantizeConfig()

model = quantize_annotate_model(
  keras.Sequential([
    layers.Dense(10, activation='relu', input_shape=(100,)),
    quantize_annotate_layer(
        layers.Dense(2, activation='sigmoid'),
        quantize_config=quantize_config)
  ]))

# The first Dense layer gets quantized with the default behavior,
# but the second layer uses `MyDenseQuantizeConfig` for quantization.
quantized_model = quantize_apply(model)

Note that this function removes the optimizer from the original model.

to_annotate tf.keras model which needs to be quantized.

New tf.keras model with each layer in the model wrapped with QuantizeAnnotate. The new model preserves weights from the original model.