Quantisierung des Dynamikbereichs nach dem Training

Auf TensorFlow.org ansehen In Google Colab ausführen Quelle auf GitHub anzeigen Notizbuch herunterladen

Überblick

TensorFlow Lite unterstützt nun die Gewichte auf 8 Bit Genauigkeit als Teil der Modellumwandlung von tensorflow graphdefs bis TensorFlow Lite flachen Pufferformat zu konvertieren. Die dynamische Bereichsquantisierung erreicht eine 4-fache Reduzierung der Modellgröße. Darüber hinaus unterstützt TFLite die On-the-Fly-Quantisierung und Dequantisierung von Aktivierungen, um Folgendes zu ermöglichen:

  1. Verwenden von quantisierten Kerneln für eine schnellere Implementierung, sofern verfügbar.
  2. Mischen von Gleitkomma-Kernels mit quantisierten Kerneln für verschiedene Teile des Graphen.

Die Aktivierungen werden immer in Gleitkomma gespeichert. Für Ops, die quantisierte Kernel unterstützen, werden die Aktivierungen vor der Verarbeitung dynamisch auf eine Genauigkeit von 8 Bit quantisiert und nach der Verarbeitung auf Gleitkomma-Präzision dequantisiert. Abhängig vom zu konvertierenden Modell kann dies eine Beschleunigung gegenüber der reinen Gleitkommaberechnung ergeben.

Im Gegensatz zur Quantisierung bewusst Ausbildung sind die Gewichte quantisiert nach dem Training und die Aktivierungen werden dynamisch zur Inferenz in diesem Verfahren quantisiert. Daher werden die Modellgewichtungen nicht neu trainiert, um quantisierungsinduzierte Fehler zu kompensieren. Es ist wichtig, die Genauigkeit des quantisierten Modells zu überprüfen, um sicherzustellen, dass die Verschlechterung akzeptabel ist.

In diesem Tutorial wird ein MNIST-Modell von Grund auf trainiert, seine Genauigkeit in TensorFlow überprüft und dann das Modell in einen Tensorflow Lite-Flatbuffer mit dynamischer Bereichsquantisierung umgewandelt. Schließlich überprüft es die Genauigkeit des konvertierten Modells und vergleicht es mit dem ursprünglichen Float-Modell.

Erstellen Sie ein MNIST-Modell

Installieren

import logging
logging.getLogger("tensorflow").setLevel(logging.DEBUG)

import tensorflow as tf
from tensorflow import keras
import numpy as np
import pathlib

Trainieren Sie ein TensorFlow-Modell

# Load MNIST dataset
mnist = keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the input image so that each pixel value is between 0 to 1.
train_images = train_images / 255.0
test_images = test_images / 255.0

# Define the model architecture
model = keras.Sequential([
  keras.layers.InputLayer(input_shape=(28, 28)),
  keras.layers.Reshape(target_shape=(28, 28, 1)),
  keras.layers.Conv2D(filters=12, kernel_size=(3, 3), activation=tf.nn.relu),
  keras.layers.MaxPooling2D(pool_size=(2, 2)),
  keras.layers.Flatten(),
  keras.layers.Dense(10)
])

# Train the digit classification model
model.compile(optimizer='adam',
              loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
model.fit(
  train_images,
  train_labels,
  epochs=1,
  validation_data=(test_images, test_labels)
)
2021-08-12 11:16:09.363042: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.371096: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.371982: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.373801: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-12 11:16:09.374414: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.375415: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.376347: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.971601: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.972501: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.973396: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:09.974289: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14648 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0
2021-08-12 11:16:10.859606: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2021-08-12 11:16:11.609851: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100
2021-08-12 11:16:12.148395: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
1875/1875 [==============================] - 6s 2ms/step - loss: 0.3089 - accuracy: 0.9132 - val_loss: 0.1487 - val_accuracy: 0.9580
<keras.callbacks.History at 0x7f949c017090>

Da Sie das Modell beispielsweise nur für eine einzelne Epoche trainiert haben, wird es nur mit einer Genauigkeit von ~96 % trainiert.

In ein TensorFlow Lite-Modell umwandeln

Mit Hilfe des Python TFLiteConverter , können Sie nun das trainierte Modell in ein TensorFlow Lite Modell konvertieren.

Laden Sie nun das Modell die Verwendung von TFLiteConverter :

converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
2021-08-12 11:16:16.898830: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: /tmp/tmp6i7azt26/assets
2021-08-12 11:16:17.314524: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.314883: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-08-12 11:16:17.314984: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-08-12 11:16:17.315359: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.315688: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.315957: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.316301: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.316581: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.316831: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14648 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0
2021-08-12 11:16:17.318450: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1137] Optimization results for grappler item: graph_to_optimize
  function_optimizer: function_optimizer did nothing. time = 0.007ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.

2021-08-12 11:16:17.351933: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:351] Ignored output_format.
2021-08-12 11:16:17.351977: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored drop_control_dependency.
2021-08-12 11:16:17.355587: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.

Schreiben Sie es in eine tflite-Datei:

tflite_models_dir = pathlib.Path("/tmp/mnist_tflite_models/")
tflite_models_dir.mkdir(exist_ok=True, parents=True)
tflite_model_file = tflite_models_dir/"mnist_model.tflite"
tflite_model_file.write_bytes(tflite_model)
84500

Um das Modell auf den Export quantisiert, stellen Sie die optimizations Flagge zu optimieren für Größe:

converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_quant_model = converter.convert()
tflite_model_quant_file = tflite_models_dir/"mnist_model_quant.tflite"
tflite_model_quant_file.write_bytes(tflite_quant_model)
INFO:tensorflow:Assets written to: /tmp/tmp96urda5g/assets
INFO:tensorflow:Assets written to: /tmp/tmp96urda5g/assets
2021-08-12 11:16:17.933090: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.933473: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-08-12 11:16:17.933569: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-08-12 11:16:17.933912: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.934278: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.934568: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.934912: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.935210: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:17.935467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14648 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0
2021-08-12 11:16:17.937127: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1137] Optimization results for grappler item: graph_to_optimize
  function_optimizer: function_optimizer did nothing. time = 0.008ms.
  function_optimizer: function_optimizer did nothing. time = 0.002ms.

2021-08-12 11:16:17.971218: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:351] Ignored output_format.
2021-08-12 11:16:17.971263: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored drop_control_dependency.
2021-08-12 11:16:17.991496: I tensorflow/lite/tools/optimize/quantize_weights.cc:225] Skipping quantization of tensor sequential/conv2d/Conv2D because it has fewer than 1024 elements (108).
23904

Beachten Sie, wie die resultierende Datei ist etwa 1/4 der Größe.

ls -lh {tflite_models_dir}
total 180K
-rw-rw-r-- 1 kbuilder kbuilder 83K Aug 12 11:16 mnist_model.tflite
-rw-rw-r-- 1 kbuilder kbuilder 24K Aug 12 11:16 mnist_model_quant.tflite
-rw-rw-r-- 1 kbuilder kbuilder 25K Aug 12 11:15 mnist_model_quant_16x8.tflite
-rw-rw-r-- 1 kbuilder kbuilder 44K Aug 12 11:14 mnist_model_quant_f16.tflite

Führen Sie die TFLite-Modelle aus

Führen Sie das TensorFlow Lite-Modell mit dem Python TensorFlow Lite Interpreter aus.

Laden Sie das Modell in einen Interpreter

interpreter = tf.lite.Interpreter(model_path=str(tflite_model_file))
interpreter.allocate_tensors()
interpreter_quant = tf.lite.Interpreter(model_path=str(tflite_model_quant_file))
interpreter_quant.allocate_tensors()

Testen Sie das Modell auf einem Bild

test_image = np.expand_dims(test_images[0], axis=0).astype(np.float32)

input_index = interpreter.get_input_details()[0]["index"]
output_index = interpreter.get_output_details()[0]["index"]

interpreter.set_tensor(input_index, test_image)
interpreter.invoke()
predictions = interpreter.get_tensor(output_index)
import matplotlib.pylab as plt

plt.imshow(test_images[0])
template = "True:{true}, predicted:{predict}"
_ = plt.title(template.format(true= str(test_labels[0]),
                              predict=str(np.argmax(predictions[0]))))
plt.grid(False)

png

Bewerten Sie die Modelle

# A helper function to evaluate the TF Lite model using "test" dataset.
def evaluate_model(interpreter):
  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]

  # Run predictions on every image in the "test" dataset.
  prediction_digits = []
  for test_image in test_images:
    # Pre-processing: add batch dimension and convert to float32 to match with
    # the model's input data format.
    test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
    interpreter.set_tensor(input_index, test_image)

    # Run inference.
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the digit with highest
    # probability.
    output = interpreter.tensor(output_index)
    digit = np.argmax(output()[0])
    prediction_digits.append(digit)

  # Compare prediction results with ground truth labels to calculate accuracy.
  accurate_count = 0
  for index in range(len(prediction_digits)):
    if prediction_digits[index] == test_labels[index]:
      accurate_count += 1
  accuracy = accurate_count * 1.0 / len(prediction_digits)

  return accuracy
print(evaluate_model(interpreter))
0.958

Wiederholen Sie die Auswertung des quantisierten Modells mit dynamischem Bereich, um Folgendes zu erhalten:

print(evaluate_model(interpreter_quant))
0.958

In diesem Beispiel weist das komprimierte Modell keinen Unterschied in der Genauigkeit auf.

Optimierung eines bestehenden Modells

Resnets mit Voraktivierungsschichten (Resnet-v2) werden häufig für Vision-Anwendungen verwendet. Vortrainiert gefrorenen Graph für RESNET-v2-101 auf verfügbar ist Tensorflow Hub .

Sie können den eingefrorenen Graphen in einen TensorFLow Lite-Flatbuffer mit Quantisierung umwandeln, indem Sie:

import tensorflow_hub as hub

resnet_v2_101 = tf.keras.Sequential([
  keras.layers.InputLayer(input_shape=(224, 224, 3)),
  hub.KerasLayer("https://tfhub.dev/google/imagenet/resnet_v2_101/classification/4")
])

converter = tf.lite.TFLiteConverter.from_keras_model(resnet_v2_101)
# Convert to TF Lite without quantization
resnet_tflite_file = tflite_models_dir/"resnet_v2_101.tflite"
resnet_tflite_file.write_bytes(converter.convert())
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
INFO:tensorflow:Assets written to: /tmp/tmpbckbhpxw/assets
INFO:tensorflow:Assets written to: /tmp/tmpbckbhpxw/assets
2021-08-12 11:16:38.804061: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:38.804486: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-08-12 11:16:38.804652: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-08-12 11:16:38.805086: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:38.805426: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:38.805694: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:38.806093: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:38.806386: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:38.806636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14648 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0
2021-08-12 11:16:38.929631: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1137] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 3495 nodes (2947), 5719 edges (5171), time = 75.817ms.
  function_optimizer: function_optimizer did nothing. time = 2.571ms.

2021-08-12 11:16:45.142399: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:351] Ignored output_format.
2021-08-12 11:16:45.142451: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored drop_control_dependency.
178509352
# Convert to TF Lite with quantization
converter.optimizations = [tf.lite.Optimize.DEFAULT]
resnet_quantized_tflite_file = tflite_models_dir/"resnet_v2_101_quantized.tflite"
resnet_quantized_tflite_file.write_bytes(converter.convert())
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
WARNING:tensorflow:Compiled the loaded model, but the compiled metrics have yet to be built. `model.compile_metrics` will be empty until you train or evaluate the model.
INFO:tensorflow:Assets written to: /tmp/tmp5p9jctff/assets
INFO:tensorflow:Assets written to: /tmp/tmp5p9jctff/assets
2021-08-12 11:16:55.845715: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:55.846085: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-08-12 11:16:55.846192: I tensorflow/core/grappler/clusters/single_machine.cc:357] Starting new session
2021-08-12 11:16:55.846602: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:55.846932: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:55.847198: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:55.847626: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:55.847908: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-12 11:16:55.848152: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 14648 MB memory:  -> device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0
2021-08-12 11:16:55.973014: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1137] Optimization results for grappler item: graph_to_optimize
  function_optimizer: Graph size after: 3495 nodes (2947), 5719 edges (5171), time = 76.741ms.
  function_optimizer: function_optimizer did nothing. time = 4.109ms.

2021-08-12 11:17:00.507069: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:351] Ignored output_format.
2021-08-12 11:17:00.507116: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored drop_control_dependency.
46256864
ls -lh {tflite_models_dir}/*.tflite
-rw-rw-r-- 1 kbuilder kbuilder  83K Aug 12 11:16 /tmp/mnist_tflite_models/mnist_model.tflite
-rw-rw-r-- 1 kbuilder kbuilder  24K Aug 12 11:16 /tmp/mnist_tflite_models/mnist_model_quant.tflite
-rw-rw-r-- 1 kbuilder kbuilder  25K Aug 12 11:15 /tmp/mnist_tflite_models/mnist_model_quant_16x8.tflite
-rw-rw-r-- 1 kbuilder kbuilder  44K Aug 12 11:14 /tmp/mnist_tflite_models/mnist_model_quant_f16.tflite
-rw-rw-r-- 1 kbuilder kbuilder 171M Aug 12 11:16 /tmp/mnist_tflite_models/resnet_v2_101.tflite
-rw-rw-r-- 1 kbuilder kbuilder  45M Aug 12 11:17 /tmp/mnist_tflite_models/resnet_v2_101_quantized.tflite

Die Modellgröße reduziert sich von 171 MB auf 43 MB. Die Genauigkeit dieses Modells auf IMAGEnet kann mit den vorgesehenen Skripten ausgewertet werden TFLite Messgenauigkeit .

Die optimierte Top-1-Genauigkeit des Modells beträgt 76,8, genau wie beim Gleitkommamodell.