Retreinando um classificador de imagens

Ver no TensorFlow.org Executar no Google Colab Ver no GitHub Baixar caderno Veja os modelos TF Hub

Introdução

Os modelos de classificação de imagens têm milhões de parâmetros. Treiná-los do zero requer muitos dados de treinamento rotulados e muito poder de computação. A aprendizagem por transferência é uma técnica que reduz muito disso, pegando um pedaço de um modelo que já foi treinado em uma tarefa relacionada e reutilizando-o em um novo modelo.

Este Colab demonstra como construir um modelo Keras para classificar cinco espécies de flores usando um TF2 SavedModel pré-treinado do TensorFlow Hub para extração de recursos de imagem, treinado no conjunto de dados ImageNet muito maior e mais geral. Opcionalmente, o extrator de recursos pode ser treinado ("ajustado") junto com o classificador recém-adicionado.

Procurando uma ferramenta?

Este é um tutorial de codificação do TensorFlow. Se você quer uma ferramenta que só constrói o modelo Lite TensorFlow ou TF para, dê uma olhada no make_image_classifier ferramenta de linha de comando que fica instalado pelo pacote PIP tensorflow-hub[make_image_classifier] , ou pelo presente colab TF Lite.

Configurar

import itertools
import os

import matplotlib.pylab as plt
import numpy as np

import tensorflow as tf
import tensorflow_hub as hub

print("TF version:", tf.__version__)
print("Hub version:", hub.__version__)
print("GPU is", "available" if tf.test.is_gpu_available() else "NOT AVAILABLE")
2021-07-10 11:09:41.908423: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
TF version: 2.5.0
Hub version: 0.12.0
WARNING:tensorflow:From /tmp/ipykernel_26701/2333497024.py:12: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
GPU is available
2021-07-10 11:09:43.415190: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-07-10 11:09:43.416714: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-10 11:09:44.071543: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:44.072487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-10 11:09:44.072568: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-10 11:09:44.077666: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-10 11:09:44.077764: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-10 11:09:44.079203: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-10 11:09:44.079628: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-10 11:09:44.081174: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-10 11:09:44.082459: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-10 11:09:44.082713: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-10 11:09:44.082854: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:44.083790: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:44.084516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-10 11:09:44.084569: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-10 11:09:44.670922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-10 11:09:44.670956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-10 11:09:44.670964: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-10 11:09:44.671157: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:44.671783: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:44.672424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:44.673061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)

Selecione o módulo TF2 SavedModel para usar

Para começar, use https: //tfhub.dev/ google / IMAGEnet / mobilenet_v2_100_224 / feature_vector / 4 . A mesma URL pode ser usada no código para identificar o SavedModel e em seu navegador para mostrar sua documentação. (Observe que os modelos no formato TF1 Hub não funcionam aqui.)

Você pode encontrar mais modelos TF2 que geram a imagem de vetores de características aqui .

Existem vários modelos possíveis para experimentar. Tudo que você precisa fazer é selecionar um diferente na célula abaixo e acompanhar com o notebook.

Selected model: efficientnetv2-s : https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet1k_s/feature_vector/1
Input size (384, 384)

Configure o conjunto de dados do Flowers

As entradas são redimensionadas adequadamente para o módulo selecionado. O aumento do conjunto de dados (ou seja, distorções aleatórias de uma imagem cada vez que é lida) melhora o treinamento, esp. durante o ajuste fino.

data_dir = tf.keras.utils.get_file(
    'flower_photos',
    'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
    untar=True)
Downloading data from https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz
228818944/228813984 [==============================] - 7s 0us/step

Found 3670 files belonging to 5 classes.
Using 2936 files for training.
2021-07-10 11:09:55.068048: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.068753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-10 11:09:55.068865: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.069469: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.070075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-10 11:09:55.070582: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.071188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-10 11:09:55.071258: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.071913: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.072499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-10 11:09:55.072531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-10 11:09:55.072538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-10 11:09:55.072544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-10 11:09:55.072632: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.073239: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:09:55.073809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
Found 3670 files belonging to 5 classes.
Using 734 files for validation.

Definindo o modelo

Tudo o que precisamos é colocar um classificador linear no topo do feature_extractor_layer com o módulo de Hub.

Para a velocidade, começamos com um não-treinável feature_extractor_layer , mas você também pode ativar o ajuste fino para maior precisão.

do_fine_tuning = False
print("Building model with", model_handle)
model = tf.keras.Sequential([
    # Explicitly define the input shape so the model can be properly
    # loaded by the TFLiteConverter
    tf.keras.layers.InputLayer(input_shape=IMAGE_SIZE + (3,)),
    hub.KerasLayer(model_handle, trainable=do_fine_tuning),
    tf.keras.layers.Dropout(rate=0.2),
    tf.keras.layers.Dense(len(class_names),
                          kernel_regularizer=tf.keras.regularizers.l2(0.0001))
])
model.build((None,)+IMAGE_SIZE+(3,))
model.summary()
Building model with https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet1k_s/feature_vector/1
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
keras_layer (KerasLayer)     (None, 1280)              20331360  
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 6405      
=================================================================
Total params: 20,337,765
Trainable params: 6,405
Non-trainable params: 20,331,360
_________________________________________________________________

Treinando o modelo

model.compile(
  optimizer=tf.keras.optimizers.SGD(lr=0.005, momentum=0.9), 
  loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True, label_smoothing=0.1),
  metrics=['accuracy'])
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py:375: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.
  "The `lr` argument is deprecated, use `learning_rate` instead.")
steps_per_epoch = train_size // BATCH_SIZE
validation_steps = valid_size // BATCH_SIZE
hist = model.fit(
    train_ds,
    epochs=5, steps_per_epoch=steps_per_epoch,
    validation_data=val_ds,
    validation_steps=validation_steps).history
Epoch 1/5
2021-07-10 11:10:12.421017: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-07-10 11:10:12.421548: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000194999 Hz
2021-07-10 11:10:21.888082: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-10 11:10:23.880143: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8100
2021-07-10 11:10:29.026270: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-10 11:10:29.386777: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
183/183 [==============================] - 33s 87ms/step - loss: 0.8008 - accuracy: 0.8152 - val_loss: 0.6219 - val_accuracy: 0.9111
Epoch 2/5
183/183 [==============================] - 15s 79ms/step - loss: 0.6302 - accuracy: 0.9072 - val_loss: 0.5925 - val_accuracy: 0.9319
Epoch 3/5
183/183 [==============================] - 14s 79ms/step - loss: 0.5983 - accuracy: 0.9260 - val_loss: 0.5788 - val_accuracy: 0.9361
Epoch 4/5
183/183 [==============================] - 14s 79ms/step - loss: 0.5845 - accuracy: 0.9308 - val_loss: 0.5682 - val_accuracy: 0.9431
Epoch 5/5
183/183 [==============================] - 14s 78ms/step - loss: 0.5725 - accuracy: 0.9408 - val_loss: 0.5651 - val_accuracy: 0.9431
plt.figure()
plt.ylabel("Loss (training and validation)")
plt.xlabel("Training Steps")
plt.ylim([0,2])
plt.plot(hist["loss"])
plt.plot(hist["val_loss"])

plt.figure()
plt.ylabel("Accuracy (training and validation)")
plt.xlabel("Training Steps")
plt.ylim([0,1])
plt.plot(hist["accuracy"])
plt.plot(hist["val_accuracy"])
[<matplotlib.lines.Line2D at 0x7f37c0290e10>]

png

png

Experimente o modelo em uma imagem dos dados de validação:

x, y = next(iter(val_ds))
image = x[0, :, :, :]
true_index = np.argmax(y[0])
plt.imshow(image)
plt.axis('off')
plt.show()

# Expand the validation image to (1, 224, 224, 3) before predicting the label
prediction_scores = model.predict(np.expand_dims(image, axis=0))
predicted_index = np.argmax(prediction_scores)
print("True label: " + class_names[true_index])
print("Predicted label: " + class_names[predicted_index])

png

True label: sunflowers
Predicted label: sunflowers

Finalmente, o modelo treinado pode ser salvo para implantação no TF Serving ou TF Lite (no celular) da seguinte maneira.

saved_model_path = f"/tmp/saved_flowers_model_{model_name}"
tf.saved_model.save(model, saved_model_path)
2021-07-10 11:11:49.663732: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
WARNING:absl:Found untraced functions such as restored_function_body, restored_function_body, restored_function_body, restored_function_body, restored_function_body while saving (showing 5 of 855). These functions will not be directly callable after loading.
WARNING:tensorflow:FOR KERAS USERS: The object that you are saving contains one or more Keras models or layers. If you are loading the SavedModel with `tf.keras.models.load_model`, continue reading (otherwise, you may ignore the following instructions). Please change your code to save with `tf.keras.models.save_model` or `model.save`, and confirm that the file "keras.metadata" exists in the export directory. In the future, Keras will only load the SavedModels that have this file. In other words, `tf.saved_model.save` will no longer write SavedModels that can be recovered as Keras models (this will apply in TF 2.5).

FOR DEVS: If you are overwriting _tracking_metadata in your class, this property has been used to save metadata in the SavedModel. The metadta field will be deprecated soon, so please move the metadata to a different file.
WARNING:tensorflow:FOR KERAS USERS: The object that you are saving contains one or more Keras models or layers. If you are loading the SavedModel with `tf.keras.models.load_model`, continue reading (otherwise, you may ignore the following instructions). Please change your code to save with `tf.keras.models.save_model` or `model.save`, and confirm that the file "keras.metadata" exists in the export directory. In the future, Keras will only load the SavedModels that have this file. In other words, `tf.saved_model.save` will no longer write SavedModels that can be recovered as Keras models (this will apply in TF 2.5).

FOR DEVS: If you are overwriting _tracking_metadata in your class, this property has been used to save metadata in the SavedModel. The metadta field will be deprecated soon, so please move the metadata to a different file.
INFO:tensorflow:Assets written to: /tmp/saved_flowers_model_efficientnetv2-s/assets
INFO:tensorflow:Assets written to: /tmp/saved_flowers_model_efficientnetv2-s/assets

Opcional: implantação no TensorFlow Lite

TensorFlow Lite permite implantar modelos TensorFlow para dispositivos móveis e da Internet das coisas. O código a seguir mostra como converter o modelo treinado para TF Lite e aplicar ferramentas de pós-treinamento do TensorFlow modelo de otimização Toolkit . Finalmente, ele o executa no TF Lite Interpreter para examinar a qualidade resultante

  • A conversão sem otimização fornece os mesmos resultados de antes (até erro de arredondamento).
  • A conversão com otimização sem quaisquer dados quantiza os pesos do modelo em 8 bits, mas a inferência ainda usa computação de ponto flutuante para as ativações da rede neural. Isso reduz o tamanho do modelo quase por um fator de 4 e melhora a latência da CPU em dispositivos móveis.
  • Além disso, o cálculo das ativações da rede neural pode ser quantizado para números inteiros de 8 bits também se um pequeno conjunto de dados de referência for fornecido para calibrar a faixa de quantização. Em um dispositivo móvel, isso acelera ainda mais a inferência e possibilita a execução em aceleradores como EdgeTPU.

Configurações de otimização

Wrote TFLite model of 80553236 bytes.
2021-07-10 11:12:18.591401: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:345] Ignored output_format.
2021-07-10 11:12:18.591450: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:348] Ignored drop_control_dependency.
2021-07-10 11:12:18.591456: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:354] Ignored change_concat_input_ranges.
2021-07-10 11:12:18.592361: I tensorflow/cc/saved_model/reader.cc:38] Reading SavedModel from: /tmp/saved_flowers_model_efficientnetv2-s
2021-07-10 11:12:18.733110: I tensorflow/cc/saved_model/reader.cc:90] Reading meta graph with tags { serve }
2021-07-10 11:12:18.733153: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/saved_flowers_model_efficientnetv2-s
2021-07-10 11:12:18.733245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-10 11:12:18.733262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      
2021-07-10 11:12:19.277214: I tensorflow/cc/saved_model/loader.cc:206] Restoring SavedModel bundle.
2021-07-10 11:12:20.495780: I tensorflow/cc/saved_model/loader.cc:190] Running initialization op on SavedModel bundle at path: /tmp/saved_flowers_model_efficientnetv2-s
2021-07-10 11:12:20.927889: I tensorflow/cc/saved_model/loader.cc:277] SavedModel load for tags { serve }; Status: success: OK. Took 2335531 microseconds.
2021-07-10 11:12:22.377509: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:210] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2021-07-10 11:12:24.006170: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:12:24.006616: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-10 11:12:24.006755: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:12:24.007069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:12:24.007322: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-10 11:12:24.007365: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-10 11:12:24.007372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-10 11:12:24.007391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-10 11:12:24.007491: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:12:24.007783: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-10 11:12:24.008071: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
interpreter = tf.lite.Interpreter(model_content=lite_model_content)
# This little helper wraps the TF Lite interpreter as a numpy-to-numpy function.
def lite_model(images):
  interpreter.allocate_tensors()
  interpreter.set_tensor(interpreter.get_input_details()[0]['index'], images)
  interpreter.invoke()
  return interpreter.get_tensor(interpreter.get_output_details()[0]['index'])
num_eval_examples = 50 
eval_dataset = ((image, label)  # TFLite expects batch size 1.
                for batch in train_ds
                for (image, label) in zip(*batch))
count = 0
count_lite_tf_agree = 0
count_lite_correct = 0
for image, label in eval_dataset:
  probs_lite = lite_model(image[None, ...])[0]
  probs_tf = model(image[None, ...]).numpy()[0]
  y_lite = np.argmax(probs_lite)
  y_tf = np.argmax(probs_tf)
  y_true = np.argmax(label)
  count +=1
  if y_lite == y_tf: count_lite_tf_agree += 1
  if y_lite == y_true: count_lite_correct += 1
  if count >= num_eval_examples: break
print("TF Lite model agrees with original model on %d of %d examples (%g%%)." %
      (count_lite_tf_agree, count, 100.0 * count_lite_tf_agree / count))
print("TF Lite model is accurate on %d of %d examples (%g%%)." %
      (count_lite_correct, count, 100.0 * count_lite_correct / count))
TF Lite model agrees with original model on 50 of 50 examples (100%).
TF Lite model is accurate on 47 of 50 examples (94%).