Unisciti alla community di SIG TFX-Addons e aiutaci a rendere TFX ancora migliore! Iscriviti a SIG TFX-Addons

Addestra e servi un modello TensorFlow con TensorFlow Serving

Questa guida addestra un modello di rete neurale per classificare le immagini di abbigliamento, come scarpe da ginnastica e camicie , salva il modello addestrato e quindi lo serve con TensorFlow Serving . L'attenzione si concentra su TensorFlow Serving, piuttosto che sulla modellazione e l'addestramento in TensorFlow, quindi per un esempio completo che si concentra sulla modellazione e l'addestramento, vedere l' esempio di classificazione di base .

Questa guida utilizza tf.keras , un'API di alto livello per creare e addestrare modelli in TensorFlow.

import sys

# Confirm that we're using Python 3
assert sys.version_info.major is 3, 'Oops, not running Python 3. Use Runtime > Change runtime type'
# TensorFlow and tf.keras
print("Installing dependencies for Colab environment")
!pip install -Uq grpcio==1.26.0

import tensorflow as tf
from tensorflow import keras

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import os
import subprocess

print('TensorFlow version: {}'.format(tf.__version__))

Crea il tuo modello

Importa il set di dati Fashion MNIST

Questa guida utilizza il set di dati Fashion MNIST che contiene 70.000 immagini in scala di grigi in 10 categorie. Le immagini mostrano singoli capi di abbigliamento a bassa risoluzione (28 x 28 pixel), come mostrato qui:

Sprite MNIST di moda
Figura 1. Campioni Fashion-MNIST (di Zalando, MIT License).

Fashion MNIST è inteso come un sostituto immediato del classico set di dati MNIST , spesso utilizzato come "Hello, World" dei programmi di machine learning per la visione artificiale . Puoi accedere a Fashion MNIST direttamente da TensorFlow, basta importare e caricare i dati.

fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# scale the values to 0.0 to 1.0
train_images = train_images / 255.0
test_images = test_images / 255.0

# reshape for feeding into the model
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
32768/29515 [=================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
26427392/26421880 [==============================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
8192/5148 [===============================================] - 0s 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
4423680/4422102 [==============================] - 0s 0us/step

train_images.shape: (60000, 28, 28, 1), of float64
test_images.shape: (10000, 28, 28, 1), of float64

Addestra e valuta il tuo modello

Usiamo la CNN più semplice possibile, poiché non siamo concentrati sulla parte di modellazione.

model = keras.Sequential([
  keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3, 
                      strides=2, activation='relu', name='Conv1'),
  keras.layers.Flatten(),
  keras.layers.Dense(10, name='Dense')
])
model.summary()

testing = False
epochs = 5

model.compile(optimizer='adam', 
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=[keras.metrics.SparseCategoricalAccuracy()])
model.fit(train_images, train_labels, epochs=epochs)

test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy: {}'.format(test_acc))
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Conv1 (Conv2D)               (None, 13, 13, 8)         80        
_________________________________________________________________
flatten (Flatten)            (None, 1352)              0         
_________________________________________________________________
Dense (Dense)                (None, 10)                13530     
=================================================================
Total params: 13,610
Trainable params: 13,610
Non-trainable params: 0
_________________________________________________________________
Epoch 1/5
1875/1875 [==============================] - 12s 2ms/step - loss: 0.5205 - sparse_categorical_accuracy: 0.8206
Epoch 2/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3819 - sparse_categorical_accuracy: 0.8672
Epoch 3/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3472 - sparse_categorical_accuracy: 0.8784
Epoch 4/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3266 - sparse_categorical_accuracy: 0.8847
Epoch 5/5
1875/1875 [==============================] - 3s 2ms/step - loss: 0.3129 - sparse_categorical_accuracy: 0.8882
313/313 [==============================] - 1s 1ms/step - loss: 0.3535 - sparse_categorical_accuracy: 0.8735

Test accuracy: 0.8734999895095825

Salva il tuo modello

Per caricare il nostro modello addestrato in TensorFlow Serving, dobbiamo prima salvarlo nel formato SavedModel . Questo creerà un file protobuf in una gerarchia di directory ben definita e includerà un numero di versione. TensorFlow Serving ci consente di selezionare quale versione di un modello, o "servable", vogliamo utilizzare quando facciamo richieste di inferenza. Ogni versione verrà esportata in una sottodirectory diversa nel percorso specificato.

# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors,
# and stored with the default serving key
import tempfile

MODEL_DIR = tempfile.gettempdir()
version = 1
export_path = os.path.join(MODEL_DIR, str(version))
print('export_path = {}\n'.format(export_path))

tf.keras.models.save_model(
    model,
    export_path,
    overwrite=True,
    include_optimizer=True,
    save_format=None,
    signatures=None,
    options=None
)

print('\nSaved model:')
!ls -l {export_path}
export_path = /tmp/1
WARNING:absl:Function `_wrapped_model` contains input name(s) Conv1_input with unsupported characters which will be renamed to conv1_input in the SavedModel.
INFO:tensorflow:Assets written to: /tmp/1/assets
INFO:tensorflow:Assets written to: /tmp/1/assets
Saved model:
total 96
drwxr-xr-x 2 kbuilder kbuilder  4096 May 25 09:12 assets
-rw-rw-r-- 1 kbuilder kbuilder  7981 May 25 09:12 keras_metadata.pb
-rw-rw-r-- 1 kbuilder kbuilder 80661 May 25 09:12 saved_model.pb
drwxr-xr-x 2 kbuilder kbuilder  4096 May 25 09:12 variables

Esamina il modello salvato

Useremo l'utilità della riga di comando saved_model_cli per esaminare MetaGraphDefs (i modelli) e SignatureDefs (i metodi che puoi chiamare) nel nostro SavedModel. Vedi questa discussione sulla CLI di SavedModel nella Guida di TensorFlow.

saved_model_cli show --dir {export_path} --all
2021-05-25 09:12:04.142378: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['Conv1_input'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 28, 28, 1)
        name: serving_default_Conv1_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['Dense'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 10)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict

Defined Functions:
  Function Name: '__call__'
    Option #1
      Callable with:
        Argument #1
          Conv1_input: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='Conv1_input')
        Argument #2
          DType: bool
          Value: False
        Argument #3
          DType: NoneType
          Value: None
    Option #2
      Callable with:
        Argument #1
          inputs: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='inputs')
        Argument #2
          DType: bool
          Value: False
        Argument #3
          DType: NoneType
          Value: None
    Option #3
      Callable with:
        Argument #1
          inputs: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='inputs')
        Argument #2
          DType: bool
          Value: True
        Argument #3
          DType: NoneType
          Value: None
    Option #4
      Callable with:
        Argument #1
          Conv1_input: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='Conv1_input')
        Argument #2
          DType: bool
          Value: True
        Argument #3
          DType: NoneType
          Value: None

  Function Name: '_default_save_signature'
    Option #1
      Callable with:
        Argument #1
          Conv1_input: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='Conv1_input')

  Function Name: 'call_and_return_all_conditional_losses'
    Option #1
      Callable with:
        Argument #1
          inputs: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='inputs')
        Argument #2
          DType: bool
          Value: True
        Argument #3
          DType: NoneType
          Value: None
    Option #2
      Callable with:
        Argument #1
          inputs: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='inputs')
        Argument #2
          DType: bool
          Value: False
        Argument #3
          DType: NoneType
          Value: None
    Option #3
      Callable with:
        Argument #1
          Conv1_input: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='Conv1_input')
        Argument #2
          DType: bool
          Value: True
        Argument #3
          DType: NoneType
          Value: None
    Option #4
      Callable with:
        Argument #1
          Conv1_input: TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='Conv1_input')
        Argument #2
          DType: bool
          Value: False
        Argument #3
          DType: NoneType
          Value: None

Questo ci dice molto sul nostro modello! In questo caso abbiamo appena addestrato il nostro modello, quindi conosciamo già gli input e gli output, ma se non lo facessimo sarebbe un'informazione importante. Non ci dice tutto, come il fatto che si tratti di dati di immagine in scala di grigi, ad esempio, ma è un ottimo inizio.

Servi il tuo modello con TensorFlow Serving

Aggiungi l'URI di distribuzione di TensorFlow Serving come origine del pacchetto:

Ci stiamo preparando a installare TensorFlow Serving utilizzando Aptitude poiché questo Colab viene eseguito in un ambiente Debian. Aggiungeremo il tensorflow-model-server all'elenco di pacchetti di cui Aptitude è a conoscenza. Nota che stiamo eseguendo come root.

import sys
# We need sudo prefix if not on a Google Colab.
if 'google.colab' not in sys.modules:
  SUDO_IF_NEEDED = 'sudo'
else:
  SUDO_IF_NEEDED = ''
# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo
# You would instead do:
# echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \
# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -

!echo "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | {SUDO_IF_NEEDED} tee /etc/apt/sources.list.d/tensorflow-serving.list && \
curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | {SUDO_IF_NEEDED} apt-key add -
!{SUDO_IF_NEEDED} apt update
deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2943  100  2943    0     0   5236      0 --:--:-- --:--:-- --:--:--  5236
OK
Hit:1 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic InRelease
Hit:2 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-updates InRelease
Hit:3 http://asia-east1.gce.archive.ubuntu.com/ubuntu bionic-backports InRelease
Hit:4 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64  InRelease
Hit:5 https://nvidia.github.io/nvidia-container-runtime/ubuntu18.04/amd64  InRelease
Hit:6 https://nvidia.github.io/nvidia-docker/ubuntu18.04/amd64  InRelease
Get:7 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3012 B]
Ign:8 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Hit:9 http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release
Hit:10 http://security.ubuntu.com/ubuntu bionic-security InRelease
Get:11 http://packages.cloud.google.com/apt google-cloud-logging-wheezy InRelease [5483 B]
Get:12 https://packages.cloud.google.com/apt eip-cloud-bionic InRelease [5419 B]
Hit:14 http://archive.canonical.com/ubuntu bionic InRelease
Get:15 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 Packages [340 B]
Err:11 http://packages.cloud.google.com/apt google-cloud-logging-wheezy InRelease
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
Err:12 https://packages.cloud.google.com/apt eip-cloud-bionic InRelease
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
Get:16 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server-universal amd64 Packages [347 B]
Fetched 14.6 kB in 1s (16.0 kB/s)



106 packages can be upgraded. Run 'apt list --upgradable' to see them.
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://packages.cloud.google.com/apt google-cloud-logging-wheezy InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: https://packages.cloud.google.com/apt eip-cloud-bionic InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
W: Failed to fetch https://packages.cloud.google.com/apt/dists/eip-cloud-bionic/InRelease  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
W: Failed to fetch http://packages.cloud.google.com/apt/dists/google-cloud-logging-wheezy/InRelease  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FEEA9169307EA071 NO_PUBKEY 8B57C5C2836F4BEB
W: Some index files failed to download. They have been ignored, or old ones used instead.

Installa TensorFlow Serving

Questo è tutto ciò di cui hai bisogno: una riga di comando!

{SUDO_IF_NEEDED} apt-get install tensorflow-model-server
The following NEW packages will be installed:
  tensorflow-model-server
0 upgraded, 1 newly installed, 0 to remove and 106 not upgraded.
Need to get 326 MB of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 2.5.1 [326 MB]
Fetched 326 MB in 7s (45.2 MB/s)
Selecting previously unselected package tensorflow-model-server.
(Reading database ... 193390 files and directories currently installed.)
Preparing to unpack .../tensorflow-model-server_2.5.1_all.deb ...
Unpacking tensorflow-model-server (2.5.1) ...
Setting up tensorflow-model-server (2.5.1) ...

Inizia a eseguire TensorFlow Serving

È qui che iniziamo a eseguire TensorFlow Serving e carichiamo il nostro modello. Dopo il caricamento possiamo iniziare a fare richieste di inferenza usando REST. Ci sono alcuni parametri importanti:

  • rest_api_port : la porta che utilizzerai per le richieste REST.
  • model_name : lo userai nell'URL delle richieste REST. Può essere qualsiasi cosa.
  • model_base_path : questo è il percorso della directory in cui hai salvato il tuo modello.
os.environ["MODEL_DIR"] = MODEL_DIR
nohup tensorflow_model_server \
  --rest_api_port=8501 \
  --model_name=fashion_model \
  --model_base_path="${MODEL_DIR}" >server.log 2>&1
tail server.log

Fai una richiesta al tuo modello in TensorFlow Serving

Per prima cosa, diamo un'occhiata a un esempio casuale dai nostri dati di test.

def show(idx, title):
  plt.figure()
  plt.imshow(test_images[idx].reshape(28,28))
  plt.axis('off')
  plt.title('\n\n{}'.format(title), fontdict={'size': 16})

import random
rando = random.randint(0,len(test_images)-1)
show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))

png

Ok, sembra interessante. Quanto è difficile per te riconoscerlo? Ora creiamo l'oggetto JSON per un batch di tre richieste di inferenza e vediamo quanto bene il nostro modello riconosce le cose:

import json
data = json.dumps({"signature_name": "serving_default", "instances": test_images[0:3].tolist()})
print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))
Data: {"signature_name": "serving_default", "instances": ...  [0.0], [0.0], [0.0], [0.0], [0.0], [0.0], [0.0]]]]}

Effettua richieste REST

Versione più recente del servable

Invieremo una richiesta di previsione come POST all'endpoint REST del nostro server e gli passeremo tre esempi. Chiederemo al nostro server di fornirci l'ultima versione del nostro servable non specificando una particolare versione.

!pip install -q requests

import requests
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
  class_names[np.argmax(predictions[0])], np.argmax(predictions[0]), class_names[test_labels[0]], test_labels[0]))

Una versione particolare del servable

Ora specifichiamo una particolare versione del nostro servable. Poiché ne abbiamo solo uno, selezioniamo la versione 1. Analizzeremo anche tutti e tre i risultati.

headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)
predictions = json.loads(json_response.text)['predictions']

for i in range(0,3):
  show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(
    class_names[np.argmax(predictions[i])], np.argmax(predictions[i]), class_names[test_labels[i]], test_labels[i]))