Text classification with TensorFlow Lite Model Maker

View on TensorFlow.org Run in Google Colab View source on GitHub Download notebook

The TensorFlow Lite Model Maker library simplifies the process of adapting and converting a TensorFlow neural-network model to particular input data when deploying this model for on-device ML applications.

This notebook shows an end-to-end example that utilizes this Model Maker library to illustrate the adaption and conversion of a commonly-used text classification model to classify movie reviews on a mobile device.

Prerequisites

To run this example, we first need to install several required packages, including Model Maker package that in github repo.

pip install -q git+https://github.com/tensorflow/examples.git#egg=tensorflow-examples[model_maker]

Import the required packages.

import numpy as np
import os

import tensorflow as tf
assert tf.__version__.startswith('2')

from tensorflow_examples.lite.model_maker.core.data_util.text_dataloader import TextClassifierDataLoader
from tensorflow_examples.lite.model_maker.core.task.model_spec import AverageWordVecModelSpec
from tensorflow_examples.lite.model_maker.core.task.model_spec import BertClassifierModelSpec
from tensorflow_examples.lite.model_maker.core.task import text_classifier
/tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_addons/utils/ensure_tf_install.py:44: UserWarning: You are currently using a nightly version of TensorFlow (2.3.0-dev20200529). 
TensorFlow Addons offers no support for the nightly versions of TensorFlow. Some things might work, some other might not. 
If you encounter a bug, do not file an issue on GitHub.
  UserWarning,

Simple End-to-End Example

Get the data path

Let's get some texts to play with this simple end-to-end example.

data_path = tf.keras.utils.get_file(
      fname='aclImdb',
      origin='http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz',
      untar=True)
Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
84131840/84125825 [==============================] - 8s 0us/step

You could replace it with your own text folders. As for uploading data to colab, you could find the upload button in the left sidebar shown in the image below with the red rectangle. Just have a try to upload a zip file and unzip it. The root file path is the current path.

Upload File

If you prefer not to upload your images to the cloud, you could try to run the library locally following the guide in github.

Run the example

The example just consists of 6 lines of code as shown below, representing 5 steps of the overall process.

Step 0. Choose a model_spec that represents a model for text classifier.

model_spec = AverageWordVecModelSpec()

Step 1. Load train and test data specific to an on-device ML app and preprocess the data according to specific model_spec.

train_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'train'), model_spec=model_spec, class_labels=['pos', 'neg'])
test_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'test'), model_spec=model_spec, is_training=False, shuffle=False)
INFO:tensorflow:Saved vocabulary in /tmp/tmp_yy9n7_p/ec62defe9648a07cf12ee6e273b7597e_vocab.

Step 2. Customize the TensorFlow model.

model = text_classifier.create(train_data, model_spec=model_spec)
INFO:tensorflow:Retraining the models...

INFO:tensorflow:Retraining the models...

Epoch 1/3
781/781 [==============================] - 2s 2ms/step - loss: 0.5334 - accuracy: 0.7438
Epoch 2/3
781/781 [==============================] - 2s 2ms/step - loss: 0.2944 - accuracy: 0.8845
Epoch 3/3
781/781 [==============================] - 2s 2ms/step - loss: 0.2351 - accuracy: 0.9111

Step 3. Evaluate the model.

loss, acc = model.evaluate(test_data)
782/782 [==============================] - 1s 2ms/step - loss: 0.3237 - accuracy: 0.8644

Step 4. Export to TensorFlow Lite model. You could download it in the left sidebar same as the uploading part for your own use.

model.export(export_dir='.')
INFO:tensorflow:Saving labels in ./labels.txt.

INFO:tensorflow:Saving labels in ./labels.txt.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:465: set_learning_phase (from tensorflow.python.keras.backend) is deprecated and will be removed after 2020-10-11.
Instructions for updating:
Simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:465: set_learning_phase (from tensorflow.python.keras.backend) is deprecated and will be removed after 2020-10-11.
Instructions for updating:
Simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:105: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/tracking/tracking.py:105: Model.state_updates (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
This property should not be used in TensorFlow 2.0, as updates are applied automatically.

INFO:tensorflow:Assets written to: /tmp/tmp628c92sp/assets

INFO:tensorflow:Assets written to: /tmp/tmp628c92sp/assets

INFO:tensorflow:Saved vocabulary in ./vocab.

INFO:tensorflow:Saved vocabulary in ./vocab.

After this simple 5 steps, we could further use TensorFlow Lite model file and label file in on-device applications like in text classification reference app.

Detailed Process

In the above, we tried the simple end-to-end example. The following walks through the example step by step to show more detail.

Step 0: Choose a model_spec that represents a model for text classifier.

each model_spec object represents a specific model for the text classifier. Currently, we support averging word embedding model and BERT-base model.

model_spec = AverageWordVecModelSpec()

Step 1: Load Input Data Specific to an On-device ML App

The IMDB dataset contains 25000 movie reviews for training and 25000 movie reviews for testing from the Internet Movie Database. The dataset has two classes: positive and negative movie reviews.

Download the archive version of the dataset and untar it.

The IMDB dataset has the following directory structure:

aclImdb
|__ train
    |______ pos: [1962_10.txt, 2499_10.txt, ...]
    |______ neg: [104_3.txt, 109_2.txt, ...]
    |______ unsup: [12099_0.txt, 1424_0.txt, ...]
|__ test
    |______ pos: [1384_9.txt, 191_9.txt, ...]
    |______ neg: [1629_1.txt, 21_1.txt]

Note that the text data under train/unsup folder are unlabeled documents for unsupervised learning and such data should be ignored in this tutorial.

data_path = tf.keras.utils.get_file(
      fname='aclImdb',
      origin='http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz',
      untar=True)

Use TextClassifierDataLoader to load data.

As for from_folder() method, it could load data from the folder. It assumes that the text data of the same class are in the same subdirectory and the subfolder name is the class name. Each text file contains one movie review sample.

Parameter class_labels is used to specify which subfolder should be considered. As for train folder, this parameter is used to skip unsup subfolder.

train_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'train'), model_spec=model_spec, class_labels=['pos', 'neg'])
test_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'test'), model_spec=model_spec, is_training=False, shuffle=False)
train_data, validation_data = train_data.split(0.9)
INFO:tensorflow:Saved vocabulary in /tmp/tmp488s4h84/ec62defe9648a07cf12ee6e273b7597e_vocab.

INFO:tensorflow:Saved vocabulary in /tmp/tmp488s4h84/ec62defe9648a07cf12ee6e273b7597e_vocab.

Step 2: Customize the TensorFlow Model

Create a custom text classifier model based on the loaded data. Currently, we support averaging word embedding and BERT-base model.

model = text_classifier.create(train_data, model_spec=model_spec, validation_data=validation_data)
INFO:tensorflow:Retraining the models...

INFO:tensorflow:Retraining the models...

Epoch 1/3
703/703 [==============================] - 2s 3ms/step - loss: 0.5364 - accuracy: 0.7485 - val_loss: 0.3322 - val_accuracy: 0.8670
Epoch 2/3
703/703 [==============================] - 2s 3ms/step - loss: 0.2935 - accuracy: 0.8850 - val_loss: 0.2681 - val_accuracy: 0.8938
Epoch 3/3
703/703 [==============================] - 2s 3ms/step - loss: 0.2332 - accuracy: 0.9124 - val_loss: 0.2567 - val_accuracy: 0.9006

Have a look at the detailed model structure.

model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, 256, 16)           160048    
_________________________________________________________________
global_average_pooling1d_1 ( (None, 16)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 16)                272       
_________________________________________________________________
dropout_1 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 2)                 34        
=================================================================
Total params: 160,354
Trainable params: 160,354
Non-trainable params: 0
_________________________________________________________________

Step 3: Evaluate the Customized Model

Evaluate the result of the model, get the loss and accuracy of the model.

Evaluate the loss and accuracy in test_data. If no data is given the results are evaluated on the data that's splitted in the create method.

loss, acc = model.evaluate(test_data)
782/782 [==============================] - 1s 2ms/step - loss: 0.3069 - accuracy: 0.8738

Step 4: Export to TensorFlow Lite Model

Convert the existing model to TensorFlow Lite model format that could be later used in on-device ML application. Meanwhile, save the text labels in label file and vocabulary in vocab file. The default TFLite filename is model.tflite, the default label filename is label.txt, the default vocab filename is vocab.

model.export(export_dir='.')
INFO:tensorflow:Saving labels in ./labels.txt.

INFO:tensorflow:Saving labels in ./labels.txt.

INFO:tensorflow:Assets written to: /tmp/tmpxvg3vrck/assets

INFO:tensorflow:Assets written to: /tmp/tmpxvg3vrck/assets

INFO:tensorflow:Saved vocabulary in ./vocab.

INFO:tensorflow:Saved vocabulary in ./vocab.

The TensorFlow Lite model file and label file could be used in the text classification reference app.

In detail, we could add movie_review_classifier.tflite, text_label.txt and vocab.txt to the assets directory folder. Meanwhile, change the filenames in code.

Here, we also demonstrate how to use the above files to run and evaluate the TensorFlow Lite model.

# Read TensorFlow Lite model from TensorFlow Lite file.
with tf.io.gfile.GFile('model.tflite', 'rb') as f:
  model_content = f.read()

# Read label names from label file.
with tf.io.gfile.GFile('labels.txt', 'r') as f:
  label_names = f.read().split('\n')

# Initialze TensorFlow Lite inpterpreter.
interpreter = tf.lite.Interpreter(model_content=model_content)
interpreter.allocate_tensors()
input_index = interpreter.get_input_details()[0]['index']
output = interpreter.tensor(interpreter.get_output_details()[0]["index"])

# Run predictions on each test data and calculate accuracy.
accurate_count = 0
for text, label in test_data.dataset:
    # Add batch dimension and convert to float32 to match with the model's input
    # data format.
    text = tf.expand_dims(text, 0)
    text = tf.cast(text, tf.float32)

    # Run inference.
    interpreter.set_tensor(input_index, text)
    interpreter.invoke()

    # Post-processing: remove batch dimension and find the label with highest
    # probability.
    predict_label = np.argmax(output()[0])
    # Get label name with label index.
    predict_label_name = label_names[predict_label]
    accurate_count += (predict_label == label.numpy())

accuracy = accurate_count * 1.0 / test_data.size
print('TensorFlow Lite model accuracy = %.4f' % accuracy)
TensorFlow Lite model accuracy = 0.8738

Note that preprocessing for inference should be the same as training. Currently, preprocessing contains split the text to tokens by '\W', encode the tokens to ids, the pad the text with pad_id to have the length of seq_length.

Advanced Usage

The create function is the critical part of this library in which parameter model_spec defines the specification of the model, currently AverageWordVecModelSpec and BertModelSpec is supported. The create function contains the following steps for AverageWordVecModelSpec:

  1. Tokenize the text and select the top num_words most frequent words to generate the vocubulary. The default value of num_words in AverageWordVecModelSpec object is 10000.
  2. Encode the text string tokens to int ids.
  3. Create the text classifier model. Currently, this library supports one model: average the word embedding of the text with RELU activation, then leverage softmax dense layer for classification. As for Embedding layer, the input dimension is the size of the vocabulary, the output dimension is AverageWordVecModelSpec object's variable wordvec_dim which default value is 16, the input length is AverageWordVecModelSpec object's variable seq_len which default value is 256.
  4. Train the classifier model. The default epoch is 2 and the default batch size is 32.

In this section, we describe several advanced topics, including adjusting the model, changing the training hyperparameters etc.

Adjust the model

We could adjust the model infrastructure like variables wordvec_dim, seq_len in AverageWordVecModelSpec class.

  • wordvec_dim: Dimension of word embedding.
  • seq_len: length of sequence.

For example, we could train with larger wordvec_dim. If we change the model, we need to construct the new model_spec firstly.

new_model_spec = AverageWordVecModelSpec(wordvec_dim=32)

Secondly, we should get the preprocessed data accordingly.

new_train_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'train'), model_spec=new_model_spec, class_labels=['pos', 'neg'])
new_train_data, new_validation_data = new_train_data.split(0.9)
INFO:tensorflow:Saved vocabulary in /tmp/tmpzbnnyfvy/50181340c97c05a9b95e08855e259577_vocab.

INFO:tensorflow:Saved vocabulary in /tmp/tmpzbnnyfvy/50181340c97c05a9b95e08855e259577_vocab.

Finally, we could train the new model.

model = text_classifier.create(new_train_data, model_spec=new_model_spec, validation_data=new_validation_data)
INFO:tensorflow:Retraining the models...

INFO:tensorflow:Retraining the models...

Epoch 1/3
703/703 [==============================] - 2s 3ms/step - loss: 0.4605 - accuracy: 0.7913 - val_loss: 0.3223 - val_accuracy: 0.8710
Epoch 2/3
703/703 [==============================] - 2s 3ms/step - loss: 0.2484 - accuracy: 0.9033 - val_loss: 0.3081 - val_accuracy: 0.8834
Epoch 3/3
703/703 [==============================] - 2s 3ms/step - loss: 0.1942 - accuracy: 0.9281 - val_loss: 0.3204 - val_accuracy: 0.8798

Change the training hyperparameters

We could also change the training hyperparameters like epochs and batch_size that could affect the model accuracy. For instance,

  • epochs: more epochs could achieve better accuracy, but may lead to overfitting.
  • batch_size: number of samples to use in one training step.

For example, we could train with more epochs.

model = text_classifier.create(train_data, model_spec=model_spec, validation_data=validation_data, epochs=5)
INFO:tensorflow:Retraining the models...

INFO:tensorflow:Retraining the models...

Epoch 1/5
703/703 [==============================] - 2s 3ms/step - loss: 0.5235 - accuracy: 0.7656 - val_loss: 0.3254 - val_accuracy: 0.8734
Epoch 2/5
703/703 [==============================] - 2s 3ms/step - loss: 0.2911 - accuracy: 0.8866 - val_loss: 0.2688 - val_accuracy: 0.8934
Epoch 3/5
703/703 [==============================] - 2s 3ms/step - loss: 0.2317 - accuracy: 0.9130 - val_loss: 0.2554 - val_accuracy: 0.8998
Epoch 4/5
703/703 [==============================] - 2s 3ms/step - loss: 0.1938 - accuracy: 0.9302 - val_loss: 0.2600 - val_accuracy: 0.9026
Epoch 5/5
703/703 [==============================] - 2s 3ms/step - loss: 0.1672 - accuracy: 0.9430 - val_loss: 0.2727 - val_accuracy: 0.8982

Evaluate the newly retrained model with 5 training epochs.

loss, accuracy = model.evaluate(test_data)
782/782 [==============================] - 1s 2ms/step - loss: 0.3471 - accuracy: 0.8650

Change the Model

We could change the model by changing the model_spec. The following shows how we change to BERT-base model.

First, we could change model_spec to BertModelSpec.

model_spec = BertClassifierModelSpec()
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)

INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)

The remaining steps remains the same.

Load data and preprocess the data according to model_spec.

train_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'train'), model_spec=model_spec, class_labels=['pos', 'neg'])
test_data = TextClassifierDataLoader.from_folder(os.path.join(data_path, 'test'), model_spec=model_spec, is_training=False, shuffle=False)

Then retrain the model. Note that it could take a long time to retrain the BERT model. we just set epochs equals 1 to demonstrate it.

model = text_classifier.create(train_data, model_spec=model_spec, epochs=1)
INFO:tensorflow:Retraining the models...

INFO:tensorflow:Retraining the models...

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

INFO:tensorflow:Reduce to /job:localhost/replica:0/task:0/device:CPU:0 then broadcast to ('/job:localhost/replica:0/task:0/device:CPU:0',).

  1/781 [..............................] - ETA: 1:36 - loss: 0.8518 - test_accuracy: 0.4375WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/summary_ops_v2.py:1277: stop (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.stop` instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/summary_ops_v2.py:1277: stop (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01.
Instructions for updating:
use `tf.profiler.experimental.stop` instead.

  2/781 [..............................] - ETA: 3:21 - loss: 0.8650 - test_accuracy: 0.4375WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.1385s vs `on_train_batch_end` time: 0.2556s). Check your callbacks.

Warning:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.1385s vs `on_train_batch_end` time: 0.2556s). Check your callbacks.

781/781 [==============================] - 203s 260ms/step - loss: 0.3351 - test_accuracy: 0.8459