Use a TensorFlow Lite model to category a paragraph into predefined groups.
If you are new to TensorFlow Lite and are working with Android, we recommend exploring the guide of TensorFLow Lite Task Library to integrate text classification models within just a few lines of code. You can also integrate the model using the TensorFlow Lite Interpreter Java API.
The Android example below demonstrates the implementation for both methods as lib_task_api and lib_interpreter, respectively.
If you are using a platform other than Android, or you are already familiar with the TensorFlow Lite APIs, you can download our starter text classification model.
How it works
Text classification categorizes a paragraph into predefined groups based on its content.
This pretrained model predicts if a paragraph's sentiment is positive or negative. It was trained on Large Movie Review Dataset v1.0 from Mass et al, which consists of IMDB movie reviews labeled as either positive or negative.
Here are the steps to classify a paragraph with the model:
- Tokenize the paragraph and convert it to a list of word ids using a predefined vocabulary.
- Feed the list to the TensorFlow Lite model.
- Get the probability of the paragraph being positive or negative from the model outputs.
- Only English is supported.
- This model was trained on movie reviews dataset so you may experience reduced accuracy when classifying text of other domains.
Performance benchmark numbers are generated with the tool described here.
|Model Name||Model size||Device||CPU|
|Text Classification||0.6 Mb||Pixel 3 (Android 10)||0.05ms*|
|Pixel 4 (Android 10)||0.05ms*|
|iPhone XS (iOS 12.4.1)||0.025ms**|
* 4 threads used.
** 2 threads used on iPhone for the best performance result.
|Text||Negative (0)||Positive (1)|
|This is the best movie I’ve seen in recent years. Strongly recommend it!||25.3%||74.7%|
|What a waste of my time.||72.5%||27.5%|
Use your training dataset
Follow this tutorial to apply the same technique used here to train a text classification model using your own datasets. With the right dataset, you can create a model for use cases such as document categorization or toxic comments detection.