TensorFlow is back at Google I/O on May 14! Register now

Text classification

Use a TensorFlow Lite model to category a paragraph into predefined groups.

Get started

If you are new to TensorFlow Lite and are working with Android, we recommend exploring the guide of TensorFLow Lite Task Library to integrate text classification models within just a few lines of code. You can also integrate the model using the TensorFlow Lite Interpreter Java API.

The Android example below demonstrates the implementation for both methods as lib_task_api and lib_interpreter, respectively.

Android example

If you are using a platform other than Android, or you are already familiar with the TensorFlow Lite APIs, you can download our starter text classification model.

Download starter model

How it works

Text classification categorizes a paragraph into predefined groups based on its content.

This pretrained model predicts if a paragraph's sentiment is positive or negative. It was trained on Large Movie Review Dataset v1.0 from Mass et al, which consists of IMDB movie reviews labeled as either positive or negative.

Here are the steps to classify a paragraph with the model:

Tokenize the paragraph and convert it to a list of word ids using a predefined vocabulary.
Feed the list to the TensorFlow Lite model.
Get the probability of the paragraph being positive or negative from the model outputs.

Note

Only English is supported.
This model was trained on movie reviews dataset so you may experience reduced accuracy when classifying text of other domains.

Performance benchmarks

Performance benchmark numbers are generated with the tool described here.

Model Name	Model size	Device	CPU
Text Classification	0.6 Mb	Pixel 3 (Android 10)	0.05ms*
		Pixel 4 (Android 10)	0.05ms*
		iPhone XS (iOS 12.4.1)	0.025ms**

* 4 threads used.

** 2 threads used on iPhone for the best performance result.

Example output

Text	Negative (0)	Positive (1)
This is the best movie I’ve seen in recent years. Strongly recommend it!	25.3%	74.7%
What a waste of my time.	72.5%	27.5%

Use your training dataset

Follow this tutorial to apply the same technique used here to train a text classification model using your own datasets. With the right dataset, you can create a model for use cases such as document categorization or toxic comments detection.