Performance

This document lists TensorFlow Lite performance benchmarks when running well known models on some Android and iOS devices.

These performance benchmark numbers were generated with the Android TFLite benchmark binary and the iOS benchmark app.

Android performance benchmarks

For Android benchmarks, the CPU affinity is set to use big cores on the device to reduce variance (see details).

It assumes that models were download and unzipped to the /data/local/tmp/tflite_models directory. The benchmark binary is built using these instructions and assumed in the /data/local/tmp directory.

To run the benchmark:

adb shell taskset ${CPU_MASK} /data/local/tmp/benchmark_model \
  --num_threads=1 \
  --graph=/data/local/tmp/tflite_models/${GRAPH} \
  --warmup_runs=1 \
  --num_runs=50 \
  --use_nnapi=false

Here, ${GRAPH} is the name of model and ${CPU_MASK} is the CPU affinity chosen according to the following table:

Device CPU_MASK
Pixel 2 f0
Pixel xl 0c
Model Name Device Mean inference time (std dev)
Mobilenet_1.0_224(float) Pixel 2 166.5 ms (2.6 ms)
Pixel xl 122.9 ms (1.8 ms)
Mobilenet_1.0_224 (quant) Pixel 2 69.5 ms (0.9 ms)
Pixel xl 78.9 ms (2.2 ms)
NASNet mobile Pixel 2 273.8 ms (3.5 ms)
Pixel xl 210.8 ms (4.2 ms)
SqueezeNet Pixel 2 234.0 ms (2.1 ms)
Pixel xl 158.0 ms (2.1 ms)
Inception_ResNet_V2 Pixel 2 2846.0 ms (15.0 ms)
Pixel xl 1973.0 ms (15.0 ms)
Inception_V4 Pixel 2 3180.0 ms (11.7 ms)
Pixel xl 2262.0 ms (21.0 ms)

iOS benchmarks

To run iOS benchmarks, the benchmark app was modified to include the appropriate model and benchmark_params.json was modified to set num_threads to 1.

Model Name Device Mean inference time (std dev)
Mobilenet_1.0_224(float) iPhone 8 32.2 ms (0.8 ms)
Mobilenet_1.0_224 (quant) iPhone 8 24.4 ms (0.8 ms)
NASNet mobile iPhone 8 60.3 ms (0.6 ms)
SqueezeNet iPhone 8 44.3 (0.7 ms)
Inception_ResNet_V2 iPhone 8 562.4 ms (18.2 ms)
Inception_V4 iPhone 8 661.0 ms (29.2 ms)