Save the date! Google I/O returns May 18-20 Register now

TensorFlow Lite guide

TensorFlow Lite is a set of tools to help developers run TensorFlow models on mobile, embedded, and IoT devices. It enables on-device machine learning inference with low latency and a small binary size.

TensorFlow Lite consists of two main components:

  • The TensorFlow Lite interpreter, which runs specially optimized models on many different hardware types, including mobile phones, embedded Linux devices, and microcontrollers.
  • The TensorFlow Lite converter, which converts TensorFlow models into an efficient form for use by the interpreter, and can introduce optimizations to improve binary size and performance.

Machine learning at the edge

TensorFlow Lite is designed to make it easy to perform machine learning on devices, "at the edge" of the network, instead of sending data back and forth from a server. For developers, performing machine learning on-device can help improve:

  • Latency: there's no round-trip to a server
  • Privacy: no data needs to leave the device
  • Connectivity: an Internet connection isn't required
  • Power consumption: network connections are power hungry

TensorFlow Lite works with a huge range of devices, from tiny microcontrollers to powerful mobile phones.

Get started

To begin working with TensorFlow Lite on mobile devices, visit Get started. If you want to deploy TensorFlow Lite models to microcontrollers, visit Microcontrollers.

Key features

  • Interpreter tuned for on-device ML, supporting a set of core operators that are optimized for on-device applications, and with a small binary size.
  • Diverse platform support, covering Android and iOS devices, embedded Linux, and microcontrollers, making use of platform APIs for accelerated inference.
  • APIs for multiple languages including Java, Swift, Objective-C, C++, and Python.
  • High performance, with hardware acceleration on supported devices, device-optimized kernels, and pre-fused activations and biases.
  • Model optimization tools, including quantization, that can reduce size and increase performance of models without sacrificing accuracy.
  • Efficient model format, using a FlatBuffer that is optimized for small size and portability.
  • Pre-trained models for common machine learning tasks that can be customized to your application.
  • Samples and tutorials that show you how to deploy machine learning models on supported platforms.

Development workflow

The workflow for using TensorFlow Lite involves the following steps:

  1. Pick a model

    Bring your own TensorFlow model, find a model online, or pick a model from our Pre-trained models to drop in or retrain.

  2. Convert the model

    If you're using a custom model, use the TensorFlow Lite converter and a few lines of Python to convert it to the TensorFlow Lite format.

  3. Deploy to your device

    Run your model on-device with the TensorFlow Lite interpreter, with APIs in many languages.

  4. Optimize your model

    Use our Model Optimization Toolkit to reduce your model's size and increase its efficiency with minimal impact on accuracy.

To learn more about using TensorFlow Lite in your project, see Get started.

Technical constraints

TensorFlow Lite plans to provide high performance on-device inference for any TensorFlow model. However, the TensorFlow Lite interpreter currently supports a limited subset of TensorFlow operators that have been optimized for on-device use. This means that some models require additional steps to work with TensorFlow Lite.

To learn which operators are available, see Operator compatibility.

If your model uses operators that are not yet supported by TensorFlow Lite interpreter, you can use TensorFlow Select to include TensorFlow operations in your TensorFlow Lite build. However, this will lead to an increased binary size.

TensorFlow Lite does not currently support on-device training, but it is in our Roadmap, along with other planned improvements.

Next steps

Want to keep learning about TensorFlow Lite? Here are some next steps: