Join us at TensorFlow World, Oct 28-31. Use code TF20 for 20% off select passes. Register now

Get started with microcontrollers

This document will help you start working with TensorFlow Lite for Microcontrollers.

Start by reading through and running our Examples.

For a walkthrough of the code required to run inference, see the Run inference section below.


There are several examples that demonstrate how to build embedded machine learning applications with TensorFlow Lite:

Hello World example

This example is designed to demonstrate the absolute basics of using TensorFlow Lite for Microcontrollers. It includes the full end-to-end workflow of training a model, converting it for use with TensorFlow Lite, and running inference on a microcontroller.

In the example, a model is trained to replicate a sine function. When deployed to a microcontroller, its predictions are used to either blink LEDs or control an animation.

Hello World example

The example code includes a Jupyter notebook that demonstrates how the model is trained and converted:


The process of building and converting a model is also covered in the guide Build and convert models.

To see how inference is performed, take a look at

The example is tested on the following platforms:

Micro Speech example

This example uses a simple audio recognition model to identify keywords in speech. The sample code captures audio from a device's microphones. The model classifies this audio in real time, determining whether the word "yes" or "no" has been spoken.

Micro Speech example

The Run inference section walks through the code of the Micro Speech sample and explains how it works.

The example is tested on the following platforms:

Micro Vision example

This example shows how you can use TensorFlow Lite to run a 250 kilobyte neural network to recognize people in images captured by a camera. It is designed to run on systems with small amounts of memory such as microcontrollers and DSPs.

Micro Vision example

The example is tested on the following platforms:

Run inference

The following section walks through the Micro Speech sample's and explains how it used TensorFlow Lite for Microcontrollers to run inference.


To use the library, we must include the following header files:

#include "tensorflow/lite/experimental/micro/kernels/all_ops_resolver.h"
#include "tensorflow/lite/experimental/micro/micro_error_reporter.h"
#include "tensorflow/lite/experimental/micro/micro_interpreter.h"
#include "tensorflow/lite/schema/schema_generated.h"
#include "tensorflow/lite/version.h"

The sample also includes some other files. These are the most significant:

#include "tensorflow/lite/experimental/micro/examples/micro_speech/feature_provider.h"
#include "tensorflow/lite/experimental/micro/examples/micro_speech/micro_features/micro_model_settings.h"
#include "tensorflow/lite/experimental/micro/examples/micro_speech/micro_features/tiny_conv_micro_features_model_data.h"

Set up logging

To set up logging, a tflite::ErrorReporter pointer is created using a pointer to a tflite::MicroErrorReporter instance:

tflite::MicroErrorReporter micro_error_reporter;
tflite::ErrorReporter* error_reporter = &micro_error_reporter;

This variable will be passed into the interpreter, which allows it to write logs. Since microcontrollers often have a variety of mechanisms for logging, the implementation of tflite::MicroErrorReporter is designed to be customized for your particular device.

Load a model

In the following code, the model is instantiated from a char array, g_tiny_conv_micro_features_model_data (to learn how this is created, see Build and convert models). We then check the model to ensure its schema version is compatible with the version we are using:

const tflite::Model* model =
if (model->version() != TFLITE_SCHEMA_VERSION) {
      "Model provided is schema version %d not equal "
      "to supported version %d.\n",
      model->version(), TFLITE_SCHEMA_VERSION);
  return 1;

Instantiate operations resolver

An AllOpsResolver instance is required by the interpreter to access TensorFlow operations. This class can be extended to add custom operations to your project:

tflite::ops::micro::AllOpsResolver resolver;

Allocate memory

We need to preallocate a certain amount of memory for input, output, and intermediate arrays. This is provided as a uint8_t array of size tensor_arena_size, which is passed into a tflite::SimpleTensorAllocator instance:

const int tensor_arena_size = 10 * 1024;
uint8_t tensor_arena[tensor_arena_size];
tflite::SimpleTensorAllocator tensor_allocator(tensor_arena,

Instantiate interpreter

We create a tflite::MicroInterpreter instance, passing in the variables created earlier:

tflite::MicroInterpreter interpreter(model, resolver, &tensor_allocator,

Validate input shape

The MicroInterpreter instance can provide us with a pointer to the model's input tensor by calling .input(0), where 0 represents the first (and only) input tensor. We inspect this tensor to confirm that its shape and type are what we are expecting:

TfLiteTensor* model_input = interpreter.input(0);
if ((model_input->dims->size != 4) || (model_input->dims->data[0] != 1) ||
    (model_input->dims->data[1] != kFeatureSliceCount) ||
    (model_input->dims->data[2] != kFeatureSliceSize) ||
    (model_input->type != kTfLiteUInt8)) {
  error_reporter->Report("Bad input tensor parameters in model");
  return 1;

In this snippet, the variables kFeatureSliceCount and kFeatureSliceSize relate to properties of the input and are defined in micro_model_settings.h. The enum value kTfLiteUInt8 is a reference to one of the TensorFlow Lite data types, and is defined in c_api_internal.h.

Generate features

The data we input to our model must be generated from the microcontroller's audio input. The FeatureProvider class defined in feature_provider.h captures audio and converts it into a set of features that will be passed into the model. When it is instantiated, we use the TfLiteTensor obtained earlier to pass in a pointer to the input array. This is used by the FeatureProvider to populate the input data that will be passed into the model:

  FeatureProvider feature_provider(kFeatureElementCount,

The following code causes the FeatureProvider to generate a set of features from the most recent second of audio and populate the input tensor:

TfLiteStatus feature_status = feature_provider.PopulateFeatureData(
    error_reporter, previous_time, current_time, &how_many_new_slices);

In the sample, feature generation and inference happens in a loop, so the device is constantly capturing and processing new audio.

If you are writing your own program, you will likely generate features in a different way, but you will always populate the input tensor with data before running the model.

Run the model

To run the model, we can call Invoke() on our tflite::MicroInterpreter instance:

TfLiteStatus invoke_status = interpreter.Invoke();
if (invoke_status != kTfLiteOk) {
  error_reporter->Report("Invoke failed");
  return 1;

We can check the return value, a TfLiteStatus, to determine if the run was successful. The possible values of TfLiteStatus, defined in c_api_internal.h, are kTfLiteOk and kTfLiteError.

Obtain the output

The model's output tensor can be obtained by calling output(0) on the tflite::MicroIntepreter, where 0 represents the first (and only) output tensor.

In the sample, the output is an array representing the probability of the input belonging to various classes (representing "yes", "no", "unknown", and "silence"). Since they are in a set order, we can use simple logic to determine which class has the highest probability:

    TfLiteTensor* output = interpreter.output(0);
    uint8_t top_category_score = 0;
    int top_category_index;
    for (int category_index = 0; category_index < kCategoryCount;
         ++category_index) {
      const uint8_t category_score = output->data.uint8[category_index];
      if (category_score > top_category_score) {
        top_category_score = category_score;
        top_category_index = category_index;

Elsewhere in the sample, a more sophisticated algorithm is used to smooth recognition results across a number of frames. This is defined in recognize_commands.h. The same technique can be used to improve reliability when processing any continuous stream of data.

Next steps

Once you have built and run the samples, read the following documents: