Introducing LiteRT: Google's high-performance runtime for on-device AI, formerly known as TensorFlow Lite. Learn more

Interpreter

public final class Interpreter

Driver class to drive model inference with TensorFlow Lite.

Note: If you don't need access to any of the "experimental" API features below, prefer to use InterpreterApi and InterpreterFactory rather than using Interpreter directly.

A Interpreter encapsulates a pre-trained TensorFlow Lite model, in which operations are executed for model inference.

For example, if a model takes only one input and returns only one output:

try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model)) {
   interpreter.run(input, output);
 }

If a model takes multiple inputs or outputs:

Object[] inputs = {input0, input1, ...};
 Map<Integer, Object> map_of_indices_to_outputs = new HashMap<>();
 FloatBuffer ith_output = FloatBuffer.allocateDirect(3 * 2 * 4);  // Float tensor, shape 3x2x4.
 ith_output.order(ByteOrder.nativeOrder());
 map_of_indices_to_outputs.put(i, ith_output);
 try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model)) {
   interpreter.runForMultipleInputsOutputs(inputs, map_of_indices_to_outputs);
 }

If a model takes or produces string tensors:

String[] input = {"foo", "bar"};  // Input tensor shape is [2].
 String[][] output = new String[3][2];  // Output tensor shape is [3, 2].
 try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model)) {
   interpreter.runForMultipleInputsOutputs(input, output);
 }

Note that there's a distinction between shape [] and shape[1]. For scalar string tensor outputs:

String[] input = {"foo"};  // Input tensor shape is [1].
 ByteBuffer outputBuffer = ByteBuffer.allocate(OUTPUT_BYTES_SIZE);  // Output tensor shape is [].
 try (Interpreter interpreter = new Interpreter(file_of_a_tensorflowlite_model)) {
   interpreter.runForMultipleInputsOutputs(input, outputBuffer);
 }
 byte[] outputBytes = new byte[outputBuffer.remaining()];
 outputBuffer.get(outputBytes);
 // Below, the `charset` can be StandardCharsets.UTF_8.
 String output = new String(outputBytes, charset);

Orders of inputs and outputs are determined when converting TensorFlow model to TensorFlowLite model with Toco, as are the default shapes of the inputs.

When inputs are provided as (multi-dimensional) arrays, the corresponding input tensor(s) will be implicitly resized according to that array's shape. When inputs are provided as Buffer types, no implicit resizing is done; the caller must ensure that the Buffer byte size either matches that of the corresponding tensor, or that they first resize the tensor via resizeInput(int, int[]). Tensor shape and type information can be obtained via the Tensor class, available via getInputTensor(int) and getOutputTensor(int).

WARNING:Interpreter instances are not thread-safe. A Interpreter owns resources that must be explicitly freed by invoking close()

The TFLite library is built against NDK API 19. It may work for Android API levels below 19, but is not guaranteed.

Nested Classes

class Interpreter.Options An options class for controlling runtime interpreter behavior.

Public Constructors

	Interpreter(File modelFile) Initializes an `Interpreter`.
	Interpreter(File modelFile, Interpreter.Options options) Initializes an `Interpreter` and specifies options for customizing interpreter behavior.
	Interpreter(ByteBuffer byteBuffer) Initializes an `Interpreter` with a `ByteBuffer` of a model file.
	Interpreter(ByteBuffer byteBuffer, Interpreter.Options options) Initializes an `Interpreter` with a `ByteBuffer` of a model file and a set of custom `Interpreter.Options`.

Public Methods

void	allocateTensors() Explicitly updates allocations for all tensors, if necessary.
void	close() Release resources associated with the `InterpreterApi` instance.
int	getInputIndex(String opName) Gets index of an input given the op name of the input.
Tensor	getInputTensor(int inputIndex) Gets the Tensor associated with the provided input index.
int	getInputTensorCount() Gets the number of input tensors.
Tensor	getInputTensorFromSignature(String inputName, String signatureKey) Gets the Tensor associated with the provided input name and signature method name.
Long	getLastNativeInferenceDurationNanoseconds() Returns native inference timing.
int	getOutputIndex(String opName) Gets index of an output given the op name of the output.
Tensor	getOutputTensor(int outputIndex) Gets the Tensor associated with the provided output index.
int	getOutputTensorCount() Gets the number of output Tensors.
Tensor	getOutputTensorFromSignature(String outputName, String signatureKey) Gets the Tensor associated with the provided output name in specific signature method.
String[]	getSignatureInputs(String signatureKey) Gets the list of SignatureDefs inputs for method `signatureKey`.
String[]	getSignatureKeys() Gets the list of SignatureDef exported method names available in the model.
String[]	getSignatureOutputs(String signatureKey) Gets the list of SignatureDefs outputs for method `signatureKey`.
void	resetVariableTensors() Advanced: Resets all variable tensors to the default value.
void	resizeInput(int idx, int[] dims, boolean strict) Resizes idx-th input of the native model to the given dims.
void	resizeInput(int idx, int[] dims) Resizes idx-th input of the native model to the given dims.
void	run(Object input, Object output) Runs model inference if the model takes only one input, and provides only one output.
void	runForMultipleInputsOutputs(Object[] inputs, Map<Integer, Object> outputs) Runs model inference if the model takes multiple inputs, or returns multiple outputs.
void	runSignature(Map<String, Object> inputs, Map<String, Object> outputs) Same as `runSignature(Map, Map, String)` but doesn't require passing a signatureKey, assuming the model has one SignatureDef.
void	runSignature(Map<String, Object> inputs, Map<String, Object> outputs, String signatureKey) Runs model inference based on SignatureDef provided through `signatureKey`.
void	setCancelled(boolean cancelled) Advanced: Interrupts inference in the middle of a call to `run(Object, Object)`.

Inherited Methods

From class java.lang.Object

boolean	equals(Object arg0)
final Class<?>	getClass()
int	hashCode()
final void	notify()
final void	notifyAll()
String	toString()
final void	wait(long arg0, int arg1)
final void	wait(long arg0)
final void	wait()

From interface org.tensorflow.lite.InterpreterApi

abstract void	allocateTensors() Explicitly updates allocations for all tensors, if necessary.
abstract void	close() Release resources associated with the `InterpreterApi` instance.
static InterpreterApi	create(File modelFile, InterpreterApi.Options options) Constructs an `InterpreterApi` instance, using the specified model and options.
static InterpreterApi	create(ByteBuffer byteBuffer, InterpreterApi.Options options) Constructs an `InterpreterApi` instance, using the specified model and options.
abstract int	getInputIndex(String opName) Gets index of an input given the op name of the input.
abstract Tensor	getInputTensor(int inputIndex) Gets the Tensor associated with the provided input index.
abstract int	getInputTensorCount() Gets the number of input tensors.
abstract Long	getLastNativeInferenceDurationNanoseconds() Returns native inference timing.
abstract int	getOutputIndex(String opName) Gets index of an output given the op name of the output.
abstract Tensor	getOutputTensor(int outputIndex) Gets the Tensor associated with the provided output index.
abstract int	getOutputTensorCount() Gets the number of output Tensors.
abstract void	resizeInput(int idx, int[] dims, boolean strict) Resizes idx-th input of the native model to the given dims.
abstract void	resizeInput(int idx, int[] dims) Resizes idx-th input of the native model to the given dims.
abstract void	run(Object input, Object output) Runs model inference if the model takes only one input, and provides only one output.
abstract void	runForMultipleInputsOutputs(Object[] inputs, Map<Integer, Object> outputs) Runs model inference if the model takes multiple inputs, or returns multiple outputs.

From interface java.lang.AutoCloseable

abstract void

close()

Public Constructors

public Interpreter (File modelFile)

Initializes an Interpreter.

Parameters

modelFile	a File of a pre-trained TF Lite model.

Throws

IllegalArgumentException	if `modelFile` does not encode a valid TensorFlow Lite model.

public Interpreter (File modelFile, Interpreter.Options options)

Initializes an Interpreter and specifies options for customizing interpreter behavior.

Parameters

modelFile	a file of a pre-trained TF Lite model
options	a set of options for customizing interpreter behavior

Throws

IllegalArgumentException	if `modelFile` does not encode a valid TensorFlow Lite model.

public Interpreter (ByteBuffer byteBuffer)

Initializes an Interpreter with a ByteBuffer of a model file.

The ByteBuffer should not be modified after the construction of a Interpreter. The ByteBuffer can be either a MappedByteBuffer that memory-maps a model file, or a direct ByteBuffer of nativeOrder() that contains the bytes content of a model.

Parameters

byteBuffer

Throws

IllegalArgumentException	if `byteBuffer` is not a `MappedByteBuffer` nor a direct `ByteBuffer` of nativeOrder.

public Interpreter (ByteBuffer byteBuffer, Interpreter.Options options)

Initializes an Interpreter with a ByteBuffer of a model file and a set of custom Interpreter.Options.

The ByteBuffer should not be modified after the construction of an Interpreter. The ByteBuffer can be either a MappedByteBuffer that memory-maps a model file, or a direct ByteBuffer of nativeOrder() that contains the bytes content of a model.

Parameters

byteBuffer
options

Throws

IllegalArgumentException	if `byteBuffer` is not a `MappedByteBuffer` nor a direct `ByteBuffer` of nativeOrder.

Public Methods

public void allocateTensors ()

Explicitly updates allocations for all tensors, if necessary.

This will propagate shapes and memory allocations for dependent tensors using the input tensor shape(s) as given.

Note: This call is *purely optional*. Tensor allocation will occur automatically during execution if any input tensors have been resized. This call is most useful in determining the shapes for any output tensors before executing the graph, e.g.,

 interpreter.resizeInput(0, new int[]{1, 4, 4, 3}));
 interpreter.allocateTensors();
 FloatBuffer input = FloatBuffer.allocate(interpreter.getInputTensor(0).numElements());
 // Populate inputs...
 FloatBuffer output = FloatBuffer.allocate(interpreter.getOutputTensor(0).numElements());
 interpreter.run(input, output)
 // Process outputs...

Note: Some graphs have dynamically shaped outputs, in which case the output shape may not fully propagate until inference is executed.

public void close ()

Release resources associated with the InterpreterApi instance.

public int getInputIndex (String opName)

Gets index of an input given the op name of the input.

Parameters

opName

public Tensor getInputTensor (int inputIndex)

Gets the Tensor associated with the provided input index.

Parameters

inputIndex

public int getInputTensorCount ()

Gets the number of input tensors.

public Tensor getInputTensorFromSignature (String inputName, String signatureKey)

Gets the Tensor associated with the provided input name and signature method name.

WARNING: This is an experimental API and subject to change.

Parameters

inputName	Input name in the signature.
signatureKey	Signature key identifying the SignatureDef, can be null if the model has one signature.

Throws

IllegalArgumentException	if `inputName` or `signatureKey` is null or empty, or invalid name provided.

public Long getLastNativeInferenceDurationNanoseconds ()

Returns native inference timing.

public int getOutputIndex (String opName)

Gets index of an output given the op name of the output.

Parameters

opName

public Tensor getOutputTensor (int outputIndex)

Gets the Tensor associated with the provided output index.

Note: Output tensor details (e.g., shape) may not be fully populated until after inference is executed. If you need updated details *before* running inference (e.g., after resizing an input tensor, which may invalidate output tensor shapes), use allocateTensors() to explicitly trigger allocation and shape propagation. Note that, for graphs with output shapes that are dependent on input *values*, the output shape may not be fully determined until running inference.

Parameters

outputIndex

public int getOutputTensorCount ()

Gets the number of output Tensors.

public Tensor getOutputTensorFromSignature (String outputName, String signatureKey)

Gets the Tensor associated with the provided output name in specific signature method.

WARNING: This is an experimental API and subject to change.

Parameters

outputName	Output name in the signature.
signatureKey	Signature key identifying the SignatureDef, can be null if the model has one signature.

Throws

IllegalArgumentException	if `outputName` or `signatureKey` is null or empty, or invalid name provided.

public String[] getSignatureInputs (String signatureKey)

Gets the list of SignatureDefs inputs for method signatureKey.

WARNING: This is an experimental API and subject to change.

Parameters

signatureKey

public String[] getSignatureKeys ()

Gets the list of SignatureDef exported method names available in the model.

WARNING: This is an experimental API and subject to change.

public String[] getSignatureOutputs (String signatureKey)

Gets the list of SignatureDefs outputs for method signatureKey.

WARNING: This is an experimental API and subject to change.

Parameters

signatureKey

public void resetVariableTensors ()

Advanced: Resets all variable tensors to the default value.

If a variable tensor doesn't have an associated buffer, it will be reset to zero.

WARNING: This is an experimental API and subject to change.

public void resizeInput (int idx, int[] dims, boolean strict)

Resizes idx-th input of the native model to the given dims.

When `strict` is True, only unknown dimensions can be resized. Unknown dimensions are indicated as `-1` in the array returned by `Tensor.shapeSignature()`.

Parameters

idx
dims
strict

public void resizeInput (int idx, int[] dims)

Resizes idx-th input of the native model to the given dims.

Parameters

idx
dims

public void run (Object input, Object output)

Runs model inference if the model takes only one input, and provides only one output.

Warning: The API is more efficient if a Buffer (preferably direct, but not required) is used as the input/output data type. Please consider using Buffer to feed and fetch primitive data for better performance. The following concrete Buffer types are supported:

ByteBuffer - compatible with any underlying primitive Tensor type.
FloatBuffer - compatible with float Tensors.
IntBuffer - compatible with int32 Tensors.
LongBuffer - compatible with int64 Tensors.

Note that boolean types are only supported as arrays, not Buffers, or as scalar inputs.

Parameters

input an array or multidimensional array, or a Buffer of primitive types including int, float, long, and byte. Buffer is the preferred way to pass large input data for primitive types, whereas string types require using the (multi-dimensional) array input path. When a Buffer is used, its content should remain unchanged until model inference is done, and the caller must ensure that the Buffer is at the appropriate read position. A null value is allowed only if the caller is using a Delegate that allows buffer handle interop, and such a buffer has been bound to the input Tensor.

output a multidimensional array of output data, or a Buffer of primitive types including int, float, long, and byte. When a Buffer is used, the caller must ensure that it is set the appropriate write position. A null value is allowed, and is useful for certain cases, e.g., if the caller is using a Delegate that allows buffer handle interop, and such a buffer has been bound to the output Tensor (see also Interpreter.Options#setAllowBufferHandleOutput(boolean)), or if the graph has dynamically shaped outputs and the caller must query the output Tensor shape after inference has been invoked, fetching the data directly from the output tensor (via Tensor.asReadOnlyBuffer()).

input	an array or multidimensional array, or a `Buffer` of primitive types including int, float, long, and byte. `Buffer` is the preferred way to pass large input data for primitive types, whereas string types require using the (multi-dimensional) array input path. When a `Buffer` is used, its content should remain unchanged until model inference is done, and the caller must ensure that the `Buffer` is at the appropriate read position. A `null` value is allowed only if the caller is using a `Delegate` that allows buffer handle interop, and such a buffer has been bound to the input `Tensor`.
output	a multidimensional array of output data, or a `Buffer` of primitive types including int, float, long, and byte. When a `Buffer` is used, the caller must ensure that it is set the appropriate write position. A null value is allowed, and is useful for certain cases, e.g., if the caller is using a `Delegate` that allows buffer handle interop, and such a buffer has been bound to the output `Tensor` (see also Interpreter.Options#setAllowBufferHandleOutput(boolean)), or if the graph has dynamically shaped outputs and the caller must query the output `Tensor` shape after inference has been invoked, fetching the data directly from the output tensor (via `Tensor.asReadOnlyBuffer()`).

public void runForMultipleInputsOutputs (Object[] inputs, Map<Integer, Object> outputs)

Runs model inference if the model takes multiple inputs, or returns multiple outputs.

Warning: The API is more efficient if Buffers (preferably direct, but not required) are used as the input/output data types. Please consider using Buffer to feed and fetch primitive data for better performance. The following concrete Buffer types are supported:

ByteBuffer - compatible with any underlying primitive Tensor type.
FloatBuffer - compatible with float Tensors.
IntBuffer - compatible with int32 Tensors.
LongBuffer - compatible with int64 Tensors.

Note that boolean types are only supported as arrays, not Buffers, or as scalar inputs.

Note: null values for invididual elements of inputs and outputs is allowed only if the caller is using a Delegate that allows buffer handle interop, and such a buffer has been bound to the corresponding input or output Tensor(s).

Parameters

inputs an array of input data. The inputs should be in the same order as inputs of the model. Each input can be an array or multidimensional array, or a Buffer of primitive types including int, float, long, and byte. Buffer is the preferred way to pass large input data, whereas string types require using the (multi-dimensional) array input path. When Buffer is used, its content should remain unchanged until model inference is done, and the caller must ensure that the Buffer is at the appropriate read position.

outputs a map mapping output indices to multidimensional arrays of output data or Buffers of primitive types including int, float, long, and byte. It only needs to keep entries for the outputs to be used. When a Buffer is used, the caller must ensure that it is set the appropriate write position. The map may be empty for cases where either buffer handles are used for output tensor data, or cases where the outputs are dynamically shaped and the caller must query the output Tensor shape after inference has been invoked, fetching the data directly from the output tensor (via Tensor.asReadOnlyBuffer()).

inputs	an array of input data. The inputs should be in the same order as inputs of the model. Each input can be an array or multidimensional array, or a `Buffer` of primitive types including int, float, long, and byte. `Buffer` is the preferred way to pass large input data, whereas string types require using the (multi-dimensional) array input path. When `Buffer` is used, its content should remain unchanged until model inference is done, and the caller must ensure that the `Buffer` is at the appropriate read position.
outputs	a map mapping output indices to multidimensional arrays of output data or `Buffer`s of primitive types including int, float, long, and byte. It only needs to keep entries for the outputs to be used. When a `Buffer` is used, the caller must ensure that it is set the appropriate write position. The map may be empty for cases where either buffer handles are used for output tensor data, or cases where the outputs are dynamically shaped and the caller must query the output `Tensor` shape after inference has been invoked, fetching the data directly from the output tensor (via `Tensor.asReadOnlyBuffer()`).

public void runSignature (Map<String, Object> inputs, Map<String, Object> outputs)

Same as runSignature(Map, Map, String) but doesn't require passing a signatureKey, assuming the model has one SignatureDef. If the model has more than one SignatureDef it will throw an exception.

WARNING: This is an experimental API and subject to change.

Parameters

inputs
outputs

public void runSignature (Map<String, Object> inputs, Map<String, Object> outputs, String signatureKey)

Runs model inference based on SignatureDef provided through signatureKey.

See run(Object, Object) for more details on the allowed input and output data types.

WARNING: This is an experimental API and subject to change.

Parameters

inputs	A map from input name in the SignatureDef to an input object.
outputs	A map from output name in SignatureDef to output data. This may be empty if the caller wishes to query the `Tensor` data directly after inference (e.g., if the output shape is dynamic, or output buffer handles are used).
signatureKey	Signature key identifying the SignatureDef.

Throws

IllegalArgumentException	if `inputs` is null or empty, if `outputs` or `signatureKey` is null, or if an error occurs when running inference.

public void setCancelled (boolean cancelled)

Advanced: Interrupts inference in the middle of a call to run(Object, Object).

A cancellation flag will be set to true when this function gets called. The interpreter will check the flag between Op invocations, and if it's true, the interpreter will stop execution. The interpreter will remain a cancelled state until explicitly "uncancelled" by setCancelled(false).

WARNING: This is an experimental API and subject to change.

Parameters

cancelled	`true` to cancel inference in a best-effort way; `false` to resume.

Throws

IllegalStateException	if the interpreter is not initialized with the cancellable option, which is by default off.

Interpreter

Nested Classes

Public Constructors

Public Methods

Inherited Methods

Public Constructors

public Interpreter (File modelFile)

Parameters

Throws

public Interpreter (File modelFile, Interpreter.Options options)

Parameters

Throws

public Interpreter (ByteBuffer byteBuffer)

Parameters

Throws

public Interpreter (ByteBuffer byteBuffer, Interpreter.Options options)

Parameters

Throws

Public Methods

public void allocateTensors ()

public void close ()

public int getInputIndex (String opName)

Parameters

public Tensor getInputTensor (int inputIndex)

Parameters

public int getInputTensorCount ()

public Tensor getInputTensorFromSignature (String inputName, String signatureKey)

Parameters

Throws

public Long getLastNativeInferenceDurationNanoseconds ()

public int getOutputIndex (String opName)

Parameters

public Tensor getOutputTensor (int outputIndex)

Parameters

public int getOutputTensorCount ()

public Tensor getOutputTensorFromSignature (String outputName, String signatureKey)

Parameters

Throws

public String[] getSignatureInputs (String signatureKey)

Parameters

public String[] getSignatureKeys ()

public String[] getSignatureOutputs (String signatureKey)

Parameters

public void resetVariableTensors ()

public void resizeInput (int idx, int[] dims, boolean strict)

Parameters

public void resizeInput (int idx, int[] dims)

Parameters

public void run (Object input, Object output)

Parameters

public void runForMultipleInputsOutputs (Object[] inputs, Map<Integer, Object> outputs)

Parameters

public void runSignature (Map<String, Object> inputs, Map<String, Object> outputs)

Parameters

public void runSignature (Map<String, Object> inputs, Map<String, Object> outputs, String signatureKey)

Parameters

Throws

public void setCancelled (boolean cancelled)

Parameters

Throws

See Also