TensorFlow 2.0 RC is available Learn more

Post-training float16 quantization

View on TensorFlow.org Run in Google Colab View source on GitHub

Overview

TensorFlow Lite now supports converting weights to 16-bit floating point values during model conversion from TensorFlow to TensorFlow Lite's flat buffer format. This results in a 2x reduction in model size. Some harware, like GPUs, can compute natively in this reduced precision arithmetic, realizing a speedup over traditional floating point execution. The Tensorflow Lite GPU delegate can be configured to run in this way. However, a model converted to float16 weights can still run on the CPU without additional modification: the float16 weights are upsampled to float32 prior to the first inference. This permits a significant reduction in model size in exchange for a minimal impacts to latency and accuracy.

In this tutorial, you train an MNIST model from scratch, check its accuracy in TensorFlow, and then convert the saved model into a Tensorflow Lite flatbuffer with float16 quantization. Finally, check the accuracy of the converted model and compare it to the original saved model. The training script, mnist.py, is available from the TensorFlow official MNIST tutorial.

Build an MNIST model

Setup

! pip uninstall -y tensorflow
! pip install -q -U tf-nightly
WARNING: Skipping tensorflow as it is not installed.
ERROR: tensorflow-gpu 2.0.0b1 has requirement tb-nightly<1.14.0a20190604,>=1.14.0a20190603, but you'll have tb-nightly 1.15.0a20190802 which is incompatible.
import tensorflow as tf
tf.enable_eager_execution()

import numpy as np

tf.logging.set_verbosity(tf.logging.DEBUG)
WARNING: Logging before flag parsing goes to stderr.
W0802 17:52:11.362909 140694297093888 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.enable_eager_execution is deprecated. Please use tf.compat.v1.enable_eager_execution instead.

W0802 17:52:11.364723 140694297093888 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

! git clone --depth 1 https://github.com/tensorflow/models
Cloning into 'models'...
remote: Enumerating objects: 3224, done.
remote: Counting objects: 100% (3224/3224), done.
remote: Compressing objects: 100% (2726/2726), done.
remote: Total 3224 (delta 587), reused 2067 (delta 421), pack-reused 0
Receiving objects: 100% (3224/3224), 370.68 MiB | 43.58 MiB/s, done.
Resolving deltas: 100% (587/587), done.
Checking out files: 100% (3053/3053), done.
tf.lite.constants.FLOAT16
tf.float16
import sys
import os

if sys.version_info.major >= 3:
    import pathlib
else:
    import pathlib2 as pathlib

# Add `models` to the python path.
models_path = os.path.join(os.getcwd(), "models")
sys.path.append(models_path)

Train and export the model

saved_models_root = "/tmp/mnist_saved_model"
# The above path addition is not visible to subprocesses, add the path for the subprocess as well.
!PYTHONPATH={models_path} python models/official/mnist/mnist.py --train_epochs=1 --export_dir {saved_models_root} --data_format=channels_last
WARNING: Logging before flag parsing goes to stderr.
W0802 17:52:29.287535 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.logging.set_verbosity is deprecated. Please use tf.compat.v1.logging.set_verbosity instead.

W0802 17:52:29.290326 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

I0802 17:52:29.291357 140699526784768 run_config.py:558] Initializing RunConfig with distribution strategies.
I0802 17:52:29.291565 140699526784768 estimator_training.py:167] Not using Distribute Coordinator.
I0802 17:52:29.292140 140699526784768 estimator.py:209] Using config: {'_eval_distribute': None, '_experimental_max_worker_delay_secs': None, '_save_checkpoints_secs': 600, '_evaluation_master': '', '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7ff6f8f434a8>, '_num_worker_replicas': 1, '_device_fn': None, '_keep_checkpoint_max': 5, '_service': None, '_task_type': 'worker', '_save_summary_steps': 100, '_model_dir': '/tmp/mnist_model', '_distribute_coordinator_mode': None, '_is_chief': True, '_num_ps_replicas': 0, '_train_distribute': <tensorflow.python.distribute.one_device_strategy.OneDeviceStrategyV1 object at 0x7ff6f8f433c8>, '_global_id_in_cluster': 0, '_experimental_distribute': None, '_master': '', '_log_step_count_steps': 100, '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_task_id': 0, '_protocol': None, '_session_config': allow_soft_placement: true
, '_tf_random_seed': None}
W0802 17:52:29.294717 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.gfile.Exists is deprecated. Please use tf.io.gfile.exists instead.

Downloading https://storage.googleapis.com/cvdf-datasets/mnist/train-images-idx3-ubyte.gz to /tmpfs/tmp/tmp0sb3f18u.gz
Downloading https://storage.googleapis.com/cvdf-datasets/mnist/train-labels-idx1-ubyte.gz to /tmpfs/tmp/tmp6e5_ojxm.gz
W0802 17:52:30.888936 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.estimator.inputs is deprecated. Please use tf.compat.v1.estimator.inputs instead.

W0802 17:52:31.105814 140699526784768 deprecation.py:506] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1633: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
I0802 17:52:31.109607 140699526784768 estimator.py:1145] Calling model_fn.
W0802 17:52:31.227320 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

W0802 17:52:31.263274 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.

W0802 17:52:31.273012 140699526784768 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/ops/losses/losses_impl.py:121: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0802 17:52:31.281508 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.metrics.accuracy is deprecated. Please use tf.compat.v1.metrics.accuracy instead.

W0802 17:52:31.303630 140699526784768 module_wrapper.py:136] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/util/module_wrapper.py:163: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

W0802 17:52:31.605368 140699526784768 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/training/optimizer.py:172: BaseResourceVariable.constraint (from tensorflow.python.ops.resource_variable_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Apply a constraint manually following the optimizer update step.
W0802 17:52:31.629992 140699526784768 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_estimator/python/estimator/model_fn.py:337: scalar (from tensorflow.python.framework.tensor_shape) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.TensorShape([]).
I0802 17:52:31.630296 140699526784768 estimator.py:1147] Done calling model_fn.
I0802 17:52:31.663995 140699526784768 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
I0802 17:52:31.821853 140699526784768 monitored_session.py:240] Graph was finalized.
2019-08-02 17:52:31.822320: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-08-02 17:52:31.828186: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200000000 Hz
2019-08-02 17:52:31.828496: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x69467d0 executing computations on platform Host. Devices:
2019-08-02 17:52:31.828528: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-08-02 17:52:31.841254: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
IteratorToStringHandle: CPU XLA_CPU 
IteratorV2: CPU XLA_CPU 
IteratorGetNext: CPU XLA_CPU 
MakeIterator: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  IteratorV2 (IteratorV2) /replica:0/task:0/device:GPU:0
  MakeIterator (MakeIterator) /replica:0/task:0/device:GPU:0
  IteratorToStringHandle (IteratorToStringHandle) /replica:0/task:0/device:GPU:0
  IteratorGetNext (IteratorGetNext) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.841383: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
AssignAddVariableOp: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  global_step/Initializer/zeros (Const) 
  global_step (VarHandleOp) /replica:0/task:0/device:GPU:0
  global_step/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  global_step/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  global_step/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Identity/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/AssignAddVariableOp (AssignAddVariableOp) /replica:0/task:0/device:GPU:0
  Adam/ReadVariableOp_4 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp (VarIsInitializedOp) 
  save/AssignVariableOp_26 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.841531: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
RandomUniform: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
Mul: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
Sub: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 
Add: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  conv2d/kernel/Initializer/random_uniform/shape (Const) 
  conv2d/kernel/Initializer/random_uniform/min (Const) 
  conv2d/kernel/Initializer/random_uniform/max (Const) 
  conv2d/kernel/Initializer/random_uniform/RandomUniform (RandomUniform) 
  conv2d/kernel/Initializer/random_uniform/sub (Sub) 
  conv2d/kernel/Initializer/random_uniform/mul (Mul) 
  conv2d/kernel/Initializer/random_uniform (Add) 
  conv2d/kernel (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/Conv2D/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/conv2d/Conv2D/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam_1/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/kernel/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d/kernel/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_1 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_11 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_12 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_1 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_11 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_12 (VarIsInitializedOp) 
  save/AssignVariableOp_5 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_6 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_7 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.841766: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
Mul: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
ResourceApplyAdam: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  conv2d/bias/Initializer/zeros (Const) 
  conv2d/bias (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/conv2d/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  beta1_power/Initializer/initial_value (Const) /replica:0/task:0/device:GPU:0
  beta1_power (VarHandleOp) /replica:0/task:0/device:GPU:0
  beta1_power/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  beta1_power/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  beta1_power/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  beta2_power/Initializer/initial_value (Const) /replica:0/task:0/device:GPU:0
  beta2_power (VarHandleOp) /replica:0/task:0/device:GPU:0
  beta2_power/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  beta2_power/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  beta2_power/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam_1/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d/bias/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d/kernel/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d/kernel/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d/bias/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d/bias/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d/bias/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d_1/kernel/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d_1/kernel/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d_1/bias/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d_1/bias/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense/kernel/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense/kernel/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense/bias/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense/bias/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense_1/kernel/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense_1/kernel/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense_1/bias/ResourceApplyAdam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense_1/bias/ResourceApplyAdam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/mul (Mul) /replica:0/task:0/device:GPU:0
  Adam/AssignVariableOp (AssignVariableOp) /replica:0/task:0/device:GPU:0
  Adam/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/ReadVariableOp_2 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/mul_1 (Mul) /replica:0/task:0/device:GPU:0
  Adam/AssignVariableOp_1 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  Adam/ReadVariableOp_3 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_2 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_9 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_10 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_13 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_14 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_2 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_9 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_10 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_13 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_14 (VarIsInitializedOp) 
  save/AssignVariableOp (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_1 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_2 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_3 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_4 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.842025: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
RandomUniform: CPU XLA_CPU 
Fill: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
Mul: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
Sub: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 
Add: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  conv2d_1/kernel/Initializer/random_uniform/shape (Const) 
  conv2d_1/kernel/Initializer/random_uniform/min (Const) 
  conv2d_1/kernel/Initializer/random_uniform/max (Const) 
  conv2d_1/kernel/Initializer/random_uniform/RandomUniform (RandomUniform) 
  conv2d_1/kernel/Initializer/random_uniform/sub (Sub) 
  conv2d_1/kernel/Initializer/random_uniform/mul (Mul) 
  conv2d_1/kernel/Initializer/random_uniform (Add) 
  conv2d_1/kernel (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/Conv2D/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/conv2d_1/Conv2D/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam_1/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam_1/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam_1/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/kernel/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d_1/kernel/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_3 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_15 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_16 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_3 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_15 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_16 (VarIsInitializedOp) 
  save/AssignVariableOp_11 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_12 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_13 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.842198: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  conv2d_1/bias/Initializer/zeros (Const) 
  conv2d_1/bias (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/conv2d_1/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam_1/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  conv2d_1/bias/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_conv2d_1/bias/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_4 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_17 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_18 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_4 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_17 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_18 (VarIsInitializedOp) 
  save/AssignVariableOp_8 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_9 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_10 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.842406: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
RandomUniform: CPU XLA_CPU 
Fill: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
Mul: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
Sub: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 
Add: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  dense/kernel/Initializer/random_uniform/shape (Const) 
  dense/kernel/Initializer/random_uniform/min (Const) 
  dense/kernel/Initializer/random_uniform/max (Const) 
  dense/kernel/Initializer/random_uniform/RandomUniform (RandomUniform) 
  dense/kernel/Initializer/random_uniform/sub (Sub) 
  dense/kernel/Initializer/random_uniform/mul (Mul) 
  dense/kernel/Initializer/random_uniform (Add) 
  dense/kernel (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense/kernel/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense/MatMul/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/dense/MatMul/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam_1/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam_1/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam_1/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense/kernel/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense/kernel/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_5 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_19 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_20 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_5 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_19 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_20 (VarIsInitializedOp) 
  save/AssignVariableOp_17 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_18 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_19 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.842592: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 
Fill: CPU XLA_CPU 
Const: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  dense/bias/Initializer/zeros/shape_as_tensor (Const) 
  dense/bias/Initializer/zeros/Const (Const) 
  dense/bias/Initializer/zeros (Fill) 
  dense/bias (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense/bias/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense/bias/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense/bias/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/dense/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  dense/bias/Adam/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  dense/bias/Adam/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  dense/bias/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam_1/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  dense/bias/Adam_1/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  dense/bias/Adam_1/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  dense/bias/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense/bias/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense/bias/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_6 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_21 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_22 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_6 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_21 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_22 (VarIsInitializedOp) 
  save/AssignVariableOp_14 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_15 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_16 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.844526: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
RandomUniform: CPU XLA_CPU 
Fill: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
Mul: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
Sub: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 
Add: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  dense_1/kernel/Initializer/random_uniform/shape (Const) 
  dense_1/kernel/Initializer/random_uniform/min (Const) 
  dense_1/kernel/Initializer/random_uniform/max (Const) 
  dense_1/kernel/Initializer/random_uniform/RandomUniform (RandomUniform) 
  dense_1/kernel/Initializer/random_uniform/sub (Sub) 
  dense_1/kernel/Initializer/random_uniform/mul (Mul) 
  dense_1/kernel/Initializer/random_uniform (Add) 
  dense_1/kernel (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/MatMul/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/dense_1/MatMul/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam_1/Initializer/zeros/shape_as_tensor (Const) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam_1/Initializer/zeros/Const (Const) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam_1/Initializer/zeros (Fill) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/kernel/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense_1/kernel/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_7 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_23 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_24 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_7 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_23 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_24 (VarIsInitializedOp) 
  save/AssignVariableOp_23 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_24 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_25 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.844819: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
ResourceApplyAdam: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  dense_1/bias/Initializer/zeros (Const) 
  dense_1/bias (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  sequential/dense_1/BiasAdd/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam_1/Initializer/zeros (Const) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam_1 (VarHandleOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam_1/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam_1/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  dense_1/bias/Adam_1/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  Adam/update_dense_1/bias/ResourceApplyAdam (ResourceApplyAdam) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_8 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_25 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables/VarIsInitializedOp_26 (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_8 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_25 (VarIsInitializedOp) 
  report_uninitialized_variables_1/VarIsInitializedOp_26 (VarIsInitializedOp) 
  save/AssignVariableOp_20 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_21 (AssignVariableOp) /replica:0/task:0/device:GPU:0
  save/AssignVariableOp_22 (AssignVariableOp) /replica:0/task:0/device:GPU:0

2019-08-02 17:52:31.845147: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
AssignAddVariableOp: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  accuracy/total/Initializer/zeros (Const) 
  accuracy/total (VarHandleOp) /replica:0/task:0/device:GPU:0
  accuracy/total/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  accuracy/total/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/total/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/AssignAddVariableOp (AssignAddVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/value/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/update_op/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_27 (VarIsInitializedOp) 

2019-08-02 17:52:31.845266: W tensorflow/core/common_runtime/colocation_graph.cc:960] Failed to place the graph without changing the devices of some resources. Some of the operations (that had to be colocated with resource generating operations) are not supported on the resources' devices. Current candidate devices are [
  /job:localhost/replica:0/task:0/device:CPU:0
  /job:localhost/replica:0/task:0/device:XLA_CPU:0].
See below for details of this colocation group:
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/replica:0/task:0/device:GPU:0' assigned_device_name_='' resource_device_name_='/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU, XLA_CPU] possible_devices_=[]
AssignAddVariableOp: CPU XLA_CPU 
ReadVariableOp: CPU XLA_CPU 
AssignVariableOp: CPU XLA_CPU 
VarIsInitializedOp: CPU XLA_CPU 
Const: CPU XLA_CPU 
VarHandleOp: CPU XLA_CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  accuracy/count/Initializer/zeros (Const) 
  accuracy/count (VarHandleOp) /replica:0/task:0/device:GPU:0
  accuracy/count/IsInitialized/VarIsInitializedOp (VarIsInitializedOp) /replica:0/task:0/device:GPU:0
  accuracy/count/Assign (AssignVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/count/Read/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/AssignAddVariableOp_1 (AssignAddVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/ReadVariableOp_1 (ReadVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/Maximum/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  accuracy/Maximum_1/ReadVariableOp (ReadVariableOp) /replica:0/task:0/device:GPU:0
  report_uninitialized_variables_1/VarIsInitializedOp_28 (VarIsInitializedOp) 

I0802 17:52:32.088811 140699526784768 session_manager.py:500] Running local_init_op.
I0802 17:52:32.104807 140699526784768 session_manager.py:502] Done running local_init_op.
I0802 17:52:32.420798 140699526784768 basic_session_run_hooks.py:606] Saving checkpoints for 0 into /tmp/mnist_model/model.ckpt.
I0802 17:52:37.499570 140699526784768 basic_session_run_hooks.py:262] cross_entropy = 2.304758, learning_rate = 1e-04, train_accuracy = 0.12
I0802 17:52:37.500300 140699526784768 basic_session_run_hooks.py:262] loss = 2.304758, step = 0
I0802 17:52:54.141548 140699526784768 basic_session_run_hooks.py:692] global_step/sec: 6.00874
I0802 17:52:54.142451 140699526784768 basic_session_run_hooks.py:260] cross_entropy = 0.4685873, learning_rate = 1e-04, train_accuracy = 0.475 (16.643 sec)
I0802 17:52:54.142682 140699526784768 basic_session_run_hooks.py:260] loss = 0.4685873, step = 100 (16.642 sec)
I0802 17:53:10.021602 140699526784768 basic_session_run_hooks.py:692] global_step/sec: 6.29723
I0802 17:53:10.022568 140699526784768 basic_session_run_hooks.py:260] cross_entropy = 0.17987648, learning_rate = 1e-04, train_accuracy = 0.63 (15.880 sec)
I0802 17:53:10.022954 140699526784768 basic_session_run_hooks.py:260] loss = 0.17987648, step = 200 (15.880 sec)
I0802 17:53:25.744684 140699526784768 basic_session_run_hooks.py:692] global_step/sec: 6.36007
I0802 17:53:25.745657 140699526784768 basic_session_run_hooks.py:260] cross_entropy = 0.21796136, learning_rate = 1e-04, train_accuracy = 0.705 (15.723 sec)
I0802 17:53:25.745869 140699526784768 basic_session_run_hooks.py:260] loss = 0.21796136, step = 300 (15.723 sec)
I0802 17:53:41.650218 140699526784768 basic_session_run_hooks.py:692] global_step/sec: 6.28713
I0802 17:53:41.651176 140699526784768 basic_session_run_hooks.py:260] cross_entropy = 0.18158635, learning_rate = 1e-04, train_accuracy = 0.754 (15.906 sec)
I0802 17:53:41.651422 140699526784768 basic_session_run_hooks.py:260] loss = 0.18158635, step = 400 (15.906 sec)
I0802 17:53:57.310877 140699526784768 basic_session_run_hooks.py:692] global_step/sec: 6.38543
I0802 17:53:57.311892 140699526784768 basic_session_run_hooks.py:260] cross_entropy = 0.19583084, learning_rate = 1e-04, train_accuracy = 0.785 (15.661 sec)
I0802 17:53:57.312148 140699526784768 basic_session_run_hooks.py:260] loss = 0.19583084, step = 500 (15.661 sec)
I0802 17:54:12.965739 140699526784768 basic_session_run_hooks.py:606] Saving checkpoints for 600 into /tmp/mnist_model/model.ckpt.
I0802 17:54:13.137421 140699526784768 estimator.py:368] Loss for final step: 0.0385497.
Downloading https://storage.googleapis.com/cvdf-datasets/mnist/t10k-images-idx3-ubyte.gz to /tmpfs/tmp/tmpiqu5wfks.gz
Downloading https://storage.googleapis.com/cvdf-datasets/mnist/t10k-labels-idx1-ubyte.gz to /tmpfs/tmp/tmpj6lce3gw.gz
W0802 17:54:13.276775 140699526784768 deprecation.py:323] From models/official/mnist/mnist.py:204: DatasetV1.make_one_shot_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_one_shot_iterator(dataset)`.
I0802 17:54:13.292047 140699526784768 estimator.py:1145] Calling model_fn.
I0802 17:54:13.457111 140699526784768 estimator.py:1147] Done calling model_fn.
I0802 17:54:13.475410 140699526784768 evaluation.py:255] Starting evaluation at 2019-08-02T17:54:13Z
I0802 17:54:13.550277 140699526784768 monitored_session.py:240] Graph was finalized.
I0802 17:54:13.551791 140699526784768 saver.py:1284] Restoring parameters from /tmp/mnist_model/model.ckpt-600
I0802 17:54:13.603642 140699526784768 session_manager.py:500] Running local_init_op.
I0802 17:54:13.617778 140699526784768 session_manager.py:502] Done running local_init_op.
I0802 17:54:18.592701 140699526784768 evaluation.py:275] Finished evaluation at 2019-08-02-17:54:18
I0802 17:54:18.592995 140699526784768 estimator.py:2039] Saving dict for global step 600: accuracy = 0.9655, global_step = 600, loss = 0.11704015
I0802 17:54:18.644671 140699526784768 estimator.py:2099] Saving 'checkpoint_path' summary for global step 600: /tmp/mnist_model/model.ckpt-600

Evaluation results:
    {'loss': 0.11704015, 'accuracy': 0.9655, 'global_step': 600}

W0802 17:54:18.646510 140699526784768 deprecation.py:323] From models/official/mnist/mnist.py:228: Estimator.export_savedmodel (from tensorflow_estimator.python.estimator.estimator) is deprecated and will be removed in a future version.
Instructions for updating:
This function has been renamed, use `export_saved_model` instead.
I0802 17:54:18.653813 140699526784768 estimator.py:1145] Calling model_fn.
I0802 17:54:18.792591 140699526784768 estimator.py:1147] Done calling model_fn.
W0802 17:54:18.792976 140699526784768 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
I0802 17:54:18.793393 140699526784768 export_utils.py:170] Signatures INCLUDED in export for Classify: None
I0802 17:54:18.793503 140699526784768 export_utils.py:170] Signatures INCLUDED in export for Regress: None
I0802 17:54:18.793580 140699526784768 export_utils.py:170] Signatures INCLUDED in export for Eval: None
I0802 17:54:18.793655 140699526784768 export_utils.py:170] Signatures INCLUDED in export for Predict: ['classify', 'serving_default']
I0802 17:54:18.793719 140699526784768 export_utils.py:170] Signatures INCLUDED in export for Train: None
I0802 17:54:18.817130 140699526784768 saver.py:1284] Restoring parameters from /tmp/mnist_model/model.ckpt-600
I0802 17:54:18.841982 140699526784768 builder_impl.py:662] Assets added to graph.
I0802 17:54:18.842219 140699526784768 builder_impl.py:457] No assets to write.
I0802 17:54:18.908719 140699526784768 builder_impl.py:422] SavedModel written to: /tmp/mnist_saved_model/temp-b'1564793658'/saved_model.pb

For the example, you trained the model for just a single epoch, so it only trains to ~96% accuracy.

Convert to a TensorFlow Lite model

The savedmodel directory is named with a timestamp. Select the most recent one:

saved_model_dir = str(sorted(pathlib.Path(saved_models_root).glob("*"))[-1])
saved_model_dir
'/tmp/mnist_saved_model/1564793658'

Using the Python TFLiteConverter, the saved model can be converted into a TensorFlow Lite model.

First load the model using the TFLiteConverter:

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
tflite_model = converter.convert()
W0802 17:54:19.801775 140694297093888 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/lite/python/convert_saved_model.py:60: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
I0802 17:54:19.852858 140694297093888 saver.py:1284] Restoring parameters from /tmp/mnist_saved_model/1564793658/variables/variables
I0802 17:54:19.891981 140694297093888 convert_saved_model.py:80] The given SavedModel MetaGraphDef contains SignatureDefs with the following keys: {'serving_default', 'classify'}
I0802 17:54:19.893552 140694297093888 convert_saved_model.py:99] input tensors info: 
I0802 17:54:19.894403 140694297093888 convert_saved_model.py:41] Tensor's key in saved_model's tensor_map: image
I0802 17:54:19.895246 140694297093888 convert_saved_model.py:43]  tensor name: Placeholder:0, shape: (-1, 28, 28), type: DT_FLOAT
I0802 17:54:19.895914 140694297093888 convert_saved_model.py:101] output tensors info: 
I0802 17:54:19.896573 140694297093888 convert_saved_model.py:41] Tensor's key in saved_model's tensor_map: probabilities
I0802 17:54:19.897194 140694297093888 convert_saved_model.py:43]  tensor name: Softmax:0, shape: (-1, 10), type: DT_FLOAT
I0802 17:54:19.897789 140694297093888 convert_saved_model.py:41] Tensor's key in saved_model's tensor_map: classes
I0802 17:54:19.898434 140694297093888 convert_saved_model.py:43]  tensor name: ArgMax:0, shape: (-1), type: DT_INT64
I0802 17:54:19.943767 140694297093888 saver.py:1284] Restoring parameters from /tmp/mnist_saved_model/1564793658/variables/variables
W0802 17:54:19.997899 140694297093888 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/lite/python/util.py:249: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
W0802 17:54:19.999256 140694297093888 deprecation.py:323] From /tmpfs/src/tf_docs_env/lib/python3.5/site-packages/tensorflow_core/python/framework/graph_util_impl.py:275: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
I0802 17:54:20.022251 140694297093888 graph_util_impl.py:318] Froze 8 variables.
I0802 17:54:20.053938 140694297093888 graph_util_impl.py:372] Converted 8 variables to const ops.

Write it out to a .tflite file:

tflite_models_dir = pathlib.Path("/tmp/mnist_tflite_models/")
tflite_models_dir.mkdir(exist_ok=True, parents=True)
tflite_model_file = tflite_models_dir/"mnist_model.tflite"
tflite_model_file.write_bytes(tflite_model)
13101276

To instead quantize the model to float16 on export, first set the optimizations flag to use default optimizations. Then specify that float16 is the supported type on the target platform:

tf.logging.set_verbosity(tf.logging.INFO)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.lite.constants.FLOAT16]

Finally, convert the model like usual. Note, by default the converted model will still use float input and outputs for invocation convenience.

tflite_fp16_model = converter.convert()
tflite_model_fp16_file = tflite_models_dir/"mnist_model_quant_f16.tflite"
tflite_model_fp16_file.write_bytes(tflite_fp16_model)
6552856

Note how the resulting file is approximately 1/2 the size.

!ls -lh {tflite_models_dir}
total 19M
-rw-rw-r-- 1 kbuilder kbuilder 6.3M Aug  2 17:54 mnist_model_quant_f16.tflite
-rw-rw-r-- 1 kbuilder kbuilder  13M Aug  2 17:54 mnist_model.tflite

Run the TensorFlow Lite models

Run the TensorFlow Lite model using the Python TensorFlow Lite Interpreter.

Load the test data

First, let's load the MNIST test data to feed to the model:

_, mnist_test = tf.keras.datasets.mnist.load_data()
images, labels = tf.cast(mnist_test[0], tf.float32)/255.0, mnist_test[1]

mnist_ds = tf.data.Dataset.from_tensor_slices((images, labels)).batch(1)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step

Load the model into the interpreters

interpreter = tf.lite.Interpreter(model_path=str(tflite_model_file))
interpreter.allocate_tensors()
interpreter_fp16 = tf.lite.Interpreter(model_path=str(tflite_model_fp16_file))
interpreter_fp16.allocate_tensors()

Test the models on one image

for img, label in mnist_ds:
  break

interpreter.set_tensor(interpreter.get_input_details()[0]["index"], img)
interpreter.invoke()
predictions = interpreter.get_tensor(
    interpreter.get_output_details()[0]["index"])
import matplotlib.pylab as plt

plt.imshow(img[0])
template = "True:{true}, predicted:{predict}"
_ = plt.title(template.format(true= str(label[0].numpy()),
                              predict=str(predictions[0])))
plt.grid(False)
interpreter_fp16.set_tensor(
    interpreter_fp16.get_input_details()[0]["index"], img)
interpreter_fp16.invoke()
predictions = interpreter_fp16.get_tensor(
    interpreter_fp16.get_output_details()[0]["index"])
plt.imshow(img[0])
template = "True:{true}, predicted:{predict}"
_ = plt.title(template.format(true= str(label[0].numpy()),
                              predict=str(predictions[0])))
plt.grid(False)

png

Evaluate the models

def eval_model(interpreter, mnist_ds):
  total_seen = 0
  num_correct = 0

  input_index = interpreter.get_input_details()[0]["index"]
  output_index = interpreter.get_output_details()[0]["index"]
  for img, label in mnist_ds:
    total_seen += 1
    interpreter.set_tensor(input_index, img)
    interpreter.invoke()
    predictions = interpreter.get_tensor(output_index)
    if predictions == label.numpy():
      num_correct += 1

    if total_seen % 500 == 0:
      print("Accuracy after %i images: %f" %
            (total_seen, float(num_correct) / float(total_seen)))

  return float(num_correct) / float(total_seen)
# Create smaller dataset for demonstration purposes
mnist_ds_demo = mnist_ds.take(2000)

print(eval_model(interpreter, mnist_ds_demo))
Accuracy after 500 images: 0.972000
Accuracy after 1000 images: 0.964000
Accuracy after 1500 images: 0.957333
Accuracy after 2000 images: 0.953000
0.953

Repeat the evaluation on the float16 quantized model to obtain:

# NOTE: Colab runs on server CPUs. At the time of writing this, TensorFlow Lite
# doesn't have super optimized server CPU kernels. For this reason this may be
# slower than the above float interpreter. But for mobile CPUs, considerable
# speedup can be observed.
print(eval_model(interpreter_fp16, mnist_ds_demo))
Accuracy after 500 images: 0.972000
Accuracy after 1000 images: 0.964000
Accuracy after 1500 images: 0.957333
Accuracy after 2000 images: 0.953000
0.953

In this example, you have quantized a model to float16 with no difference in the accuracy.

It's also possible to evaluate the fp16 quantized model on the GPU. To perform all arithmetic with the reduced precision values, be sure to create the TfLiteGPUDelegateOptions struct in your app and set precision_loss_allowed to 1, like this:

//Prepare GPU delegate.
const TfLiteGpuDelegateOptions options = {
  .metadata = NULL,
  .compile_options = {
    .precision_loss_allowed = 1,  // FP16
    .preferred_gl_object_type = TFLITE_GL_OBJECT_TYPE_FASTEST,
    .dynamic_batch_enabled = 0,   // Not fully functional yet
  },
};

Detailed documentation on the TFLite GPU delegate and how to use it in your application can be found here