TensorFlow Model Analysis

An Example of a Key Component of TensorFlow Extended (TFX)

This example colab notebook illustrates how TensorFlow Model Analysis (TFMA) can be used to investigate and visualize the characteristics of a dataset and the performance of a model. We'll use a model that we trained previously, and now you get to play with the results!

The model we trained was for the Chicago Taxi Example, which uses the Taxi Trips dataset released by the City of Chicago.

Read more about the dataset in Google BigQuery. Explore the full dataset in the BigQuery UI.

The columns in the dataset are:

pickup_community_areafaretrip_start_month
trip_start_hourtrip_start_daytrip_start_timestamp
pickup_latitudepickup_longitudedropoff_latitude
dropoff_longitudetrip_milespickup_census_tract
dropoff_census_tractpayment_typecompany
trip_secondsdropoff_community_areatips

Install Jupyter Extensions

jupyter nbextension enable --py widgetsnbextension
jupyter nbextension install --py --symlink tensorflow_model_analysis
jupyter nbextension enable --py tensorflow_model_analysis

Upgrade Pip

To avoid upgrading Pip in a system when running locally, check to make sure that we're running in Colab. Local systems can of course be upgraded separately.

try:
  import colab
  !pip install --upgrade pip
except:
  pass

Install TensorFlow

pip install tensorflow==2.2.0
Collecting tensorflow==2.2.0
  Using cached tensorflow-2.2.0-cp36-cp36m-manylinux2010_x86_64.whl (516.2 MB)
Requirement already satisfied: opt-einsum>=2.3.2 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (3.3.0)
Requirement already satisfied: grpcio>=1.8.6 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.31.0)
Requirement already satisfied: google-pasta>=0.1.8 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (0.2.0)
Requirement already satisfied: h5py<2.11.0,>=2.10.0 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (2.10.0)
Requirement already satisfied: keras-preprocessing>=1.1.0 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.1.2)
Requirement already satisfied: numpy<2.0,>=1.16.0 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.18.5)
Requirement already satisfied: wrapt>=1.11.1 in /home/kbuilder/.local/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.12.1)
Requirement already satisfied: absl-py>=0.7.0 in /home/kbuilder/.local/lib/python3.6/site-packages (from tensorflow==2.2.0) (0.9.0)
Collecting tensorflow-estimator<2.3.0,>=2.2.0
  Using cached tensorflow_estimator-2.2.0-py2.py3-none-any.whl (454 kB)
Requirement already satisfied: astunparse==1.6.3 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.6.3)
Requirement already satisfied: termcolor>=1.1.0 in /home/kbuilder/.local/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.1.0)
Requirement already satisfied: wheel>=0.26; python_version >= "3" in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (0.35.1)
Requirement already satisfied: gast==0.3.3 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (0.3.3)
Requirement already satisfied: scipy==1.4.1; python_version >= "3" in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.4.1)
Requirement already satisfied: six>=1.12.0 in /home/kbuilder/.local/lib/python3.6/site-packages (from tensorflow==2.2.0) (1.15.0)
Requirement already satisfied: protobuf>=3.8.0 in /home/kbuilder/.local/lib/python3.6/site-packages (from tensorflow==2.2.0) (3.13.0)
Collecting tensorboard<2.3.0,>=2.2.0
  Using cached tensorboard-2.2.2-py3-none-any.whl (3.0 MB)
Requirement already satisfied: setuptools in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from protobuf>=3.8.0->tensorflow==2.2.0) (49.6.0)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (1.7.0)
Requirement already satisfied: google-auth<2,>=1.6.3 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (1.20.1)
Requirement already satisfied: werkzeug>=0.11.15 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (1.0.1)
Requirement already satisfied: requests<3,>=2.21.0 in /home/kbuilder/.local/lib/python3.6/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (2.24.0)
Requirement already satisfied: markdown>=2.6.8 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (3.2.2)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (0.4.1)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /home/kbuilder/.local/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (4.1.1)
Requirement already satisfied: rsa<5,>=3.1.4; python_version >= "3.5" in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (4.6)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/lib/python3/dist-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (0.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /home/kbuilder/.local/lib/python3.6/site-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (2020.6.20)
Requirement already satisfied: idna<3,>=2.5 in /usr/lib/python3/dist-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (2.6)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/lib/python3/dist-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (1.22)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/lib/python3/dist-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (3.0.4)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /home/kbuilder/.local/lib/python3.6/site-packages (from markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (1.7.0)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (1.3.0)
Requirement already satisfied: pyasn1>=0.1.3 in /usr/lib/python3/dist-packages (from rsa<5,>=3.1.4; python_version >= "3.5"->google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (0.4.2)
Requirement already satisfied: zipp>=0.5 in /home/kbuilder/.local/lib/python3.6/site-packages (from importlib-metadata; python_version < "3.8"->markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (3.1.0)
Requirement already satisfied: oauthlib>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.6/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.3.0,>=2.2.0->tensorflow==2.2.0) (3.1.0)
Installing collected packages: tensorflow-estimator, tensorboard, tensorflow
  Attempting uninstall: tensorflow-estimator
    Found existing installation: tensorflow-estimator 2.3.0
    Uninstalling tensorflow-estimator-2.3.0:
      Successfully uninstalled tensorflow-estimator-2.3.0
  Attempting uninstall: tensorboard
    Found existing installation: tensorboard 2.3.0
    Uninstalling tensorboard-2.3.0:
      Successfully uninstalled tensorboard-2.3.0
  Attempting uninstall: tensorflow
    Found existing installation: tensorflow 2.3.0
    Uninstalling tensorflow-2.3.0:
      Successfully uninstalled tensorflow-2.3.0
Successfully installed tensorboard-2.2.2 tensorflow-2.2.0 tensorflow-estimator-2.2.0

Install TensorFlow Model Analysis (TFMA)

This will pull in all the dependencies, and will take a minute. Please ignore the warnings.

import sys

# Confirm that we're using Python 3
assert sys.version_info.major is 3, 'Oops, not running Python 3. Use Runtime > Change runtime type'
import tensorflow as tf
print('TF version: {}'.format(tf.__version__))

print('Installing Apache Beam')
!pip install -Uq apache_beam==2.17.0
import apache_beam as beam
print('Beam version: {}'.format(beam.__version__))

# Install TFMA
# This will pull in all the dependencies, and will take a minute
# Please ignore the warnings
!pip install -q tensorflow-model-analysis==0.21.3

import tensorflow as tf
import tensorflow_model_analysis as tfma
print('TFMA version: {}'.format(tfma.version.VERSION_STRING))
TF version: 2.2.0
Installing Apache Beam
Beam version: 2.17.0
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

tensorflow-serving-api 2.3.0 requires tensorflow<3,>=2.3, but you'll have tensorflow 2.2.0 which is incompatible.
tfx-bsl 0.22.1 requires apache-beam[gcp]<3,>=2.20, but you'll have apache-beam 2.17.0 which is incompatible.
tfx-bsl 0.22.1 requires pyarrow<0.17,>=0.16.0, but you'll have pyarrow 0.15.1 which is incompatible.
tfx-bsl 0.22.1 requires tensorflow-metadata<0.23,>=0.22.2, but you'll have tensorflow-metadata 0.21.2 which is incompatible.

Error importing tfx_bsl_extension.coders. Some tfx_bsl functionalities are not available
TFMA version: 0.21.3

Load The Files

We'll download a tar file that has everything we need. That includes:

  • Training and evaluation datasets
  • Data schema
  • Training results as EvalSavedModels
# Download the tar file from GCP and extract it
import io, os, tempfile
BASE_DIR = tempfile.mkdtemp()
TFMA_DIR = os.path.join(BASE_DIR, 'eval_saved_models-0.15.0')
DATA_DIR = os.path.join(TFMA_DIR, 'data')
OUTPUT_DIR = os.path.join(TFMA_DIR, 'output')
SCHEMA = os.path.join(TFMA_DIR, 'schema.pbtxt')

!wget https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/eval_saved_models-0.15.0.tar
!tar xf eval_saved_models-0.15.0.tar
!mv eval_saved_models-0.15.0 {BASE_DIR}
!rm eval_saved_models-0.15.0.tar

print("Here's what we downloaded:")
!ls -R {TFMA_DIR}
--2020-08-18 09:11:43--  https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/eval_saved_models-0.15.0.tar
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.204.128, 74.125.203.128, 108.177.125.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.204.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4311040 (4.1M) [application/x-tar]
Saving to: ‘eval_saved_models-0.15.0.tar’

eval_saved_models-0 100%[===================>]   4.11M  11.2MB/s    in 0.4s    

2020-08-18 09:11:44 (11.2 MB/s) - ‘eval_saved_models-0.15.0.tar’ saved [4311040/4311040]

Here's what we downloaded:
/tmp/tmp70nogsm5/eval_saved_models-0.15.0:
data  run_0  run_1  run_2  schema.pbtxt

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/data:
eval  train

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/data/eval:
data.csv

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/data/train:
data.csv

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0:
eval_model_dir

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir:
1578507304

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir/1578507304:
assets  saved_model.pb  variables

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir/1578507304/assets:
vocab_compute_and_apply_vocabulary_1_vocabulary
vocab_compute_and_apply_vocabulary_vocabulary

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir/1578507304/variables:
variables.data-00000-of-00001  variables.index

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_1:
eval_model_dir

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_1/eval_model_dir:
1578507304

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_1/eval_model_dir/1578507304:
assets  saved_model.pb  variables

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_1/eval_model_dir/1578507304/assets:
vocab_compute_and_apply_vocabulary_1_vocabulary
vocab_compute_and_apply_vocabulary_vocabulary

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_1/eval_model_dir/1578507304/variables:
variables.data-00000-of-00001  variables.index

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_2:
eval_model_dir

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_2/eval_model_dir:
1578507304

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_2/eval_model_dir/1578507304:
assets  saved_model.pb  variables

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_2/eval_model_dir/1578507304/assets:
vocab_compute_and_apply_vocabulary_1_vocabulary
vocab_compute_and_apply_vocabulary_vocabulary

/tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_2/eval_model_dir/1578507304/variables:
variables.data-00000-of-00001  variables.index

Parse the Schema

Among the things we downloaded was a schema for our data that was created by TensorFlow Data Validation. Let's parse that now so that we can use it with TFMA.

from google.protobuf import text_format
from tensorflow.python.lib.io import file_io
from tensorflow_metadata.proto.v0 import schema_pb2
from tensorflow.core.example import example_pb2

schema = schema_pb2.Schema()
contents = file_io.read_file_to_string(SCHEMA)
schema = text_format.Parse(contents, schema)

Use the Schema to Create TFRecords

We need to give TFMA access to our dataset, so let's create a TFRecords file. We can use our schema to create it, since it gives us the correct type for each feature.

import csv

datafile = os.path.join(DATA_DIR, 'eval', 'data.csv')
reader = csv.DictReader(open(datafile, 'r'))
examples = []
for line in reader:
  example = example_pb2.Example()
  for feature in schema.feature:
    key = feature.name
    if len(line[key]) > 0:
      if feature.type == schema_pb2.FLOAT:
        example.features.feature[key].float_list.value[:] = [float(line[key])]
      elif feature.type == schema_pb2.INT:
        example.features.feature[key].int64_list.value[:] = [int(line[key])]
      elif feature.type == schema_pb2.BYTES:
        example.features.feature[key].bytes_list.value[:] = [line[key].encode('utf8')]
    else:
      if feature.type == schema_pb2.FLOAT:
        example.features.feature[key].float_list.value[:] = []
      elif feature.type == schema_pb2.INT:
        example.features.feature[key].int64_list.value[:] = []
      elif feature.type == schema_pb2.BYTES:
        example.features.feature[key].bytes_list.value[:] = []
  examples.append(example)

TFRecord_file = os.path.join(BASE_DIR, 'train_data.rio')
with tf.io.TFRecordWriter(TFRecord_file) as writer:
  for example in examples:
    writer.write(example.SerializeToString())
  writer.flush()
  writer.close()

!ls {TFRecord_file}
/tmp/tmp70nogsm5/train_data.rio

Run TFMA and Render Metrics

Now we're ready to create a function that we'll use to run TFMA and render metrics. It requires an EvalSavedModel, a list of SliceSpecs, and an index into the SliceSpec list. It will create an EvalResult using tfma.run_model_analysis, and use it to create a SlicingMetricsViewer using tfma.view.render_slicing_metrics, which will render a visualization of our dataset using the slice we created.

def run_and_render(eval_model=None, slice_list=None, slice_idx=0):
  """Runs the model analysis and renders the slicing metrics

  Args:
      eval_model: An instance of tf.saved_model saved with evaluation data
      slice_list: A list of tfma.slicer.SingleSliceSpec giving the slices
      slice_idx: An integer index into slice_list specifying the slice to use

  Returns:
      A SlicingMetricsViewer object if in Jupyter notebook; None if in Colab.
  """
  eval_result = tfma.run_model_analysis(eval_shared_model=eval_model,
                                          data_location=TFRecord_file,
                                          file_format='tfrecords',
                                          slice_spec=slice_list,
                                          output_path='sample_data',
                                          extractors=None)
  return tfma.view.render_slicing_metrics(eval_result, slicing_spec=slice_list[slice_idx])

Slicing and Dicing

We previously trained a model, and now we've loaded the results. Let's take a look at our visualizations, starting with using TFMA to slice along particular features. But first we need to read in the EvalSavedModel from one of our previous training runs.

Plots are interactive:

  • Click and drag to pan
  • Scroll to zoom
  • Right click to reset the view

Simply hover over the desired data point to see more details. Select from four different types of plots using the selections at the bottom.

For example, we'll be setting slicing_column to look at the trip_start_hour feature in our SliceSpec.

# Load the TFMA results for the first training run
# This will take a minute
eval_model_base_dir_0 = os.path.join(TFMA_DIR, 'run_0', 'eval_model_dir')
eval_model_dir_0 = os.path.join(eval_model_base_dir_0, next(os.walk(eval_model_base_dir_0))[1][0])
eval_shared_model_0 = tfma.default_eval_shared_model(eval_saved_model_path=eval_model_dir_0)

# Slice our data by the trip_start_hour feature
slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_hour'])]

run_and_render(eval_model=eval_shared_model_0, slice_list=slices, slice_idx=0)
WARNING:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir/1578507304/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir/1578507304/variables/variables

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:1666: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:root:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/writers/metrics_and_plots_serialization.py:125: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/writers/metrics_and_plots_serialization.py:125: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`

SlicingMetricsViewer(config={'weightedExamplesColumn': 'post_export_metrics/example_count'}, data=[{'slice': '…

Slices Overview

The default visualization is the Slices Overview when the number of slices is small. It shows the values of metrics for each slice. Since we've selected trip_start_hour above, it's showing us metrics like accuracy and AUC for each hour, which allows us to look for issues that are specific to some hours and not others.

In the visualization above:

  • Try sorting the feature column, which is our trip_start_hours feature, by clicking on the column header
  • Try sorting by precision, and notice that the precision for some of the hours with examples is 0, which may indicate a problem

The chart also allows us to select and display different metrics in our slices.

  • Try selecting different metrics from the "Show" menu
  • Try selecting recall in the "Show" menu, and notice that the recall for some of the hours with examples is 0, which may indicate a problem

It is also possible to set a threshold to filter out slices with smaller numbers of examples, or "weights". You can type a minimum number of examples, or use the slider.

Metrics Histogram

This view also supports a Metrics Histogram as an alternative visualization, which is also the default view when the number of slices is large. The results will be divided into buckets and the number of slices / total weights / both can be visualized. Columns can be sorted by clicking on the column header. Slices with small weights can be filtered out by setting the threshold. Further filtering can be applied by dragging the grey band. To reset the range, double click the band. Filtering can also be used to remove outliers in the visualization and the metrics tables. Click the gear icon to switch to a logarithmic scale instead of a linear scale.

  • Try selecting "Metrics Histogram" in the Visualization menu

More Slices

Let's create a whole list of SliceSpecs, which will allow us to select any of the slices in the list. We'll select the trip_start_day slice (days of the week) by setting the slice_idx to 1. Try changing the slice_idx to 0 or 2 and running again to examine different slices.

slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_hour']),
          tfma.slicer.SingleSliceSpec(columns=['trip_start_day']),
          tfma.slicer.SingleSliceSpec(columns=['trip_start_month'])]
run_and_render(eval_model=eval_shared_model_0, slice_list=slices, slice_idx=1)
WARNING:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Deleting 1 existing files in target path matching: 
WARNING:root:Deleting 1 existing files in target path matching: 
WARNING:root:Deleting 1 existing files in target path matching: 

SlicingMetricsViewer(config={'weightedExamplesColumn': 'post_export_metrics/example_count'}, data=[{'slice': '…

You can create feature crosses to analyze combinations of features. Let's create a SliceSpec to look at a cross of trip_start_day and trip_start_hour:

slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_day', 'trip_start_hour'])]
run_and_render(eval_shared_model_0, slices, 0)
WARNING:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Deleting 1 existing files in target path matching: 
WARNING:root:Deleting 1 existing files in target path matching: 
WARNING:root:Deleting 1 existing files in target path matching: 

SlicingMetricsViewer(config={'weightedExamplesColumn': 'post_export_metrics/example_count'}, data=[{'slice': '…

Crossing the two columns creates a lot of combinations! Let's narrow down our cross to only look at trips that start at noon. Then let's select accuracy from the visualization:

slices = [tfma.slicer.SingleSliceSpec(columns=['trip_start_day'], features=[('trip_start_hour', 12)])]
run_and_render(eval_shared_model_0, slices, 0)
WARNING:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Deleting 1 existing files in target path matching: 
WARNING:root:Deleting 1 existing files in target path matching: 
WARNING:root:Deleting 1 existing files in target path matching: 

SlicingMetricsViewer(config={'weightedExamplesColumn': 'post_export_metrics/example_count'}, data=[{'slice': '…

Tracking Model Performance Over Time

Your training dataset will be used for training your model, and will hopefully be representative of your test dataset and the data that will be sent to your model in production. However, while the data in inference requests may remain the same as your training data, in many cases it will start to change enough so that the performance of your model will change.

That means that you need to monitor and measure your model's performance on an ongoing basis, so that you can be aware of and react to changes. Let's take a look at how TFMA can help.

Measure Performance For New Data

We downloaded the results of three different training runs above, so let's load them now and use TFMA to see how they compare using render_time_series. We can specify particular slices to look at. Let's compare our training runs for trips that started at noon.

  • Select a metric from the dropdown to add the time series graph for that metric
  • Close unwanted graphs
  • Hover over data points (the ends of line segments in the graph) to get more details
def get_eval_result(base_dir, output_dir, data_loc, slice_spec):
  eval_model_dir = os.path.join(base_dir, next(os.walk(base_dir))[1][0])
  eval_shared_model = tfma.default_eval_shared_model(eval_saved_model_path=eval_model_dir)

  return tfma.run_model_analysis(eval_shared_model=eval_shared_model,
                                          data_location=data_loc,
                                          file_format='tfrecords',
                                          slice_spec=slice_spec,
                                          output_path=output_dir,
                                          extractors=None)

slices = [tfma.slicer.SingleSliceSpec()]
output_dir_0 = os.path.join(TFMA_DIR, 'output', 'run_0')
result_ts0 = get_eval_result(os.path.join(TFMA_DIR, 'run_0', 'eval_model_dir'),
                             output_dir_0, TFRecord_file, slices)
output_dir_1 = os.path.join(TFMA_DIR, 'output', 'run_1')
result_ts1 = get_eval_result(os.path.join(TFMA_DIR, 'run_1', 'eval_model_dir'),
                             output_dir_1, TFRecord_file, slices)
output_dir_2 = os.path.join(TFMA_DIR, 'output', 'run_2')
result_ts2 = get_eval_result(os.path.join(TFMA_DIR, 'run_2', 'eval_model_dir'),
                             output_dir_2, TFRecord_file, slices)
WARNING:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir/1578507304/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_0/eval_model_dir/1578507304/variables/variables

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_1/eval_model_dir/1578507304/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_1/eval_model_dir/1578507304/variables/variables

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

Warning:tensorflow:Tensorflow version (2.2.0) found. Note that TFMA support for TF 2.0 is currently in beta

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_2/eval_model_dir/1578507304/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tmp70nogsm5/eval_saved_models-0.15.0/run_2/eval_model_dir/1578507304/variables/variables

How does it look today?

First, we'll imagine that we've trained and deployed our model yesterday, and now we want to see how it's doing on the new data coming in today. The visualization will start by displaying accuracy. Add AUC and average loss by using the "Add metric series" menu.

eval_results_from_disk = tfma.load_eval_results([output_dir_0, output_dir_1],
                                                tfma.constants.MODEL_CENTRIC_MODE)

tfma.view.render_time_series(eval_results_from_disk, slices[0])
TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'average_loss': {'doubleValue':…

Now we'll imagine that another day has passed and we want to see how it's doing on the new data coming in today, compared to the previous two days. Again add AUC and average loss by using the "Add metric series" menu:

eval_results_from_disk = tfma.load_eval_results([output_dir_0, output_dir_1, output_dir_2],
                                                tfma.constants.MODEL_CENTRIC_MODE)

tfma.view.render_time_series(eval_results_from_disk, slices[0])
TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'recall': {'doubleValue': 0.150…