Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge

Introduction to Fairness Indicators

View on TensorFlow.org Run in Google Colab View on GitHub Download notebook See TF Hub model

Overview

Fairness Indicators is a suite of tools built on top of TensorFlow Model Analysis (TFMA) that enable regular evaluation of fairness metrics in product pipelines. TFMA is a library for evaluating both TensorFlow and non-TensorFlow machine learning models. It allows you to evaluate your models on large amounts of data in a distributed manner, compute in-graph and other metrics over different slices of data, and visualize them in notebooks.

Fairness Indicators is packaged with TensorFlow Data Validation (TFDV) and the What-If Tool. Using Fairness Indicators allows you to:

  • Evaluate model performance, sliced across defined groups of users
  • Gain confidence about results with confidence intervals and evaluations at multiple thresholds
  • Evaluate the distribution of datasets
  • Dive deep into individual slices to explore root causes and opportunities for improvement

In this notebook, you will use Fairness Indicators to fix fairness issues in a model you train using the Civil Comments dataset. Watch this video for more details and context on the real-world scenario this is based on which is also one of primary motivations for creating Fairness Indicators.

Dataset

In this notebook, you will work with the Civil Comments dataset, approximately 2 million public comments made public by the Civil Comments platform in 2017 for ongoing research. This effort was sponsored by Jigsaw, who have hosted competitions on Kaggle to help classify toxic comments as well as minimize unintended model bias.

Each individual text comment in the dataset has a toxicity label, with the label being 1 if the comment is toxic and 0 if the comment is non-toxic. Within the data, a subset of comments are labeled with a variety of identity attributes, including categories for gender, sexual orientation, religion, and race or ethnicity.

Setup

Install fairness-indicators and witwidget.

pip install -q -U pip==20.2

pip install -q fairness-indicators
pip install -q witwidget

You must restart the Colab runtime after installing. Select Runtime > Restart runtime from the Colab menu.

Do not proceed with the rest of this tutorial without first restarting the runtime.

Import all other required libraries.

import os
import tempfile
import apache_beam as beam
import numpy as np
import pandas as pd
from datetime import datetime
import pprint

import tensorflow_hub as hub
import tensorflow as tf
import tensorflow_model_analysis as tfma
import tensorflow_data_validation as tfdv

from tensorflow_model_analysis.addons.fairness.post_export_metrics import fairness_indicators
from tensorflow_model_analysis.addons.fairness.view import widget_view

from fairness_indicators.tutorial_utils import util

from witwidget.notebook.visualization import WitConfigBuilder
from witwidget.notebook.visualization import WitWidget

Download and analyze the data

By default, this notebook downloads a preprocessed version of this dataset, but you may use the original dataset and re-run the processing steps if desired. In the original dataset, each comment is labeled with the percentage of raters who believed that a comment corresponds to a particular identity. For example, a comment might be labeled with the following: { male: 0.3, female: 1.0, transgender: 0.0, heterosexual: 0.8, homosexual_gay_or_lesbian: 1.0 } The processing step groups identity by category (gender, sexual_orientation, etc.) and removes identities with a score less than 0.5. So the example above would be converted to the following: of raters who believed that a comment corresponds to a particular identity. For example, the comment would be labeled with the following: { gender: [female], sexual_orientation: [heterosexual, homosexual_gay_or_lesbian] }

download_original_data = False

if download_original_data:
  train_tf_file = tf.keras.utils.get_file('train_tf.tfrecord',
                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf.tfrecord')
  validate_tf_file = tf.keras.utils.get_file('validate_tf.tfrecord',
                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf.tfrecord')

  # The identity terms list will be grouped together by their categories
  # (see 'IDENTITY_COLUMNS') on threshould 0.5. Only the identity term column,
  # text column and label column will be kept after processing.
  train_tf_file = util.convert_comments_data(train_tf_file)
  validate_tf_file = util.convert_comments_data(validate_tf_file)

else:
  train_tf_file = tf.keras.utils.get_file('train_tf_processed.tfrecord',
                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf_processed.tfrecord')
  validate_tf_file = tf.keras.utils.get_file('validate_tf_processed.tfrecord',
                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf_processed.tfrecord')
Downloading data from https://storage.googleapis.com/civil_comments_dataset/train_tf_processed.tfrecord
488161280/488153424 [==============================] - 6s 0us/step
Downloading data from https://storage.googleapis.com/civil_comments_dataset/validate_tf_processed.tfrecord
324943872/324941336 [==============================] - 5s 0us/step

Use TFDV to analyze the data and find potential problems in it, such as missing values and data imbalances, that can lead to fairness disparities.

stats = tfdv.generate_statistics_from_tfrecord(data_location=train_tf_file)
tfdv.visualize_statistics(stats)
WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.
WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_data_validation/utils/stats_util.py:247: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_data_validation/utils/stats_util.py:247: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`

TFDV shows that there are some significant imbalances in the data which could lead to biased model outcomes.

  • The toxicity label (the value predicted by the model) is unbalanced. Only 8% of the examples in the training set are toxic, which means that a classifier could get 92% accuracy by predicting that all comments are non-toxic.

  • In the fields relating to identity terms, only 6.6k out of the 1.08 million (0.61%) training examples deal with homosexuality, and those related to bisexuality are even more rare. This indicates that performance on these slices may suffer due to lack of training data.

Prepare the data

Define a feature map to parse the data. Each example will have a label, comment text, and identity features sexual orientation, gender, religion, race, and disability that are associated with the text.

BASE_DIR = tempfile.gettempdir()

TEXT_FEATURE = 'comment_text'
LABEL = 'toxicity'
FEATURE_MAP = {
    # Label:
    LABEL: tf.io.FixedLenFeature([], tf.float32),
    # Text:
    TEXT_FEATURE:  tf.io.FixedLenFeature([], tf.string),

    # Identities:
    'sexual_orientation':tf.io.VarLenFeature(tf.string),
    'gender':tf.io.VarLenFeature(tf.string),
    'religion':tf.io.VarLenFeature(tf.string),
    'race':tf.io.VarLenFeature(tf.string),
    'disability':tf.io.VarLenFeature(tf.string),
}

Next, set up an input function to feed data into the model. Add a weight column to each example and upweight the toxic examples to account for the class imbalance identified by the TFDV. Use only identity features during the evaluation phase, as only the comments are fed into the model during training.

def train_input_fn():
  def parse_function(serialized):
    parsed_example = tf.io.parse_single_example(
        serialized=serialized, features=FEATURE_MAP)
    # Adds a weight column to deal with unbalanced classes.
    parsed_example['weight'] = tf.add(parsed_example[LABEL], 0.1)
    return (parsed_example,
            parsed_example[LABEL])
  train_dataset = tf.data.TFRecordDataset(
      filenames=[train_tf_file]).map(parse_function).batch(512)
  return train_dataset

Train the model

Create and train a deep learning model on the data.

model_dir = os.path.join(BASE_DIR, 'train', datetime.now().strftime(
    "%Y%m%d-%H%M%S"))

embedded_text_feature_column = hub.text_embedding_column(
    key=TEXT_FEATURE,
    module_spec='https://tfhub.dev/google/nnlm-en-dim128/1')

classifier = tf.estimator.DNNClassifier(
    hidden_units=[500, 100],
    weight_column='weight',
    feature_columns=[embedded_text_feature_column],
    optimizer=tf.keras.optimizers.Adagrad(learning_rate=0.003),
    loss_reduction=tf.losses.Reduction.SUM,
    n_classes=2,
    model_dir=model_dir)

classifier.train(input_fn=train_input_fn, steps=1000)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20210130-101939', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20210130-101939', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/head/base_head.py:517: NumericColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/head/base_head.py:517: NumericColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py:2192: NumericColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py:2192: NumericColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:83: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:83: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20210130-101939/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20210130-101939/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 59.10357, step = 0
INFO:tensorflow:loss = 59.10357, step = 0
INFO:tensorflow:global_step/sec: 21.4277
INFO:tensorflow:global_step/sec: 21.4277
INFO:tensorflow:loss = 55.88044, step = 100 (4.669 sec)
INFO:tensorflow:loss = 55.88044, step = 100 (4.669 sec)
INFO:tensorflow:global_step/sec: 21.9772
INFO:tensorflow:global_step/sec: 21.9772
INFO:tensorflow:loss = 47.06628, step = 200 (4.550 sec)
INFO:tensorflow:loss = 47.06628, step = 200 (4.550 sec)
INFO:tensorflow:global_step/sec: 21.5857
INFO:tensorflow:global_step/sec: 21.5857
INFO:tensorflow:loss = 55.466003, step = 300 (4.633 sec)
INFO:tensorflow:loss = 55.466003, step = 300 (4.633 sec)
INFO:tensorflow:global_step/sec: 20.7493
INFO:tensorflow:global_step/sec: 20.7493
INFO:tensorflow:loss = 56.189507, step = 400 (4.820 sec)
INFO:tensorflow:loss = 56.189507, step = 400 (4.820 sec)
INFO:tensorflow:global_step/sec: 21.7353
INFO:tensorflow:global_step/sec: 21.7353
INFO:tensorflow:loss = 41.800144, step = 500 (4.600 sec)
INFO:tensorflow:loss = 41.800144, step = 500 (4.600 sec)
INFO:tensorflow:global_step/sec: 22.551
INFO:tensorflow:global_step/sec: 22.551
INFO:tensorflow:loss = 45.347168, step = 600 (4.434 sec)
INFO:tensorflow:loss = 45.347168, step = 600 (4.434 sec)
INFO:tensorflow:global_step/sec: 22.292
INFO:tensorflow:global_step/sec: 22.292
INFO:tensorflow:loss = 51.336346, step = 700 (4.486 sec)
INFO:tensorflow:loss = 51.336346, step = 700 (4.486 sec)
INFO:tensorflow:global_step/sec: 21.9374
INFO:tensorflow:global_step/sec: 21.9374
INFO:tensorflow:loss = 47.501987, step = 800 (4.559 sec)
INFO:tensorflow:loss = 47.501987, step = 800 (4.559 sec)
INFO:tensorflow:global_step/sec: 22.0593
INFO:tensorflow:global_step/sec: 22.0593
INFO:tensorflow:loss = 48.003296, step = 900 (4.533 sec)
INFO:tensorflow:loss = 48.003296, step = 900 (4.533 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20210130-101939/model.ckpt.
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20210130-101939/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Loss for final step: 50.98375.
INFO:tensorflow:Loss for final step: 50.98375.
<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifierV2 at 0x7f76540f3a20>

Analyze the model

After obtaining the trained model, analyze it to compute fairness metrics using TFMA and Fairness Indicators. Begin by exporting the model as a SavedModel.

Export SavedModel

def eval_input_receiver_fn():
  serialized_tf_example = tf.compat.v1.placeholder(
      dtype=tf.string, shape=[None], name='input_example_placeholder')

  # This *must* be a dictionary containing a single key 'examples', which
  # points to the input placeholder.
  receiver_tensors = {'examples': serialized_tf_example}

  features = tf.io.parse_example(serialized_tf_example, FEATURE_MAP)
  features['weight'] = tf.ones_like(features[LABEL])

  return tfma.export.EvalInputReceiver(
    features=features,
    receiver_tensors=receiver_tensors,
    labels=features[LABEL])

tfma_export_dir = tfma.export.export_eval_savedmodel(
  estimator=classifier,
  export_dir_base=os.path.join(BASE_DIR, 'tfma_eval_model'),
  eval_input_receiver_fn=eval_input_receiver_fn)
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/encoding.py:141: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/encoding.py:141: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: None
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']
INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']
WARNING:tensorflow:Export includes no default signature!
WARNING:tensorflow:Export includes no default signature!
INFO:tensorflow:Restoring parameters from /tmp/train/20210130-101939/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/train/20210130-101939/model.ckpt-1000
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1612002036/assets
INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1612002036/assets
INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1612002036/saved_model.pb
INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1612002036/saved_model.pb

Compute Fairness Metrics

Select the identity to compute metrics for and whether to run with confidence intervals using the dropdown in the panel on the right.

Fairness Indicators Computation Options

Slice selection: sexual_orientation
Compute confidence intervals: False
WARNING:apache_beam.typehints.typehints:Ignoring send_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring return_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring send_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring return_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring send_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring return_type hint: <class 'NoneType'>
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1612002036/variables/variables
INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1612002036/variables/variables
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.
WARNING:apache_beam.io.filebasedsink:Deleting 1 existing files in target path matching: 
WARNING:apache_beam.io.filebasedsink:Deleting 1 existing files in target path matching: 
WARNING:apache_beam.io.filebasedsink:Deleting 1 existing files in target path matching:

Visualize data using the What-if Tool

In this section, you'll use the What-If Tool's interactive visual interface to explore and manipulate data at a micro-level.

Each point on the scatter plot on the right-hand panel represents one of the examples in the subset loaded into the tool. Click on one of the points to see details about this particular example in the left-hand panel. The comment text, ground truth toxicity, and applicable identities are shown. At the bottom of this left-hand panel, you see the inference results from the model you just trained.

Modify the text of the example and then click the Run inference button to view how your changes caused the perceived toxicity prediction to change.

DEFAULT_MAX_EXAMPLES = 1000

# Load 100000 examples in memory. When first rendered, 
# What-If Tool should only display 1000 of these due to browser constraints.
def wit_dataset(file, num_examples=100000):
  dataset = tf.data.TFRecordDataset(
      filenames=[file]).take(num_examples)
  return [tf.train.Example.FromString(d.numpy()) for d in dataset]

wit_data = wit_dataset(train_tf_file)
config_builder = WitConfigBuilder(wit_data[:DEFAULT_MAX_EXAMPLES]).set_estimator_and_feature_spec(
    classifier, FEATURE_MAP).set_label_vocab(['non-toxicity', LABEL]).set_target_feature(LABEL)
wit = WitWidget(config_builder)

Render Fairness Indicators

Render the Fairness Indicators widget with the exported evaluation results.

Below you will see bar charts displaying performance of each slice of the data on selected metrics. You can adjust the baseline comparison slice as well as the displayed threshold(s) using the dropdown menus at the top of the visualization.

The Fairness Indicator widget is integrated with the What-If Tool rendered above. If you select one slice of the data in the bar chart, the What-If Tool will update to show you examples from the selected slice. When the data reloads in the What-If Tool above, try modifying Color By to toxicity. This can give you a visual understanding of the toxicity balance of examples by slice.

event_handlers={'slice-selected':
                wit.create_selection_callback(wit_data, DEFAULT_MAX_EXAMPLES)}
widget_view.render_fairness_indicator(eval_result=eval_result,
                                      slicing_column=slice_selection,
                                      event_handlers=event_handlers
                                      )
FairnessIndicatorViewer(slicingMetrics=[{'sliceValue': 'Overall', 'slice': 'Overall', 'metrics': {'auc_precisi…

With this particular dataset and task, systematically higher false positive and false negative rates for certain identities can lead to negative consequences. For example, in a content moderation system, a higher-than-overall false positive rate for a certain group can lead to those voices being silenced. Thus, it is important to regularly evaluate these types of criteria as you develop and improve models, and utilize tools such as Fairness Indicators, TFDV, and WIT to help illuminate potential problems. Once you've identified fairness issues, you can experiment with new data sources, data balancing, or other techniques to improve performance on underperforming groups.

See here for more information and guidance on how to use Fairness Indicators.

Use fairness evaluation results

The eval_result object, rendered above in render_fairness_indicator(), has its own API that you can leverage to read TFMA results into your programs.

Get evaluated slices and metrics

Use get_slice_names() and get_metric_names() to get the evaluated slices and metrics, respectively.

pp = pprint.PrettyPrinter()

print("Slices:")
pp.pprint(eval_result.get_slice_names())
print("\nMetrics:")
pp.pprint(eval_result.get_metric_names())
Slices:
[(),
 (('sexual_orientation', 'homosexual_gay_or_lesbian'),),
 (('sexual_orientation', 'heterosexual'),),
 (('sexual_orientation', 'bisexual'),),
 (('sexual_orientation', 'other_sexual_orientation'),)]

Metrics:
['post_export_metrics/true_positive_rate@0.90',
 'post_export_metrics/false_omission_rate@0.50',
 'post_export_metrics/negative_rate@0.70',
 'post_export_metrics/negative_rate@0.90',
 'post_export_metrics/false_positive_rate@0.70',
 'post_export_metrics/false_omission_rate@0.70',
 'post_export_metrics/false_discovery_rate@0.70',
 'accuracy',
 'prediction/mean',
 'post_export_metrics/false_omission_rate@0.30',
 'post_export_metrics/example_count',
 'post_export_metrics/false_negative_rate@0.30',
 'post_export_metrics/fairness/confusion_matrix_at_thresholds',
 'average_loss',
 'post_export_metrics/false_positive_rate@0.30',
 'post_export_metrics/false_negative_rate@0.10',
 'label/mean',
 'post_export_metrics/positive_rate@0.70',
 'auc',
 'post_export_metrics/true_negative_rate@0.50',
 'precision',
 'post_export_metrics/false_positive_rate@0.90',
 'post_export_metrics/positive_rate@0.10',
 'post_export_metrics/false_omission_rate@0.90',
 'post_export_metrics/false_discovery_rate@0.10',
 'post_export_metrics/false_negative_rate@0.70',
 'post_export_metrics/positive_rate@0.50',
 'post_export_metrics/false_discovery_rate@0.30',
 'post_export_metrics/positive_rate@0.30',
 'post_export_metrics/negative_rate@0.30',
 'post_export_metrics/negative_rate@0.50',
 'post_export_metrics/false_discovery_rate@0.90',
 'post_export_metrics/true_negative_rate@0.10',
 'auc_precision_recall',
 'post_export_metrics/true_positive_rate@0.50',
 'accuracy_baseline',
 'post_export_metrics/false_positive_rate@0.10',
 'post_export_metrics/false_negative_rate@0.50',
 'post_export_metrics/false_omission_rate@0.10',
 'post_export_metrics/false_discovery_rate@0.50',
 'post_export_metrics/false_negative_rate@0.90',
 'post_export_metrics/true_positive_rate@0.70',
 'post_export_metrics/true_positive_rate@0.10',
 'post_export_metrics/true_negative_rate@0.30',
 'post_export_metrics/true_positive_rate@0.30',
 'post_export_metrics/false_positive_rate@0.50',
 'post_export_metrics/true_negative_rate@0.90',
 'post_export_metrics/positive_rate@0.90',
 'post_export_metrics/true_negative_rate@0.70',
 'post_export_metrics/negative_rate@0.10',
 'recall']

Use get_metrics_for_slice() to get the metrics for a particular slice as a dictionary mapping metric names to metric values.

baseline_slice = ()
heterosexual_slice = (('sexual_orientation', 'heterosexual'),)

print("Baseline metric values:")
pp.pprint(eval_result.get_metrics_for_slice(baseline_slice))
print("\nHeterosexual metric values:")
pp.pprint(eval_result.get_metrics_for_slice(heterosexual_slice))
Baseline metric values:
{'accuracy': {'doubleValue': 0.7191107273101807},
 'accuracy_baseline': {'doubleValue': 0.9198060631752014},
 'auc': {'doubleValue': 0.7972093224525452},
 'auc_precision_recall': {'doubleValue': 0.3029831349849701},
 'average_loss': {'doubleValue': 0.5623680949211121},
 'label/mean': {'doubleValue': 0.08019392192363739},
 'post_export_metrics/example_count': {'doubleValue': 721950.0},
 'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 340.0},
                                                                                                               'boundedFalsePositives': {'value': 612773.0},
                                                                                                               'boundedPrecision': {'value': 0.08586231619119644},
                                                                                                               'boundedRecall': {'value': 0.9941273927688599},
                                                                                                               'boundedTrueNegatives': {'value': 51281.0},
                                                                                                               'boundedTruePositives': {'value': 57556.0},
                                                                                                               'falseNegatives': 340.0,
                                                                                                               'falsePositives': 612773.0,
                                                                                                               'precision': 0.08586231619119644,
                                                                                                               'recall': 0.9941273927688599,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 340.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 612773.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.08586231619119644},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.9941273927688599},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 51281.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 57556.0},
                                                                                                               'threshold': 0.10000000149011612,
                                                                                                               'trueNegatives': 51281.0,
                                                                                                               'truePositives': 57556.0},
                                                                                                              {'boundedFalseNegatives': {'value': 4971.0},
                                                                                                               'boundedFalsePositives': {'value': 385684.0},
                                                                                                               'boundedPrecision': {'value': 0.12066555768251419},
                                                                                                               'boundedRecall': {'value': 0.9141391515731812},
                                                                                                               'boundedTrueNegatives': {'value': 278370.0},
                                                                                                               'boundedTruePositives': {'value': 52925.0},
                                                                                                               'falseNegatives': 4971.0,
                                                                                                               'falsePositives': 385684.0,
                                                                                                               'precision': 0.12066555768251419,
                                                                                                               'recall': 0.9141391515731812,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 4971.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 385684.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.12066555768251419},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.9141391515731812},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 278370.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 52925.0},
                                                                                                               'threshold': 0.30000001192092896,
                                                                                                               'trueNegatives': 278370.0,
                                                                                                               'truePositives': 52925.0},
                                                                                                              {'boundedFalseNegatives': {'value': 15684.0},
                                                                                                               'boundedFalsePositives': {'value': 187104.0},
                                                                                                               'boundedPrecision': {'value': 0.18407785892486572},
                                                                                                               'boundedRecall': {'value': 0.7291004657745361},
                                                                                                               'boundedTrueNegatives': {'value': 476950.0},
                                                                                                               'boundedTruePositives': {'value': 42212.0},
                                                                                                               'falseNegatives': 15684.0,
                                                                                                               'falsePositives': 187104.0,
                                                                                                               'precision': 0.18407785892486572,
                                                                                                               'recall': 0.7291004657745361,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 15684.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 187104.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.18407785892486572},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.7291004657745361},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 476950.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 42212.0},
                                                                                                               'threshold': 0.5,
                                                                                                               'trueNegatives': 476950.0,
                                                                                                               'truePositives': 42212.0},
                                                                                                              {'boundedFalseNegatives': {'value': 31392.0},
                                                                                                               'boundedFalsePositives': {'value': 64497.0},
                                                                                                               'boundedPrecision': {'value': 0.291249543428421},
                                                                                                               'boundedRecall': {'value': 0.4577863812446594},
                                                                                                               'boundedTrueNegatives': {'value': 599557.0},
                                                                                                               'boundedTruePositives': {'value': 26504.0},
                                                                                                               'falseNegatives': 31392.0,
                                                                                                               'falsePositives': 64497.0,
                                                                                                               'precision': 0.291249543428421,
                                                                                                               'recall': 0.4577863812446594,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 31392.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 64497.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.291249543428421},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.4577863812446594},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 599557.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 26504.0},
                                                                                                               'threshold': 0.699999988079071,
                                                                                                               'trueNegatives': 599557.0,
                                                                                                               'truePositives': 26504.0},
                                                                                                              {'boundedFalseNegatives': {'value': 51089.0},
                                                                                                               'boundedFalsePositives': {'value': 6354.0},
                                                                                                               'boundedPrecision': {'value': 0.5172099471092224},
                                                                                                               'boundedRecall': {'value': 0.11757288873195648},
                                                                                                               'boundedTrueNegatives': {'value': 657700.0},
                                                                                                               'boundedTruePositives': {'value': 6807.0},
                                                                                                               'falseNegatives': 51089.0,
                                                                                                               'falsePositives': 6354.0,
                                                                                                               'precision': 0.5172099471092224,
                                                                                                               'recall': 0.11757288873195648,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 51089.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 6354.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.5172099471092224},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.11757288873195648},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 657700.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 6807.0},
                                                                                                               'threshold': 0.8999999761581421,
                                                                                                               'trueNegatives': 657700.0,
                                                                                                               'truePositives': 6807.0}]} },
 'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.9141376614570618},
 'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.8793344497680664},
 'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.8159221410751343},
 'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.7087504267692566},
 'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.4827900528907776},
 'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0058725993148982525},
 'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08586085587739944},
 'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.27089953422546387},
 'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.5422136187553406},
 'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8824270963668823},
 'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.006586466915905476},
 'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.017544230446219444},
 'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.03183702379465103},
 'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.04975362494587898},
 'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.07207927852869034},
 'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9227758646011353},
 'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.5808021426200867},
 'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.28176021575927734},
 'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.09712613373994827},
 'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.009568499401211739},
 'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.07150217890739441},
 'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.3924662470817566},
 'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.6823658347129822},
 'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.8739510774612427},
 'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9817702174186707},
 'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9284977912902832},
 'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.6075337529182434},
 'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.3176341950893402},
 'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.12604889273643494},
 'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.01822979375720024},
 'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.07722414284944534},
 'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.41919782757759094},
 'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.7182397842407227},
 'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.9028738737106323},
 'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9904314875602722},
 'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 0.9941273927688599},
 'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.9141391515731812},
 'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7291004657745361},
 'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.4577863812446594},
 'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.11757288873195648},
 'precision': {'doubleValue': 0.18407785892486572},
 'prediction/mean': {'doubleValue': 0.40011298656463623},
 'recall': {'doubleValue': 0.7291004657745361} }

Heterosexual metric values:
{'accuracy': {'doubleValue': 0.5325203537940979},
 'accuracy_baseline': {'doubleValue': 0.7601625919342041},
 'auc': {'doubleValue': 0.6657300591468811},
 'auc_precision_recall': {'doubleValue': 0.40228524804115295},
 'average_loss': {'doubleValue': 0.8246824741363525},
 'label/mean': {'doubleValue': 0.2398373931646347},
 'post_export_metrics/example_count': {'doubleValue': 492.0},
 'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                               'boundedFalsePositives': {'value': 362.0},
                                                                                                               'boundedPrecision': {'value': 0.24583333730697632},
                                                                                                               'boundedRecall': {'value': 1.0},
                                                                                                               'boundedTrueNegatives': {'value': 12.0},
                                                                                                               'boundedTruePositives': {'value': 118.0},
                                                                                                               'falsePositives': 362.0,
                                                                                                               'precision': 0.24583333730697632,
                                                                                                               'recall': 1.0,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 362.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.24583333730697632},
                                                                                                               'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 12.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 118.0},
                                                                                                               'threshold': 0.10000000149011612,
                                                                                                               'trueNegatives': 12.0,
                                                                                                               'truePositives': 118.0},
                                                                                                              {'boundedFalseNegatives': {'value': 10.0},
                                                                                                               'boundedFalsePositives': {'value': 289.0},
                                                                                                               'boundedPrecision': {'value': 0.27204030752182007},
                                                                                                               'boundedRecall': {'value': 0.9152542352676392},
                                                                                                               'boundedTrueNegatives': {'value': 85.0},
                                                                                                               'boundedTruePositives': {'value': 108.0},
                                                                                                               'falseNegatives': 10.0,
                                                                                                               'falsePositives': 289.0,
                                                                                                               'precision': 0.27204030752182007,
                                                                                                               'recall': 0.9152542352676392,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 10.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 289.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.27204030752182007},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.9152542352676392},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 85.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 108.0},
                                                                                                               'threshold': 0.30000001192092896,
                                                                                                               'trueNegatives': 85.0,
                                                                                                               'truePositives': 108.0},
                                                                                                              {'boundedFalseNegatives': {'value': 31.0},
                                                                                                               'boundedFalsePositives': {'value': 199.0},
                                                                                                               'boundedPrecision': {'value': 0.3041957914829254},
                                                                                                               'boundedRecall': {'value': 0.7372881174087524},
                                                                                                               'boundedTrueNegatives': {'value': 175.0},
                                                                                                               'boundedTruePositives': {'value': 87.0},
                                                                                                               'falseNegatives': 31.0,
                                                                                                               'falsePositives': 199.0,
                                                                                                               'precision': 0.3041957914829254,
                                                                                                               'recall': 0.7372881174087524,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 31.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 199.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.3041957914829254},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.7372881174087524},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 175.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 87.0},
                                                                                                               'threshold': 0.5,
                                                                                                               'trueNegatives': 175.0,
                                                                                                               'truePositives': 87.0},
                                                                                                              {'boundedFalseNegatives': {'value': 58.0},
                                                                                                               'boundedFalsePositives': {'value': 111.0},
                                                                                                               'boundedPrecision': {'value': 0.35087719559669495},
                                                                                                               'boundedRecall': {'value': 0.508474588394165},
                                                                                                               'boundedTrueNegatives': {'value': 263.0},
                                                                                                               'boundedTruePositives': {'value': 60.0},
                                                                                                               'falseNegatives': 58.0,
                                                                                                               'falsePositives': 111.0,
                                                                                                               'precision': 0.35087719559669495,
                                                                                                               'recall': 0.508474588394165,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 58.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 111.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.35087719559669495},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.508474588394165},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 263.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 60.0},
                                                                                                               'threshold': 0.699999988079071,
                                                                                                               'trueNegatives': 263.0,
                                                                                                               'truePositives': 60.0},
                                                                                                              {'boundedFalseNegatives': {'value': 97.0},
                                                                                                               'boundedFalsePositives': {'value': 16.0},
                                                                                                               'boundedPrecision': {'value': 0.5675675868988037},
                                                                                                               'boundedRecall': {'value': 0.17796610295772552},
                                                                                                               'boundedTrueNegatives': {'value': 358.0},
                                                                                                               'boundedTruePositives': {'value': 21.0},
                                                                                                               'falseNegatives': 97.0,
                                                                                                               'falsePositives': 16.0,
                                                                                                               'precision': 0.5675675868988037,
                                                                                                               'recall': 0.17796610295772552,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 97.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 16.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.5675675868988037},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.17796610295772552},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 358.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 21.0},
                                                                                                               'threshold': 0.8999999761581421,
                                                                                                               'trueNegatives': 358.0,
                                                                                                               'truePositives': 21.0}]} },
 'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7541666626930237},
 'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.7279596924781799},
 'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.6958041787147522},
 'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.6491228342056274},
 'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.4324324429035187},
 'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
 'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08474576473236084},
 'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.26271185278892517},
 'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.49152541160583496},
 'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8220338821411133},
 'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
 'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.10526315867900848},
 'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.15048544108867645},
 'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.180685356259346},
 'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.2131868153810501},
 'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9679144620895386},
 'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.7727272510528564},
 'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.5320855379104614},
 'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.2967914342880249},
 'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.04278074949979782},
 'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.024390242993831635},
 'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.19308942556381226},
 'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.4186991751194},
 'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6524389982223511},
 'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9247967600822449},
 'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9756097793579102},
 'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.8069105744361877},
 'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5813007950782776},
 'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.34756097197532654},
 'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.07520325481891632},
 'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.03208556026220322},
 'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.22727273404598236},
 'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.4679144322872162},
 'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.7032085657119751},
 'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9572192430496216},
 'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
 'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.9152542352676392},
 'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7372881174087524},
 'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.508474588394165},
 'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.17796610295772552},
 'precision': {'doubleValue': 0.3041957914829254},
 'prediction/mean': {'doubleValue': 0.5550199747085571},
 'recall': {'doubleValue': 0.7372881174087524} }

Use get_metrics_for_all_slices() to get the metrics for all slices as a dictionary mapping each slice to the corresponding metrics dictionary you obtain from running get_metrics_for_slice() on it.

pp.pprint(eval_result.get_metrics_for_all_slices())
{(): {'accuracy': {'doubleValue': 0.7191107273101807},
      'accuracy_baseline': {'doubleValue': 0.9198060631752014},
      'auc': {'doubleValue': 0.7972093224525452},
      'auc_precision_recall': {'doubleValue': 0.3029831349849701},
      'average_loss': {'doubleValue': 0.5623680949211121},
      'label/mean': {'doubleValue': 0.08019392192363739},
      'post_export_metrics/example_count': {'doubleValue': 721950.0},
      'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 340.0},
                                                                                                                    'boundedFalsePositives': {'value': 612773.0},
                                                                                                                    'boundedPrecision': {'value': 0.08586231619119644},
                                                                                                                    'boundedRecall': {'value': 0.9941273927688599},
                                                                                                                    'boundedTrueNegatives': {'value': 51281.0},
                                                                                                                    'boundedTruePositives': {'value': 57556.0},
                                                                                                                    'falseNegatives': 340.0,
                                                                                                                    'falsePositives': 612773.0,
                                                                                                                    'precision': 0.08586231619119644,
                                                                                                                    'recall': 0.9941273927688599,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 340.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 612773.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.08586231619119644},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.9941273927688599},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 51281.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 57556.0},
                                                                                                                    'threshold': 0.10000000149011612,
                                                                                                                    'trueNegatives': 51281.0,
                                                                                                                    'truePositives': 57556.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 4971.0},
                                                                                                                    'boundedFalsePositives': {'value': 385684.0},
                                                                                                                    'boundedPrecision': {'value': 0.12066555768251419},
                                                                                                                    'boundedRecall': {'value': 0.9141391515731812},
                                                                                                                    'boundedTrueNegatives': {'value': 278370.0},
                                                                                                                    'boundedTruePositives': {'value': 52925.0},
                                                                                                                    'falseNegatives': 4971.0,
                                                                                                                    'falsePositives': 385684.0,
                                                                                                                    'precision': 0.12066555768251419,
                                                                                                                    'recall': 0.9141391515731812,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 4971.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 385684.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.12066555768251419},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.9141391515731812},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 278370.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 52925.0},
                                                                                                                    'threshold': 0.30000001192092896,
                                                                                                                    'trueNegatives': 278370.0,
                                                                                                                    'truePositives': 52925.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 15684.0},
                                                                                                                    'boundedFalsePositives': {'value': 187104.0},
                                                                                                                    'boundedPrecision': {'value': 0.18407785892486572},
                                                                                                                    'boundedRecall': {'value': 0.7291004657745361},
                                                                                                                    'boundedTrueNegatives': {'value': 476950.0},
                                                                                                                    'boundedTruePositives': {'value': 42212.0},
                                                                                                                    'falseNegatives': 15684.0,
                                                                                                                    'falsePositives': 187104.0,
                                                                                                                    'precision': 0.18407785892486572,
                                                                                                                    'recall': 0.7291004657745361,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 15684.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 187104.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.18407785892486572},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.7291004657745361},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 476950.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 42212.0},
                                                                                                                    'threshold': 0.5,
                                                                                                                    'trueNegatives': 476950.0,
                                                                                                                    'truePositives': 42212.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 31392.0},
                                                                                                                    'boundedFalsePositives': {'value': 64497.0},
                                                                                                                    'boundedPrecision': {'value': 0.291249543428421},
                                                                                                                    'boundedRecall': {'value': 0.4577863812446594},
                                                                                                                    'boundedTrueNegatives': {'value': 599557.0},
                                                                                                                    'boundedTruePositives': {'value': 26504.0},
                                                                                                                    'falseNegatives': 31392.0,
                                                                                                                    'falsePositives': 64497.0,
                                                                                                                    'precision': 0.291249543428421,
                                                                                                                    'recall': 0.4577863812446594,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 31392.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 64497.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.291249543428421},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.4577863812446594},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 599557.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 26504.0},
                                                                                                                    'threshold': 0.699999988079071,
                                                                                                                    'trueNegatives': 599557.0,
                                                                                                                    'truePositives': 26504.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 51089.0},
                                                                                                                    'boundedFalsePositives': {'value': 6354.0},
                                                                                                                    'boundedPrecision': {'value': 0.5172099471092224},
                                                                                                                    'boundedRecall': {'value': 0.11757288873195648},
                                                                                                                    'boundedTrueNegatives': {'value': 657700.0},
                                                                                                                    'boundedTruePositives': {'value': 6807.0},
                                                                                                                    'falseNegatives': 51089.0,
                                                                                                                    'falsePositives': 6354.0,
                                                                                                                    'precision': 0.5172099471092224,
                                                                                                                    'recall': 0.11757288873195648,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 51089.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 6354.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.5172099471092224},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.11757288873195648},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 657700.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 6807.0},
                                                                                                                    'threshold': 0.8999999761581421,
                                                                                                                    'trueNegatives': 657700.0,
                                                                                                                    'truePositives': 6807.0}]} },
      'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.9141376614570618},
      'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.8793344497680664},
      'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.8159221410751343},
      'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.7087504267692566},
      'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.4827900528907776},
      'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0058725993148982525},
      'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08586085587739944},
      'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.27089953422546387},
      'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.5422136187553406},
      'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8824270963668823},
      'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.006586466915905476},
      'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.017544230446219444},
      'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.03183702379465103},
      'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.04975362494587898},
      'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.07207927852869034},
      'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9227758646011353},
      'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.5808021426200867},
      'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.28176021575927734},
      'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.09712613373994827},
      'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.009568499401211739},
      'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.07150217890739441},
      'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.3924662470817566},
      'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.6823658347129822},
      'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.8739510774612427},
      'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9817702174186707},
      'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9284977912902832},
      'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.6075337529182434},
      'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.3176341950893402},
      'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.12604889273643494},
      'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.01822979375720024},
      'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.07722414284944534},
      'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.41919782757759094},
      'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.7182397842407227},
      'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.9028738737106323},
      'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9904314875602722},
      'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 0.9941273927688599},
      'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.9141391515731812},
      'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7291004657745361},
      'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.4577863812446594},
      'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.11757288873195648},
      'precision': {'doubleValue': 0.18407785892486572},
      'prediction/mean': {'doubleValue': 0.40011298656463623},
      'recall': {'doubleValue': 0.7291004657745361} },
 (('sexual_orientation', 'bisexual'),): {'accuracy': {'doubleValue': 0.5431034564971924},
                                         'accuracy_baseline': {'doubleValue': 0.8017241358757019},
                                         'auc': {'doubleValue': 0.6259934902191162},
                                         'auc_precision_recall': {'doubleValue': 0.3277454376220703},
                                         'average_loss': {'doubleValue': 0.7425483465194702},
                                         'label/mean': {'doubleValue': 0.1982758641242981},
                                         'post_export_metrics/example_count': {'doubleValue': 116.0},
                                         'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 84.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.21495327353477478},
                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                       'boundedTrueNegatives': {'value': 9.0},
                                                                                                                                                       'boundedTruePositives': {'value': 23.0},
                                                                                                                                                       'falsePositives': 84.0,
                                                                                                                                                       'precision': 0.21495327353477478,
                                                                                                                                                       'recall': 1.0,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 84.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.21495327353477478},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 9.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 23.0},
                                                                                                                                                       'threshold': 0.10000000149011612,
                                                                                                                                                       'trueNegatives': 9.0,
                                                                                                                                                       'truePositives': 23.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 4.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 67.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.22093023359775543},
                                                                                                                                                       'boundedRecall': {'value': 0.8260869383811951},
                                                                                                                                                       'boundedTrueNegatives': {'value': 26.0},
                                                                                                                                                       'boundedTruePositives': {'value': 19.0},
                                                                                                                                                       'falseNegatives': 4.0,
                                                                                                                                                       'falsePositives': 67.0,
                                                                                                                                                       'precision': 0.22093023359775543,
                                                                                                                                                       'recall': 0.8260869383811951,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 4.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 67.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.22093023359775543},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.8260869383811951},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 26.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 19.0},
                                                                                                                                                       'threshold': 0.30000001192092896,
                                                                                                                                                       'trueNegatives': 26.0,
                                                                                                                                                       'truePositives': 19.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 9.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 44.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.24137930572032928},
                                                                                                                                                       'boundedRecall': {'value': 0.6086956262588501},
                                                                                                                                                       'boundedTrueNegatives': {'value': 49.0},
                                                                                                                                                       'boundedTruePositives': {'value': 14.0},
                                                                                                                                                       'falseNegatives': 9.0,
                                                                                                                                                       'falsePositives': 44.0,
                                                                                                                                                       'precision': 0.24137930572032928,
                                                                                                                                                       'recall': 0.6086956262588501,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 9.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 44.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.24137930572032928},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.6086956262588501},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 49.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 14.0},
                                                                                                                                                       'threshold': 0.5,
                                                                                                                                                       'trueNegatives': 49.0,
                                                                                                                                                       'truePositives': 14.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 16.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 20.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.25925925374031067},
                                                                                                                                                       'boundedRecall': {'value': 0.30434781312942505},
                                                                                                                                                       'boundedTrueNegatives': {'value': 73.0},
                                                                                                                                                       'boundedTruePositives': {'value': 7.0},
                                                                                                                                                       'falseNegatives': 16.0,
                                                                                                                                                       'falsePositives': 20.0,
                                                                                                                                                       'precision': 0.25925925374031067,
                                                                                                                                                       'recall': 0.30434781312942505,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 16.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 20.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.25925925374031067},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.30434781312942505},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 73.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 7.0},
                                                                                                                                                       'threshold': 0.699999988079071,
                                                                                                                                                       'trueNegatives': 73.0,
                                                                                                                                                       'truePositives': 7.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 22.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 0.0},
                                                                                                                                                       'boundedPrecision': {'value': 1.0},
                                                                                                                                                       'boundedRecall': {'value': 0.043478261679410934},
                                                                                                                                                       'boundedTrueNegatives': {'value': 93.0},
                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                       'falseNegatives': 22.0,
                                                                                                                                                       'precision': 1.0,
                                                                                                                                                       'recall': 0.043478261679410934,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 22.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 0.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 1.0},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.043478261679410934},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 93.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                       'threshold': 0.8999999761581421,
                                                                                                                                                       'trueNegatives': 93.0,
                                                                                                                                                       'truePositives': 1.0}]} },
                                         'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7850467562675476},
                                         'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.7790697813034058},
                                         'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.7586206793785095},
                                         'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.7407407164573669},
                                         'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.0},
                                         'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
                                         'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.17391304671764374},
                                         'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.3913043439388275},
                                         'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.695652186870575},
                                         'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.95652174949646},
                                         'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
                                         'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.13333334028720856},
                                         'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.1551724076271057},
                                         'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.17977528274059296},
                                         'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.19130434095859528},
                                         'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9032257795333862},
                                         'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.7204301357269287},
                                         'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.47311827540397644},
                                         'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.2150537669658661},
                                         'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.0},
                                         'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.07758620381355286},
                                         'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.2586206793785095},
                                         'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.5},
                                         'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.767241358757019},
                                         'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9913793206214905},
                                         'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9224137663841248},
                                         'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.7413793206214905},
                                         'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5},
                                         'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.23275862634181976},
                                         'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.008620689623057842},
                                         'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.09677419066429138},
                                         'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.2795698940753937},
                                         'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.5268816947937012},
                                         'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.7849462628364563},
                                         'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 1.0},
                                         'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
                                         'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.8260869383811951},
                                         'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.6086956262588501},
                                         'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.30434781312942505},
                                         'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.043478261679410934},
                                         'precision': {'doubleValue': 0.24137930572032928},
                                         'prediction/mean': {'doubleValue': 0.4882037043571472},
                                         'recall': {'doubleValue': 0.6086956262588501} },
 (('sexual_orientation', 'heterosexual'),): {'accuracy': {'doubleValue': 0.5325203537940979},
                                             'accuracy_baseline': {'doubleValue': 0.7601625919342041},
                                             'auc': {'doubleValue': 0.6657300591468811},
                                             'auc_precision_recall': {'doubleValue': 0.40228524804115295},
                                             'average_loss': {'doubleValue': 0.8246824741363525},
                                             'label/mean': {'doubleValue': 0.2398373931646347},
                                             'post_export_metrics/example_count': {'doubleValue': 492.0},
                                             'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 362.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.24583333730697632},
                                                                                                                                                           'boundedRecall': {'value': 1.0},
                                                                                                                                                           'boundedTrueNegatives': {'value': 12.0},
                                                                                                                                                           'boundedTruePositives': {'value': 118.0},
                                                                                                                                                           'falsePositives': 362.0,
                                                                                                                                                           'precision': 0.24583333730697632,
                                                                                                                                                           'recall': 1.0,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 362.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.24583333730697632},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 12.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 118.0},
                                                                                                                                                           'threshold': 0.10000000149011612,
                                                                                                                                                           'trueNegatives': 12.0,
                                                                                                                                                           'truePositives': 118.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 10.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 289.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.27204030752182007},
                                                                                                                                                           'boundedRecall': {'value': 0.9152542352676392},
                                                                                                                                                           'boundedTrueNegatives': {'value': 85.0},
                                                                                                                                                           'boundedTruePositives': {'value': 108.0},
                                                                                                                                                           'falseNegatives': 10.0,
                                                                                                                                                           'falsePositives': 289.0,
                                                                                                                                                           'precision': 0.27204030752182007,
                                                                                                                                                           'recall': 0.9152542352676392,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 10.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 289.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.27204030752182007},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.9152542352676392},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 85.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 108.0},
                                                                                                                                                           'threshold': 0.30000001192092896,
                                                                                                                                                           'trueNegatives': 85.0,
                                                                                                                                                           'truePositives': 108.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 31.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 199.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.3041957914829254},
                                                                                                                                                           'boundedRecall': {'value': 0.7372881174087524},
                                                                                                                                                           'boundedTrueNegatives': {'value': 175.0},
                                                                                                                                                           'boundedTruePositives': {'value': 87.0},
                                                                                                                                                           'falseNegatives': 31.0,
                                                                                                                                                           'falsePositives': 199.0,
                                                                                                                                                           'precision': 0.3041957914829254,
                                                                                                                                                           'recall': 0.7372881174087524,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 31.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 199.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.3041957914829254},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.7372881174087524},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 175.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 87.0},
                                                                                                                                                           'threshold': 0.5,
                                                                                                                                                           'trueNegatives': 175.0,
                                                                                                                                                           'truePositives': 87.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 58.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 111.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.35087719559669495},
                                                                                                                                                           'boundedRecall': {'value': 0.508474588394165},
                                                                                                                                                           'boundedTrueNegatives': {'value': 263.0},
                                                                                                                                                           'boundedTruePositives': {'value': 60.0},
                                                                                                                                                           'falseNegatives': 58.0,
                                                                                                                                                           'falsePositives': 111.0,
                                                                                                                                                           'precision': 0.35087719559669495,
                                                                                                                                                           'recall': 0.508474588394165,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 58.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 111.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.35087719559669495},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.508474588394165},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 263.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 60.0},
                                                                                                                                                           'threshold': 0.699999988079071,
                                                                                                                                                           'trueNegatives': 263.0,
                                                                                                                                                           'truePositives': 60.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 97.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 16.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.5675675868988037},
                                                                                                                                                           'boundedRecall': {'value': 0.17796610295772552},
                                                                                                                                                           'boundedTrueNegatives': {'value': 358.0},
                                                                                                                                                           'boundedTruePositives': {'value': 21.0},
                                                                                                                                                           'falseNegatives': 97.0,
                                                                                                                                                           'falsePositives': 16.0,
                                                                                                                                                           'precision': 0.5675675868988037,
                                                                                                                                                           'recall': 0.17796610295772552,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 97.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 16.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.5675675868988037},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.17796610295772552},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 358.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 21.0},
                                                                                                                                                           'threshold': 0.8999999761581421,
                                                                                                                                                           'trueNegatives': 358.0,
                                                                                                                                                           'truePositives': 21.0}]} },
                                             'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7541666626930237},
                                             'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.7279596924781799},
                                             'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.6958041787147522},
                                             'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.6491228342056274},
                                             'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.4324324429035187},
                                             'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
                                             'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08474576473236084},
                                             'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.26271185278892517},
                                             'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.49152541160583496},
                                             'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8220338821411133},
                                             'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
                                             'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.10526315867900848},
                                             'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.15048544108867645},
                                             'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.180685356259346},
                                             'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.2131868153810501},
                                             'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9679144620895386},
                                             'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.7727272510528564},
                                             'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.5320855379104614},
                                             'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.2967914342880249},
                                             'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.04278074949979782},
                                             'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.024390242993831635},
                                             'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.19308942556381226},
                                             'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.4186991751194},
                                             'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6524389982223511},
                                             'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9247967600822449},
                                             'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9756097793579102},
                                             'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.8069105744361877},
                                             'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5813007950782776},
                                             'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.34756097197532654},
                                             'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.07520325481891632},
                                             'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.03208556026220322},
                                             'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.22727273404598236},
                                             'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.4679144322872162},
                                             'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.7032085657119751},
                                             'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9572192430496216},
                                             'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
                                             'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.9152542352676392},
                                             'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7372881174087524},
                                             'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.508474588394165},
                                             'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.17796610295772552},
                                             'precision': {'doubleValue': 0.3041957914829254},
                                             'prediction/mean': {'doubleValue': 0.5550199747085571},
                                             'recall': {'doubleValue': 0.7372881174087524} },
 (('sexual_orientation', 'homosexual_gay_or_lesbian'),): {'accuracy': {'doubleValue': 0.5879271030426025},
                                                          'accuracy_baseline': {'doubleValue': 0.7182232141494751},
                                                          'auc': {'doubleValue': 0.7076172232627869},
                                                          'auc_precision_recall': {'doubleValue': 0.4747510254383087},
                                                          'average_loss': {'doubleValue': 0.73641037940979},
                                                          'label/mean': {'doubleValue': 0.2817767560482025},
                                                          'post_export_metrics/example_count': {'doubleValue': 4390.0},
                                                          'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 2.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 3031.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.28949835896492004},
                                                                                                                                                                        'boundedRecall': {'value': 0.9983831644058228},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 122.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 1235.0},
                                                                                                                                                                        'falseNegatives': 2.0,
                                                                                                                                                                        'falsePositives': 3031.0,
                                                                                                                                                                        'precision': 0.28949835896492004,
                                                                                                                                                                        'recall': 0.9983831644058228,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 2.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 3031.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.28949835896492004},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.9983831644058228},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 122.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 1235.0},
                                                                                                                                                                        'threshold': 0.10000000149011612,
                                                                                                                                                                        'trueNegatives': 122.0,
                                                                                                                                                                        'truePositives': 1235.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 76.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 2368.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.32898837327957153},
                                                                                                                                                                        'boundedRecall': {'value': 0.9385610222816467},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 785.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 1161.0},
                                                                                                                                                                        'falseNegatives': 76.0,
                                                                                                                                                                        'falsePositives': 2368.0,
                                                                                                                                                                        'precision': 0.32898837327957153,
                                                                                                                                                                        'recall': 0.9385610222816467,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 76.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 2368.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.32898837327957153},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.9385610222816467},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 785.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 1161.0},
                                                                                                                                                                        'threshold': 0.30000001192092896,
                                                                                                                                                                        'trueNegatives': 785.0,
                                                                                                                                                                        'truePositives': 1161.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 281.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 1528.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.38486313819885254},
                                                                                                                                                                        'boundedRecall': {'value': 0.7728375196456909},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 1625.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 956.0},
                                                                                                                                                                        'falseNegatives': 281.0,
                                                                                                                                                                        'falsePositives': 1528.0,
                                                                                                                                                                        'precision': 0.38486313819885254,
                                                                                                                                                                        'recall': 0.7728375196456909,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 281.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 1528.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.38486313819885254},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.7728375196456909},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 1625.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 956.0},
                                                                                                                                                                        'threshold': 0.5,
                                                                                                                                                                        'trueNegatives': 1625.0,
                                                                                                                                                                        'truePositives': 956.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 611.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 731.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.4613117277622223},
                                                                                                                                                                        'boundedRecall': {'value': 0.5060630440711975},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 2422.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 626.0},
                                                                                                                                                                        'falseNegatives': 611.0,
                                                                                                                                                                        'falsePositives': 731.0,
                                                                                                                                                                        'precision': 0.4613117277622223,
                                                                                                                                                                        'recall': 0.5060630440711975,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 611.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 731.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.4613117277622223},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.5060630440711975},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 2422.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 626.0},
                                                                                                                                                                        'threshold': 0.699999988079071,
                                                                                                                                                                        'trueNegatives': 2422.0,
                                                                                                                                                                        'truePositives': 626.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 1072.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 107.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.6066176295280457},
                                                                                                                                                                        'boundedRecall': {'value': 0.1333872228860855},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 3046.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 165.0},
                                                                                                                                                                        'falseNegatives': 1072.0,
                                                                                                                                                                        'falsePositives': 107.0,
                                                                                                                                                                        'precision': 0.6066176295280457,
                                                                                                                                                                        'recall': 0.1333872228860855,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 1072.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 107.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.6066176295280457},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.1333872228860855},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 3046.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 165.0},
                                                                                                                                                                        'threshold': 0.8999999761581421,
                                                                                                                                                                        'trueNegatives': 3046.0,
                                                                                                                                                                        'truePositives': 165.0}]} },
                                                          'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7105016112327576},
                                                          'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.6710116267204285},
                                                          'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.6151368618011475},
                                                          'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.5386883020401001},
                                                          'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.39338234066963196},
                                                          'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.001616814872249961},
                                                          'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.061438966542482376},
                                                          'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.22716249525547028},
                                                          'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.4939369559288025},
                                                          'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8666127920150757},
                                                          'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.016129031777381897},
                                                          'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.08826945722103119},
                                                          'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.14742916822433472},
                                                          'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.20145070552825928},
                                                          'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.2603205442428589},
                                                          'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9613066911697388},
                                                          'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.7510307431221008},
                                                          'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.48461782932281494},
                                                          'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.23184269666671753},
                                                          'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.03393593430519104},
                                                          'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.028246013447642326},
                                                          'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.19612756371498108},
                                                          'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.43416857719421387},
                                                          'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6908884048461914},
                                                          'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9380410313606262},
                                                          'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9717540144920349},
                                                          'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.8038724660873413},
                                                          'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5658314228057861},
                                                          'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.309111624956131},
                                                          'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.06195899844169617},
                                                          'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.03869330883026123},
                                                          'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.24896924197673798},
                                                          'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.5153821706771851},
                                                          'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.7681573033332825},
                                                          'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9660640954971313},
                                                          'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 0.9983831644058228},
                                                          'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.9385610222816467},
                                                          'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7728375196456909},
                                                          'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.5060630440711975},
                                                          'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.1333872228860855},
                                                          'precision': {'doubleValue': 0.38486313819885254},
                                                          'prediction/mean': {'doubleValue': 0.5426298975944519},
                                                          'recall': {'doubleValue': 0.7728375196456909} },
 (('sexual_orientation', 'other_sexual_orientation'),): {'accuracy': {'doubleValue': 0.6000000238418579},
                                                         'accuracy_baseline': {'doubleValue': 0.800000011920929},
                                                         'auc': {'doubleValue': 1.0},
                                                         'auc_precision_recall': {'doubleValue': 1.0},
                                                         'average_loss': {'doubleValue': 0.7428082823753357},
                                                         'label/mean': {'doubleValue': 0.20000000298023224},
                                                         'post_export_metrics/example_count': {'doubleValue': 5.0},
                                                         'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 4.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.20000000298023224},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 4.0,
                                                                                                                                                                       'precision': 0.20000000298023224,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 4.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.20000000298023224},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.10000000149011612,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 3.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.25},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 1.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 3.0,
                                                                                                                                                                       'precision': 0.25,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 3.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.25},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.30000001192092896,
                                                                                                                                                                       'trueNegatives': 1.0,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 2.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.3333333432674408},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 2.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 2.0,
                                                                                                                                                                       'precision': 0.3333333432674408,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 2.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.3333333432674408},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 2.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.5,
                                                                                                                                                                       'trueNegatives': 2.0,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 1.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.5},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 3.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 1.0,
                                                                                                                                                                       'precision': 0.5,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.5},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 3.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.699999988079071,
                                                                                                                                                                       'trueNegatives': 3.0,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 1.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 0.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.0},
                                                                                                                                                                       'boundedRecall': {'value': 0.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 4.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 0.0},
                                                                                                                                                                       'falseNegatives': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 4.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 0.0},
                                                                                                                                                                       'threshold': 0.8999999761581421,
                                                                                                                                                                       'trueNegatives': 4.0}]} },
                                                         'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.800000011920929},
                                                         'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.75},
                                                         'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.6666666865348816},
                                                         'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.5},
                                                         'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 1.0},
                                                         'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.20000000298023224},
                                                         'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 1.0},
                                                         'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.75},
                                                         'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.5},
                                                         'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.25},
                                                         'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.0},
                                                         'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.20000000298023224},
                                                         'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.4000000059604645},
                                                         'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6000000238418579},
                                                         'post_export_metrics/negative_rate@0.90': {'doubleValue': 1.0},
                                                         'post_export_metrics/positive_rate@0.10': {'doubleValue': 1.0},
                                                         'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.800000011920929},
                                                         'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.6000000238418579},
                                                         'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.4000000059604645},
                                                         'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.0},
                                                         'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.25},
                                                         'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.5},
                                                         'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.75},
                                                         'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.0},
                                                         'precision': {'doubleValue': 0.3333333432674408},
                                                         'prediction/mean': {'doubleValue': 0.5955214500427246},
                                                         'recall': {'doubleValue': 1.0} } }