このページは Cloud Translation API によって翻訳されました。
Switch to English

公平性指標の概要

TensorFlow.orgで表示 GoogleColabで実行 GitHubで表示ノートブックをダウンロードTFハブモデルを参照してください

概要概要

Fairness Indicatorsは、 TensorFlow Model Analysis(TFMA)の上に構築された一連のツールであり、製品パイプラインの公平性メトリックの定期的な評価を可能にします。 TFMAは、TensorFlowと非TensorFlowの両方の機械学習モデルを評価するためのライブラリです。これにより、大量のデータでモデルを分散して評価し、さまざまなデータスライスでグラフ内およびその他のメトリックを計算し、ノートブックで視覚化することができます。

Fairness Indicatorsは、 TensorFlow Data Validation(TFDV)What-Ifツールにパッケージ化されています。公平性指標を使用すると、次のことが可能になります。

  • 定義されたユーザーグループ間でスライスされたモデルのパフォーマンスを評価します
  • 複数のしきい値での信頼区間と評価を使用して、結果について信頼を得る
  • データセットの分布を評価する
  • 個々のスライスを深く掘り下げて、根本原因と改善の機会を探ります

このノートブックでは、公平性インジケーターを使用して、 CivilCommentsデータセットを使用してトレーニングするモデルの公平性の問題を修正します。これが基づいている実際のシナリオの詳細とコンテキストについては、このビデオをご覧ください。これは、公平性指標を作成する主な動機の1つでもあります。

データセット

このノートブックでは、 Civil Commentsデータセットを使用します。これは、進行中の調査のために2017年にCivilCommentsプラットフォームによって公開された約200万の公開コメントです。この取り組みは、有毒なコメントを分類し、意図しないモデルのバイアスを最小限に抑えるために、Kaggleでコンテストを主催したJigsawによって後援されました。

データセット内の個々のテキストコメントにはそれぞれ毒性ラベルがあり、コメントが有毒である場合は1、無毒である場合は0のラベルが付いています。データ内で、コメントのサブセットは、性別、性的指向、宗教、人種または民族のカテゴリなど、さまざまなID属性でラベル付けされています。

セットアップ

fairness-indicatorswitwidgetをインストールします。

pip install -q -U pip==20.2

pip install -q fairness-indicators
pip install -q witwidget
    WARNING: Skipping dill as it is not installed.
    WARNING: Skipping pyyaml as it is not installed.
    WARNING: Skipping joblib as it is not installed.
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

tensorflow 2.3.2 requires numpy<1.19.0,>=1.16.0, but you'll have numpy 1.19.5 which is incompatible.
google-api-python-client 1.12.8 requires httplib2<1dev,>=0.15.0, but you'll have httplib2 0.9.2 which is incompatible.
WARNING: You are using pip version 20.2; however, version 20.3.3 is available.
You should consider upgrading via the '/tmpfs/src/tf_docs_env/bin/python -m pip install --upgrade pip' command.
    WARNING: Skipping numpy as it is not installed.
    WARNING: Skipping httplib2 as it is not installed.
ERROR: After October 2020 you may experience errors when installing or updating packages. This is because pip will change the way that it resolves dependency conflicts.

We recommend you use --use-feature=2020-resolver to test your packages with the new resolver before it becomes the default.

apache-beam 2.27.0 requires httplib2<0.18.0,>=0.8, but you'll have httplib2 0.18.1 which is incompatible.
WARNING: You are using pip version 20.2; however, version 20.3.3 is available.
You should consider upgrading via the '/tmpfs/src/tf_docs_env/bin/python -m pip install --upgrade pip' command.

インストール後、Colabランタイムを再起動する必要があります。 ColabメニューからRuntime> Restartruntimeを選択します。

最初にランタイムを再起動せずに、このチュートリアルの残りの部分に進まないでください。

他のすべての必要なライブラリをインポートします。

import os
import tempfile
import apache_beam as beam
import numpy as np
import pandas as pd
from datetime import datetime
import pprint

import tensorflow_hub as hub
import tensorflow as tf
import tensorflow_model_analysis as tfma
import tensorflow_data_validation as tfdv

from tensorflow_model_analysis.addons.fairness.post_export_metrics import fairness_indicators
from tensorflow_model_analysis.addons.fairness.view import widget_view

from fairness_indicators.documentation.examples import util

from witwidget.notebook.visualization import WitConfigBuilder
from witwidget.notebook.visualization import WitWidget

データをダウンロードして分析する

デフォルトでは、このノートブックはこのデータセットの前処理済みバージョンをダウンロードしますが、元のデータセットを使用して、必要に応じて処理ステップを再実行できます。元のデータセットでは、各コメントには、コメントが特定のIDに対応すると信じている評価者の割合がラベル付けされています。たとえば、コメントには次のラベルが付けられます。{男性:0.3、女性:1.0、トランスジェンダー:0.0、異性愛者:0.8、同性愛者_gay_or_lesbian:1.0}処理ステップでは、IDをカテゴリ(性別、性的指向など)でグループ化し、削除します。スコアが0.5未満のID。したがって、上記の例は次のように変換されます。コメントが特定のアイデンティティに対応すると信じている評価者の場合。たとえば、コメントには次のラベルが付けられます:{性別:[女性]、性的指向:[異性愛者、同性愛者_ゲイ_またはレズビアン]}

download_original_data = False

if download_original_data:
  train_tf_file = tf.keras.utils.get_file('train_tf.tfrecord',
                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf.tfrecord')
  validate_tf_file = tf.keras.utils.get_file('validate_tf.tfrecord',
                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf.tfrecord')

  # The identity terms list will be grouped together by their categories
  # (see 'IDENTITY_COLUMNS') on threshould 0.5. Only the identity term column,
  # text column and label column will be kept after processing.
  train_tf_file = util.convert_comments_data(train_tf_file)
  validate_tf_file = util.convert_comments_data(validate_tf_file)

else:
  train_tf_file = tf.keras.utils.get_file('train_tf_processed.tfrecord',
                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf_processed.tfrecord')
  validate_tf_file = tf.keras.utils.get_file('validate_tf_processed.tfrecord',
                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf_processed.tfrecord')
Downloading data from https://storage.googleapis.com/civil_comments_dataset/train_tf_processed.tfrecord
488161280/488153424 [==============================] - 14s 0us/step
Downloading data from https://storage.googleapis.com/civil_comments_dataset/validate_tf_processed.tfrecord
324943872/324941336 [==============================] - 2s 0us/step

TFDVを使用してデータを分析し、欠測値やデータの不均衡など、公平性の不一致につながる可能性のある潜在的な問題を見つけます。

stats = tfdv.generate_statistics_from_tfrecord(data_location=train_tf_file)
tfdv.visualize_statistics(stats)
WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.

Warning:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_data_validation/utils/stats_util.py:247: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_data_validation/utils/stats_util.py:247: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`

TFDVは、データにいくつかの重大な不均衡があり、モデルの結果に偏りが生じる可能性があることを示しています。

  • 毒性ラベル(モデルによって予測された値)は不均衡です。トレーニングセットの例の8%のみが有毒です。つまり、すべてのコメントが無毒であると予測することで、分類器は92%の精度を得ることができます。

  • アイデンティティ用語に関連する分野では、108万(0.61%)のトレーニング例のうち6.6kのみが同性愛を扱っており、バイセクシュアルに関連するものはさらにまれです。これは、トレーニングデータが不足しているために、これらのスライスのパフォーマンスが低下する可能性があることを示しています。

データを準備する

データを解析するためのフィーチャマップを定義します。各例には、ラベル、コメントテキスト、およびテキストに関連付けられているsexual orientationgenderreligionrace 、およびdisability ID機能があります。

BASE_DIR = tempfile.gettempdir()

TEXT_FEATURE = 'comment_text'
LABEL = 'toxicity'
FEATURE_MAP = {
    # Label:
    LABEL: tf.io.FixedLenFeature([], tf.float32),
    # Text:
    TEXT_FEATURE:  tf.io.FixedLenFeature([], tf.string),

    # Identities:
    'sexual_orientation':tf.io.VarLenFeature(tf.string),
    'gender':tf.io.VarLenFeature(tf.string),
    'religion':tf.io.VarLenFeature(tf.string),
    'race':tf.io.VarLenFeature(tf.string),
    'disability':tf.io.VarLenFeature(tf.string),
}

次に、データをモデルにフィードするための入力関数を設定します。 TFDVによって識別されたクラスの不均衡を説明するために、各例に重みの列を追加し、有毒な例を強調します。トレーニング中にコメントのみがモデルに入力されるため、評価フェーズではID機能のみを使用してください。

def train_input_fn():
  def parse_function(serialized):
    parsed_example = tf.io.parse_single_example(
        serialized=serialized, features=FEATURE_MAP)
    # Adds a weight column to deal with unbalanced classes.
    parsed_example['weight'] = tf.add(parsed_example[LABEL], 0.1)
    return (parsed_example,
            parsed_example[LABEL])
  train_dataset = tf.data.TFRecordDataset(
      filenames=[train_tf_file]).map(parse_function).batch(512)
  return train_dataset

モデルをトレーニングする

データに関する深層学習モデルを作成してトレーニングします。

model_dir = os.path.join(BASE_DIR, 'train', datetime.now().strftime(
    "%Y%m%d-%H%M%S"))

embedded_text_feature_column = hub.text_embedding_column(
    key=TEXT_FEATURE,
    module_spec='https://tfhub.dev/google/nnlm-en-dim128/1')

classifier = tf.estimator.DNNClassifier(
    hidden_units=[500, 100],
    weight_column='weight',
    feature_columns=[embedded_text_feature_column],
    optimizer=tf.keras.optimizers.Adagrad(learning_rate=0.003),
    loss_reduction=tf.losses.Reduction.SUM,
    n_classes=2,
    model_dir=model_dir)

classifier.train(input_fn=train_input_fn, steps=1000)
INFO:tensorflow:Using default config.

INFO:tensorflow:Using default config.

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20210111-203552', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20210111-203552', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/head/base_head.py:517: NumericColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/head/base_head.py:517: NumericColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py:2192: NumericColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py:2192: NumericColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:83: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:83: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20210111-203552/model.ckpt.

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20210111-203552/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:loss = 59.633625, step = 0

INFO:tensorflow:loss = 59.633625, step = 0

INFO:tensorflow:global_step/sec: 20.8493

INFO:tensorflow:global_step/sec: 20.8493

INFO:tensorflow:loss = 56.093784, step = 100 (4.799 sec)

INFO:tensorflow:loss = 56.093784, step = 100 (4.799 sec)

INFO:tensorflow:global_step/sec: 21.0896

INFO:tensorflow:global_step/sec: 21.0896

INFO:tensorflow:loss = 47.710705, step = 200 (4.742 sec)

INFO:tensorflow:loss = 47.710705, step = 200 (4.742 sec)

INFO:tensorflow:global_step/sec: 21.1662

INFO:tensorflow:global_step/sec: 21.1662

INFO:tensorflow:loss = 56.390984, step = 300 (4.725 sec)

INFO:tensorflow:loss = 56.390984, step = 300 (4.725 sec)

INFO:tensorflow:global_step/sec: 20.8717

INFO:tensorflow:global_step/sec: 20.8717

INFO:tensorflow:loss = 55.78325, step = 400 (4.791 sec)

INFO:tensorflow:loss = 55.78325, step = 400 (4.791 sec)

INFO:tensorflow:global_step/sec: 20.6858

INFO:tensorflow:global_step/sec: 20.6858

INFO:tensorflow:loss = 41.514656, step = 500 (4.834 sec)

INFO:tensorflow:loss = 41.514656, step = 500 (4.834 sec)

INFO:tensorflow:global_step/sec: 21.335

INFO:tensorflow:global_step/sec: 21.335

INFO:tensorflow:loss = 45.96388, step = 600 (4.687 sec)

INFO:tensorflow:loss = 45.96388, step = 600 (4.687 sec)

INFO:tensorflow:global_step/sec: 20.9603

INFO:tensorflow:global_step/sec: 20.9603

INFO:tensorflow:loss = 51.020523, step = 700 (4.771 sec)

INFO:tensorflow:loss = 51.020523, step = 700 (4.771 sec)

INFO:tensorflow:global_step/sec: 20.9778

INFO:tensorflow:global_step/sec: 20.9778

INFO:tensorflow:loss = 47.464096, step = 800 (4.767 sec)

INFO:tensorflow:loss = 47.464096, step = 800 (4.767 sec)

INFO:tensorflow:global_step/sec: 21.1824

INFO:tensorflow:global_step/sec: 21.1824

INFO:tensorflow:loss = 48.542847, step = 900 (4.721 sec)

INFO:tensorflow:loss = 48.542847, step = 900 (4.721 sec)

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20210111-203552/model.ckpt.

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20210111-203552/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Loss for final step: 51.33647.

INFO:tensorflow:Loss for final step: 51.33647.

<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifierV2 at 0x7feb8e3bce80>

モデルを分析する

トレーニングされたモデルを取得したら、それを分析して、TFMAと公平性指標を使用して公平性メトリックを計算します。モデルをSavedModelとしてエクスポートすることから始めます。

SavedModelをエクスポートします

def eval_input_receiver_fn():
  serialized_tf_example = tf.compat.v1.placeholder(
      dtype=tf.string, shape=[None], name='input_example_placeholder')

  # This *must* be a dictionary containing a single key 'examples', which
  # points to the input placeholder.
  receiver_tensors = {'examples': serialized_tf_example}

  features = tf.io.parse_example(serialized_tf_example, FEATURE_MAP)
  features['weight'] = tf.ones_like(features[LABEL])

  return tfma.export.EvalInputReceiver(
    features=features,
    receiver_tensors=receiver_tensors,
    labels=features[LABEL])

tfma_export_dir = tfma.export.export_eval_savedmodel(
  estimator=classifier,
  export_dir_base=os.path.join(BASE_DIR, 'tfma_eval_model'),
  eval_input_receiver_fn=eval_input_receiver_fn)
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/encoding.py:141: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/encoding.py:141: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

Warning:tensorflow:Export includes no default signature!

Warning:tensorflow:Export includes no default signature!

INFO:tensorflow:Restoring parameters from /tmp/train/20210111-203552/model.ckpt-1000

INFO:tensorflow:Restoring parameters from /tmp/train/20210111-203552/model.ckpt-1000

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1610397418/assets

INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1610397418/assets

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1610397418/saved_model.pb

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1610397418/saved_model.pb

公平性メトリックの計算

右側のパネルのドロップダウンを使用して、メトリックを計算するIDと、信頼区間で実行するかどうかを選択します。

公平性指標の計算オプション

Slice selection: sexual_orientation
Compute confidence intervals: False

Warning:apache_beam.typehints.typehints:Ignoring send_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring return_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring send_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring return_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring send_type hint: <class 'NoneType'>
WARNING:apache_beam.typehints.typehints:Ignoring return_type hint: <class 'NoneType'>

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1610397418/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1610397418/variables/variables

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.

What-ifツールを使用してデータを視覚化する

このセクションでは、What-Ifツールのインタラクティブなビジュアルインターフェイスを使用して、マイクロレベルでデータを探索および操作します。

右側のパネルの散布図の各点は、ツールにロードされたサブセットの例の1つを表しています。ポイントの1つをクリックして、左側のパネルにこの特定の例の詳細を表示します。コメントテキスト、グラウンドトゥルース毒性、および該当するIDが表示されます。この左側のパネルの下部に、トレーニングしたばかりのモデルからの推論結果が表示されます。

例のテキストを変更し、[推論実行]ボタンをクリックして、変更によって知覚される毒性予測がどのように変化したかを確認します。

DEFAULT_MAX_EXAMPLES = 1000

# Load 100000 examples in memory. When first rendered, 
# What-If Tool should only display 1000 of these due to browser constraints.
def wit_dataset(file, num_examples=100000):
  dataset = tf.data.TFRecordDataset(
      filenames=[file]).take(num_examples)
  return [tf.train.Example.FromString(d.numpy()) for d in dataset]

wit_data = wit_dataset(train_tf_file)
config_builder = WitConfigBuilder(wit_data[:DEFAULT_MAX_EXAMPLES]).set_estimator_and_feature_spec(
    classifier, FEATURE_MAP).set_label_vocab(['non-toxicity', LABEL]).set_target_feature(LABEL)
wit = WitWidget(config_builder)

公平性指標をレンダリングする

エクスポートされた評価結果を使用して、公平性インジケーターウィジェットをレンダリングします。

以下に、選択したメトリックのデータの各スライスのパフォーマンスを示す棒グラフを示します。ビジュアライゼーションの上部にあるドロップダウンメニューを使用して、ベースライン比較スライスと表示されるしきい値を調整できます。

Fairness Indicatorウィジェットは、上記でレンダリングされたWhat-Ifツールと統合されています。棒グラフでデータのスライスを1つ選択すると、What-Ifツールが更新され、選択したスライスの例が表示されます。上記のwhat-ifツールのデータのリロードは、毒性色でを変更しようとします。これにより、スライスごとの例の毒性バランスを視覚的に理解できます。

event_handlers={'slice-selected':
                wit.create_selection_callback(wit_data, DEFAULT_MAX_EXAMPLES)}
widget_view.render_fairness_indicator(eval_result=eval_result,
                                      slicing_column=slice_selection,
                                      event_handlers=event_handlers
                                      )
FairnessIndicatorViewer(slicingMetrics=[{'sliceValue': 'Overall', 'slice': 'Overall', 'metrics': {'post_export…

この特定のデータセットとタスクでは、特定のIDの偽陽性率と偽陰性率が体系的に高くなると、否定的な結果につながる可能性があります。たとえば、コンテンツモデレーションシステムでは、特定のグループの誤検知率が全体よりも高いと、それらの音声が無音になる可能性があります。したがって、モデルを開発および改善する際には、これらのタイプの基準を定期的に評価し、公平性指標、TFDV、WITなどのツールを利用して潜在的な問題を明らかにすることが重要です。公平性の問題を特定したら、新しいデータソース、データバランシング、またはその他の手法を試して、パフォーマンスの低いグループのパフォーマンスを向上させることができます。

公平性指標の使用方法の詳細とガイダンスについては、こちらをご覧ください。

公平性評価結果を使用する

上記のrender_fairness_indicator()でレンダリングされたeval_resultオブジェクトには、TFMAの結果をプログラムに読み込むために利用できる独自のAPIがあります。

評価されたスライスとメトリックを取得する

get_slice_names()get_metric_names()を使用して、それぞれ評価されたスライスとメトリックを取得します。

pp = pprint.PrettyPrinter()

print("Slices:")
pp.pprint(eval_result.get_slice_names())
print("\nMetrics:")
pp.pprint(eval_result.get_metric_names())
Slices:
[(),
 (('sexual_orientation', 'homosexual_gay_or_lesbian'),),
 (('sexual_orientation', 'heterosexual'),),
 (('sexual_orientation', 'bisexual'),),
 (('sexual_orientation', 'other_sexual_orientation'),)]

Metrics:
['post_export_metrics/fairness/confusion_matrix_at_thresholds',
 'post_export_metrics/false_discovery_rate@0.10',
 'post_export_metrics/false_negative_rate@0.70',
 'precision',
 'post_export_metrics/false_positive_rate@0.90',
 'post_export_metrics/false_discovery_rate@0.50',
 'post_export_metrics/true_positive_rate@0.90',
 'post_export_metrics/true_negative_rate@0.70',
 'post_export_metrics/false_positive_rate@0.30',
 'post_export_metrics/negative_rate@0.90',
 'post_export_metrics/negative_rate@0.10',
 'label/mean',
 'post_export_metrics/positive_rate@0.10',
 'prediction/mean',
 'post_export_metrics/false_discovery_rate@0.30',
 'accuracy',
 'post_export_metrics/false_negative_rate@0.90',
 'post_export_metrics/true_positive_rate@0.10',
 'post_export_metrics/negative_rate@0.70',
 'post_export_metrics/false_negative_rate@0.10',
 'post_export_metrics/false_discovery_rate@0.90',
 'post_export_metrics/false_omission_rate@0.50',
 'post_export_metrics/true_positive_rate@0.70',
 'post_export_metrics/false_discovery_rate@0.70',
 'post_export_metrics/positive_rate@0.30',
 'post_export_metrics/false_omission_rate@0.70',
 'post_export_metrics/true_negative_rate@0.90',
 'post_export_metrics/false_omission_rate@0.10',
 'recall',
 'post_export_metrics/false_positive_rate@0.50',
 'post_export_metrics/negative_rate@0.30',
 'post_export_metrics/true_positive_rate@0.50',
 'post_export_metrics/false_omission_rate@0.90',
 'post_export_metrics/example_count',
 'post_export_metrics/false_positive_rate@0.10',
 'post_export_metrics/true_negative_rate@0.30',
 'post_export_metrics/true_negative_rate@0.10',
 'post_export_metrics/false_omission_rate@0.30',
 'auc',
 'post_export_metrics/false_positive_rate@0.70',
 'post_export_metrics/false_negative_rate@0.30',
 'post_export_metrics/true_positive_rate@0.30',
 'average_loss',
 'post_export_metrics/negative_rate@0.50',
 'accuracy_baseline',
 'post_export_metrics/true_negative_rate@0.50',
 'post_export_metrics/false_negative_rate@0.50',
 'post_export_metrics/positive_rate@0.50',
 'auc_precision_recall',
 'post_export_metrics/positive_rate@0.90',
 'post_export_metrics/positive_rate@0.70']

get_metrics_for_slice()を使用して、メトリック名をメトリック値にマッピングするディクショナリとして特定のスライスのメトリックを取得します。

baseline_slice = ()
heterosexual_slice = (('sexual_orientation', 'heterosexual'),)

print("Baseline metric values:")
pp.pprint(eval_result.get_metrics_for_slice(baseline_slice))
print("\nHeterosexual metric values:")
pp.pprint(eval_result.get_metrics_for_slice(heterosexual_slice))
Baseline metric values:
{'accuracy': {'doubleValue': 0.7181273102760315},
 'accuracy_baseline': {'doubleValue': 0.9198060631752014},
 'auc': {'doubleValue': 0.7963067293167114},
 'auc_precision_recall': {'doubleValue': 0.3019316792488098},
 'average_loss': {'doubleValue': 0.5641244053840637},
 'label/mean': {'doubleValue': 0.08019392192363739},
 'post_export_metrics/example_count': {'doubleValue': 721950.0},
 'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 316.0},
                                                                                                               'boundedFalsePositives': {'value': 614843.0},
                                                                                                               'boundedPrecision': {'value': 0.0856306254863739},
                                                                                                               'boundedRecall': {'value': 0.9945419430732727},
                                                                                                               'boundedTrueNegatives': {'value': 49211.0},
                                                                                                               'boundedTruePositives': {'value': 57580.0},
                                                                                                               'falseNegatives': 316.0,
                                                                                                               'falsePositives': 614843.0,
                                                                                                               'precision': 0.0856306254863739,
                                                                                                               'recall': 0.9945419430732727,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 316.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 614843.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.0856306254863739},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.9945419430732727},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 49211.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 57580.0},
                                                                                                               'threshold': 0.10000000149011612,
                                                                                                               'trueNegatives': 49211.0,
                                                                                                               'truePositives': 57580.0},
                                                                                                              {'boundedFalseNegatives': {'value': 5026.0},
                                                                                                               'boundedFalsePositives': {'value': 386489.0},
                                                                                                               'boundedPrecision': {'value': 0.12033439427614212},
                                                                                                               'boundedRecall': {'value': 0.913189172744751},
                                                                                                               'boundedTrueNegatives': {'value': 277565.0},
                                                                                                               'boundedTruePositives': {'value': 52870.0},
                                                                                                               'falseNegatives': 5026.0,
                                                                                                               'falsePositives': 386489.0,
                                                                                                               'precision': 0.12033439427614212,
                                                                                                               'recall': 0.913189172744751,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 5026.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 386489.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.12033439427614212},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.913189172744751},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 277565.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 52870.0},
                                                                                                               'threshold': 0.30000001192092896,
                                                                                                               'trueNegatives': 277565.0,
                                                                                                               'truePositives': 52870.0},
                                                                                                              {'boundedFalseNegatives': {'value': 15685.0},
                                                                                                               'boundedFalsePositives': {'value': 187813.0},
                                                                                                               'boundedPrecision': {'value': 0.18350693583488464},
                                                                                                               'boundedRecall': {'value': 0.7290831804275513},
                                                                                                               'boundedTrueNegatives': {'value': 476241.0},
                                                                                                               'boundedTruePositives': {'value': 42211.0},
                                                                                                               'falseNegatives': 15685.0,
                                                                                                               'falsePositives': 187813.0,
                                                                                                               'precision': 0.18350693583488464,
                                                                                                               'recall': 0.7290831804275513,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 15685.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 187813.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.18350693583488464},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.7290831804275513},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 476241.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 42211.0},
                                                                                                               'threshold': 0.5,
                                                                                                               'trueNegatives': 476241.0,
                                                                                                               'truePositives': 42211.0},
                                                                                                              {'boundedFalseNegatives': {'value': 31441.0},
                                                                                                               'boundedFalsePositives': {'value': 64692.0},
                                                                                                               'boundedPrecision': {'value': 0.2902454137802124},
                                                                                                               'boundedRecall': {'value': 0.45694002509117126},
                                                                                                               'boundedTrueNegatives': {'value': 599362.0},
                                                                                                               'boundedTruePositives': {'value': 26455.0},
                                                                                                               'falseNegatives': 31441.0,
                                                                                                               'falsePositives': 64692.0,
                                                                                                               'precision': 0.2902454137802124,
                                                                                                               'recall': 0.45694002509117126,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 31441.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 64692.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.2902454137802124},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.45694002509117126},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 599362.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 26455.0},
                                                                                                               'threshold': 0.699999988079071,
                                                                                                               'trueNegatives': 599362.0,
                                                                                                               'truePositives': 26455.0},
                                                                                                              {'boundedFalseNegatives': {'value': 51099.0},
                                                                                                               'boundedFalsePositives': {'value': 6463.0},
                                                                                                               'boundedPrecision': {'value': 0.5125942826271057},
                                                                                                               'boundedRecall': {'value': 0.1174001693725586},
                                                                                                               'boundedTrueNegatives': {'value': 657591.0},
                                                                                                               'boundedTruePositives': {'value': 6797.0},
                                                                                                               'falseNegatives': 51099.0,
                                                                                                               'falsePositives': 6463.0,
                                                                                                               'precision': 0.5125942826271057,
                                                                                                               'recall': 0.1174001693725586,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 51099.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 6463.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.5125942826271057},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.1174001693725586},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 657591.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 6797.0},
                                                                                                               'threshold': 0.8999999761581421,
                                                                                                               'trueNegatives': 657591.0,
                                                                                                               'truePositives': 6797.0}]} },
 'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.9143694043159485},
 'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.8796656131744385},
 'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.816493034362793},
 'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.7097545862197876},
 'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.4874057173728943},
 'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.005458062514662743},
 'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08681083470582962},
 'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.27091681957244873},
 'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.5430599451065063},
 'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8825998306274414},
 'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.006380358245223761},
 'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.017785420641303062},
 'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.031884875148534775},
 'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.049842819571495056},
 'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.07210345566272736},
 'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9258930683135986},
 'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.5820144414901733},
 'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.2828279137611389},
 'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.09741979092359543},
 'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.009732642211019993},
 'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.06860170513391495},
 'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.3914273977279663},
 'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.6813851594924927},
 'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.8737488985061646},
 'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9816330671310425},
 'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9313982725143433},
 'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.6085726022720337},
 'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.3186148703098297},
 'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.12625113129615784},
 'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.018366923555731773},
 'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.07410692423582077},
 'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.41798558831214905},
 'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.7171720862388611},
 'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.902580201625824},
 'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9902673363685608},
 'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 0.9945419430732727},
 'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.913189172744751},
 'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7290831804275513},
 'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.45694002509117126},
 'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.1174001693725586},
 'precision': {'doubleValue': 0.18350693583488464},
 'prediction/mean': {'doubleValue': 0.40113359689712524},
 'recall': {'doubleValue': 0.7290831804275513} }

Heterosexual metric values:
{'accuracy': {'doubleValue': 0.5304877758026123},
 'accuracy_baseline': {'doubleValue': 0.7601625919342041},
 'auc': {'doubleValue': 0.666443943977356},
 'auc_precision_recall': {'doubleValue': 0.40453678369522095},
 'average_loss': {'doubleValue': 0.8366135358810425},
 'label/mean': {'doubleValue': 0.2398373931646347},
 'post_export_metrics/example_count': {'doubleValue': 492.0},
 'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                               'boundedFalsePositives': {'value': 363.0},
                                                                                                               'boundedPrecision': {'value': 0.24532224237918854},
                                                                                                               'boundedRecall': {'value': 1.0},
                                                                                                               'boundedTrueNegatives': {'value': 11.0},
                                                                                                               'boundedTruePositives': {'value': 118.0},
                                                                                                               'falsePositives': 363.0,
                                                                                                               'precision': 0.24532224237918854,
                                                                                                               'recall': 1.0,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 363.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.24532224237918854},
                                                                                                               'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 11.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 118.0},
                                                                                                               'threshold': 0.10000000149011612,
                                                                                                               'trueNegatives': 11.0,
                                                                                                               'truePositives': 118.0},
                                                                                                              {'boundedFalseNegatives': {'value': 10.0},
                                                                                                               'boundedFalsePositives': {'value': 287.0},
                                                                                                               'boundedPrecision': {'value': 0.27341771125793457},
                                                                                                               'boundedRecall': {'value': 0.9152542352676392},
                                                                                                               'boundedTrueNegatives': {'value': 87.0},
                                                                                                               'boundedTruePositives': {'value': 108.0},
                                                                                                               'falseNegatives': 10.0,
                                                                                                               'falsePositives': 287.0,
                                                                                                               'precision': 0.27341771125793457,
                                                                                                               'recall': 0.9152542352676392,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 10.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 287.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.27341771125793457},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.9152542352676392},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 87.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 108.0},
                                                                                                               'threshold': 0.30000001192092896,
                                                                                                               'trueNegatives': 87.0,
                                                                                                               'truePositives': 108.0},
                                                                                                              {'boundedFalseNegatives': {'value': 34.0},
                                                                                                               'boundedFalsePositives': {'value': 197.0},
                                                                                                               'boundedPrecision': {'value': 0.29893237352371216},
                                                                                                               'boundedRecall': {'value': 0.7118644118309021},
                                                                                                               'boundedTrueNegatives': {'value': 177.0},
                                                                                                               'boundedTruePositives': {'value': 84.0},
                                                                                                               'falseNegatives': 34.0,
                                                                                                               'falsePositives': 197.0,
                                                                                                               'precision': 0.29893237352371216,
                                                                                                               'recall': 0.7118644118309021,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 34.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 197.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.29893237352371216},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.7118644118309021},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 177.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 84.0},
                                                                                                               'threshold': 0.5,
                                                                                                               'trueNegatives': 177.0,
                                                                                                               'truePositives': 84.0},
                                                                                                              {'boundedFalseNegatives': {'value': 56.0},
                                                                                                               'boundedFalsePositives': {'value': 116.0},
                                                                                                               'boundedPrecision': {'value': 0.3483146131038666},
                                                                                                               'boundedRecall': {'value': 0.5254237055778503},
                                                                                                               'boundedTrueNegatives': {'value': 258.0},
                                                                                                               'boundedTruePositives': {'value': 62.0},
                                                                                                               'falseNegatives': 56.0,
                                                                                                               'falsePositives': 116.0,
                                                                                                               'precision': 0.3483146131038666,
                                                                                                               'recall': 0.5254237055778503,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 56.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 116.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.3483146131038666},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.5254237055778503},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 258.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 62.0},
                                                                                                               'threshold': 0.699999988079071,
                                                                                                               'trueNegatives': 258.0,
                                                                                                               'truePositives': 62.0},
                                                                                                              {'boundedFalseNegatives': {'value': 96.0},
                                                                                                               'boundedFalsePositives': {'value': 17.0},
                                                                                                               'boundedPrecision': {'value': 0.5641025900840759},
                                                                                                               'boundedRecall': {'value': 0.18644067645072937},
                                                                                                               'boundedTrueNegatives': {'value': 357.0},
                                                                                                               'boundedTruePositives': {'value': 22.0},
                                                                                                               'falseNegatives': 96.0,
                                                                                                               'falsePositives': 17.0,
                                                                                                               'precision': 0.5641025900840759,
                                                                                                               'recall': 0.18644067645072937,
                                                                                                               'tDistributionFalseNegatives': {'unsampledValue': 96.0},
                                                                                                               'tDistributionFalsePositives': {'unsampledValue': 17.0},
                                                                                                               'tDistributionPrecision': {'unsampledValue': 0.5641025900840759},
                                                                                                               'tDistributionRecall': {'unsampledValue': 0.18644067645072937},
                                                                                                               'tDistributionTrueNegatives': {'unsampledValue': 357.0},
                                                                                                               'tDistributionTruePositives': {'unsampledValue': 22.0},
                                                                                                               'threshold': 0.8999999761581421,
                                                                                                               'trueNegatives': 357.0,
                                                                                                               'truePositives': 22.0}]} },
 'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7546777725219727},
 'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.7265822887420654},
 'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.7010676264762878},
 'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.6516854166984558},
 'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.43589743971824646},
 'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
 'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08474576473236084},
 'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.2881355881690979},
 'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.47457626461982727},
 'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8135592937469482},
 'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
 'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.10309278219938278},
 'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.1611374467611313},
 'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.17834395170211792},
 'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.21192052960395813},
 'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.970588207244873},
 'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.7673797011375427},
 'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.5267379879951477},
 'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.31016042828559875},
 'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.04545454680919647},
 'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.022357722744345665},
 'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.1971544772386551},
 'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.4288617968559265},
 'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6382113695144653},
 'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9207317233085632},
 'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.977642297744751},
 'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.8028455376625061},
 'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5711382031440735},
 'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.36178863048553467},
 'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.07926829159259796},
 'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.029411764815449715},
 'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.23262031376361847},
 'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.4732620418071747},
 'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.6898396015167236},
 'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9545454382896423},
 'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
 'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.9152542352676392},
 'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7118644118309021},
 'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.5254237055778503},
 'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.18644067645072937},
 'precision': {'doubleValue': 0.29893237352371216},
 'prediction/mean': {'doubleValue': 0.5596572160720825},
 'recall': {'doubleValue': 0.7118644118309021} }

get_metrics_for_all_slices()を使用して、すべてのスライスのメトリックを、 get_metrics_for_slice()実行から取得した対応するメトリックディクショナリに各スライスをマッピングするディクショナリとして取得します。

pp.pprint(eval_result.get_metrics_for_all_slices())
{(): {'accuracy': {'doubleValue': 0.7181273102760315},
      'accuracy_baseline': {'doubleValue': 0.9198060631752014},
      'auc': {'doubleValue': 0.7963067293167114},
      'auc_precision_recall': {'doubleValue': 0.3019316792488098},
      'average_loss': {'doubleValue': 0.5641244053840637},
      'label/mean': {'doubleValue': 0.08019392192363739},
      'post_export_metrics/example_count': {'doubleValue': 721950.0},
      'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 316.0},
                                                                                                                    'boundedFalsePositives': {'value': 614843.0},
                                                                                                                    'boundedPrecision': {'value': 0.0856306254863739},
                                                                                                                    'boundedRecall': {'value': 0.9945419430732727},
                                                                                                                    'boundedTrueNegatives': {'value': 49211.0},
                                                                                                                    'boundedTruePositives': {'value': 57580.0},
                                                                                                                    'falseNegatives': 316.0,
                                                                                                                    'falsePositives': 614843.0,
                                                                                                                    'precision': 0.0856306254863739,
                                                                                                                    'recall': 0.9945419430732727,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 316.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 614843.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.0856306254863739},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.9945419430732727},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 49211.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 57580.0},
                                                                                                                    'threshold': 0.10000000149011612,
                                                                                                                    'trueNegatives': 49211.0,
                                                                                                                    'truePositives': 57580.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 5026.0},
                                                                                                                    'boundedFalsePositives': {'value': 386489.0},
                                                                                                                    'boundedPrecision': {'value': 0.12033439427614212},
                                                                                                                    'boundedRecall': {'value': 0.913189172744751},
                                                                                                                    'boundedTrueNegatives': {'value': 277565.0},
                                                                                                                    'boundedTruePositives': {'value': 52870.0},
                                                                                                                    'falseNegatives': 5026.0,
                                                                                                                    'falsePositives': 386489.0,
                                                                                                                    'precision': 0.12033439427614212,
                                                                                                                    'recall': 0.913189172744751,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 5026.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 386489.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.12033439427614212},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.913189172744751},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 277565.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 52870.0},
                                                                                                                    'threshold': 0.30000001192092896,
                                                                                                                    'trueNegatives': 277565.0,
                                                                                                                    'truePositives': 52870.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 15685.0},
                                                                                                                    'boundedFalsePositives': {'value': 187813.0},
                                                                                                                    'boundedPrecision': {'value': 0.18350693583488464},
                                                                                                                    'boundedRecall': {'value': 0.7290831804275513},
                                                                                                                    'boundedTrueNegatives': {'value': 476241.0},
                                                                                                                    'boundedTruePositives': {'value': 42211.0},
                                                                                                                    'falseNegatives': 15685.0,
                                                                                                                    'falsePositives': 187813.0,
                                                                                                                    'precision': 0.18350693583488464,
                                                                                                                    'recall': 0.7290831804275513,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 15685.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 187813.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.18350693583488464},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.7290831804275513},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 476241.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 42211.0},
                                                                                                                    'threshold': 0.5,
                                                                                                                    'trueNegatives': 476241.0,
                                                                                                                    'truePositives': 42211.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 31441.0},
                                                                                                                    'boundedFalsePositives': {'value': 64692.0},
                                                                                                                    'boundedPrecision': {'value': 0.2902454137802124},
                                                                                                                    'boundedRecall': {'value': 0.45694002509117126},
                                                                                                                    'boundedTrueNegatives': {'value': 599362.0},
                                                                                                                    'boundedTruePositives': {'value': 26455.0},
                                                                                                                    'falseNegatives': 31441.0,
                                                                                                                    'falsePositives': 64692.0,
                                                                                                                    'precision': 0.2902454137802124,
                                                                                                                    'recall': 0.45694002509117126,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 31441.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 64692.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.2902454137802124},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.45694002509117126},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 599362.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 26455.0},
                                                                                                                    'threshold': 0.699999988079071,
                                                                                                                    'trueNegatives': 599362.0,
                                                                                                                    'truePositives': 26455.0},
                                                                                                                   {'boundedFalseNegatives': {'value': 51099.0},
                                                                                                                    'boundedFalsePositives': {'value': 6463.0},
                                                                                                                    'boundedPrecision': {'value': 0.5125942826271057},
                                                                                                                    'boundedRecall': {'value': 0.1174001693725586},
                                                                                                                    'boundedTrueNegatives': {'value': 657591.0},
                                                                                                                    'boundedTruePositives': {'value': 6797.0},
                                                                                                                    'falseNegatives': 51099.0,
                                                                                                                    'falsePositives': 6463.0,
                                                                                                                    'precision': 0.5125942826271057,
                                                                                                                    'recall': 0.1174001693725586,
                                                                                                                    'tDistributionFalseNegatives': {'unsampledValue': 51099.0},
                                                                                                                    'tDistributionFalsePositives': {'unsampledValue': 6463.0},
                                                                                                                    'tDistributionPrecision': {'unsampledValue': 0.5125942826271057},
                                                                                                                    'tDistributionRecall': {'unsampledValue': 0.1174001693725586},
                                                                                                                    'tDistributionTrueNegatives': {'unsampledValue': 657591.0},
                                                                                                                    'tDistributionTruePositives': {'unsampledValue': 6797.0},
                                                                                                                    'threshold': 0.8999999761581421,
                                                                                                                    'trueNegatives': 657591.0,
                                                                                                                    'truePositives': 6797.0}]} },
      'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.9143694043159485},
      'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.8796656131744385},
      'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.816493034362793},
      'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.7097545862197876},
      'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.4874057173728943},
      'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.005458062514662743},
      'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08681083470582962},
      'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.27091681957244873},
      'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.5430599451065063},
      'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8825998306274414},
      'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.006380358245223761},
      'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.017785420641303062},
      'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.031884875148534775},
      'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.049842819571495056},
      'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.07210345566272736},
      'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9258930683135986},
      'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.5820144414901733},
      'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.2828279137611389},
      'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.09741979092359543},
      'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.009732642211019993},
      'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.06860170513391495},
      'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.3914273977279663},
      'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.6813851594924927},
      'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.8737488985061646},
      'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9816330671310425},
      'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9313982725143433},
      'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.6085726022720337},
      'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.3186148703098297},
      'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.12625113129615784},
      'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.018366923555731773},
      'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.07410692423582077},
      'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.41798558831214905},
      'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.7171720862388611},
      'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.902580201625824},
      'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9902673363685608},
      'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 0.9945419430732727},
      'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.913189172744751},
      'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7290831804275513},
      'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.45694002509117126},
      'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.1174001693725586},
      'precision': {'doubleValue': 0.18350693583488464},
      'prediction/mean': {'doubleValue': 0.40113359689712524},
      'recall': {'doubleValue': 0.7290831804275513} },
 (('sexual_orientation', 'bisexual'),): {'accuracy': {'doubleValue': 0.5431034564971924},
                                         'accuracy_baseline': {'doubleValue': 0.8017241358757019},
                                         'auc': {'doubleValue': 0.6243571639060974},
                                         'auc_precision_recall': {'doubleValue': 0.303098201751709},
                                         'average_loss': {'doubleValue': 0.7430298328399658},
                                         'label/mean': {'doubleValue': 0.1982758641242981},
                                         'post_export_metrics/example_count': {'doubleValue': 116.0},
                                         'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 85.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.21296297013759613},
                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                       'boundedTrueNegatives': {'value': 8.0},
                                                                                                                                                       'boundedTruePositives': {'value': 23.0},
                                                                                                                                                       'falsePositives': 85.0,
                                                                                                                                                       'precision': 0.21296297013759613,
                                                                                                                                                       'recall': 1.0,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 85.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.21296297013759613},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 8.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 23.0},
                                                                                                                                                       'threshold': 0.10000000149011612,
                                                                                                                                                       'trueNegatives': 8.0,
                                                                                                                                                       'truePositives': 23.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 4.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 68.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.2183908075094223},
                                                                                                                                                       'boundedRecall': {'value': 0.8260869383811951},
                                                                                                                                                       'boundedTrueNegatives': {'value': 25.0},
                                                                                                                                                       'boundedTruePositives': {'value': 19.0},
                                                                                                                                                       'falseNegatives': 4.0,
                                                                                                                                                       'falsePositives': 68.0,
                                                                                                                                                       'precision': 0.2183908075094223,
                                                                                                                                                       'recall': 0.8260869383811951,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 4.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 68.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.2183908075094223},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.8260869383811951},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 25.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 19.0},
                                                                                                                                                       'threshold': 0.30000001192092896,
                                                                                                                                                       'trueNegatives': 25.0,
                                                                                                                                                       'truePositives': 19.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 9.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 44.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.24137930572032928},
                                                                                                                                                       'boundedRecall': {'value': 0.6086956262588501},
                                                                                                                                                       'boundedTrueNegatives': {'value': 49.0},
                                                                                                                                                       'boundedTruePositives': {'value': 14.0},
                                                                                                                                                       'falseNegatives': 9.0,
                                                                                                                                                       'falsePositives': 44.0,
                                                                                                                                                       'precision': 0.24137930572032928,
                                                                                                                                                       'recall': 0.6086956262588501,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 9.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 44.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.24137930572032928},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.6086956262588501},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 49.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 14.0},
                                                                                                                                                       'threshold': 0.5,
                                                                                                                                                       'trueNegatives': 49.0,
                                                                                                                                                       'truePositives': 14.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 16.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 18.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.2800000011920929},
                                                                                                                                                       'boundedRecall': {'value': 0.30434781312942505},
                                                                                                                                                       'boundedTrueNegatives': {'value': 75.0},
                                                                                                                                                       'boundedTruePositives': {'value': 7.0},
                                                                                                                                                       'falseNegatives': 16.0,
                                                                                                                                                       'falsePositives': 18.0,
                                                                                                                                                       'precision': 0.2800000011920929,
                                                                                                                                                       'recall': 0.30434781312942505,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 16.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 18.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.2800000011920929},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.30434781312942505},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 75.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 7.0},
                                                                                                                                                       'threshold': 0.699999988079071,
                                                                                                                                                       'trueNegatives': 75.0,
                                                                                                                                                       'truePositives': 7.0},
                                                                                                                                                      {'boundedFalseNegatives': {'value': 22.0},
                                                                                                                                                       'boundedFalsePositives': {'value': 1.0},
                                                                                                                                                       'boundedPrecision': {'value': 0.5},
                                                                                                                                                       'boundedRecall': {'value': 0.043478261679410934},
                                                                                                                                                       'boundedTrueNegatives': {'value': 92.0},
                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                       'falseNegatives': 22.0,
                                                                                                                                                       'falsePositives': 1.0,
                                                                                                                                                       'precision': 0.5,
                                                                                                                                                       'recall': 0.043478261679410934,
                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 22.0},
                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 1.0},
                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.5},
                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.043478261679410934},
                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 92.0},
                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                       'threshold': 0.8999999761581421,
                                                                                                                                                       'trueNegatives': 92.0,
                                                                                                                                                       'truePositives': 1.0}]} },
                                         'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7870370149612427},
                                         'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.7816091775894165},
                                         'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.7586206793785095},
                                         'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.7200000286102295},
                                         'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.5},
                                         'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
                                         'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.17391304671764374},
                                         'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.3913043439388275},
                                         'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.695652186870575},
                                         'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.95652174949646},
                                         'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
                                         'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.13793103396892548},
                                         'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.1551724076271057},
                                         'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.17582418024539948},
                                         'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.19298245012760162},
                                         'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9139785170555115},
                                         'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.7311828136444092},
                                         'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.47311827540397644},
                                         'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.19354838132858276},
                                         'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.01075268816202879},
                                         'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.06896551698446274},
                                         'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.25},
                                         'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.5},
                                         'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.7844827771186829},
                                         'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.982758641242981},
                                         'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.931034505367279},
                                         'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.75},
                                         'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5},
                                         'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.21551723778247833},
                                         'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.017241379246115685},
                                         'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.08602150529623032},
                                         'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.2688172161579132},
                                         'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.5268816947937012},
                                         'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.8064516186714172},
                                         'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9892473220825195},
                                         'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
                                         'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.8260869383811951},
                                         'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.6086956262588501},
                                         'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.30434781312942505},
                                         'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.043478261679410934},
                                         'precision': {'doubleValue': 0.24137930572032928},
                                         'prediction/mean': {'doubleValue': 0.48801353573799133},
                                         'recall': {'doubleValue': 0.6086956262588501} },
 (('sexual_orientation', 'heterosexual'),): {'accuracy': {'doubleValue': 0.5304877758026123},
                                             'accuracy_baseline': {'doubleValue': 0.7601625919342041},
                                             'auc': {'doubleValue': 0.666443943977356},
                                             'auc_precision_recall': {'doubleValue': 0.40453678369522095},
                                             'average_loss': {'doubleValue': 0.8366135358810425},
                                             'label/mean': {'doubleValue': 0.2398373931646347},
                                             'post_export_metrics/example_count': {'doubleValue': 492.0},
                                             'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 363.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.24532224237918854},
                                                                                                                                                           'boundedRecall': {'value': 1.0},
                                                                                                                                                           'boundedTrueNegatives': {'value': 11.0},
                                                                                                                                                           'boundedTruePositives': {'value': 118.0},
                                                                                                                                                           'falsePositives': 363.0,
                                                                                                                                                           'precision': 0.24532224237918854,
                                                                                                                                                           'recall': 1.0,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 363.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.24532224237918854},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 11.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 118.0},
                                                                                                                                                           'threshold': 0.10000000149011612,
                                                                                                                                                           'trueNegatives': 11.0,
                                                                                                                                                           'truePositives': 118.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 10.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 287.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.27341771125793457},
                                                                                                                                                           'boundedRecall': {'value': 0.9152542352676392},
                                                                                                                                                           'boundedTrueNegatives': {'value': 87.0},
                                                                                                                                                           'boundedTruePositives': {'value': 108.0},
                                                                                                                                                           'falseNegatives': 10.0,
                                                                                                                                                           'falsePositives': 287.0,
                                                                                                                                                           'precision': 0.27341771125793457,
                                                                                                                                                           'recall': 0.9152542352676392,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 10.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 287.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.27341771125793457},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.9152542352676392},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 87.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 108.0},
                                                                                                                                                           'threshold': 0.30000001192092896,
                                                                                                                                                           'trueNegatives': 87.0,
                                                                                                                                                           'truePositives': 108.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 34.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 197.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.29893237352371216},
                                                                                                                                                           'boundedRecall': {'value': 0.7118644118309021},
                                                                                                                                                           'boundedTrueNegatives': {'value': 177.0},
                                                                                                                                                           'boundedTruePositives': {'value': 84.0},
                                                                                                                                                           'falseNegatives': 34.0,
                                                                                                                                                           'falsePositives': 197.0,
                                                                                                                                                           'precision': 0.29893237352371216,
                                                                                                                                                           'recall': 0.7118644118309021,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 34.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 197.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.29893237352371216},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.7118644118309021},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 177.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 84.0},
                                                                                                                                                           'threshold': 0.5,
                                                                                                                                                           'trueNegatives': 177.0,
                                                                                                                                                           'truePositives': 84.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 56.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 116.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.3483146131038666},
                                                                                                                                                           'boundedRecall': {'value': 0.5254237055778503},
                                                                                                                                                           'boundedTrueNegatives': {'value': 258.0},
                                                                                                                                                           'boundedTruePositives': {'value': 62.0},
                                                                                                                                                           'falseNegatives': 56.0,
                                                                                                                                                           'falsePositives': 116.0,
                                                                                                                                                           'precision': 0.3483146131038666,
                                                                                                                                                           'recall': 0.5254237055778503,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 56.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 116.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.3483146131038666},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.5254237055778503},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 258.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 62.0},
                                                                                                                                                           'threshold': 0.699999988079071,
                                                                                                                                                           'trueNegatives': 258.0,
                                                                                                                                                           'truePositives': 62.0},
                                                                                                                                                          {'boundedFalseNegatives': {'value': 96.0},
                                                                                                                                                           'boundedFalsePositives': {'value': 17.0},
                                                                                                                                                           'boundedPrecision': {'value': 0.5641025900840759},
                                                                                                                                                           'boundedRecall': {'value': 0.18644067645072937},
                                                                                                                                                           'boundedTrueNegatives': {'value': 357.0},
                                                                                                                                                           'boundedTruePositives': {'value': 22.0},
                                                                                                                                                           'falseNegatives': 96.0,
                                                                                                                                                           'falsePositives': 17.0,
                                                                                                                                                           'precision': 0.5641025900840759,
                                                                                                                                                           'recall': 0.18644067645072937,
                                                                                                                                                           'tDistributionFalseNegatives': {'unsampledValue': 96.0},
                                                                                                                                                           'tDistributionFalsePositives': {'unsampledValue': 17.0},
                                                                                                                                                           'tDistributionPrecision': {'unsampledValue': 0.5641025900840759},
                                                                                                                                                           'tDistributionRecall': {'unsampledValue': 0.18644067645072937},
                                                                                                                                                           'tDistributionTrueNegatives': {'unsampledValue': 357.0},
                                                                                                                                                           'tDistributionTruePositives': {'unsampledValue': 22.0},
                                                                                                                                                           'threshold': 0.8999999761581421,
                                                                                                                                                           'trueNegatives': 357.0,
                                                                                                                                                           'truePositives': 22.0}]} },
                                             'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7546777725219727},
                                             'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.7265822887420654},
                                             'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.7010676264762878},
                                             'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.6516854166984558},
                                             'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.43589743971824646},
                                             'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
                                             'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.08474576473236084},
                                             'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.2881355881690979},
                                             'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.47457626461982727},
                                             'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8135592937469482},
                                             'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
                                             'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.10309278219938278},
                                             'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.1611374467611313},
                                             'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.17834395170211792},
                                             'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.21192052960395813},
                                             'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.970588207244873},
                                             'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.7673797011375427},
                                             'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.5267379879951477},
                                             'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.31016042828559875},
                                             'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.04545454680919647},
                                             'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.022357722744345665},
                                             'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.1971544772386551},
                                             'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.4288617968559265},
                                             'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6382113695144653},
                                             'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9207317233085632},
                                             'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.977642297744751},
                                             'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.8028455376625061},
                                             'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5711382031440735},
                                             'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.36178863048553467},
                                             'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.07926829159259796},
                                             'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.029411764815449715},
                                             'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.23262031376361847},
                                             'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.4732620418071747},
                                             'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.6898396015167236},
                                             'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9545454382896423},
                                             'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
                                             'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.9152542352676392},
                                             'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7118644118309021},
                                             'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.5254237055778503},
                                             'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.18644067645072937},
                                             'precision': {'doubleValue': 0.29893237352371216},
                                             'prediction/mean': {'doubleValue': 0.5596572160720825},
                                             'recall': {'doubleValue': 0.7118644118309021} },
 (('sexual_orientation', 'homosexual_gay_or_lesbian'),): {'accuracy': {'doubleValue': 0.5872437357902527},
                                                          'accuracy_baseline': {'doubleValue': 0.7182232141494751},
                                                          'auc': {'doubleValue': 0.706774890422821},
                                                          'auc_precision_recall': {'doubleValue': 0.4722827970981598},
                                                          'average_loss': {'doubleValue': 0.7415628433227539},
                                                          'label/mean': {'doubleValue': 0.2817767560482025},
                                                          'post_export_metrics/example_count': {'doubleValue': 4390.0},
                                                          'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 2.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 3035.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.289227157831192},
                                                                                                                                                                        'boundedRecall': {'value': 0.9983831644058228},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 118.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 1235.0},
                                                                                                                                                                        'falseNegatives': 2.0,
                                                                                                                                                                        'falsePositives': 3035.0,
                                                                                                                                                                        'precision': 0.289227157831192,
                                                                                                                                                                        'recall': 0.9983831644058228,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 2.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 3035.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.289227157831192},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.9983831644058228},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 118.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 1235.0},
                                                                                                                                                                        'threshold': 0.10000000149011612,
                                                                                                                                                                        'trueNegatives': 118.0,
                                                                                                                                                                        'truePositives': 1235.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 80.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 2379.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.3272058963775635},
                                                                                                                                                                        'boundedRecall': {'value': 0.935327410697937},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 774.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 1157.0},
                                                                                                                                                                        'falseNegatives': 80.0,
                                                                                                                                                                        'falsePositives': 2379.0,
                                                                                                                                                                        'precision': 0.3272058963775635,
                                                                                                                                                                        'recall': 0.935327410697937,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 80.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 2379.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.3272058963775635},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.935327410697937},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 774.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 1157.0},
                                                                                                                                                                        'threshold': 0.30000001192092896,
                                                                                                                                                                        'trueNegatives': 774.0,
                                                                                                                                                                        'truePositives': 1157.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 280.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 1532.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.38449177145957947},
                                                                                                                                                                        'boundedRecall': {'value': 0.7736459374427795},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 1621.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 957.0},
                                                                                                                                                                        'falseNegatives': 280.0,
                                                                                                                                                                        'falsePositives': 1532.0,
                                                                                                                                                                        'precision': 0.38449177145957947,
                                                                                                                                                                        'recall': 0.7736459374427795,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 280.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 1532.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.38449177145957947},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.7736459374427795},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 1621.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 957.0},
                                                                                                                                                                        'threshold': 0.5,
                                                                                                                                                                        'trueNegatives': 1621.0,
                                                                                                                                                                        'truePositives': 957.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 605.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 749.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.4576393961906433},
                                                                                                                                                                        'boundedRecall': {'value': 0.5109134912490845},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 2404.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 632.0},
                                                                                                                                                                        'falseNegatives': 605.0,
                                                                                                                                                                        'falsePositives': 749.0,
                                                                                                                                                                        'precision': 0.4576393961906433,
                                                                                                                                                                        'recall': 0.5109134912490845,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 605.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 749.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.4576393961906433},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.5109134912490845},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 2404.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 632.0},
                                                                                                                                                                        'threshold': 0.699999988079071,
                                                                                                                                                                        'trueNegatives': 2404.0,
                                                                                                                                                                        'truePositives': 632.0},
                                                                                                                                                                       {'boundedFalseNegatives': {'value': 1069.0},
                                                                                                                                                                        'boundedFalsePositives': {'value': 119.0},
                                                                                                                                                                        'boundedPrecision': {'value': 0.5853658318519592},
                                                                                                                                                                        'boundedRecall': {'value': 0.135812446475029},
                                                                                                                                                                        'boundedTrueNegatives': {'value': 3034.0},
                                                                                                                                                                        'boundedTruePositives': {'value': 168.0},
                                                                                                                                                                        'falseNegatives': 1069.0,
                                                                                                                                                                        'falsePositives': 119.0,
                                                                                                                                                                        'precision': 0.5853658318519592,
                                                                                                                                                                        'recall': 0.135812446475029,
                                                                                                                                                                        'tDistributionFalseNegatives': {'unsampledValue': 1069.0},
                                                                                                                                                                        'tDistributionFalsePositives': {'unsampledValue': 119.0},
                                                                                                                                                                        'tDistributionPrecision': {'unsampledValue': 0.5853658318519592},
                                                                                                                                                                        'tDistributionRecall': {'unsampledValue': 0.135812446475029},
                                                                                                                                                                        'tDistributionTrueNegatives': {'unsampledValue': 3034.0},
                                                                                                                                                                        'tDistributionTruePositives': {'unsampledValue': 168.0},
                                                                                                                                                                        'threshold': 0.8999999761581421,
                                                                                                                                                                        'trueNegatives': 3034.0,
                                                                                                                                                                        'truePositives': 168.0}]} },
                                                          'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.7107728123664856},
                                                          'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.6727941036224365},
                                                          'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.6155082583427429},
                                                          'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.5423606038093567},
                                                          'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.4146341383457184},
                                                          'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.001616814872249961},
                                                          'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.06467259675264359},
                                                          'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.22635407745838165},
                                                          'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.4890865087509155},
                                                          'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 0.8641875386238098},
                                                          'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.01666666753590107},
                                                          'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.09367681294679642},
                                                          'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.147290900349617},
                                                          'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.20106346905231476},
                                                          'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.26054108142852783},
                                                          'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 0.9625753164291382},
                                                          'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.754519522190094},
                                                          'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.48588645458221436},
                                                          'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.23755154013633728},
                                                          'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.03774183243513107},
                                                          'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.027334852144122124},
                                                          'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.19453303515911102},
                                                          'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.4330296218395233},
                                                          'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6854214072227478},
                                                          'post_export_metrics/negative_rate@0.90': {'doubleValue': 0.9346241354942322},
                                                          'post_export_metrics/positive_rate@0.10': {'doubleValue': 0.9726651310920715},
                                                          'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.8054669499397278},
                                                          'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.5669704079627991},
                                                          'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.3145785927772522},
                                                          'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.06537585705518723},
                                                          'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.03742467612028122},
                                                          'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.2454804927110672},
                                                          'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.5141135454177856},
                                                          'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.7624484896659851},
                                                          'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 0.9622581601142883},
                                                          'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 0.9983831644058228},
                                                          'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 0.935327410697937},
                                                          'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 0.7736459374427795},
                                                          'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 0.5109134912490845},
                                                          'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.135812446475029},
                                                          'precision': {'doubleValue': 0.38449177145957947},
                                                          'prediction/mean': {'doubleValue': 0.5443003177642822},
                                                          'recall': {'doubleValue': 0.7736459374427795} },
 (('sexual_orientation', 'other_sexual_orientation'),): {'accuracy': {'doubleValue': 0.6000000238418579},
                                                         'accuracy_baseline': {'doubleValue': 0.800000011920929},
                                                         'auc': {'doubleValue': 1.0},
                                                         'auc_precision_recall': {'doubleValue': 1.0},
                                                         'average_loss': {'doubleValue': 0.7427176237106323},
                                                         'label/mean': {'doubleValue': 0.20000000298023224},
                                                         'post_export_metrics/example_count': {'doubleValue': 5.0},
                                                         'post_export_metrics/fairness/confusion_matrix_at_thresholds': {'confusionMatrixAtThresholds': {'matrices': [{'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 4.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.20000000298023224},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 4.0,
                                                                                                                                                                       'precision': 0.20000000298023224,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 4.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.20000000298023224},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.10000000149011612,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 3.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.25},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 1.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 3.0,
                                                                                                                                                                       'precision': 0.25,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 3.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.25},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.30000001192092896,
                                                                                                                                                                       'trueNegatives': 1.0,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 2.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.3333333432674408},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 2.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 2.0,
                                                                                                                                                                       'precision': 0.3333333432674408,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 2.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.3333333432674408},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 2.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.5,
                                                                                                                                                                       'trueNegatives': 2.0,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 0.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 1.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.5},
                                                                                                                                                                       'boundedRecall': {'value': 1.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 3.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 1.0},
                                                                                                                                                                       'falsePositives': 1.0,
                                                                                                                                                                       'precision': 0.5,
                                                                                                                                                                       'recall': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.5},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 3.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 1.0},
                                                                                                                                                                       'threshold': 0.699999988079071,
                                                                                                                                                                       'trueNegatives': 3.0,
                                                                                                                                                                       'truePositives': 1.0},
                                                                                                                                                                      {'boundedFalseNegatives': {'value': 1.0},
                                                                                                                                                                       'boundedFalsePositives': {'value': 0.0},
                                                                                                                                                                       'boundedPrecision': {'value': 0.0},
                                                                                                                                                                       'boundedRecall': {'value': 0.0},
                                                                                                                                                                       'boundedTrueNegatives': {'value': 4.0},
                                                                                                                                                                       'boundedTruePositives': {'value': 0.0},
                                                                                                                                                                       'falseNegatives': 1.0,
                                                                                                                                                                       'tDistributionFalseNegatives': {'unsampledValue': 1.0},
                                                                                                                                                                       'tDistributionFalsePositives': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionPrecision': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionRecall': {'unsampledValue': 0.0},
                                                                                                                                                                       'tDistributionTrueNegatives': {'unsampledValue': 4.0},
                                                                                                                                                                       'tDistributionTruePositives': {'unsampledValue': 0.0},
                                                                                                                                                                       'threshold': 0.8999999761581421,
                                                                                                                                                                       'trueNegatives': 4.0}]} },
                                                         'post_export_metrics/false_discovery_rate@0.10': {'doubleValue': 0.800000011920929},
                                                         'post_export_metrics/false_discovery_rate@0.30': {'doubleValue': 0.75},
                                                         'post_export_metrics/false_discovery_rate@0.50': {'doubleValue': 0.6666666865348816},
                                                         'post_export_metrics/false_discovery_rate@0.70': {'doubleValue': 0.5},
                                                         'post_export_metrics/false_discovery_rate@0.90': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.30': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.50': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.70': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_negative_rate@0.90': {'doubleValue': 1.0},
                                                         'post_export_metrics/false_omission_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.30': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.50': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.70': {'doubleValue': 0.0},
                                                         'post_export_metrics/false_omission_rate@0.90': {'doubleValue': 0.20000000298023224},
                                                         'post_export_metrics/false_positive_rate@0.10': {'doubleValue': 1.0},
                                                         'post_export_metrics/false_positive_rate@0.30': {'doubleValue': 0.75},
                                                         'post_export_metrics/false_positive_rate@0.50': {'doubleValue': 0.5},
                                                         'post_export_metrics/false_positive_rate@0.70': {'doubleValue': 0.25},
                                                         'post_export_metrics/false_positive_rate@0.90': {'doubleValue': 0.0},
                                                         'post_export_metrics/negative_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/negative_rate@0.30': {'doubleValue': 0.20000000298023224},
                                                         'post_export_metrics/negative_rate@0.50': {'doubleValue': 0.4000000059604645},
                                                         'post_export_metrics/negative_rate@0.70': {'doubleValue': 0.6000000238418579},
                                                         'post_export_metrics/negative_rate@0.90': {'doubleValue': 1.0},
                                                         'post_export_metrics/positive_rate@0.10': {'doubleValue': 1.0},
                                                         'post_export_metrics/positive_rate@0.30': {'doubleValue': 0.800000011920929},
                                                         'post_export_metrics/positive_rate@0.50': {'doubleValue': 0.6000000238418579},
                                                         'post_export_metrics/positive_rate@0.70': {'doubleValue': 0.4000000059604645},
                                                         'post_export_metrics/positive_rate@0.90': {'doubleValue': 0.0},
                                                         'post_export_metrics/true_negative_rate@0.10': {'doubleValue': 0.0},
                                                         'post_export_metrics/true_negative_rate@0.30': {'doubleValue': 0.25},
                                                         'post_export_metrics/true_negative_rate@0.50': {'doubleValue': 0.5},
                                                         'post_export_metrics/true_negative_rate@0.70': {'doubleValue': 0.75},
                                                         'post_export_metrics/true_negative_rate@0.90': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.10': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.30': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.50': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.70': {'doubleValue': 1.0},
                                                         'post_export_metrics/true_positive_rate@0.90': {'doubleValue': 0.0},
                                                         'precision': {'doubleValue': 0.3333333432674408},
                                                         'prediction/mean': {'doubleValue': 0.5998175144195557},
                                                         'recall': {'doubleValue': 1.0} } }