Fairness Indicators on TF-Hub Text Embeddings

In this colab, you will learn how to use Fairness Indicators to evaluate embeddings from TF Hub. Fairness Indicators is a suite of tools that facilitates evaluation and visualization of fairness metrics on machine learning models. Fairness Indicators is built on top of TensorFlow Model Analysis, TensorFlow's official model evaluation library.

Imports

!pip install -q fairness-indicators \
  "absl-py==0.8.0" \
  "pyarrow==0.15.1" \
  "apache-beam==2.17.0" \
  "avro-python3==1.9.1"
ERROR: witwidget 1.7.0 has requirement oauth2client>=4.1.3, but you'll have oauth2client 3.0.0 which is incompatible.
ERROR: tensorflow-serving-api 2.2.0 has requirement tensorflow~=2.2.0, but you'll have tensorflow 2.3.0 which is incompatible.
ERROR: tfx-bsl 0.22.1 has requirement apache-beam[gcp]<3,>=2.20, but you'll have apache-beam 2.17.0 which is incompatible.
ERROR: tfx-bsl 0.22.1 has requirement pyarrow<0.17,>=0.16.0, but you'll have pyarrow 0.15.1 which is incompatible.
ERROR: tensorflow-model-analysis 0.22.2 has requirement apache-beam[gcp]<3,>=2.20, but you'll have apache-beam 2.17.0 which is incompatible.
ERROR: tensorflow-model-analysis 0.22.2 has requirement pyarrow<0.17,>=0.16, but you'll have pyarrow 0.15.1 which is incompatible.
ERROR: tensorflow-transform 0.22.0 has requirement apache-beam[gcp]<3,>=2.20, but you'll have apache-beam 2.17.0 which is incompatible.
ERROR: tensorflow-transform 0.22.0 has requirement tensorflow!=2.0.*,<2.3,>=1.15, but you'll have tensorflow 2.3.0 which is incompatible.
ERROR: tensorflow-data-validation 0.22.2 has requirement apache-beam[gcp]<3,>=2.22, but you'll have apache-beam 2.17.0 which is incompatible.
ERROR: tensorflow-data-validation 0.22.2 has requirement pyarrow<0.17,>=0.16, but you'll have pyarrow 0.15.1 which is incompatible.

import os
import tempfile
import apache_beam as beam
from datetime import datetime
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_model_analysis as tfma
from tensorflow_model_analysis.addons.fairness.view import widget_view
from tensorflow_model_analysis.addons.fairness.post_export_metrics import fairness_indicators
from fairness_indicators import example_model
from fairness_indicators.examples import util
Error importing tfx_bsl_extension.coders. Some tfx_bsl functionalities are not availableError importing tfx_bsl_extension.arrow.array_util. Some tfx_bsl functionalities are not availableError importing tfx_bsl_extension.arrow.table_util. Some tfx_bsl functionalities are not available: libarrow.so.16: cannot open shared object file: No such file or directory

Defining Constants

TensorFlow parses features from data using FixedLenFeature and VarLenFeature. So to allow TensorFlow to parse our data, we will need to map out our input feature, output feature, and any slicing features that we will want to analyze via Fairness Indicators.

BASE_DIR = tempfile.gettempdir()

# The input and output features of the classifier
TEXT_FEATURE = 'comment_text'
LABEL = 'toxicity'

FEATURE_MAP = {
    # input and output features
    LABEL: tf.io.FixedLenFeature([], tf.float32),
    TEXT_FEATURE: tf.io.FixedLenFeature([], tf.string),

    # slicing features
    'sexual_orientation': tf.io.VarLenFeature(tf.string),
    'gender': tf.io.VarLenFeature(tf.string),
    'religion': tf.io.VarLenFeature(tf.string),
    'race': tf.io.VarLenFeature(tf.string),
    'disability': tf.io.VarLenFeature(tf.string)
}

IDENTITY_TERMS = ['gender', 'sexual_orientation', 'race', 'religion', 'disability']

Data

In this exercise, we'll work with the Civil Comments dataset, approximately 2 million public comments made public by the Civil Comments platform in 2017 for ongoing research. This effort was sponsored by Jigsaw, who have hosted competitions on Kaggle to help classify toxic comments as well as minimize unintended model bias.

Each individual text comment in the dataset has a toxicity label, with the label being 1 if the comment is toxic and 0 if the comment is non-toxic. Within the data, a subset of comments are labeled with a variety of identity attributes, including categories for gender, sexual orientation, religion, and race or ethnicity.

You can choose to download the original dataset and process it in the colab, which may take minutes, or you can download the preprocessed data.

download_original_data = True

if download_original_data:
  train_tf_file = tf.keras.utils.get_file('train_tf.tfrecord',
                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf.tfrecord')
  validate_tf_file = tf.keras.utils.get_file('validate_tf.tfrecord',
                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf.tfrecord')

  # The identity terms list will be grouped together by their categories
  # on threshould 0.5. Only the identity term column, text column,
  # and label column will be kept after processing.
  train_tf_file = util.convert_comments_data(train_tf_file)
  validate_tf_file = util.convert_comments_data(validate_tf_file)

else:
  train_tf_file = tf.keras.utils.get_file('train_tf_processed.tfrecord',
                                          'https://storage.googleapis.com/civil_comments_dataset/train_tf_processed.tfrecord')
  validate_tf_file = tf.keras.utils.get_file('validate_tf_processed.tfrecord',
                                             'https://storage.googleapis.com/civil_comments_dataset/validate_tf_processed.tfrecord')

Creating a TensorFlow Model Analysis Pipeline

The Fairness Indicators library operates on TensorFlow Model Analysis (TFMA) models. TFMA models wrap TensorFlow models with additional functionality to evaluate and visualize their results. The actual evaluation occurs inside of an Apache Beam pipeline.

So we need to...

  1. Build a TensorFlow model.
  2. Build a TFMA model on top of the TensorFlow model.
  3. Run the model analysis in a Beam pipeline.

Putting it all Together

def embedding_fairness_result(embedding, identity_term='gender'):
  
  model_dir = os.path.join(BASE_DIR, 'train',
                         datetime.now().strftime('%Y%m%d-%H%M%S'))

  print("Training classifier for " + embedding)
  classifier = example_model.train_model(model_dir,
                                         train_tf_file,
                                         LABEL,
                                         TEXT_FEATURE,
                                         FEATURE_MAP,
                                         embedding)

  # We need to create a unique path to store our results for this embedding.
  embedding_name = embedding.split('/')[-2]
  eval_result_path = os.path.join(BASE_DIR, 'eval_result', embedding_name)

  example_model.evaluate_model(classifier,
                               validate_tf_file,
                               eval_result_path,
                               identity_term,
                               LABEL,
                               FEATURE_MAP)
  return tfma.load_eval_result(output_path=eval_result_path)

Run TFMA & Fairness Indicators

Fairness Indicators Metrics

Refer here for more information on Fairness Indicators. Below are some of the available metrics.

Text Embeddings

TF-Hub provides several text embeddings. These embeddings will serve as the feature column for our different models. For this Colab, we use the following embeddings:

Fairness Indicator Results

For each of the above embeddings, we will compute fairness indicators with our embedding_fairness_result pipeline, and then render the results in the Fairness Indicator UI widget with widget_view.render_fairness_indicator.

Note that the widget_view.render_fairness_indicator cells may need to be run twice for the visualization to be displayed.

Random NNLM

eval_result_random_nnlm = embedding_fairness_result('https://tfhub.dev/google/random-nnlm-en-dim128/1')
Training classifier for https://tfhub.dev/google/random-nnlm-en-dim128/1
INFO:tensorflow:Using default config.

INFO:tensorflow:Using default config.

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20200728-101738', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20200728-101738', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/canned/head.py:402: NumericColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/canned/head.py:402: NumericColumn._get_dense_tensor (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py:2192: NumericColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/feature_column/feature_column.py:2192: NumericColumn._transform_feature (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
The old _FeatureColumn APIs are being deprecated. Please use the new FeatureColumn APIs instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/adagrad.py:77: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/adagrad.py:77: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20200728-101738/model.ckpt.

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20200728-101738/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:loss = 60.395515, step = 0

INFO:tensorflow:loss = 60.395515, step = 0

INFO:tensorflow:global_step/sec: 22.1997

INFO:tensorflow:global_step/sec: 22.1997

INFO:tensorflow:loss = 70.53755, step = 100 (4.507 sec)

INFO:tensorflow:loss = 70.53755, step = 100 (4.507 sec)

INFO:tensorflow:global_step/sec: 23.024

INFO:tensorflow:global_step/sec: 23.024

INFO:tensorflow:loss = 60.31934, step = 200 (4.343 sec)

INFO:tensorflow:loss = 60.31934, step = 200 (4.343 sec)

INFO:tensorflow:global_step/sec: 23.0476

INFO:tensorflow:global_step/sec: 23.0476

INFO:tensorflow:loss = 59.56571, step = 300 (4.339 sec)

INFO:tensorflow:loss = 59.56571, step = 300 (4.339 sec)

INFO:tensorflow:global_step/sec: 22.5428

INFO:tensorflow:global_step/sec: 22.5428

INFO:tensorflow:loss = 61.309155, step = 400 (4.436 sec)

INFO:tensorflow:loss = 61.309155, step = 400 (4.436 sec)

INFO:tensorflow:global_step/sec: 22.8946

INFO:tensorflow:global_step/sec: 22.8946

INFO:tensorflow:loss = 57.682594, step = 500 (4.368 sec)

INFO:tensorflow:loss = 57.682594, step = 500 (4.368 sec)

INFO:tensorflow:global_step/sec: 22.2927

INFO:tensorflow:global_step/sec: 22.2927

INFO:tensorflow:loss = 57.669823, step = 600 (4.486 sec)

INFO:tensorflow:loss = 57.669823, step = 600 (4.486 sec)

INFO:tensorflow:global_step/sec: 22.5556

INFO:tensorflow:global_step/sec: 22.5556

INFO:tensorflow:loss = 60.49809, step = 700 (4.433 sec)

INFO:tensorflow:loss = 60.49809, step = 700 (4.433 sec)

INFO:tensorflow:global_step/sec: 22.5791

INFO:tensorflow:global_step/sec: 22.5791

INFO:tensorflow:loss = 59.164856, step = 800 (4.429 sec)

INFO:tensorflow:loss = 59.164856, step = 800 (4.429 sec)

INFO:tensorflow:global_step/sec: 19.9118

INFO:tensorflow:global_step/sec: 19.9118

INFO:tensorflow:loss = 57.124557, step = 900 (5.022 sec)

INFO:tensorflow:loss = 57.124557, step = 900 (5.022 sec)

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20200728-101738/model.ckpt.

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20200728-101738/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Loss for final step: 60.099632.

INFO:tensorflow:Loss for final step: 60.099632.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/encoding.py:141: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/encoding.py:141: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/canned/head.py:642: auc (from tensorflow.python.ops.metrics_impl) is deprecated and will be removed in a future version.
Instructions for updating:
The value of AUC returned by this may race with the update so this is deprecated. Please use tf.keras.metrics.AUC instead.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/canned/head.py:642: auc (from tensorflow.python.ops.metrics_impl) is deprecated and will be removed in a future version.
Instructions for updating:
The value of AUC returned by this may race with the update so this is deprecated. Please use tf.keras.metrics.AUC instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

Warning:tensorflow:Export includes no default signature!

Warning:tensorflow:Export includes no default signature!

INFO:tensorflow:Restoring parameters from /tmp/train/20200728-101738/model.ckpt-1000

INFO:tensorflow:Restoring parameters from /tmp/train/20200728-101738/model.ckpt-1000

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1595931514/assets

INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1595931514/assets

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1595931514/saved_model.pb

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1595931514/saved_model.pb
WARNING:absl:Tensorflow version (2.3.0) found. Note that TFMA support for TF 2.0 is currently in beta

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1595931514/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1595931514/variables/variables

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.
WARNING:root:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:68: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`

Warning:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:68: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`

widget_view.render_fairness_indicator(eval_result=eval_result_random_nnlm)
FairnessIndicatorViewer(slicingMetrics=[{'sliceValue': 'Overall', 'slice': 'Overall', 'metrics': {'recall': {'…
NNLM
eval_result_nnlm = embedding_fairness_result('https://tfhub.dev/google/nnlm-en-dim128/1')
Training classifier for https://tfhub.dev/google/nnlm-en-dim128/1
INFO:tensorflow:Using default config.

INFO:tensorflow:Using default config.

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20200728-102039', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20200728-102039', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20200728-102039/model.ckpt.

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20200728-102039/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:loss = 59.663322, step = 0

INFO:tensorflow:loss = 59.663322, step = 0

INFO:tensorflow:global_step/sec: 22.6638

INFO:tensorflow:global_step/sec: 22.6638

INFO:tensorflow:loss = 56.25737, step = 100 (4.414 sec)

INFO:tensorflow:loss = 56.25737, step = 100 (4.414 sec)

INFO:tensorflow:global_step/sec: 22.8206

INFO:tensorflow:global_step/sec: 22.8206

INFO:tensorflow:loss = 47.597294, step = 200 (4.382 sec)

INFO:tensorflow:loss = 47.597294, step = 200 (4.382 sec)

INFO:tensorflow:global_step/sec: 22.6709

INFO:tensorflow:global_step/sec: 22.6709

INFO:tensorflow:loss = 55.987038, step = 300 (4.411 sec)

INFO:tensorflow:loss = 55.987038, step = 300 (4.411 sec)

INFO:tensorflow:global_step/sec: 22.6271

INFO:tensorflow:global_step/sec: 22.6271

INFO:tensorflow:loss = 55.864807, step = 400 (4.420 sec)

INFO:tensorflow:loss = 55.864807, step = 400 (4.420 sec)

INFO:tensorflow:global_step/sec: 22.721

INFO:tensorflow:global_step/sec: 22.721

INFO:tensorflow:loss = 41.99434, step = 500 (4.401 sec)

INFO:tensorflow:loss = 41.99434, step = 500 (4.401 sec)

INFO:tensorflow:global_step/sec: 22.4847

INFO:tensorflow:global_step/sec: 22.4847

INFO:tensorflow:loss = 45.392982, step = 600 (4.448 sec)

INFO:tensorflow:loss = 45.392982, step = 600 (4.448 sec)

INFO:tensorflow:global_step/sec: 22.4545

INFO:tensorflow:global_step/sec: 22.4545

INFO:tensorflow:loss = 51.30686, step = 700 (4.453 sec)

INFO:tensorflow:loss = 51.30686, step = 700 (4.453 sec)

INFO:tensorflow:global_step/sec: 22.2672

INFO:tensorflow:global_step/sec: 22.2672

INFO:tensorflow:loss = 47.349762, step = 800 (4.492 sec)

INFO:tensorflow:loss = 47.349762, step = 800 (4.492 sec)

INFO:tensorflow:global_step/sec: 22.4969

INFO:tensorflow:global_step/sec: 22.4969

INFO:tensorflow:loss = 48.462177, step = 900 (4.445 sec)

INFO:tensorflow:loss = 48.462177, step = 900 (4.445 sec)

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20200728-102039/model.ckpt.

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20200728-102039/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Loss for final step: 50.60894.

INFO:tensorflow:Loss for final step: 50.60894.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

Warning:tensorflow:Export includes no default signature!

Warning:tensorflow:Export includes no default signature!

INFO:tensorflow:Restoring parameters from /tmp/train/20200728-102039/model.ckpt-1000

INFO:tensorflow:Restoring parameters from /tmp/train/20200728-102039/model.ckpt-1000

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1595931686/assets

INFO:tensorflow:Assets written to: /tmp/tfma_eval_model/temp-1595931686/assets

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1595931686/saved_model.pb

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1595931686/saved_model.pb
WARNING:absl:Tensorflow version (2.3.0) found. Note that TFMA support for TF 2.0 is currently in beta

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1595931686/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1595931686/variables/variables

widget_view.render_fairness_indicator(eval_result=eval_result_nnlm)
FairnessIndicatorViewer(slicingMetrics=[{'sliceValue': 'Overall', 'slice': 'Overall', 'metrics': {'post_export…
Universal Sentence Encoder
eval_result_use = embedding_fairness_result('https://tfhub.dev/google/universal-sentence-encoder/2')
Training classifier for https://tfhub.dev/google/universal-sentence-encoder/2
INFO:tensorflow:Using default config.

INFO:tensorflow:Using default config.

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20200728-102330', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

INFO:tensorflow:Using config: {'_model_dir': '/tmp/train/20200728-102330', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20200728-102330/model.ckpt.

INFO:tensorflow:Saving checkpoints for 0 into /tmp/train/20200728-102330/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:loss = 59.31093, step = 0

INFO:tensorflow:loss = 59.31093, step = 0

INFO:tensorflow:global_step/sec: 8.11647

INFO:tensorflow:global_step/sec: 8.11647

INFO:tensorflow:loss = 50.083836, step = 100 (12.322 sec)

INFO:tensorflow:loss = 50.083836, step = 100 (12.322 sec)

INFO:tensorflow:global_step/sec: 8.34408

INFO:tensorflow:global_step/sec: 8.34408

INFO:tensorflow:loss = 46.30419, step = 200 (11.985 sec)

INFO:tensorflow:loss = 46.30419, step = 200 (11.985 sec)

INFO:tensorflow:global_step/sec: 8.31844

INFO:tensorflow:global_step/sec: 8.31844

INFO:tensorflow:loss = 48.47805, step = 300 (12.022 sec)

INFO:tensorflow:loss = 48.47805, step = 300 (12.022 sec)

INFO:tensorflow:global_step/sec: 8.38671

INFO:tensorflow:global_step/sec: 8.38671

INFO:tensorflow:loss = 44.44144, step = 400 (11.923 sec)

INFO:tensorflow:loss = 44.44144, step = 400 (11.923 sec)

INFO:tensorflow:global_step/sec: 8.41145

INFO:tensorflow:global_step/sec: 8.41145

INFO:tensorflow:loss = 35.44813, step = 500 (11.889 sec)

INFO:tensorflow:loss = 35.44813, step = 500 (11.889 sec)

INFO:tensorflow:global_step/sec: 8.36842

INFO:tensorflow:global_step/sec: 8.36842

INFO:tensorflow:loss = 42.151596, step = 600 (11.950 sec)

INFO:tensorflow:loss = 42.151596, step = 600 (11.950 sec)

INFO:tensorflow:global_step/sec: 8.16525

INFO:tensorflow:global_step/sec: 8.16525

INFO:tensorflow:loss = 40.697086, step = 700 (12.247 sec)

INFO:tensorflow:loss = 40.697086, step = 700 (12.247 sec)

INFO:tensorflow:global_step/sec: 8.34284

INFO:tensorflow:global_step/sec: 8.34284

INFO:tensorflow:loss = 37.473423, step = 800 (11.986 sec)

INFO:tensorflow:loss = 37.473423, step = 800 (11.986 sec)

INFO:tensorflow:global_step/sec: 8.36163

INFO:tensorflow:global_step/sec: 8.36163

INFO:tensorflow:loss = 32.704132, step = 900 (11.959 sec)

INFO:tensorflow:loss = 32.704132, step = 900 (11.959 sec)

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20200728-102330/model.ckpt.

INFO:tensorflow:Saving checkpoints for 1000 into /tmp/train/20200728-102330/model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...

INFO:tensorflow:Loss for final step: 47.284683.

INFO:tensorflow:Loss for final step: 47.284683.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

INFO:tensorflow:Saver not created because there are no variables in the graph to restore

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

Warning:tensorflow:Trapezoidal rule is known to produce incorrect PR-AUCs; please switch to "careful_interpolation" instead.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Classify: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Regress: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Predict: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Train: None

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

INFO:tensorflow:Signatures INCLUDED in export for Eval: ['eval']

Warning:tensorflow:Export includes no default signature!

Warning:tensorflow:Export includes no default signature!

INFO:tensorflow:Restoring parameters from /tmp/train/20200728-102330/model.ckpt-1000

INFO:tensorflow:Restoring parameters from /tmp/train/20200728-102330/model.ckpt-1000

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:Assets added to graph.

INFO:tensorflow:No assets to write.

INFO:tensorflow:No assets to write.

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1595931968/saved_model.pb

INFO:tensorflow:SavedModel written to: /tmp/tfma_eval_model/temp-1595931968/saved_model.pb
WARNING:absl:Tensorflow version (2.3.0) found. Note that TFMA support for TF 2.0 is currently in beta

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1595931968/variables/variables

INFO:tensorflow:Restoring parameters from /tmp/tfma_eval_model/1595931968/variables/variables

widget_view.render_fairness_indicator(eval_result=eval_result_use)
FairnessIndicatorViewer(slicingMetrics=[{'sliceValue': 'Overall', 'slice': 'Overall', 'metrics': {'post_export…

Comparing Embeddings

We can also use Fairness Indicators to compare embeddings directly. Let's compare the models generated from the NNLM and USE embeddings.

widget_view.render_fairness_indicator(multi_eval_results={'nnlm': eval_result_nnlm, 'use': eval_result_use})
FairnessIndicatorViewer(evalName='nnlm', evalNameCompare='use', slicingMetrics=[{'sliceValue': 'Overall', 'sli…

Exercises

  1. Pick an identity category, such as religion or sexual orientation, and look at False Positive Rate for the Universal Sentence Encoder. How do different slices compare to each other? How do they compare to the Overall baseline?
  2. Now pick a different identity category. Compare the results of this category with the previous one. Does the model weigh one category as more "toxic" than the other? Does this change with the embedding used?
  3. Does the model generally tend to overestimate or underestimate the number of toxic comments?
  4. Look at the graphs for different fairness metrics. Which metrics seem most informative? Which embeddings perform best and worst for that metric?