MLコミュニティデーは11月9日です! TensorFlow、JAXからの更新のために私たちに参加し、より多くの詳細をご覧ください

TF-Hub による簡単なテキスト分類器の構築方法

注意: このチュートリアルでは、非推奨の TensorFlow 1 の機能を使用しています。このタスクの最新アプローチについては、TensorFlow 2 バージョンをご覧ください。

Run in Google Colab GitHub でソースを表示

TensorFlow Hub (TF-Hub) は、機械学習の知識を再利用可能なリソース、特にトレーニング済みのモジュールで共有するためのプラットフォームです。このチュートリアルは次の 2 つの主要部分で構成しています。

入門編: TF-Hub によるテキスト分類器のトレーニング

TF-Hub のテキスト埋め込みモジュールを使用して、適切なベースライン精度を持つ単純な感情分類器をトレーニングします。その後、予測を分析してモデルが適切であるかを確認し、精度を向上させるための改善点を提案します。

上級編: 転移学習の分析

この項目では、様々な TF-Hub モジュールを使用して Estimator の精度への効果を比較し、転移学習のメリットとデメリットを実証します。

オプションの前提条件

セットアップ

# Install TF-Hub.
pip install -q seaborn

Tensorflow のインストールに関する詳細は、https://www.tensorflow.org/install/ をご覧ください。

from absl import logging

import tensorflow as tf
import tensorflow_hub as hub
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import re
import seaborn as sns

はじめに

データ

映画レビューの大規模データセット v1.0 タスク (Mass et al., 2011) を解決してみます。データセットは、1 から 10 までの肯定度でラベル付けされた IMDB 映画レビューで構成されています。タスクは、レビューをネガティブ (negative) またはポジティブ (positive) にラベル付けすることです。

# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
  data = {}
  data["sentence"] = []
  data["sentiment"] = []
  for file_path in os.listdir(directory):
    with tf.io.gfile.GFile(os.path.join(directory, file_path), "r") as f:
      data["sentence"].append(f.read())
      data["sentiment"].append(re.match("\d+_(\d+)\.txt", file_path).group(1))
  return pd.DataFrame.from_dict(data)

# Merge positive and negative examples, add a polarity column and shuffle.
def load_dataset(directory):
  pos_df = load_directory_data(os.path.join(directory, "pos"))
  neg_df = load_directory_data(os.path.join(directory, "neg"))
  pos_df["polarity"] = 1
  neg_df["polarity"] = 0
  return pd.concat([pos_df, neg_df]).sample(frac=1).reset_index(drop=True)

# Download and process the dataset files.
def download_and_load_datasets(force_download=False):
  dataset = tf.keras.utils.get_file(
      fname="aclImdb.tar.gz", 
      origin="http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz", 
      extract=True)

  train_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                       "aclImdb", "train"))
  test_df = load_dataset(os.path.join(os.path.dirname(dataset), 
                                      "aclImdb", "test"))

  return train_df, test_df

# Reduce logging output.
logging.set_verbosity(logging.ERROR)

train_df, test_df = download_and_load_datasets()
train_df.head()
Downloading data from http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
84131840/84125825 [==============================] - 7s 0us/step

モデル

入力関数

Estimator フレームワークは、Pandasの データフレームをラップする入力関数を提供します。

# Training input on the whole training set with no limit on training epochs.
train_input_fn = tf.compat.v1.estimator.inputs.pandas_input_fn(
    train_df, train_df["polarity"], num_epochs=None, shuffle=True)

# Prediction on the whole training set.
predict_train_input_fn = tf.compat.v1.estimator.inputs.pandas_input_fn(
    train_df, train_df["polarity"], shuffle=False)
# Prediction on the test set.
predict_test_input_fn = tf.compat.v1.estimator.inputs.pandas_input_fn(
    test_df, test_df["polarity"], shuffle=False)

特徴量カラム

TF-Hub は、特定のテキスト特徴量にモジュールを適用し、モジュールの出力をさらに渡す、特徴量カラムを提供しています。このチュートリアルでは nlm-en-dim128 モジュールを使用します。このチュートリアルにおいて、最も重要な事実は次の通りです。

  • このモジュールは、文のバッチを1 次元のテンソル文字列で入力として受け取ります。
  • このモジュールは、文の前処理(例えば、句読点の削除やスペースの分割など)を担当します。
  • このモジュールは任意の入力に使用できます(例えば nlm-en-dim128 は、語彙に存在していない単語を 20.000 バケットまでハッシュします)。
embedded_text_feature_column = hub.text_embedding_column(
    key="sentence", 
    module_spec="https://tfhub.dev/google/nnlm-en-dim128/1")

Estimator

分類には DNN 分類器を使用することができます。(注: ラベル関数の異なるモデリングに関しては、追加の留意点がチュートリアルの最後にあります。)

estimator = tf.estimator.DNNClassifier(
    hidden_units=[500, 100],
    feature_columns=[embedded_text_feature_column],
    n_classes=2,
    optimizer=tf.keras.optimizers.Adagrad(lr=0.003))
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpdkwhugtd
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpdkwhugtd
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpdkwhugtd', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpdkwhugtd', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

トレーニング

妥当なステップ数の分だけ、Estimator をトレーニングします。

# Training for 5,000 steps means 640,000 training examples with the default
# batch size. This is roughly equivalent to 25 epochs since the training dataset
# contains 25,000 examples.
estimator.train(input_fn=train_input_fn, steps=5000);
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/inputs/queues/feeding_queue_runner.py:65: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/inputs/queues/feeding_queue_runner.py:65: QueueRunner.__init__ (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/inputs/queues/feeding_functions.py:491: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/inputs/queues/feeding_functions.py:491: add_queue_runner (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:83: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/keras/optimizer_v2/adagrad.py:83: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py:906: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.6/site-packages/tensorflow/python/training/monitored_session.py:906: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpdkwhugtd/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpdkwhugtd/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.69349724, step = 0
INFO:tensorflow:loss = 0.69349724, step = 0
INFO:tensorflow:global_step/sec: 71.0329
INFO:tensorflow:global_step/sec: 71.0329
INFO:tensorflow:loss = 0.6794902, step = 100 (1.414 sec)
INFO:tensorflow:loss = 0.6794902, step = 100 (1.414 sec)
INFO:tensorflow:global_step/sec: 91.0063
INFO:tensorflow:global_step/sec: 91.0063
INFO:tensorflow:loss = 0.6654321, step = 200 (1.095 sec)
INFO:tensorflow:loss = 0.6654321, step = 200 (1.095 sec)
INFO:tensorflow:global_step/sec: 103.532
INFO:tensorflow:global_step/sec: 103.532
INFO:tensorflow:loss = 0.66587555, step = 300 (0.966 sec)
INFO:tensorflow:loss = 0.66587555, step = 300 (0.966 sec)
INFO:tensorflow:global_step/sec: 103.498
INFO:tensorflow:global_step/sec: 103.498
INFO:tensorflow:loss = 0.64757437, step = 400 (0.966 sec)
INFO:tensorflow:loss = 0.64757437, step = 400 (0.966 sec)
INFO:tensorflow:global_step/sec: 102.333
INFO:tensorflow:global_step/sec: 102.333
INFO:tensorflow:loss = 0.63865006, step = 500 (0.977 sec)
INFO:tensorflow:loss = 0.63865006, step = 500 (0.977 sec)
INFO:tensorflow:global_step/sec: 104.901
INFO:tensorflow:global_step/sec: 104.901
INFO:tensorflow:loss = 0.6285503, step = 600 (0.953 sec)
INFO:tensorflow:loss = 0.6285503, step = 600 (0.953 sec)
INFO:tensorflow:global_step/sec: 99.1627
INFO:tensorflow:global_step/sec: 99.1627
INFO:tensorflow:loss = 0.61981493, step = 700 (1.012 sec)
INFO:tensorflow:loss = 0.61981493, step = 700 (1.012 sec)
INFO:tensorflow:global_step/sec: 96.5171
INFO:tensorflow:global_step/sec: 96.5171
INFO:tensorflow:loss = 0.59588444, step = 800 (1.033 sec)
INFO:tensorflow:loss = 0.59588444, step = 800 (1.033 sec)
INFO:tensorflow:global_step/sec: 99.764
INFO:tensorflow:global_step/sec: 99.764
INFO:tensorflow:loss = 0.6028495, step = 900 (1.002 sec)
INFO:tensorflow:loss = 0.6028495, step = 900 (1.002 sec)
INFO:tensorflow:global_step/sec: 96.5284
INFO:tensorflow:global_step/sec: 96.5284
INFO:tensorflow:loss = 0.60471284, step = 1000 (1.037 sec)
INFO:tensorflow:loss = 0.60471284, step = 1000 (1.037 sec)
INFO:tensorflow:global_step/sec: 101.166
INFO:tensorflow:global_step/sec: 101.166
INFO:tensorflow:loss = 0.59343255, step = 1100 (0.988 sec)
INFO:tensorflow:loss = 0.59343255, step = 1100 (0.988 sec)
INFO:tensorflow:global_step/sec: 69.1799
INFO:tensorflow:global_step/sec: 69.1799
INFO:tensorflow:loss = 0.60358894, step = 1200 (1.445 sec)
INFO:tensorflow:loss = 0.60358894, step = 1200 (1.445 sec)
INFO:tensorflow:global_step/sec: 96.9234
INFO:tensorflow:global_step/sec: 96.9234
INFO:tensorflow:loss = 0.59524465, step = 1300 (1.032 sec)
INFO:tensorflow:loss = 0.59524465, step = 1300 (1.032 sec)
INFO:tensorflow:global_step/sec: 102.805
INFO:tensorflow:global_step/sec: 102.805
INFO:tensorflow:loss = 0.5466298, step = 1400 (0.972 sec)
INFO:tensorflow:loss = 0.5466298, step = 1400 (0.972 sec)
INFO:tensorflow:global_step/sec: 102.171
INFO:tensorflow:global_step/sec: 102.171
INFO:tensorflow:loss = 0.5659929, step = 1500 (0.979 sec)
INFO:tensorflow:loss = 0.5659929, step = 1500 (0.979 sec)
INFO:tensorflow:global_step/sec: 103.462
INFO:tensorflow:global_step/sec: 103.462
INFO:tensorflow:loss = 0.5457308, step = 1600 (0.966 sec)
INFO:tensorflow:loss = 0.5457308, step = 1600 (0.966 sec)
INFO:tensorflow:global_step/sec: 104.962
INFO:tensorflow:global_step/sec: 104.962
INFO:tensorflow:loss = 0.54733324, step = 1700 (0.953 sec)
INFO:tensorflow:loss = 0.54733324, step = 1700 (0.953 sec)
INFO:tensorflow:global_step/sec: 103.402
INFO:tensorflow:global_step/sec: 103.402
INFO:tensorflow:loss = 0.49836773, step = 1800 (0.968 sec)
INFO:tensorflow:loss = 0.49836773, step = 1800 (0.968 sec)
INFO:tensorflow:global_step/sec: 101.553
INFO:tensorflow:global_step/sec: 101.553
INFO:tensorflow:loss = 0.534271, step = 1900 (0.984 sec)
INFO:tensorflow:loss = 0.534271, step = 1900 (0.984 sec)
INFO:tensorflow:global_step/sec: 99.9588
INFO:tensorflow:global_step/sec: 99.9588
INFO:tensorflow:loss = 0.5265736, step = 2000 (1.000 sec)
INFO:tensorflow:loss = 0.5265736, step = 2000 (1.000 sec)
INFO:tensorflow:global_step/sec: 100.492
INFO:tensorflow:global_step/sec: 100.492
INFO:tensorflow:loss = 0.4579283, step = 2100 (0.995 sec)
INFO:tensorflow:loss = 0.4579283, step = 2100 (0.995 sec)
INFO:tensorflow:global_step/sec: 98.3662
INFO:tensorflow:global_step/sec: 98.3662
INFO:tensorflow:loss = 0.5117551, step = 2200 (1.017 sec)
INFO:tensorflow:loss = 0.5117551, step = 2200 (1.017 sec)
INFO:tensorflow:global_step/sec: 97.8359
INFO:tensorflow:global_step/sec: 97.8359
INFO:tensorflow:loss = 0.5549089, step = 2300 (1.023 sec)
INFO:tensorflow:loss = 0.5549089, step = 2300 (1.023 sec)
INFO:tensorflow:global_step/sec: 94.8334
INFO:tensorflow:global_step/sec: 94.8334
INFO:tensorflow:loss = 0.46146345, step = 2400 (1.054 sec)
INFO:tensorflow:loss = 0.46146345, step = 2400 (1.054 sec)
INFO:tensorflow:global_step/sec: 96.3887
INFO:tensorflow:global_step/sec: 96.3887
INFO:tensorflow:loss = 0.4327197, step = 2500 (1.038 sec)
INFO:tensorflow:loss = 0.4327197, step = 2500 (1.038 sec)
INFO:tensorflow:global_step/sec: 96.6847
INFO:tensorflow:global_step/sec: 96.6847
INFO:tensorflow:loss = 0.5277687, step = 2600 (1.034 sec)
INFO:tensorflow:loss = 0.5277687, step = 2600 (1.034 sec)
INFO:tensorflow:global_step/sec: 98.7156
INFO:tensorflow:global_step/sec: 98.7156
INFO:tensorflow:loss = 0.3798782, step = 2700 (1.013 sec)
INFO:tensorflow:loss = 0.3798782, step = 2700 (1.013 sec)
INFO:tensorflow:global_step/sec: 95.9932
INFO:tensorflow:global_step/sec: 95.9932
INFO:tensorflow:loss = 0.48133224, step = 2800 (1.042 sec)
INFO:tensorflow:loss = 0.48133224, step = 2800 (1.042 sec)
INFO:tensorflow:global_step/sec: 93.8834
INFO:tensorflow:global_step/sec: 93.8834
INFO:tensorflow:loss = 0.43487376, step = 2900 (1.065 sec)
INFO:tensorflow:loss = 0.43487376, step = 2900 (1.065 sec)
INFO:tensorflow:global_step/sec: 95.5893
INFO:tensorflow:global_step/sec: 95.5893
INFO:tensorflow:loss = 0.50173616, step = 3000 (1.046 sec)
INFO:tensorflow:loss = 0.50173616, step = 3000 (1.046 sec)
INFO:tensorflow:global_step/sec: 92.8243
INFO:tensorflow:global_step/sec: 92.8243
INFO:tensorflow:loss = 0.44710934, step = 3100 (1.077 sec)
INFO:tensorflow:loss = 0.44710934, step = 3100 (1.077 sec)
INFO:tensorflow:global_step/sec: 94.9422
INFO:tensorflow:global_step/sec: 94.9422
INFO:tensorflow:loss = 0.4261893, step = 3200 (1.054 sec)
INFO:tensorflow:loss = 0.4261893, step = 3200 (1.054 sec)
INFO:tensorflow:global_step/sec: 100.42
INFO:tensorflow:global_step/sec: 100.42
INFO:tensorflow:loss = 0.45536137, step = 3300 (0.995 sec)
INFO:tensorflow:loss = 0.45536137, step = 3300 (0.995 sec)
INFO:tensorflow:global_step/sec: 101.866
INFO:tensorflow:global_step/sec: 101.866
INFO:tensorflow:loss = 0.55128187, step = 3400 (0.982 sec)
INFO:tensorflow:loss = 0.55128187, step = 3400 (0.982 sec)
INFO:tensorflow:global_step/sec: 100.78
INFO:tensorflow:global_step/sec: 100.78
INFO:tensorflow:loss = 0.55404973, step = 3500 (0.992 sec)
INFO:tensorflow:loss = 0.55404973, step = 3500 (0.992 sec)
INFO:tensorflow:global_step/sec: 100.322
INFO:tensorflow:global_step/sec: 100.322
INFO:tensorflow:loss = 0.45455348, step = 3600 (0.997 sec)
INFO:tensorflow:loss = 0.45455348, step = 3600 (0.997 sec)
INFO:tensorflow:global_step/sec: 102.235
INFO:tensorflow:global_step/sec: 102.235
INFO:tensorflow:loss = 0.48876747, step = 3700 (0.978 sec)
INFO:tensorflow:loss = 0.48876747, step = 3700 (0.978 sec)
INFO:tensorflow:global_step/sec: 100.242
INFO:tensorflow:global_step/sec: 100.242
INFO:tensorflow:loss = 0.5316795, step = 3800 (0.998 sec)
INFO:tensorflow:loss = 0.5316795, step = 3800 (0.998 sec)
INFO:tensorflow:global_step/sec: 100.909
INFO:tensorflow:global_step/sec: 100.909
INFO:tensorflow:loss = 0.4505452, step = 3900 (0.991 sec)
INFO:tensorflow:loss = 0.4505452, step = 3900 (0.991 sec)
INFO:tensorflow:global_step/sec: 96.9219
INFO:tensorflow:global_step/sec: 96.9219
INFO:tensorflow:loss = 0.45720708, step = 4000 (1.031 sec)
INFO:tensorflow:loss = 0.45720708, step = 4000 (1.031 sec)
INFO:tensorflow:global_step/sec: 96.7579
INFO:tensorflow:global_step/sec: 96.7579
INFO:tensorflow:loss = 0.47777736, step = 4100 (1.034 sec)
INFO:tensorflow:loss = 0.47777736, step = 4100 (1.034 sec)
INFO:tensorflow:global_step/sec: 98.867
INFO:tensorflow:global_step/sec: 98.867
INFO:tensorflow:loss = 0.48704547, step = 4200 (1.011 sec)
INFO:tensorflow:loss = 0.48704547, step = 4200 (1.011 sec)
INFO:tensorflow:global_step/sec: 100.343
INFO:tensorflow:global_step/sec: 100.343
INFO:tensorflow:loss = 0.4784004, step = 4300 (0.997 sec)
INFO:tensorflow:loss = 0.4784004, step = 4300 (0.997 sec)
INFO:tensorflow:global_step/sec: 98.7306
INFO:tensorflow:global_step/sec: 98.7306
INFO:tensorflow:loss = 0.49343526, step = 4400 (1.013 sec)
INFO:tensorflow:loss = 0.49343526, step = 4400 (1.013 sec)
INFO:tensorflow:global_step/sec: 99.0127
INFO:tensorflow:global_step/sec: 99.0127
INFO:tensorflow:loss = 0.48327613, step = 4500 (1.009 sec)
INFO:tensorflow:loss = 0.48327613, step = 4500 (1.009 sec)
INFO:tensorflow:global_step/sec: 99.4392
INFO:tensorflow:global_step/sec: 99.4392
INFO:tensorflow:loss = 0.38068467, step = 4600 (1.006 sec)
INFO:tensorflow:loss = 0.38068467, step = 4600 (1.006 sec)
INFO:tensorflow:global_step/sec: 98.1745
INFO:tensorflow:global_step/sec: 98.1745
INFO:tensorflow:loss = 0.41735762, step = 4700 (1.019 sec)
INFO:tensorflow:loss = 0.41735762, step = 4700 (1.019 sec)
INFO:tensorflow:global_step/sec: 98.5399
INFO:tensorflow:global_step/sec: 98.5399
INFO:tensorflow:loss = 0.47647473, step = 4800 (1.015 sec)
INFO:tensorflow:loss = 0.47647473, step = 4800 (1.015 sec)
INFO:tensorflow:global_step/sec: 100.345
INFO:tensorflow:global_step/sec: 100.345
INFO:tensorflow:loss = 0.40401012, step = 4900 (0.997 sec)
INFO:tensorflow:loss = 0.40401012, step = 4900 (0.997 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 5000...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 5000...
INFO:tensorflow:Saving checkpoints for 5000 into /tmp/tmpdkwhugtd/model.ckpt.
INFO:tensorflow:Saving checkpoints for 5000 into /tmp/tmpdkwhugtd/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 5000...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 5000...
INFO:tensorflow:Loss for final step: 0.49753642.
INFO:tensorflow:Loss for final step: 0.49753642.
<tensorflow_estimator.python.estimator.canned.dnn.DNNClassifierV2 at 0x7fb55f847da0>

予測する

トレーニングセットとテストセットの両方で予測を実行します。

train_eval_result = estimator.evaluate(input_fn=predict_train_input_fn)
test_eval_result = estimator.evaluate(input_fn=predict_test_input_fn)

print("Training set accuracy: {accuracy}".format(**train_eval_result))
print("Test set accuracy: {accuracy}".format(**test_eval_result))
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:11Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:11Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.64340s
INFO:tensorflow:Inference Time : 3.64340s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:15
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:15
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.7926, accuracy_baseline = 0.5, auc = 0.8736185, auc_precision_recall = 0.87475705, average_loss = 0.44760704, global_step = 5000, label/mean = 0.5, loss = 0.44769168, precision = 0.805326, prediction/mean = 0.4884011, recall = 0.77176
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.7926, accuracy_baseline = 0.5, auc = 0.8736185, auc_precision_recall = 0.87475705, average_loss = 0.44760704, global_step = 5000, label/mean = 0.5, loss = 0.44769168, precision = 0.805326, prediction/mean = 0.4884011, recall = 0.77176
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:16Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:16Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.59406s
INFO:tensorflow:Inference Time : 3.59406s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:20
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:20
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.78468, accuracy_baseline = 0.5, auc = 0.86784285, auc_precision_recall = 0.8701854, average_loss = 0.4562724, global_step = 5000, label/mean = 0.5, loss = 0.4558364, precision = 0.8037559, prediction/mean = 0.48375455, recall = 0.75328
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.78468, accuracy_baseline = 0.5, auc = 0.86784285, auc_precision_recall = 0.8701854, average_loss = 0.4562724, global_step = 5000, label/mean = 0.5, loss = 0.4558364, precision = 0.8037559, prediction/mean = 0.48375455, recall = 0.75328
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /tmp/tmpdkwhugtd/model.ckpt-5000
Training set accuracy: 0.7925999760627747
Test set accuracy: 0.7846800088882446

混同行列

混同行列を目視確認して、誤分類の分布を把握することができます。

def get_predictions(estimator, input_fn):
  return [x["class_ids"][0] for x in estimator.predict(input_fn=input_fn)]

LABELS = [
    "negative", "positive"
]

# Create a confusion matrix on training data.
cm = tf.math.confusion_matrix(train_df["polarity"], 
                              get_predictions(estimator, predict_train_input_fn))

# Normalize the confusion matrix so that each row sums to 1.
cm = tf.cast(cm, dtype=tf.float32)
cm = cm / tf.math.reduce_sum(cm, axis=1)[:, np.newaxis]

sns.heatmap(cm, annot=True, xticklabels=LABELS, yticklabels=LABELS);
plt.xlabel("Predicted");
plt.ylabel("True");
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
Text(33.0, 0.5, 'True')

png

もっと改善するために

  1. 感情の回帰: 極性クラスに各例を割り当てる際には分類器を使用しました。しかし実際には感情という、もう 1 つの自由に使える分類的特徴があります。この場合クラスは実際にはスケールを表し、連続的な範囲に基礎的な値(ポジティブまたはネガティブ)をうまくマッピングすることができます。分類(DNN 分類器)の代わりに回帰(DNN 回帰器)を計算することによって、このプロパティが使用できるようにしました。
  2. 大規模モジュール: 本チュートリアルでは、メモリ使用量を制限するために小さなモジュールを使用しました。もっと語彙が多く埋め込み空間が大きいモジュールを使用すると、精度のポイントがさらに向上する可能性があります。
  3. パラメータ調整: 学習率やステップ数などのメタパラメータを調整することにより、精度を向上させることができます。特に異なるモジュールを使用している場合にこれは有効です。妥当な結果を得るためには、検証セットが非常に重要な要素です。なぜなら、テストセットに一般化させないでトレーニングデータの予測を学習するモデルを設定するのは非常に簡単だからです。
  4. より複雑なモデル: 本チュートリアルでは個々の単語を埋め込み、さらに平均値と組み合わせて文の埋め込みを計算するモジュールを使用しました。Sequential モジュール(例えばユニバーサルセンテンスエンコーダモジュールなど)を使用して、文の性質をさらに良く捉えることも可能です。あるいは、2 つ以上の TF-Hub モジュールをアンサンブルします。
  5. 正則化: 過適合を防ぐために、Proximal Adagrad オプティマイザなどの正則化を行うオプティマイザを使用してみるのもよいでしょう。

上級編: 転移学習の分析

転移学習によって、トレーニングリソースが節約され小さなデータセットによるトレーニングでも良好なモデルの一般化が実現できるようになりました。この項目では、2 つの異なる TF-Hub モジュールを使用してトレーニングを行い、これを実証します。

  • nnlm-en-dim128 - 事前トレーニング済みのテキスト埋め込みモジュール
  • random-nnlm-en-dim128 - nlm-en-dim128 と同じ語彙とネットワークを持ちますが、重みはランダムに初期化され、実際のデータではトレーニングされていない、テキスト埋め込みモジュール

これを 2 つのモードでトレーニングします。

  • 分類器のみをトレーニングする(つまりモジュールは凍結) 。
  • モジュールと分類器を一緒にトレーニングする。

様々なモジュールを使用して複数のトレーニングと評価を行い、精度にどのような影響が出るかを見てみましょう。

def train_and_evaluate_with_module(hub_module, train_module=False):
  embedded_text_feature_column = hub.text_embedding_column(
      key="sentence", module_spec=hub_module, trainable=train_module)

  estimator = tf.estimator.DNNClassifier(
      hidden_units=[500, 100],
      feature_columns=[embedded_text_feature_column],
      n_classes=2,
      optimizer=tf.keras.optimizers.Adagrad(learning_rate=0.003))

  estimator.train(input_fn=train_input_fn, steps=1000)

  train_eval_result = estimator.evaluate(input_fn=predict_train_input_fn)
  test_eval_result = estimator.evaluate(input_fn=predict_test_input_fn)

  training_set_accuracy = train_eval_result["accuracy"]
  test_set_accuracy = test_eval_result["accuracy"]

  return {
      "Training accuracy": training_set_accuracy,
      "Test accuracy": test_set_accuracy
  }


results = {}
results["nnlm-en-dim128"] = train_and_evaluate_with_module(
    "https://tfhub.dev/google/nnlm-en-dim128/1")
results["nnlm-en-dim128-with-module-training"] = train_and_evaluate_with_module(
    "https://tfhub.dev/google/nnlm-en-dim128/1", True)
results["random-nnlm-en-dim128"] = train_and_evaluate_with_module(
    "https://tfhub.dev/google/random-nnlm-en-dim128/1")
results["random-nnlm-en-dim128-with-module-training"] = train_and_evaluate_with_module(
    "https://tfhub.dev/google/random-nnlm-en-dim128/1", True)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmp7w2py65s
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmp7w2py65s
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp7w2py65s', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp7w2py65s', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp7w2py65s/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp7w2py65s/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.7113131, step = 0
INFO:tensorflow:loss = 0.7113131, step = 0
INFO:tensorflow:global_step/sec: 94.3368
INFO:tensorflow:global_step/sec: 94.3368
INFO:tensorflow:loss = 0.6764325, step = 100 (1.062 sec)
INFO:tensorflow:loss = 0.6764325, step = 100 (1.062 sec)
INFO:tensorflow:global_step/sec: 98.1725
INFO:tensorflow:global_step/sec: 98.1725
INFO:tensorflow:loss = 0.65599465, step = 200 (1.019 sec)
INFO:tensorflow:loss = 0.65599465, step = 200 (1.019 sec)
INFO:tensorflow:global_step/sec: 99.0769
INFO:tensorflow:global_step/sec: 99.0769
INFO:tensorflow:loss = 0.64254063, step = 300 (1.009 sec)
INFO:tensorflow:loss = 0.64254063, step = 300 (1.009 sec)
INFO:tensorflow:global_step/sec: 97.1168
INFO:tensorflow:global_step/sec: 97.1168
INFO:tensorflow:loss = 0.63655424, step = 400 (1.030 sec)
INFO:tensorflow:loss = 0.63655424, step = 400 (1.030 sec)
INFO:tensorflow:global_step/sec: 98.0562
INFO:tensorflow:global_step/sec: 98.0562
INFO:tensorflow:loss = 0.6260065, step = 500 (1.020 sec)
INFO:tensorflow:loss = 0.6260065, step = 500 (1.020 sec)
INFO:tensorflow:global_step/sec: 96.3862
INFO:tensorflow:global_step/sec: 96.3862
INFO:tensorflow:loss = 0.625863, step = 600 (1.038 sec)
INFO:tensorflow:loss = 0.625863, step = 600 (1.038 sec)
INFO:tensorflow:global_step/sec: 100.74
INFO:tensorflow:global_step/sec: 100.74
INFO:tensorflow:loss = 0.6222489, step = 700 (0.993 sec)
INFO:tensorflow:loss = 0.6222489, step = 700 (0.993 sec)
INFO:tensorflow:global_step/sec: 95.7688
INFO:tensorflow:global_step/sec: 95.7688
INFO:tensorflow:loss = 0.5913098, step = 800 (1.044 sec)
INFO:tensorflow:loss = 0.5913098, step = 800 (1.044 sec)
INFO:tensorflow:global_step/sec: 98.9735
INFO:tensorflow:global_step/sec: 98.9735
INFO:tensorflow:loss = 0.5569434, step = 900 (1.010 sec)
INFO:tensorflow:loss = 0.5569434, step = 900 (1.010 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmp7w2py65s/model.ckpt.
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmp7w2py65s/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Loss for final step: 0.56597835.
INFO:tensorflow:Loss for final step: 0.56597835.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:38Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:38Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.62319s
INFO:tensorflow:Inference Time : 3.62319s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:41
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:41
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.73124, accuracy_baseline = 0.5, auc = 0.80642134, auc_precision_recall = 0.8075039, average_loss = 0.57480353, global_step = 1000, label/mean = 0.5, loss = 0.5747586, precision = 0.73547864, prediction/mean = 0.501575, recall = 0.72224
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.73124, accuracy_baseline = 0.5, auc = 0.80642134, auc_precision_recall = 0.8075039, average_loss = 0.57480353, global_step = 1000, label/mean = 0.5, loss = 0.5747586, precision = 0.73547864, prediction/mean = 0.501575, recall = 0.72224
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:43Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:07:43Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.57959s
INFO:tensorflow:Inference Time : 3.57959s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:46
INFO:tensorflow:Finished evaluation at 2021-02-12-21:07:46
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.72492, accuracy_baseline = 0.5, auc = 0.79834723, auc_precision_recall = 0.79879713, average_loss = 0.57974845, global_step = 1000, label/mean = 0.5, loss = 0.57965636, precision = 0.73580474, prediction/mean = 0.4974555, recall = 0.70184
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.72492, accuracy_baseline = 0.5, auc = 0.79834723, auc_precision_recall = 0.79879713, average_loss = 0.57974845, global_step = 1000, label/mean = 0.5, loss = 0.57965636, precision = 0.73580474, prediction/mean = 0.4974555, recall = 0.70184
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp7w2py65s/model.ckpt-1000
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpcuj0j_zw
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpcuj0j_zw
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpcuj0j_zw', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpcuj0j_zw', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpcuj0j_zw/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpcuj0j_zw/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.696195, step = 0
INFO:tensorflow:loss = 0.696195, step = 0
INFO:tensorflow:global_step/sec: 91.5797
INFO:tensorflow:global_step/sec: 91.5797
INFO:tensorflow:loss = 0.685055, step = 100 (1.094 sec)
INFO:tensorflow:loss = 0.685055, step = 100 (1.094 sec)
INFO:tensorflow:global_step/sec: 100.283
INFO:tensorflow:global_step/sec: 100.283
INFO:tensorflow:loss = 0.674756, step = 200 (0.997 sec)
INFO:tensorflow:loss = 0.674756, step = 200 (0.997 sec)
INFO:tensorflow:global_step/sec: 97.74
INFO:tensorflow:global_step/sec: 97.74
INFO:tensorflow:loss = 0.65558565, step = 300 (1.023 sec)
INFO:tensorflow:loss = 0.65558565, step = 300 (1.023 sec)
INFO:tensorflow:global_step/sec: 100.253
INFO:tensorflow:global_step/sec: 100.253
INFO:tensorflow:loss = 0.66118705, step = 400 (0.998 sec)
INFO:tensorflow:loss = 0.66118705, step = 400 (0.998 sec)
INFO:tensorflow:global_step/sec: 99.4053
INFO:tensorflow:global_step/sec: 99.4053
INFO:tensorflow:loss = 0.63699234, step = 500 (1.006 sec)
INFO:tensorflow:loss = 0.63699234, step = 500 (1.006 sec)
INFO:tensorflow:global_step/sec: 99.5299
INFO:tensorflow:global_step/sec: 99.5299
INFO:tensorflow:loss = 0.620423, step = 600 (1.005 sec)
INFO:tensorflow:loss = 0.620423, step = 600 (1.005 sec)
INFO:tensorflow:global_step/sec: 98.2414
INFO:tensorflow:global_step/sec: 98.2414
INFO:tensorflow:loss = 0.6484535, step = 700 (1.017 sec)
INFO:tensorflow:loss = 0.6484535, step = 700 (1.017 sec)
INFO:tensorflow:global_step/sec: 99.8481
INFO:tensorflow:global_step/sec: 99.8481
INFO:tensorflow:loss = 0.6057408, step = 800 (1.001 sec)
INFO:tensorflow:loss = 0.6057408, step = 800 (1.001 sec)
INFO:tensorflow:global_step/sec: 96.1552
INFO:tensorflow:global_step/sec: 96.1552
INFO:tensorflow:loss = 0.56978875, step = 900 (1.040 sec)
INFO:tensorflow:loss = 0.56978875, step = 900 (1.040 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmpcuj0j_zw/model.ckpt.
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmpcuj0j_zw/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Loss for final step: 0.5444518.
INFO:tensorflow:Loss for final step: 0.5444518.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:01Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:01Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.59808s
INFO:tensorflow:Inference Time : 3.59808s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:04
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:04
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.7342, accuracy_baseline = 0.5, auc = 0.81058335, auc_precision_recall = 0.81121254, average_loss = 0.5834282, global_step = 1000, label/mean = 0.5, loss = 0.5833859, precision = 0.73748684, prediction/mean = 0.5024869, recall = 0.72728
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.7342, accuracy_baseline = 0.5, auc = 0.81058335, auc_precision_recall = 0.81121254, average_loss = 0.5834282, global_step = 1000, label/mean = 0.5, loss = 0.5833859, precision = 0.73748684, prediction/mean = 0.5024869, recall = 0.72728
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:05Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:05Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.78260s
INFO:tensorflow:Inference Time : 3.78260s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:09
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:09
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.72648, accuracy_baseline = 0.5, auc = 0.8017419, auc_precision_recall = 0.8010956, average_loss = 0.5883348, global_step = 1000, label/mean = 0.5, loss = 0.5882809, precision = 0.73412174, prediction/mean = 0.49961936, recall = 0.71016
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.72648, accuracy_baseline = 0.5, auc = 0.8017419, auc_precision_recall = 0.8010956, average_loss = 0.5883348, global_step = 1000, label/mean = 0.5, loss = 0.5882809, precision = 0.73412174, prediction/mean = 0.49961936, recall = 0.71016
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpcuj0j_zw/model.ckpt-1000
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmp9a0hlmrm
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmp9a0hlmrm
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp9a0hlmrm', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp9a0hlmrm', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp9a0hlmrm/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp9a0hlmrm/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.6971753, step = 0
INFO:tensorflow:loss = 0.6971753, step = 0
INFO:tensorflow:global_step/sec: 95.8891
INFO:tensorflow:global_step/sec: 95.8891
INFO:tensorflow:loss = 0.6725138, step = 100 (1.045 sec)
INFO:tensorflow:loss = 0.6725138, step = 100 (1.045 sec)
INFO:tensorflow:global_step/sec: 101.915
INFO:tensorflow:global_step/sec: 101.915
INFO:tensorflow:loss = 0.6222487, step = 200 (0.981 sec)
INFO:tensorflow:loss = 0.6222487, step = 200 (0.981 sec)
INFO:tensorflow:global_step/sec: 102.506
INFO:tensorflow:global_step/sec: 102.506
INFO:tensorflow:loss = 0.6487738, step = 300 (0.976 sec)
INFO:tensorflow:loss = 0.6487738, step = 300 (0.976 sec)
INFO:tensorflow:global_step/sec: 102.782
INFO:tensorflow:global_step/sec: 102.782
INFO:tensorflow:loss = 0.66589504, step = 400 (0.973 sec)
INFO:tensorflow:loss = 0.66589504, step = 400 (0.973 sec)
INFO:tensorflow:global_step/sec: 102.939
INFO:tensorflow:global_step/sec: 102.939
INFO:tensorflow:loss = 0.5936918, step = 500 (0.972 sec)
INFO:tensorflow:loss = 0.5936918, step = 500 (0.972 sec)
INFO:tensorflow:global_step/sec: 103.992
INFO:tensorflow:global_step/sec: 103.992
INFO:tensorflow:loss = 0.5987308, step = 600 (0.961 sec)
INFO:tensorflow:loss = 0.5987308, step = 600 (0.961 sec)
INFO:tensorflow:global_step/sec: 104.541
INFO:tensorflow:global_step/sec: 104.541
INFO:tensorflow:loss = 0.55824125, step = 700 (0.957 sec)
INFO:tensorflow:loss = 0.55824125, step = 700 (0.957 sec)
INFO:tensorflow:global_step/sec: 103.713
INFO:tensorflow:global_step/sec: 103.713
INFO:tensorflow:loss = 0.6031821, step = 800 (0.964 sec)
INFO:tensorflow:loss = 0.6031821, step = 800 (0.964 sec)
INFO:tensorflow:global_step/sec: 104.16
INFO:tensorflow:global_step/sec: 104.16
INFO:tensorflow:loss = 0.63235676, step = 900 (0.959 sec)
INFO:tensorflow:loss = 0.63235676, step = 900 (0.959 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmp9a0hlmrm/model.ckpt.
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmp9a0hlmrm/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Loss for final step: 0.5835521.
INFO:tensorflow:Loss for final step: 0.5835521.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:33Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:33Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.50186s
INFO:tensorflow:Inference Time : 3.50186s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:36
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:36
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.67772, accuracy_baseline = 0.5, auc = 0.7420038, auc_precision_recall = 0.73254406, average_loss = 0.5978992, global_step = 1000, label/mean = 0.5, loss = 0.5979569, precision = 0.6771955, prediction/mean = 0.5002821, recall = 0.6792
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.67772, accuracy_baseline = 0.5, auc = 0.7420038, auc_precision_recall = 0.73254406, average_loss = 0.5978992, global_step = 1000, label/mean = 0.5, loss = 0.5979569, precision = 0.6771955, prediction/mean = 0.5002821, recall = 0.6792
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:38Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:38Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.44084s
INFO:tensorflow:Inference Time : 3.44084s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:41
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:41
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.66364, accuracy_baseline = 0.5, auc = 0.7241649, auc_precision_recall = 0.71436924, average_loss = 0.61208284, global_step = 1000, label/mean = 0.5, loss = 0.612271, precision = 0.6647338, prediction/mean = 0.50022006, recall = 0.66032
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.66364, accuracy_baseline = 0.5, auc = 0.7241649, auc_precision_recall = 0.71436924, average_loss = 0.61208284, global_step = 1000, label/mean = 0.5, loss = 0.612271, precision = 0.6647338, prediction/mean = 0.50022006, recall = 0.66032
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmp9a0hlmrm/model.ckpt-1000
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpb3jj1xco
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpb3jj1xco
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpb3jj1xco', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpb3jj1xco', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpb3jj1xco/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpb3jj1xco/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.76369715, step = 0
INFO:tensorflow:loss = 0.76369715, step = 0
INFO:tensorflow:global_step/sec: 98.1455
INFO:tensorflow:global_step/sec: 98.1455
INFO:tensorflow:loss = 0.6576936, step = 100 (1.021 sec)
INFO:tensorflow:loss = 0.6576936, step = 100 (1.021 sec)
INFO:tensorflow:global_step/sec: 103.247
INFO:tensorflow:global_step/sec: 103.247
INFO:tensorflow:loss = 0.6220906, step = 200 (0.968 sec)
INFO:tensorflow:loss = 0.6220906, step = 200 (0.968 sec)
INFO:tensorflow:global_step/sec: 101.594
INFO:tensorflow:global_step/sec: 101.594
INFO:tensorflow:loss = 0.6021811, step = 300 (0.984 sec)
INFO:tensorflow:loss = 0.6021811, step = 300 (0.984 sec)
INFO:tensorflow:global_step/sec: 101.32
INFO:tensorflow:global_step/sec: 101.32
INFO:tensorflow:loss = 0.6112791, step = 400 (0.987 sec)
INFO:tensorflow:loss = 0.6112791, step = 400 (0.987 sec)
INFO:tensorflow:global_step/sec: 103.923
INFO:tensorflow:global_step/sec: 103.923
INFO:tensorflow:loss = 0.6482002, step = 500 (0.962 sec)
INFO:tensorflow:loss = 0.6482002, step = 500 (0.962 sec)
INFO:tensorflow:global_step/sec: 100.859
INFO:tensorflow:global_step/sec: 100.859
INFO:tensorflow:loss = 0.56012577, step = 600 (0.992 sec)
INFO:tensorflow:loss = 0.56012577, step = 600 (0.992 sec)
INFO:tensorflow:global_step/sec: 101.543
INFO:tensorflow:global_step/sec: 101.543
INFO:tensorflow:loss = 0.56843096, step = 700 (0.985 sec)
INFO:tensorflow:loss = 0.56843096, step = 700 (0.985 sec)
INFO:tensorflow:global_step/sec: 102.58
INFO:tensorflow:global_step/sec: 102.58
INFO:tensorflow:loss = 0.6630769, step = 800 (0.974 sec)
INFO:tensorflow:loss = 0.6630769, step = 800 (0.974 sec)
INFO:tensorflow:global_step/sec: 103.531
INFO:tensorflow:global_step/sec: 103.531
INFO:tensorflow:loss = 0.54503405, step = 900 (0.966 sec)
INFO:tensorflow:loss = 0.54503405, step = 900 (0.966 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1000...
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmpb3jj1xco/model.ckpt.
INFO:tensorflow:Saving checkpoints for 1000 into /tmp/tmpb3jj1xco/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1000...
INFO:tensorflow:Loss for final step: 0.5460258.
INFO:tensorflow:Loss for final step: 0.5460258.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:55Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:08:55Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpb3jj1xco/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmpb3jj1xco/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.59697s
INFO:tensorflow:Inference Time : 3.59697s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:59
INFO:tensorflow:Finished evaluation at 2021-02-12-21:08:59
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.68, accuracy_baseline = 0.5, auc = 0.74497974, auc_precision_recall = 0.73682225, average_loss = 0.5951442, global_step = 1000, label/mean = 0.5, loss = 0.5953157, precision = 0.68277824, prediction/mean = 0.4955421, recall = 0.6724
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.68, accuracy_baseline = 0.5, auc = 0.74497974, auc_precision_recall = 0.73682225, average_loss = 0.5951442, global_step = 1000, label/mean = 0.5, loss = 0.5953157, precision = 0.68277824, prediction/mean = 0.4955421, recall = 0.6724
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpb3jj1xco/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpb3jj1xco/model.ckpt-1000
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:09:00Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:09:00Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpb3jj1xco/model.ckpt-1000
INFO:tensorflow:Restoring parameters from /tmp/tmpb3jj1xco/model.ckpt-1000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.50390s
INFO:tensorflow:Inference Time : 3.50390s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:09:04
INFO:tensorflow:Finished evaluation at 2021-02-12-21:09:04
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.6666, accuracy_baseline = 0.5, auc = 0.72640914, auc_precision_recall = 0.7165327, average_loss = 0.61048025, global_step = 1000, label/mean = 0.5, loss = 0.61059827, precision = 0.66951567, prediction/mean = 0.4960133, recall = 0.658
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.6666, accuracy_baseline = 0.5, auc = 0.72640914, auc_precision_recall = 0.7165327, average_loss = 0.61048025, global_step = 1000, label/mean = 0.5, loss = 0.61059827, precision = 0.66951567, prediction/mean = 0.4960133, recall = 0.658
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpb3jj1xco/model.ckpt-1000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 1000: /tmp/tmpb3jj1xco/model.ckpt-1000

結果を見てみましょう。

pd.DataFrame.from_dict(results, orient="index")

既に複数のパターンが見られますが、まず最初にテストセットのベースラインの精度、つまり最も代表的なクラスのラベルのみを出力して達成可能な下限値を確立させる必要があります。

estimator.evaluate(input_fn=predict_test_input_fn)["accuracy_baseline"]
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-02-12T21:09:05Z
INFO:tensorflow:Starting evaluation at 2021-02-12T21:09:05Z
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Restoring parameters from /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 3.53047s
INFO:tensorflow:Inference Time : 3.53047s
INFO:tensorflow:Finished evaluation at 2021-02-12-21:09:08
INFO:tensorflow:Finished evaluation at 2021-02-12-21:09:08
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.78468, accuracy_baseline = 0.5, auc = 0.86784285, auc_precision_recall = 0.8701854, average_loss = 0.4562724, global_step = 5000, label/mean = 0.5, loss = 0.4558364, precision = 0.8037559, prediction/mean = 0.48375455, recall = 0.75328
INFO:tensorflow:Saving dict for global step 5000: accuracy = 0.78468, accuracy_baseline = 0.5, auc = 0.86784285, auc_precision_recall = 0.8701854, average_loss = 0.4562724, global_step = 5000, label/mean = 0.5, loss = 0.4558364, precision = 0.8037559, prediction/mean = 0.48375455, recall = 0.75328
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /tmp/tmpdkwhugtd/model.ckpt-5000
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 5000: /tmp/tmpdkwhugtd/model.ckpt-5000
0.5

最も代表的なクラスを割り当てると、精度は 50% になります。ここで注目すべき点がいくつかあります。

  1. 驚かれるかも知れませんが、モデルは固定されたランダムな埋め込み上でまだ学習することが可能です。その理由は、ディクショナリのすべての単語がランダムなベクトルにマップされていたとしても、Estimator は完全に接続されたレイヤーを使用するだけで空間を分離することができるからです。
  2. ランダム埋め込みを使用したモジュールのトレーニングを許可すると、分類器だけをトレーニングする場合に比べ、トレーニングとテスト両方の精度が向上します。
  3. また、事前トレーニング済みの埋め込みでモジュールをトレーニングすると、トレーニングとテスト両方の精度が向上します。ただし、トレーニングセットの過適合には注意してください。事前トレーニング済みのモジュールをトレーニングすることは、正則化を行っても埋め込みの重みが多様なデータでトレーニングされた言語モデルを表現することができなくなるという意味で危険です。その代わりに収束して新しいデータセットを理想的な表現にします。