Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

การวิเคราะห์แบบจำลอง TensorFlow

ดูบน TensorFlow.org

ทำงานใน Google Colab

ดูแหล่งที่มาบน GitHub

ดาวน์โหลดโน๊ตบุ๊ค

ตัวอย่างส่วนประกอบหลักของ TensorFlow Extended (TFX)

TensorFlow รุ่นวิเคราะห์ (Tfma) เป็นห้องสมุดสำหรับการดำเนินการประเมินผลรูปแบบทั่วทั้งชิ้นที่แตกต่างกันของข้อมูล Tfma ดำเนินการคำนวณในลักษณะกระจายจำนวนมากในช่วงของข้อมูลโดยใช้ Apache Beam

ตัวอย่างสมุดบันทึก colab นี้แสดงให้เห็นว่า TFMA สามารถใช้เพื่อตรวจสอบและแสดงภาพประสิทธิภาพของแบบจำลองอย่างไรโดยคำนึงถึงลักษณะของชุดข้อมูล เราจะใช้โมเดลที่เราฝึกมาก่อนหน้านี้ และตอนนี้คุณก็ได้เล่นกับผลลัพธ์แล้ว! รูปแบบที่เราได้รับการฝึกฝนเป็นสำหรับ ชิคาโกแท็กซี่ตัวอย่าง ซึ่งใช้ รถแท็กซี่การเดินทางชุดข้อมูลที่ ปล่อยออกมาจากเมืองชิคาโก สำรวจชุดเต็มรูปแบบใน BigQuery UI

ในฐานะผู้สร้างโมเดลและนักพัฒนา ให้นึกถึงวิธีการใช้ข้อมูลนี้ รวมถึงประโยชน์และอันตรายที่อาจเกิดขึ้นจากการคาดคะเนของแบบจำลอง แบบจำลองเช่นนี้สามารถเสริมสร้างอคติและความเหลื่อมล้ำทางสังคมได้ คุณลักษณะที่เกี่ยวข้องกับปัญหาที่คุณต้องการแก้ไขหรือจะทำให้เกิดความลำเอียงหรือไม่? สำหรับข้อมูลเพิ่มเติมเกี่ยวกับการอ่าน ML ความเป็นธรรม

คอลัมน์ในชุดข้อมูลคือ:

รถปิคอัพ_ชุมชน_พื้นที่	ค่าโดยสาร	trip_start_month
trip_start_hour	trip_start_day	trip_start_timestamp
รถปิคอัพ_ละติจูด	รถกระบะ_ลองจิจูด	dropoff_latitude
dropoff_longitude	trip_miles	pickup_census_tract
dropoff_census_tract	ประเภทการชำระเงิน	บริษัท
trip_seconds	dropoff_community_area	เคล็ดลับ

ติดตั้ง Jupyter Extensions

jupyter nbextension enable --py widgetsnbextension --sys-prefix 
jupyter nbextension install --py --symlink tensorflow_model_analysis --sys-prefix 
jupyter nbextension enable --py tensorflow_model_analysis --sys-prefix

ติดตั้ง TensorFlow Model Analysis (TFMA)

สิ่งนี้จะดึงการพึ่งพาทั้งหมดและจะใช้เวลาสักครู่

# Upgrade pip to the latest, and install TFMA.
pip install -U pip
pip install tensorflow-model-analysis

ตอนนี้ คุณต้องรีสตาร์ทรันไทม์ก่อนเรียกใช้เซลล์ด้านล่าง

# This setup was tested with TF 2.5 and TFMA 0.31 (using colab), but it should
# also work with the latest release.
import sys

# Confirm that we're using Python 3
assert sys.version_info.major==3, 'This notebook must be run using Python 3.'

import tensorflow as tf
print('TF version: {}'.format(tf.__version__))
import apache_beam as beam
print('Beam version: {}'.format(beam.__version__))
import tensorflow_model_analysis as tfma
print('TFMA version: {}'.format(tfma.__version__))

TF version: 2.4.4
Beam version: 2.34.0
TFMA version: 0.29.0

โหลดไฟล์

เราจะดาวน์โหลดไฟล์ tar ที่มีทุกอย่างที่เราต้องการ ซึ่งรวมถึง:

ชุดข้อมูลการฝึกอบรมและประเมินผล
สคีมาข้อมูล
การฝึกอบรมและการให้บริการแบบจำลองที่บันทึกไว้ (keras และตัวประมาณค่า) และแบบจำลองที่บันทึกไว้ของการประเมิน (ตัวประมาณค่า)

# Download the tar file from GCP and extract it
import io, os, tempfile
TAR_NAME = 'saved_models-2.2'
BASE_DIR = tempfile.mkdtemp()
DATA_DIR = os.path.join(BASE_DIR, TAR_NAME, 'data')
MODELS_DIR = os.path.join(BASE_DIR, TAR_NAME, 'models')
SCHEMA = os.path.join(BASE_DIR, TAR_NAME, 'schema.pbtxt')
OUTPUT_DIR = os.path.join(BASE_DIR, 'output')

!curl -O https://storage.googleapis.com/artifacts.tfx-oss-public.appspot.com/datasets/{TAR_NAME}.tar
!tar xf {TAR_NAME}.tar
!mv {TAR_NAME} {BASE_DIR}
!rm {TAR_NAME}.tar

print("Here's what we downloaded:")
!ls -R {BASE_DIR}

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 6800k  100 6800k    0     0  28.2M      0 --:--:-- --:--:-- --:--:-- 28.2M
Here's what we downloaded:
/tmp/tmp_at9q62d:
saved_models-2.2

/tmp/tmp_at9q62d/saved_models-2.2:
data  models  schema.pbtxt

/tmp/tmp_at9q62d/saved_models-2.2/data:
eval  train

/tmp/tmp_at9q62d/saved_models-2.2/data/eval:
data.csv

/tmp/tmp_at9q62d/saved_models-2.2/data/train:
data.csv

/tmp/tmp_at9q62d/saved_models-2.2/models:
estimator  keras

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator:
eval_model_dir  serving_model_dir

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/eval_model_dir:
1591221811

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/eval_model_dir/1591221811:
saved_model.pb  tmp.pbtxt  variables

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/eval_model_dir/1591221811/variables:
variables.data-00000-of-00001  variables.index

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/serving_model_dir:
checkpoint
eval_chicago-taxi-eval
events.out.tfevents.1591221780.my-pipeline-b57vp-237544850
export
graph.pbtxt
model.ckpt-100.data-00000-of-00001
model.ckpt-100.index
model.ckpt-100.meta

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/serving_model_dir/eval_chicago-taxi-eval:
events.out.tfevents.1591221799.my-pipeline-b57vp-237544850

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/serving_model_dir/export:
chicago-taxi

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/serving_model_dir/export/chicago-taxi:
1591221801

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/serving_model_dir/export/chicago-taxi/1591221801:
saved_model.pb  variables

/tmp/tmp_at9q62d/saved_models-2.2/models/estimator/serving_model_dir/export/chicago-taxi/1591221801/variables:
variables.data-00000-of-00001  variables.index

/tmp/tmp_at9q62d/saved_models-2.2/models/keras:
0  1  2

/tmp/tmp_at9q62d/saved_models-2.2/models/keras/0:
saved_model.pb  variables

/tmp/tmp_at9q62d/saved_models-2.2/models/keras/0/variables:
variables.data-00000-of-00001  variables.index

/tmp/tmp_at9q62d/saved_models-2.2/models/keras/1:
saved_model.pb  variables

/tmp/tmp_at9q62d/saved_models-2.2/models/keras/1/variables:
variables.data-00000-of-00001  variables.index

/tmp/tmp_at9q62d/saved_models-2.2/models/keras/2:
saved_model.pb  variables

/tmp/tmp_at9q62d/saved_models-2.2/models/keras/2/variables:
variables.data-00000-of-00001  variables.index

แยกวิเคราะห์ Schema

ในสิ่งที่เราดาวน์โหลดมาเป็นสคีข้อมูลของเราที่ถูกสร้างขึ้นโดย TensorFlow การตรวจสอบข้อมูล มาแยกวิเคราะห์กันตอนนี้เพื่อที่เราจะใช้กับ TFMA ได้

import tensorflow as tf
from google.protobuf import text_format
from tensorflow.python.lib.io import file_io
from tensorflow_metadata.proto.v0 import schema_pb2
from tensorflow.core.example import example_pb2

schema = schema_pb2.Schema()
contents = file_io.read_file_to_string(SCHEMA)
schema = text_format.Parse(contents, schema)

ใช้สคีมาเพื่อสร้าง TFRecords

เราจำเป็นต้องให้ TFMA เข้าถึงชุดข้อมูลของเรา เรามาสร้างไฟล์ TFRecords กัน เราสามารถใช้สคีมาของเราเพื่อสร้างมันขึ้นมาได้ เนื่องจากมันทำให้เรามีประเภทที่ถูกต้องสำหรับแต่ละฟีเจอร์

import csv

datafile = os.path.join(DATA_DIR, 'eval', 'data.csv')
reader = csv.DictReader(open(datafile, 'r'))
examples = []
for line in reader:
  example = example_pb2.Example()
  for feature in schema.feature:
    key = feature.name
    if feature.type == schema_pb2.FLOAT:
      example.features.feature[key].float_list.value[:] = (
          [float(line[key])] if len(line[key]) > 0 else [])
    elif feature.type == schema_pb2.INT:
      example.features.feature[key].int64_list.value[:] = (
          [int(line[key])] if len(line[key]) > 0 else [])
    elif feature.type == schema_pb2.BYTES:
      example.features.feature[key].bytes_list.value[:] = (
          [line[key].encode('utf8')] if len(line[key]) > 0 else [])
  # Add a new column 'big_tipper' that indicates if tips was > 20% of the fare. 
  # TODO(b/157064428): Remove after label transformation is supported for Keras.
  big_tipper = float(line['tips']) > float(line['fare']) * 0.2
  example.features.feature['big_tipper'].float_list.value[:] = [big_tipper]
  examples.append(example)

tfrecord_file = os.path.join(BASE_DIR, 'train_data.rio')
with tf.io.TFRecordWriter(tfrecord_file) as writer:
  for example in examples:
    writer.write(example.SerializeToString())

!ls {tfrecord_file}

/tmp/tmp_at9q62d/train_data.rio

ตั้งค่าและเรียกใช้ TFMA

TFMA รองรับโมเดลประเภทต่างๆ มากมาย รวมถึงโมเดล TF keras โมเดลที่อิงตาม API ลายเซ็น TF2 ทั่วไป รวมถึงโมเดลที่อิงตาม TF estimator get_started คู่มือมีรายการเต็มรูปแบบของประเภทรูปแบบการสนับสนุนและข้อ จำกัด ใด ๆ สำหรับตัวอย่างนี้เราจะแสดงให้เห็นว่าการกำหนดค่า keras รูปแบบตามเช่นเดียวกับรูปแบบตามประมาณการที่ถูกบันทึกไว้ในฐานะที่เป็น EvalSavedModel ดู คำถามที่พบบ่อย สำหรับตัวอย่างของการกำหนดค่าอื่น ๆ

TFMA ให้การสนับสนุนสำหรับการคำนวณตัววัดที่ใช้ในเวลาฝึกอบรม (เช่น ตัววัดในตัว) รวมถึงตัววัดที่กำหนดหลังจากบันทึกแบบจำลองเป็นส่วนหนึ่งของการตั้งค่าการกำหนดค่า TFMA สำหรับเรา keras ติดตั้ง เราจะแสดงให้เห็นถึงการเพิ่มตัวชี้วัดและแผนการของเราด้วยตนเองเป็นส่วนหนึ่งของการกำหนดค่าของเรา (ดู ตัวชี้วัด คู่มือสำหรับข้อมูลเกี่ยวกับตัวชี้วัดและลงจุดที่ได้รับการสนับสนุน) สำหรับการตั้งค่าตัวประมาณ เราจะใช้ตัววัดในตัวที่บันทึกไว้พร้อมกับแบบจำลอง การตั้งค่าของเรายังรวมถึงข้อกำหนดการแบ่งส่วนจำนวนหนึ่ง ซึ่งจะกล่าวถึงในรายละเอียดเพิ่มเติมในส่วนต่อไปนี้

หลังจากสร้าง tfma.EvalConfig และ tfma.EvalSharedModel แล้วเราก็สามารถเรียกใช้ Tfma tfma.run_model_analysis นี้จะสร้าง tfma.EvalResult ที่เราสามารถใช้ในภายหลังสำหรับการแสดงผลตัวชี้วัดและแผนการของเรา

Keras

import tensorflow_model_analysis as tfma

# Setup tfma.EvalConfig settings
keras_eval_config = text_format.Parse("""
  ## Model information
  model_specs {
    # For keras (and serving models) we need to add a `label_key`.
    label_key: "big_tipper"
  }

  ## Post training metric information. These will be merged with any built-in
  ## metrics from training.
  metrics_specs {
    metrics { class_name: "ExampleCount" }
    metrics { class_name: "BinaryAccuracy" }
    metrics { class_name: "BinaryCrossentropy" }
    metrics { class_name: "AUC" }
    metrics { class_name: "AUCPrecisionRecall" }
    metrics { class_name: "Precision" }
    metrics { class_name: "Recall" }
    metrics { class_name: "MeanLabel" }
    metrics { class_name: "MeanPrediction" }
    metrics { class_name: "Calibration" }
    metrics { class_name: "CalibrationPlot" }
    metrics { class_name: "ConfusionMatrixPlot" }
    # ... add additional metrics and plots ...
  }

  ## Slicing information
  slicing_specs {}  # overall slice
  slicing_specs {
    feature_keys: ["trip_start_hour"]
  }
  slicing_specs {
    feature_keys: ["trip_start_day"]
  }
  slicing_specs {
    feature_values: {
      key: "trip_start_month"
      value: "1"
    }
  }
  slicing_specs {
    feature_keys: ["trip_start_hour", "trip_start_day"]
  }
""", tfma.EvalConfig())

# Create a tfma.EvalSharedModel that points at our keras model.
keras_model_path = os.path.join(MODELS_DIR, 'keras', '2')
keras_eval_shared_model = tfma.default_eval_shared_model(
    eval_saved_model_path=keras_model_path,
    eval_config=keras_eval_config)

keras_output_path = os.path.join(OUTPUT_DIR, 'keras')

# Run TFMA
keras_eval_result = tfma.run_model_analysis(
    eval_shared_model=keras_eval_shared_model,
    eval_config=keras_eval_config,
    data_location=tfrecord_file,
    output_path=keras_output_path)

2021-12-04 10:18:15.463173: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory
2021-12-04 10:18:15.464249: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1757] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
WARNING:absl:Tensorflow version (2.4.4) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:113: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and:
`tf.data.TFRecordDataset(path)`
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/writers/metrics_plots_and_validations_writer.py:113: tf_record_iterator (from tensorflow.python.lib.io.tf_record) is deprecated and will be removed in a future version.
Instructions for updating:
Use eager execution and:
`tf.data.TFRecordDataset(path)`

ประมาณการ

import tensorflow_model_analysis as tfma

# Setup tfma.EvalConfig settings
estimator_eval_config = text_format.Parse("""
  ## Model information
  model_specs {
    # To use EvalSavedModel set `signature_name` to "eval".
    signature_name: "eval"
  }

  ## Post training metric information. These will be merged with any built-in
  ## metrics from training.
  metrics_specs {
    metrics { class_name: "ConfusionMatrixPlot" }
    # ... add additional metrics and plots ...
  }

  ## Slicing information
  slicing_specs {}  # overall slice
  slicing_specs {
    feature_keys: ["trip_start_hour"]
  }
  slicing_specs {
    feature_keys: ["trip_start_day"]
  }
  slicing_specs {
    feature_values: {
      key: "trip_start_month"
      value: "1"
    }
  }
  slicing_specs {
    feature_keys: ["trip_start_hour", "trip_start_day"]
  }
""", tfma.EvalConfig())

# Create a tfma.EvalSharedModel that points at our eval saved model.
estimator_base_model_path = os.path.join(
    MODELS_DIR, 'estimator', 'eval_model_dir')
estimator_model_path = os.path.join(
    estimator_base_model_path, os.listdir(estimator_base_model_path)[0])
estimator_eval_shared_model = tfma.default_eval_shared_model(
    eval_saved_model_path=estimator_model_path,
    eval_config=estimator_eval_config)

estimator_output_path = os.path.join(OUTPUT_DIR, 'estimator')

# Run TFMA
estimator_eval_result = tfma.run_model_analysis(
    eval_shared_model=estimator_eval_shared_model,
    eval_config=estimator_eval_config,
    data_location=tfrecord_file,
    output_path=estimator_output_path)

WARNING:absl:Tensorflow version (2.4.4) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/eval_saved_model/load.py:169: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
INFO:tensorflow:Restoring parameters from /tmp/tmp_at9q62d/saved_models-2.2/models/estimator/eval_model_dir/1591221811/variables/variables
INFO:tensorflow:Restoring parameters from /tmp/tmp_at9q62d/saved_models-2.2/models/estimator/eval_model_dir/1591221811/variables/variables
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_model_analysis/eval_saved_model/graph_ref.py:189: get_tensor_from_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.get_tensor_from_tensor_info or tf.compat.v1.saved_model.get_tensor_from_tensor_info.

การแสดงภาพเมตริกและพล็อต

ตอนนี้เราได้ทำการประเมินแล้ว มาดูการแสดงภาพของเราโดยใช้ TFMA สำหรับตัวอย่างต่อไปนี้ เราจะเห็นภาพผลลัพธ์จากการรันการประเมินบนโมเดล keras เพื่อดูประมาณการตามการปรับปรุงรูปแบบ eval_result ไปยังจุดที่เรา estimator_eval_result ตัวแปร

eval_result = keras_eval_result
# eval_result = estimator_eval_result

เมตริกการแสดงผล

ตัวชี้วัดมุมมองที่คุณใช้ tfma.view.render_slicing_metrics

โดยค่าเริ่มต้นมุมมองที่จะแสดง Overall ชิ้น เพื่อดูชิ้นโดยเฉพาะอย่างยิ่งคุณสามารถใช้ชื่อของคอลัมน์ (โดยการตั้งค่า slicing_column ) หรือให้ tfma.SlicingSpec

การแสดงภาพเมทริกสนับสนุนการโต้ตอบต่อไปนี้:

คลิกแล้วลากเพื่อเลื่อน
เลื่อนเพื่อซูม
คลิกขวาเพื่อรีเซ็ตมุมมอง
วางเมาส์เหนือจุดข้อมูลที่ต้องการเพื่อดูรายละเอียดเพิ่มเติม
เลือกจากมุมมองที่แตกต่างกันสี่ประเภทโดยใช้การเลือกที่ด้านล่าง

ตัวอย่างเช่นเราจะได้รับการตั้งค่า slicing_column ไปดูที่ trip_start_hour คุณลักษณะจากก่อนหน้าของเรา slicing_specs

tfma.view.render_slicing_metrics(eval_result, slicing_column='trip_start_hour')

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_hour:2', …

ภาพรวมสไลซ์

สร้างภาพเริ่มต้นคือภาพรวมชิ้นเมื่อจำนวนของชิ้นที่มีขนาดเล็ก มันแสดงค่าของตัวชี้วัดสำหรับแต่ละชิ้น เนื่องจากเราได้เลือก trip_start_hour ข้างต้นก็แสดงให้เราเห็นตัวชี้วัดเช่นความถูกต้องและ AUC สำหรับแต่ละชั่วโมงซึ่งช่วยให้เราสามารถมองหาประเด็นที่มีความเฉพาะเจาะจงกับชั่วโมงบางอย่างและไม่ได้คนอื่น ๆ

ในการแสดงภาพด้านบน:

ลองเรียงลำดับคอลัมน์คุณลักษณะซึ่งเป็นของเรา trip_start_hours มีโดยคลิกที่ส่วนหัวของคอลัมน์
ลองเรียงลำดับตามความแม่นยำและแจ้งให้ทราบว่ามีความแม่นยำสำหรับบางส่วนของชั่วโมงด้วยตัวอย่างเป็น 0 ซึ่งอาจบ่งบอกถึงปัญหา

แผนภูมิยังช่วยให้เราเลือกและแสดงเมตริกต่างๆ ในส่วนของเราได้

ลองเลือกเมตริกอื่นจากเมนู "แสดง"
ลองเลือกการเรียกคืนในเมนู "แสดง" และแจ้งให้ทราบว่าการเรียกคืนสำหรับบางส่วนของชั่วโมงด้วยตัวอย่างเป็น 0 ซึ่งอาจบ่งบอกถึงปัญหา

นอกจากนี้ยังสามารถกำหนดเกณฑ์เพื่อกรองการแบ่งส่วนที่มีจำนวนตัวอย่างน้อยลง หรือ "น้ำหนัก" คุณสามารถพิมพ์จำนวนตัวอย่างขั้นต่ำ หรือใช้แถบเลื่อน

ฮิสโตแกรมเมตริก

มุมมองนี้ยังสนับสนุนตัวชี้วัด Histogram เป็นการสร้างภาพทางเลือกซึ่งเป็นมุมมองเริ่มต้นเมื่อจำนวนของชิ้นที่มีขนาดใหญ่ ผลลัพธ์จะถูกแบ่งออกเป็นกลุ่มและจำนวนชิ้น / น้ำหนักรวม / ทั้งสองสามารถมองเห็นได้ สามารถจัดเรียงคอลัมน์ได้โดยคลิกที่ส่วนหัวของคอลัมน์ ชิ้นที่มีน้ำหนักน้อยสามารถกรองออกได้โดยการตั้งค่าเกณฑ์ สามารถใช้การกรองเพิ่มเติมได้โดยการลากแถบสีเทา หากต้องการรีเซ็ตช่วง ให้ดับเบิลคลิกที่แบนด์ การกรองยังสามารถใช้เพื่อลบค่าผิดปกติในการแสดงภาพและตารางตัวชี้วัด คลิกไอคอนรูปเฟืองเพื่อเปลี่ยนเป็นมาตราส่วนลอการิทึมแทนมาตราส่วนเชิงเส้น

ลองเลือก "เมตริกฮิสโตแกรม" ในเมนูการแสดงภาพ

ชิ้นเพิ่มเติม

ครั้งแรกของเรา tfma.EvalConfig สร้างรายชื่อทั้งหมดของ slicing_specs ซึ่งเราสามารถเห็นภาพข้อมูลการปรับปรุงชิ้นส่งผ่านไปยัง tfma.view.render_slicing_metrics ที่นี่เราจะเลือก trip_start_day ชิ้น (วันของสัปดาห์) ลองเปลี่ยน trip_start_day เพื่อ trip_start_month และการแสดงผลอีกครั้งเพื่อตรวจสอบชิ้นที่แตกต่างกัน

tfma.view.render_slicing_metrics(eval_result, slicing_column='trip_start_day')

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_day:3', '…

TFMA ยังสนับสนุนการสร้างคุณลักษณะข้ามเพื่อวิเคราะห์การผสมผสานของคุณลักษณะต่างๆ การตั้งค่าเดิมของเราที่สร้างข้าม trip_start_hour และ trip_start_day :

tfma.view.render_slicing_metrics(
    eval_result,
    slicing_spec=tfma.SlicingSpec(
        feature_keys=['trip_start_hour', 'trip_start_day']))

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_day_X_tri…

การข้ามสองคอลัมน์ทำให้เกิดชุดค่าผสมมากมาย! Let 's แคบลงข้ามของเราที่จะมองเฉพาะในการเดินทางที่เริ่มต้นที่เที่ยง แล้วให้เลือกของ binary_accuracy จากการสร้างภาพ:

tfma.view.render_slicing_metrics(
    eval_result,
    slicing_spec=tfma.SlicingSpec(
        feature_keys=['trip_start_day'], feature_values={'trip_start_hour': '12'}))

SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_day_X_tri…

การแสดงพล็อต

แปลงใด ๆ ที่ถูกเพิ่มเข้าไปใน tfma.EvalConfig การฝึกอบรมการโพสต์ metric_specs สามารถแสดงผลได้โดยใช้ tfma.view.render_plot

เช่นเดียวกับเมตริก แปลงสามารถดูได้โดยการแบ่งส่วน ซึ่งแตกต่างจากตัวชี้วัดเพียงแปลงสำหรับค่าชิ้นโดยเฉพาะอย่างยิ่งสามารถแสดงผลได้ดังนั้น tfma.SlicingSpec จะต้องใช้และจะต้องระบุทั้งชื่อคุณลักษณะชิ้นและความคุ้มค่า ถ้าไม่มีชิ้นมีให้แล้วแปลงสำหรับการ Overall ชิ้นถูกนำมาใช้

ในตัวอย่างด้านล่างเราจะแสดง CalibrationPlot และ ConfusionMatrixPlot แปลงที่ถูกคำนวณสำหรับ trip_start_hour:1 ชิ้น

tfma.view.render_plot(
    eval_result,
    tfma.SlicingSpec(feature_values={'trip_start_hour': '1'}))

PlotViewer(config={'sliceName': 'trip_start_hour:1', 'metricKeys': {'calibrationPlot': {'metricName': 'calibra…

การติดตามประสิทธิภาพของแบบจำลองเมื่อเวลาผ่านไป

ชุดข้อมูลการฝึกของคุณจะใช้สำหรับการฝึกโมเดลของคุณ และหวังว่าจะเป็นตัวแทนของชุดข้อมูลทดสอบของคุณ และข้อมูลที่จะถูกส่งไปยังแบบจำลองของคุณในการผลิต อย่างไรก็ตาม แม้ว่าข้อมูลในคำขออนุมานอาจยังคงเหมือนเดิมกับข้อมูลการฝึกของคุณ ในหลายกรณี ข้อมูลจะเริ่มเปลี่ยนแปลงมากพอที่ประสิทธิภาพของแบบจำลองของคุณจะเปลี่ยนไป

ซึ่งหมายความว่าคุณจำเป็นต้องตรวจสอบและวัดประสิทธิภาพของแบบจำลองของคุณอย่างต่อเนื่อง เพื่อที่คุณจะได้รับทราบและตอบสนองต่อการเปลี่ยนแปลง มาดูกันว่า TFMA สามารถช่วยได้อย่างไร

Let 's โหลด 3 ที่แตกต่างกันรูปแบบการทำงานและการใช้ Tfma เพื่อดูว่าพวกเขาเปรียบเทียบการใช้ render_time_series

# Note this re-uses the EvalConfig from the keras setup.

# Run eval on each saved model
output_paths = []
for i in range(3):
  # Create a tfma.EvalSharedModel that points at our saved model.
  eval_shared_model = tfma.default_eval_shared_model(
      eval_saved_model_path=os.path.join(MODELS_DIR, 'keras', str(i)),
      eval_config=keras_eval_config)

  output_path = os.path.join(OUTPUT_DIR, 'time_series', str(i))
  output_paths.append(output_path)

  # Run TFMA
  tfma.run_model_analysis(eval_shared_model=eval_shared_model,
                          eval_config=keras_eval_config,
                          data_location=tfrecord_file,
                          output_path=output_path)

WARNING:absl:Tensorflow version (2.4.4) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
WARNING:absl:Tensorflow version (2.4.4) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
WARNING:absl:Tensorflow version (2.4.4) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.

อย่างแรก เราจะจินตนาการว่าเราได้ฝึกและปรับใช้โมเดลของเราเมื่อวานนี้ และตอนนี้เราต้องการดูว่ามันเป็นอย่างไรกับข้อมูลใหม่ที่เข้ามาในวันนี้ การสร้างภาพข้อมูลจะเริ่มต้นด้วยการแสดง AUC จาก UI คุณสามารถ:

เพิ่มเมตริกอื่นๆ โดยใช้เมนู "เพิ่มชุดเมตริก"
ปิดกราฟที่ไม่ต้องการโดยคลิกที่ x
วางเมาส์เหนือจุดข้อมูล (จุดสิ้นสุดของส่วนของเส้นในกราฟ) เพื่อดูรายละเอียดเพิ่มเติม

eval_results_from_disk = tfma.load_eval_results(output_paths[:2])

tfma.view.render_time_series(eval_results_from_disk)

TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'binary_accuracy': {'doubleValu…

ตอนนี้ ลองจินตนาการว่าผ่านไปอีกวันแล้ว และเราต้องการดูว่าข้อมูลใหม่ที่เข้ามาในวันนี้เป็นอย่างไร เมื่อเทียบกับสองวันก่อนหน้า:

eval_results_from_disk = tfma.load_eval_results(output_paths)

tfma.view.render_time_series(eval_results_from_disk)

TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'binary_accuracy': {'doubleValu…

การตรวจสอบแบบจำลอง

TFMA สามารถกำหนดค่าให้ประเมินหลายรุ่นพร้อมกันได้ โดยทั่วไปจะทำเพื่อเปรียบเทียบรูปแบบใหม่กับเส้นฐาน (เช่น รูปแบบที่ให้บริการในปัจจุบัน) เพื่อกำหนดว่าความแตกต่างด้านประสิทธิภาพในตัวชี้วัด (เช่น AUC เป็นต้น) สัมพันธ์กับเส้นฐานอย่างไร เมื่อ เกณฑ์ มีการกำหนดค่า Tfma จะผลิต tfma.ValidationResult บันทึกระบุว่าประสิทธิภาพการทำงานที่ตรงกับ expecations

มากำหนดค่าการประเมิน keras ใหม่เพื่อเปรียบเทียบสองโมเดล: ผู้สมัครและบรรทัดฐาน นอกจากนี้เรายังจะตรวจสอบประสิทธิภาพการทำงานของผู้สมัครกับพื้นฐานโดยการตั้งค่า tmfa.MetricThreshold เมตริก AUC

# Setup tfma.EvalConfig setting
eval_config_with_thresholds = text_format.Parse("""
  ## Model information
  model_specs {
    name: "candidate"
    # For keras we need to add a `label_key`.
    label_key: "big_tipper"
  }
  model_specs {
    name: "baseline"
    # For keras we need to add a `label_key`.
    label_key: "big_tipper"
    is_baseline: true
  }

  ## Post training metric information
  metrics_specs {
    metrics { class_name: "ExampleCount" }
    metrics { class_name: "BinaryAccuracy" }
    metrics { class_name: "BinaryCrossentropy" }
    metrics {
      class_name: "AUC"
      threshold {
        # Ensure that AUC is always > 0.9
        value_threshold {
          lower_bound { value: 0.9 }
        }
        # Ensure that AUC does not drop by more than a small epsilon
        # e.g. (candidate - baseline) > -1e-10 or candidate > baseline - 1e-10
        change_threshold {
          direction: HIGHER_IS_BETTER
          absolute { value: -1e-10 }
        }
      }
    }
    metrics { class_name: "AUCPrecisionRecall" }
    metrics { class_name: "Precision" }
    metrics { class_name: "Recall" }
    metrics { class_name: "MeanLabel" }
    metrics { class_name: "MeanPrediction" }
    metrics { class_name: "Calibration" }
    metrics { class_name: "CalibrationPlot" }
    metrics { class_name: "ConfusionMatrixPlot" }
    # ... add additional metrics and plots ...
  }

  ## Slicing information
  slicing_specs {}  # overall slice
  slicing_specs {
    feature_keys: ["trip_start_hour"]
  }
  slicing_specs {
    feature_keys: ["trip_start_day"]
  }
  slicing_specs {
    feature_keys: ["trip_start_month"]
  }
  slicing_specs {
    feature_keys: ["trip_start_hour", "trip_start_day"]
  }
""", tfma.EvalConfig())

# Create tfma.EvalSharedModels that point at our keras models.
candidate_model_path = os.path.join(MODELS_DIR, 'keras', '2')
baseline_model_path = os.path.join(MODELS_DIR, 'keras', '1')
eval_shared_models = [
  tfma.default_eval_shared_model(
      model_name=tfma.CANDIDATE_KEY,
      eval_saved_model_path=candidate_model_path,
      eval_config=eval_config_with_thresholds),
  tfma.default_eval_shared_model(
      model_name=tfma.BASELINE_KEY,
      eval_saved_model_path=baseline_model_path,
      eval_config=eval_config_with_thresholds),
]

validation_output_path = os.path.join(OUTPUT_DIR, 'validation')

# Run TFMA
eval_result_with_validation = tfma.run_model_analysis(
    eval_shared_models,
    eval_config=eval_config_with_thresholds,
    data_location=tfrecord_file,
    output_path=validation_output_path)

WARNING:absl:Tensorflow version (2.4.4) found. Note that TFMA support for TF 2.0 is currently in beta
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.

เมื่อทำการประเมินด้วยแบบจำลองตั้งแต่หนึ่งแบบจำลองขึ้นไปเทียบกับเส้นฐาน TFMA จะเพิ่มตัววัดส่วนต่างสำหรับตัววัดทั้งหมดที่คำนวณระหว่างการประเมินโดยอัตโนมัติ ตัวชี้วัดเหล่านี้จะถูกตั้งชื่อตามตัวชี้วัดที่สอดคล้องกัน แต่มี _diff ผนวกเข้ากับชื่อตัวชี้วัด

มาดูเมตริกที่เกิดจากการวิ่งของเรากัน:

tfma.view.render_time_series(eval_result_with_validation)

TimeSeriesViewer(config={'isModelCentric': True}, data=[{'metrics': {'': {'': {'binary_accuracy': {'doubleValu…

ทีนี้มาดูผลลัพธ์จากการตรวจสอบความถูกต้องของเรากัน เพื่อดูผลการตรวจสอบที่เราใช้ tfma.load_validator_result ตัวอย่างของเรา การตรวจสอบล้มเหลวเนื่องจาก AUC ต่ำกว่าเกณฑ์

validation_result = tfma.load_validation_result(validation_output_path)
print(validation_result.validation_ok)

False

หมายเหตุ: เว็บไซต์นี้จะให้การใช้งานโดยใช้ข้อมูลที่ได้รับการแก้ไขสำหรับการใช้งานจากแหล่งเดิม www.cityofchicago.org เว็บไซต์อย่างเป็นทางการของเมืองชิคาโก เมืองชิคาโกไม่ได้อ้างสิทธิ์ในเนื้อหา ความถูกต้อง ความตรงต่อเวลา หรือความสมบูรณ์ของข้อมูลใดๆ ที่ให้ไว้ในเว็บไซต์นี้ ข้อมูลที่ให้ไว้ในเว็บไซต์นี้อาจเปลี่ยนแปลงได้ตลอดเวลา เป็นที่เข้าใจกันว่าข้อมูลที่ให้ไว้ในไซต์นี้กำลังถูกใช้โดยความเสี่ยงของตัวเอง