不平衡数据的分类

使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。

在 TensorFlow.org 上查看 在 Google Colab 中运行 在 GitHub 上查看源代码 下载笔记本

本教程演示了如何对高度不平衡的数据集进行分类,在此类数据集中,一类中的样本数量远多于另一类中的样本数量。您将使用 Kaggle 上托管的 Credit Card Fraud Detection 数据集,目的是从总共 284,807 笔交易中检测出仅有的 492 笔欺诈交易。您将使用 Keras 来定义模型和类权重,从而帮助模型从不平衡数据中学习。

本教程包含下列操作的完整代码:

  • 使用 Pandas 加载 CSV 文件。
  • 创建训练、验证和测试集。
  • 使用 Keras 定义并训练模型(包括设置类权重)。
  • 使用各种指标(包括精确率和召回率)评估模型。
  • 尝试使用常见技术来处理不平衡数据,例如:
    • 类加权
    • 过采样

设置

import tensorflow as tf
from tensorflow import keras

import os
import tempfile

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

import sklearn
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
2022-08-31 05:55:14.431987: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-08-31 05:55:15.145377: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory
2022-08-31 05:55:15.145637: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvrtc.so.11.1: cannot open shared object file: No such file or directory
2022-08-31 05:55:15.145650: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
mpl.rcParams['figure.figsize'] = (12, 10)
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

数据处理与浏览

下载 Kaggle Credit Card Fraud 数据集

Pandas 是一个 Python 库,其中包含许多有用的实用工具,用于加载和使用结构化数据,并可用于将 CSV 下载到数据帧中。

注:Worldline 和 ULB(布鲁塞尔自由大学)机器学习小组在大数据挖掘和欺诈检测的合作研究期间,已对此数据集进行了收集和分析。与相关主题当前和过去项目有关的详细信息,请访问这里DefeatFraud 项目页面。

file = tf.keras.utils
raw_df = pd.read_csv('https://storage.googleapis.com/download.tensorflow.org/data/creditcard.csv')
raw_df.head()
raw_df[['Time', 'V1', 'V2', 'V3', 'V4', 'V5', 'V26', 'V27', 'V28', 'Amount', 'Class']].describe()

检查类标签的不平衡

让我们看一下数据集的不平衡情况:

neg, pos = np.bincount(raw_df['Class'])
total = neg + pos
print('Examples:\n    Total: {}\n    Positive: {} ({:.2f}% of total)\n'.format(
    total, pos, 100 * pos / total))
Examples:
    Total: 284807
    Positive: 492 (0.17% of total)

这表明正样本的比例很小。

清理、拆分和归一化数据

原始数据有一些问题。首先,TimeAmount 列变化太大,无法直接使用。删除 Time 列(因为不清楚其含义),并获取 Amount 列的日志以缩小其范围。

cleaned_df = raw_df.copy()

# You don't want the `Time` column.
cleaned_df.pop('Time')

# The `Amount` column covers a huge range. Convert to log-space.
eps=0.001 # 0 => 0.1¢
cleaned_df['Log Ammount'] = np.log(cleaned_df.pop('Amount')+eps)

将数据集拆分为训练、验证和测试集。验证集在模型拟合期间使用,用于评估损失和任何指标,判断模型与数据的拟合程度。测试集在训练阶段完全不使用,仅在最后用于评估模型泛化到新数据的能力。这对于不平衡的数据集尤为重要,因为过拟合是缺乏训练数据造成的一个重大问题。

# Use a utility from sklearn to split and shuffle your dataset.
train_df, test_df = train_test_split(cleaned_df, test_size=0.2)
train_df, val_df = train_test_split(train_df, test_size=0.2)

# Form np arrays of labels and features.
train_labels = np.array(train_df.pop('Class'))
bool_train_labels = train_labels != 0
val_labels = np.array(val_df.pop('Class'))
test_labels = np.array(test_df.pop('Class'))

train_features = np.array(train_df)
val_features = np.array(val_df)
test_features = np.array(test_df)

使用 sklearn StandardScaler 将输入特征归一化。这会将平均值设置为 0,标准偏差设置为 1。

注:StandardScaler 只能使用 train_features 进行拟合,以确保模型不会窥视验证集或测试集。

scaler = StandardScaler()
train_features = scaler.fit_transform(train_features)

val_features = scaler.transform(val_features)
test_features = scaler.transform(test_features)

train_features = np.clip(train_features, -5, 5)
val_features = np.clip(val_features, -5, 5)
test_features = np.clip(test_features, -5, 5)


print('Training labels shape:', train_labels.shape)
print('Validation labels shape:', val_labels.shape)
print('Test labels shape:', test_labels.shape)

print('Training features shape:', train_features.shape)
print('Validation features shape:', val_features.shape)
print('Test features shape:', test_features.shape)
Training labels shape: (182276,)
Validation labels shape: (45569,)
Test labels shape: (56962,)
Training features shape: (182276, 29)
Validation features shape: (45569, 29)
Test features shape: (56962, 29)

小心:如果要部署模型,保留预处理计算至关重要。这是将它们实现为层并在导出之前将它们附加到模型最简单的方法。

查看数据分布

接下来通过一些特征比较一下正样本和负样本的分布。此时,建议您问自己如下问题:

  • 这些分布是否有意义?
    • 是。您已对输入进行了归一化处理,而它们大多集中在 +/- 2 范围内。
  • 您是否能看出分布之间的差异?
    • 是。正样本包含极值的比率高得多 。
pos_df = pd.DataFrame(train_features[ bool_train_labels], columns=train_df.columns)
neg_df = pd.DataFrame(train_features[~bool_train_labels], columns=train_df.columns)

sns.jointplot(x=pos_df['V5'], y=pos_df['V6'],
              kind='hex', xlim=(-5,5), ylim=(-5,5))
plt.suptitle("Positive distribution")

sns.jointplot(x=neg_df['V5'], y=neg_df['V6'],
              kind='hex', xlim=(-5,5), ylim=(-5,5))
_ = plt.suptitle("Negative distribution")

png

png

定义模型和指标

定义一个函数,该函数会创建一个简单的神经网络,其中包含一个密集连接的隐藏层、一个用于减少过拟合的随机失活层,以及一个返回欺诈交易概率的输出 Sigmoid 层:

METRICS = [
      keras.metrics.TruePositives(name='tp'),
      keras.metrics.FalsePositives(name='fp'),
      keras.metrics.TrueNegatives(name='tn'),
      keras.metrics.FalseNegatives(name='fn'), 
      keras.metrics.BinaryAccuracy(name='accuracy'),
      keras.metrics.Precision(name='precision'),
      keras.metrics.Recall(name='recall'),
      keras.metrics.AUC(name='auc'),
      keras.metrics.AUC(name='prc', curve='PR'), # precision-recall curve
]

def make_model(metrics=METRICS, output_bias=None):
  if output_bias is not None:
    output_bias = tf.keras.initializers.Constant(output_bias)
  model = keras.Sequential([
      keras.layers.Dense(
          16, activation='relu',
          input_shape=(train_features.shape[-1],)),
      keras.layers.Dropout(0.5),
      keras.layers.Dense(1, activation='sigmoid',
                         bias_initializer=output_bias),
  ])

  model.compile(
      optimizer=keras.optimizers.Adam(learning_rate=1e-3),
      loss=keras.losses.BinaryCrossentropy(),
      metrics=metrics)

  return model

了解有用的指标

请注意,上面定义的一些指标可以由模型计算得出,这对评估性能很有帮助。

  • 负例和正例是被错误分类的样本
  • 负例和正例是被正确分类的样本
  • 准确率是被正确分类的样本的百分比

\(\frac{\text{true samples} }{\text{total samples} }\)

  • 精确率是被正确分类的预测正例的百分比

\(\frac{\text{true positives} }{\text{true positives + false positives} }\)

  • 召回率是被正确分类的实际正例的百分比

\(\frac{\text{true positives} }{\text{true positives + false negatives} }\)

  • AUC 是指接收器操作特征曲线中的曲线下方面积 (ROC-AUC)。此指标等于分类器对随机正样本的排序高于随机负样本的概率。
  • AUPRC 是指精确率-召回率曲线下方面积。该指标计算不同概率阈值的精度率-召回率对。

注:准确率在此任务中不是一个有用的指标。只要始终预测“False”,您就可以在此任务中达到 99.8%+ 的准确率。

延伸阅读:

基线模型

构建模型

现在,使用先前定义的函数创建并训练模型。请注意,该模型使用大于默认的批次大小 (2048) 来进行拟合,这一点很重要,有助于确保每个批次都有一定机会包含少量正样本。如果批次过小,它们可能会没有可供学习的欺诈交易。

注:此模型无法很好地处理类不平衡问题。我们将在本教程的后面部分对此进行改进。

EPOCHS = 100
BATCH_SIZE = 2048

early_stopping = tf.keras.callbacks.EarlyStopping(
    monitor='val_auc', 
    verbose=1,
    patience=10,
    mode='max',
    restore_best_weights=True)
model = make_model()
model.summary()
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 16)                480       
                                                                 
 dropout (Dropout)           (None, 16)                0         
                                                                 
 dense_1 (Dense)             (None, 1)                 17        
                                                                 
=================================================================
Total params: 497
Trainable params: 497
Non-trainable params: 0
_________________________________________________________________

试运行模型:

model.predict(train_features[:10])
1/1 [==============================] - 0s 358ms/step
array([[0.32526094],
       [0.37401223],
       [0.4207074 ],
       [0.16697319],
       [0.22081244],
       [0.20874254],
       [0.1679387 ],
       [0.21296714],
       [0.17615263],
       [0.75215197]], dtype=float32)

可选:设置正确的初始偏差。

模型最初的猜测不太理想。您知道数据集不平衡,因此需要设置输出层的偏差以反映这种不平衡(请参阅:训练神经网络的秘诀:“好好初始化”)。这样做有助于初始收敛。

使用默认偏差初始化时,损失应约为 math.log(2) = 0.69314

results = model.evaluate(train_features, train_labels, batch_size=BATCH_SIZE, verbose=0)
print("Loss: {:0.4f}".format(results[0]))
Loss: 0.4891

可以用以下代码推导出要设置的正确偏差:

\( p_0 = pos/(pos + neg) = 1/(1+e^{-b_0}) \) \( b_0 = -log_e(1/p_0 - 1) \) \[ b_0 = log_e(pos/neg)\]

initial_bias = np.log([pos/neg])
initial_bias
array([-6.35935934])

将其设置为初始偏差,模型将给出合理得多的初始猜测。

结果应该接近:pos/total = 0.0018

model = make_model(output_bias = initial_bias)
model.predict(train_features[:10])
1/1 [==============================] - 0s 49ms/step
array([[0.00064369],
       [0.00234391],
       [0.00045576],
       [0.00055661],
       [0.00105842],
       [0.00170113],
       [0.00088736],
       [0.00071619],
       [0.00078793],
       [0.00407396]], dtype=float32)

使用此初始化,初始损失应约为:

\[-p_0log(p_0)-(1-p_0)log(1-p_0) = 0.01317\]

results = model.evaluate(train_features, train_labels, batch_size=BATCH_SIZE, verbose=0)
print("Loss: {:0.4f}".format(results[0]))
Loss: 0.0196

此初始损失大约是使用朴素初始化时损失的 50 倍。

这样,模型就不需要花费前几个周期去仅仅了解不可能有正样本。这也使得在训练过程中更容易读取损失图。

为初始权重设置检查点

为了使各种训练运行更具可比性,请将这个初始模型的权重保存在检查点文件中,并在训练前将它们加载到每个模型中:

initial_weights = os.path.join(tempfile.mkdtemp(),'initial_weights')
model.save_weights(initial_weights)

确认偏差修正有帮助

在继续之前,迅速确认这一细致偏差初始化是否确实起了作用。

在使用和不使用此细致初始化的情况下,将模型训练 20 个周期,并比较损失:

model = make_model()
model.load_weights(initial_weights)
model.layers[-1].bias.assign([0.0])
zero_bias_history = model.fit(
    train_features,
    train_labels,
    batch_size=BATCH_SIZE,
    epochs=20,
    validation_data=(val_features, val_labels), 
    verbose=0)
model = make_model()
model.load_weights(initial_weights)
careful_bias_history = model.fit(
    train_features,
    train_labels,
    batch_size=BATCH_SIZE,
    epochs=20,
    validation_data=(val_features, val_labels), 
    verbose=0)
def plot_loss(history, label, n):
  # Use a log scale to show the wide range of values.
  plt.semilogy(history.epoch,  history.history['loss'],
               color=colors[n], label='Train '+label)
  plt.semilogy(history.epoch,  history.history['val_loss'],
          color=colors[n], label='Val '+label,
          linestyle="--")
  plt.xlabel('Epoch')
  plt.ylabel('Loss')

  plt.legend()
plot_loss(zero_bias_history, "Zero Bias", 0)
plot_loss(careful_bias_history, "Careful Bias", 1)

png

上图清楚表明:就验证损失而言,在这个问题上,此细致初始化具有明显优势。

训练模型

model = make_model()
model.load_weights(initial_weights)
baseline_history = model.fit(
    train_features,
    train_labels,
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    callbacks = [early_stopping],
    validation_data=(val_features, val_labels))
Epoch 1/100
90/90 [==============================] - 2s 11ms/step - loss: 0.0157 - tp: 73.0000 - fp: 43.0000 - tn: 227417.0000 - fn: 312.0000 - accuracy: 0.9984 - precision: 0.6293 - recall: 0.1896 - auc: 0.6768 - prc: 0.1811 - val_loss: 0.0074 - val_tp: 9.0000 - val_fp: 2.0000 - val_tn: 45505.0000 - val_fn: 53.0000 - val_accuracy: 0.9988 - val_precision: 0.8182 - val_recall: 0.1452 - val_auc: 0.8106 - val_prc: 0.3590
Epoch 2/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0085 - tp: 94.0000 - fp: 29.0000 - tn: 181924.0000 - fn: 229.0000 - accuracy: 0.9986 - precision: 0.7642 - recall: 0.2910 - auc: 0.8230 - prc: 0.4120 - val_loss: 0.0048 - val_tp: 24.0000 - val_fp: 4.0000 - val_tn: 45503.0000 - val_fn: 38.0000 - val_accuracy: 0.9991 - val_precision: 0.8571 - val_recall: 0.3871 - val_auc: 0.8946 - val_prc: 0.6622
Epoch 3/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0076 - tp: 124.0000 - fp: 25.0000 - tn: 181928.0000 - fn: 199.0000 - accuracy: 0.9988 - precision: 0.8322 - recall: 0.3839 - auc: 0.8531 - prc: 0.5030 - val_loss: 0.0040 - val_tp: 32.0000 - val_fp: 4.0000 - val_tn: 45503.0000 - val_fn: 30.0000 - val_accuracy: 0.9993 - val_precision: 0.8889 - val_recall: 0.5161 - val_auc: 0.9030 - val_prc: 0.6871
Epoch 4/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0066 - tp: 140.0000 - fp: 27.0000 - tn: 181926.0000 - fn: 183.0000 - accuracy: 0.9988 - precision: 0.8383 - recall: 0.4334 - auc: 0.8814 - prc: 0.5728 - val_loss: 0.0035 - val_tp: 37.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 25.0000 - val_accuracy: 0.9993 - val_precision: 0.8810 - val_recall: 0.5968 - val_auc: 0.9110 - val_prc: 0.7066
Epoch 5/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0059 - tp: 172.0000 - fp: 28.0000 - tn: 181925.0000 - fn: 151.0000 - accuracy: 0.9990 - precision: 0.8600 - recall: 0.5325 - auc: 0.8984 - prc: 0.6243 - val_loss: 0.0033 - val_tp: 38.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 24.0000 - val_accuracy: 0.9994 - val_precision: 0.8837 - val_recall: 0.6129 - val_auc: 0.9191 - val_prc: 0.7188
Epoch 6/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0056 - tp: 174.0000 - fp: 25.0000 - tn: 181928.0000 - fn: 149.0000 - accuracy: 0.9990 - precision: 0.8744 - recall: 0.5387 - auc: 0.8941 - prc: 0.6433 - val_loss: 0.0031 - val_tp: 41.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 21.0000 - val_accuracy: 0.9994 - val_precision: 0.8913 - val_recall: 0.6613 - val_auc: 0.9272 - val_prc: 0.7391
Epoch 7/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0058 - tp: 170.0000 - fp: 33.0000 - tn: 181920.0000 - fn: 153.0000 - accuracy: 0.9990 - precision: 0.8374 - recall: 0.5263 - auc: 0.8944 - prc: 0.6126 - val_loss: 0.0030 - val_tp: 41.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 21.0000 - val_accuracy: 0.9994 - val_precision: 0.8913 - val_recall: 0.6613 - val_auc: 0.9272 - val_prc: 0.7405
Epoch 8/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0055 - tp: 180.0000 - fp: 30.0000 - tn: 181923.0000 - fn: 143.0000 - accuracy: 0.9991 - precision: 0.8571 - recall: 0.5573 - auc: 0.8978 - prc: 0.6433 - val_loss: 0.0028 - val_tp: 41.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 21.0000 - val_accuracy: 0.9994 - val_precision: 0.8913 - val_recall: 0.6613 - val_auc: 0.9352 - val_prc: 0.7752
Epoch 9/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0049 - tp: 191.0000 - fp: 30.0000 - tn: 181923.0000 - fn: 132.0000 - accuracy: 0.9991 - precision: 0.8643 - recall: 0.5913 - auc: 0.9073 - prc: 0.6758 - val_loss: 0.0027 - val_tp: 41.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 21.0000 - val_accuracy: 0.9994 - val_precision: 0.8913 - val_recall: 0.6613 - val_auc: 0.9352 - val_prc: 0.7759
Epoch 10/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0051 - tp: 176.0000 - fp: 35.0000 - tn: 181918.0000 - fn: 147.0000 - accuracy: 0.9990 - precision: 0.8341 - recall: 0.5449 - auc: 0.9090 - prc: 0.6622 - val_loss: 0.0026 - val_tp: 42.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 20.0000 - val_accuracy: 0.9995 - val_precision: 0.8936 - val_recall: 0.6774 - val_auc: 0.9352 - val_prc: 0.7819
Epoch 11/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0046 - tp: 184.0000 - fp: 40.0000 - tn: 181913.0000 - fn: 139.0000 - accuracy: 0.9990 - precision: 0.8214 - recall: 0.5697 - auc: 0.9246 - prc: 0.6863 - val_loss: 0.0026 - val_tp: 41.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 21.0000 - val_accuracy: 0.9994 - val_precision: 0.8913 - val_recall: 0.6613 - val_auc: 0.9352 - val_prc: 0.7899
Epoch 12/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0048 - tp: 175.0000 - fp: 34.0000 - tn: 181919.0000 - fn: 148.0000 - accuracy: 0.9990 - precision: 0.8373 - recall: 0.5418 - auc: 0.9199 - prc: 0.6704 - val_loss: 0.0025 - val_tp: 42.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 20.0000 - val_accuracy: 0.9995 - val_precision: 0.8936 - val_recall: 0.6774 - val_auc: 0.9432 - val_prc: 0.8006
Epoch 13/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0047 - tp: 176.0000 - fp: 37.0000 - tn: 181916.0000 - fn: 147.0000 - accuracy: 0.9990 - precision: 0.8263 - recall: 0.5449 - auc: 0.9213 - prc: 0.6640 - val_loss: 0.0024 - val_tp: 44.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 18.0000 - val_accuracy: 0.9995 - val_precision: 0.8980 - val_recall: 0.7097 - val_auc: 0.9432 - val_prc: 0.8141
Epoch 14/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0046 - tp: 192.0000 - fp: 36.0000 - tn: 181917.0000 - fn: 131.0000 - accuracy: 0.9991 - precision: 0.8421 - recall: 0.5944 - auc: 0.9152 - prc: 0.6753 - val_loss: 0.0023 - val_tp: 45.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 17.0000 - val_accuracy: 0.9995 - val_precision: 0.9000 - val_recall: 0.7258 - val_auc: 0.9432 - val_prc: 0.8173
Epoch 15/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0046 - tp: 183.0000 - fp: 37.0000 - tn: 181916.0000 - fn: 140.0000 - accuracy: 0.9990 - precision: 0.8318 - recall: 0.5666 - auc: 0.9168 - prc: 0.6892 - val_loss: 0.0023 - val_tp: 44.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 18.0000 - val_accuracy: 0.9995 - val_precision: 0.8980 - val_recall: 0.7097 - val_auc: 0.9432 - val_prc: 0.8218
Epoch 16/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0044 - tp: 190.0000 - fp: 27.0000 - tn: 181926.0000 - fn: 133.0000 - accuracy: 0.9991 - precision: 0.8756 - recall: 0.5882 - auc: 0.9246 - prc: 0.7111 - val_loss: 0.0022 - val_tp: 48.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 14.0000 - val_accuracy: 0.9996 - val_precision: 0.9057 - val_recall: 0.7742 - val_auc: 0.9432 - val_prc: 0.8243
Epoch 17/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0043 - tp: 198.0000 - fp: 29.0000 - tn: 181924.0000 - fn: 125.0000 - accuracy: 0.9992 - precision: 0.8722 - recall: 0.6130 - auc: 0.9339 - prc: 0.7133 - val_loss: 0.0022 - val_tp: 48.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 14.0000 - val_accuracy: 0.9996 - val_precision: 0.9057 - val_recall: 0.7742 - val_auc: 0.9432 - val_prc: 0.8253
Epoch 18/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0040 - tp: 202.0000 - fp: 29.0000 - tn: 181924.0000 - fn: 121.0000 - accuracy: 0.9992 - precision: 0.8745 - recall: 0.6254 - auc: 0.9309 - prc: 0.7365 - val_loss: 0.0022 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9432 - val_prc: 0.8292
Epoch 19/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0042 - tp: 193.0000 - fp: 34.0000 - tn: 181919.0000 - fn: 130.0000 - accuracy: 0.9991 - precision: 0.8502 - recall: 0.5975 - auc: 0.9310 - prc: 0.7272 - val_loss: 0.0022 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8267
Epoch 20/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0043 - tp: 198.0000 - fp: 32.0000 - tn: 181921.0000 - fn: 125.0000 - accuracy: 0.9991 - precision: 0.8609 - recall: 0.6130 - auc: 0.9295 - prc: 0.7027 - val_loss: 0.0022 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8359
Epoch 21/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0041 - tp: 204.0000 - fp: 31.0000 - tn: 181922.0000 - fn: 119.0000 - accuracy: 0.9992 - precision: 0.8681 - recall: 0.6316 - auc: 0.9248 - prc: 0.7159 - val_loss: 0.0021 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8354
Epoch 22/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0043 - tp: 200.0000 - fp: 33.0000 - tn: 181920.0000 - fn: 123.0000 - accuracy: 0.9991 - precision: 0.8584 - recall: 0.6192 - auc: 0.9217 - prc: 0.7024 - val_loss: 0.0021 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8360
Epoch 23/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0042 - tp: 197.0000 - fp: 33.0000 - tn: 181920.0000 - fn: 126.0000 - accuracy: 0.9991 - precision: 0.8565 - recall: 0.6099 - auc: 0.9310 - prc: 0.7171 - val_loss: 0.0021 - val_tp: 47.0000 - val_fp: 3.0000 - val_tn: 45504.0000 - val_fn: 15.0000 - val_accuracy: 0.9996 - val_precision: 0.9400 - val_recall: 0.7581 - val_auc: 0.9433 - val_prc: 0.8411
Epoch 24/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0043 - tp: 186.0000 - fp: 32.0000 - tn: 181921.0000 - fn: 137.0000 - accuracy: 0.9991 - precision: 0.8532 - recall: 0.5759 - auc: 0.9248 - prc: 0.7086 - val_loss: 0.0021 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8387
Epoch 25/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0038 - tp: 202.0000 - fp: 29.0000 - tn: 181924.0000 - fn: 121.0000 - accuracy: 0.9992 - precision: 0.8745 - recall: 0.6254 - auc: 0.9341 - prc: 0.7615 - val_loss: 0.0021 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8385
Epoch 26/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0039 - tp: 204.0000 - fp: 36.0000 - tn: 181917.0000 - fn: 119.0000 - accuracy: 0.9991 - precision: 0.8500 - recall: 0.6316 - auc: 0.9358 - prc: 0.7386 - val_loss: 0.0021 - val_tp: 48.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 14.0000 - val_accuracy: 0.9996 - val_precision: 0.9057 - val_recall: 0.7742 - val_auc: 0.9433 - val_prc: 0.8430
Epoch 27/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0041 - tp: 186.0000 - fp: 30.0000 - tn: 181923.0000 - fn: 137.0000 - accuracy: 0.9991 - precision: 0.8611 - recall: 0.5759 - auc: 0.9295 - prc: 0.7240 - val_loss: 0.0021 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8433
Epoch 28/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0040 - tp: 213.0000 - fp: 36.0000 - tn: 181917.0000 - fn: 110.0000 - accuracy: 0.9992 - precision: 0.8554 - recall: 0.6594 - auc: 0.9295 - prc: 0.7335 - val_loss: 0.0020 - val_tp: 48.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 14.0000 - val_accuracy: 0.9996 - val_precision: 0.9057 - val_recall: 0.7742 - val_auc: 0.9433 - val_prc: 0.8422
Epoch 29/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0038 - tp: 204.0000 - fp: 30.0000 - tn: 181923.0000 - fn: 119.0000 - accuracy: 0.9992 - precision: 0.8718 - recall: 0.6316 - auc: 0.9342 - prc: 0.7523 - val_loss: 0.0020 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8447
Epoch 30/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0041 - tp: 192.0000 - fp: 34.0000 - tn: 181919.0000 - fn: 131.0000 - accuracy: 0.9991 - precision: 0.8496 - recall: 0.5944 - auc: 0.9295 - prc: 0.7214 - val_loss: 0.0020 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8449
Epoch 31/100
90/90 [==============================] - 0s 5ms/step - loss: 0.0040 - tp: 192.0000 - fp: 35.0000 - tn: 181918.0000 - fn: 131.0000 - accuracy: 0.9991 - precision: 0.8458 - recall: 0.5944 - auc: 0.9372 - prc: 0.7340 - val_loss: 0.0020 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8432
Epoch 32/100
79/90 [=========================>....] - ETA: 0s - loss: 0.0038 - tp: 167.0000 - fp: 27.0000 - tn: 161490.0000 - fn: 108.0000 - accuracy: 0.9992 - precision: 0.8608 - recall: 0.6073 - auc: 0.9319 - prc: 0.7352Restoring model weights from the end of the best epoch: 22.
90/90 [==============================] - 0s 5ms/step - loss: 0.0040 - tp: 201.0000 - fp: 32.0000 - tn: 181921.0000 - fn: 122.0000 - accuracy: 0.9992 - precision: 0.8627 - recall: 0.6223 - auc: 0.9311 - prc: 0.7394 - val_loss: 0.0020 - val_tp: 49.0000 - val_fp: 5.0000 - val_tn: 45502.0000 - val_fn: 13.0000 - val_accuracy: 0.9996 - val_precision: 0.9074 - val_recall: 0.7903 - val_auc: 0.9433 - val_prc: 0.8447
Epoch 32: early stopping

查看训练历史记录

在本部分,您将针对训练集和验证集生成模型的准确率和损失绘图。这些对于检查过拟合十分有用,您可以在此教程中了解更多信息。

此外,您还可以为您在上面创建的任何指标生成上述绘图。假负例包含在以下示例中。

def plot_metrics(history):
  metrics = ['loss', 'prc', 'precision', 'recall']
  for n, metric in enumerate(metrics):
    name = metric.replace("_"," ").capitalize()
    plt.subplot(2,2,n+1)
    plt.plot(history.epoch, history.history[metric], color=colors[0], label='Train')
    plt.plot(history.epoch, history.history['val_'+metric],
             color=colors[0], linestyle="--", label='Val')
    plt.xlabel('Epoch')
    plt.ylabel(name)
    if metric == 'loss':
      plt.ylim([0, plt.ylim()[1]])
    elif metric == 'auc':
      plt.ylim([0.8,1])
    else:
      plt.ylim([0,1])

    plt.legend();
plot_metrics(baseline_history)

png

注:验证曲线通常比训练曲线表现更好。这主要是由于在评估模型时,随机失活层处于非活动状态。

评估指标

您可以使用混淆矩阵来汇总实际标签与预测标签,其中 X 轴是预测标签,Y 轴是实际标签:

train_predictions_baseline = model.predict(train_features, batch_size=BATCH_SIZE)
test_predictions_baseline = model.predict(test_features, batch_size=BATCH_SIZE)
90/90 [==============================] - 0s 1ms/step
28/28 [==============================] - 0s 1ms/step
def plot_cm(labels, predictions, p=0.5):
  cm = confusion_matrix(labels, predictions > p)
  plt.figure(figsize=(5,5))
  sns.heatmap(cm, annot=True, fmt="d")
  plt.title('Confusion matrix @{:.2f}'.format(p))
  plt.ylabel('Actual label')
  plt.xlabel('Predicted label')

  print('Legitimate Transactions Detected (True Negatives): ', cm[0][0])
  print('Legitimate Transactions Incorrectly Detected (False Positives): ', cm[0][1])
  print('Fraudulent Transactions Missed (False Negatives): ', cm[1][0])
  print('Fraudulent Transactions Detected (True Positives): ', cm[1][1])
  print('Total Fraudulent Transactions: ', np.sum(cm[1]))

在测试数据集上评估您的模型并显示您在上面创建的指标的结果:

baseline_results = model.evaluate(test_features, test_labels,
                                  batch_size=BATCH_SIZE, verbose=0)
for name, value in zip(model.metrics_names, baseline_results):
  print(name, ': ', value)
print()

plot_cm(test_labels, test_predictions_baseline)
loss :  0.0036762808449566364
tp :  75.0
fp :  11.0
tn :  56844.0
fn :  32.0
accuracy :  0.9992451071739197
precision :  0.8720930218696594
recall :  0.7009345889091492
auc :  0.920269787311554
prc :  0.7936798930168152

Legitimate Transactions Detected (True Negatives):  56844
Legitimate Transactions Incorrectly Detected (False Positives):  11
Fraudulent Transactions Missed (False Negatives):  32
Fraudulent Transactions Detected (True Positives):  75
Total Fraudulent Transactions:  107

png

如果模型完美地预测了所有内容,则这是一个对角矩阵,其中偏离主对角线的值(表示不正确的预测)将为零。在这种情况下,矩阵会显示您的假正例相对较少,这意味着被错误标记的合法交易相对较少。但是,您可能希望得到更少的假负例,即使这会增加假正例的数量。这种权衡可能更加可取,因为假负例允许进行欺诈交易,而假正例可能导致向客户发送电子邮件,要求他们验证自己的信用卡活动。

绘制 ROC

现在绘制 ROC。此绘图非常有用,因为它一目了然地显示了模型只需通过调整输出阈值就能达到的性能范围。

def plot_roc(name, labels, predictions, **kwargs):
  fp, tp, _ = sklearn.metrics.roc_curve(labels, predictions)

  plt.plot(100*fp, 100*tp, label=name, linewidth=2, **kwargs)
  plt.xlabel('False positives [%]')
  plt.ylabel('True positives [%]')
  plt.xlim([-0.5,20])
  plt.ylim([80,100.5])
  plt.grid(True)
  ax = plt.gca()
  ax.set_aspect('equal')
plot_roc("Train Baseline", train_labels, train_predictions_baseline, color=colors[0])
plot_roc("Test Baseline", test_labels, test_predictions_baseline, color=colors[0], linestyle='--')
plt.legend(loc='lower right');

png

绘制 AUPRC

现在绘制 AUPRC。内插精确率-召回率曲线的下方面积,通过为分类阈值的不同值绘制(召回率、精确率)点获得。根据计算方式,PR AUC 可能相当于模型的平均精确率。

def plot_prc(name, labels, predictions, **kwargs):
    precision, recall, _ = sklearn.metrics.precision_recall_curve(labels, predictions)

    plt.plot(precision, recall, label=name, linewidth=2, **kwargs)
    plt.xlabel('Precision')
    plt.ylabel('Recall')
    plt.grid(True)
    ax = plt.gca()
    ax.set_aspect('equal')
plot_prc("Train Baseline", train_labels, train_predictions_baseline, color=colors[0])
plot_prc("Test Baseline", test_labels, test_predictions_baseline, color=colors[0], linestyle='--')
plt.legend(loc='lower right');

png

看起来精确率相对较高,但是召回率和 ROC 曲线下方面积 (AUC) 可能并没有您期望的那么高。当试图同时最大限度地提高精确率和召回率时,分类器通常会面临挑战,在处理不平衡数据集时尤其如此。请务必根据您所关心的问题来考虑不同类型错误的代价。在此示例中,假负例(漏掉欺诈交易)可能造成财务损失,而假正例(将交易错误地标记为欺诈)则可能降低用户满意度。

类权重

计算类权重

我们的目标是识别欺诈交易,但您没有很多可以使用的此类正样本,因此您希望分类器提高可用的少数样本的权重。为此,您可以使用参数将 Keras 权重传递给每个类。这些权重将使模型“更加关注”来自代表不足的类的样本。

# Scaling by total/2 helps keep the loss to a similar magnitude.
# The sum of the weights of all examples stays the same.
weight_for_0 = (1 / neg)*(total)/2.0 
weight_for_1 = (1 / pos)*(total)/2.0

class_weight = {0: weight_for_0, 1: weight_for_1}

print('Weight for class 0: {:.2f}'.format(weight_for_0))
print('Weight for class 1: {:.2f}'.format(weight_for_1))
Weight for class 0: 0.50
Weight for class 1: 289.44

使用类权重训练模型

现在,尝试使用类权重对模型进行重新训练和评估,以了解其对预测的影响。

注:使用 class_weights 会改变损失范围。这可能会影响训练的稳定性,具体取决于优化器。步长取决于梯度大小的优化器(如 optimizers.SGD)可能会失效。此处使用的优化器(optimizers.Adam)不受缩放更改的影响。还要注意,由于加权,两个模型之间的总损失不具可比性。

weighted_model = make_model()
weighted_model.load_weights(initial_weights)

weighted_history = weighted_model.fit(
    train_features,
    train_labels,
    batch_size=BATCH_SIZE,
    epochs=EPOCHS,
    callbacks = [early_stopping],
    validation_data=(val_features, val_labels),
    # The class weights go here
    class_weight=class_weight)
Epoch 1/100
90/90 [==============================] - 2s 11ms/step - loss: 3.0544 - tp: 117.0000 - fp: 267.0000 - tn: 238541.0000 - fn: 313.0000 - accuracy: 0.9976 - precision: 0.3047 - recall: 0.2721 - auc: 0.7178 - prc: 0.1929 - val_loss: 0.0087 - val_tp: 15.0000 - val_fp: 4.0000 - val_tn: 45503.0000 - val_fn: 47.0000 - val_accuracy: 0.9989 - val_precision: 0.7895 - val_recall: 0.2419 - val_auc: 0.8394 - val_prc: 0.3448
Epoch 2/100
90/90 [==============================] - 0s 5ms/step - loss: 1.2871 - tp: 149.0000 - fp: 725.0000 - tn: 181228.0000 - fn: 174.0000 - accuracy: 0.9951 - precision: 0.1705 - recall: 0.4613 - auc: 0.8389 - prc: 0.2961 - val_loss: 0.0099 - val_tp: 40.0000 - val_fp: 14.0000 - val_tn: 45493.0000 - val_fn: 22.0000 - val_accuracy: 0.9992 - val_precision: 0.7407 - val_recall: 0.6452 - val_auc: 0.9212 - val_prc: 0.6390
Epoch 3/100
90/90 [==============================] - 0s 5ms/step - loss: 0.7951 - tp: 214.0000 - fp: 1383.0000 - tn: 180570.0000 - fn: 109.0000 - accuracy: 0.9918 - precision: 0.1340 - recall: 0.6625 - auc: 0.8920 - prc: 0.4301 - val_loss: 0.0137 - val_tp: 49.0000 - val_fp: 29.0000 - val_tn: 45478.0000 - val_fn: 13.0000 - val_accuracy: 0.9991 - val_precision: 0.6282 - val_recall: 0.7903 - val_auc: 0.9424 - val_prc: 0.6984
Epoch 4/100
90/90 [==============================] - 0s 5ms/step - loss: 0.6179 - tp: 228.0000 - fp: 2211.0000 - tn: 179742.0000 - fn: 95.0000 - accuracy: 0.9873 - precision: 0.0935 - recall: 0.7059 - auc: 0.9169 - prc: 0.4223 - val_loss: 0.0194 - val_tp: 49.0000 - val_fp: 82.0000 - val_tn: 45425.0000 - val_fn: 13.0000 - val_accuracy: 0.9979 - val_precision: 0.3740 - val_recall: 0.7903 - val_auc: 0.9566 - val_prc: 0.7149
Epoch 5/100
90/90 [==============================] - 0s 5ms/step - loss: 0.5532 - tp: 242.0000 - fp: 3212.0000 - tn: 178741.0000 - fn: 81.0000 - accuracy: 0.9819 - precision: 0.0701 - recall: 0.7492 - auc: 0.9203 - prc: 0.3818 - val_loss: 0.0252 - val_tp: 52.0000 - val_fp: 168.0000 - val_tn: 45339.0000 - val_fn: 10.0000 - val_accuracy: 0.9961 - val_precision: 0.2364 - val_recall: 0.8387 - val_auc: 0.9699 - val_prc: 0.7220
Epoch 6/100
90/90 [==============================] - 0s 5ms/step - loss: 0.4612 - tp: 253.0000 - fp: 4060.0000 - tn: 177893.0000 - fn: 70.0000 - accuracy: 0.9773 - precision: 0.0587 - recall: 0.7833 - auc: 0.9383 - prc: 0.3516 - val_loss: 0.0333 - val_tp: 52.0000 - val_fp: 294.0000 - val_tn: 45213.0000 - val_fn: 10.0000 - val_accuracy: 0.9933 - val_precision: 0.1503 - val_recall: 0.8387 - val_auc: 0.9794 - val_prc: 0.7146
Epoch 7/100
90/90 [==============================] - 0s 5ms/step - loss: 0.5076 - tp: 251.0000 - fp: 5167.0000 - tn: 176786.0000 - fn: 72.0000 - accuracy: 0.9713 - precision: 0.0463 - recall: 0.7771 - auc: 0.9198 - prc: 0.3070 - val_loss: 0.0413 - val_tp: 54.0000 - val_fp: 400.0000 - val_tn: 45107.0000 - val_fn: 8.0000 - val_accuracy: 0.9910 - val_precision: 0.1189 - val_recall: 0.8710 - val_auc: 0.9813 - val_prc: 0.7105
Epoch 8/100
90/90 [==============================] - 0s 5ms/step - loss: 0.4176 - tp: 263.0000 - fp: 5837.0000 - tn: 176116.0000 - fn: 60.0000 - accuracy: 0.9676 - precision: 0.0431 - recall: 0.8142 - auc: 0.9311 - prc: 0.2726 - val_loss: 0.0472 - val_tp: 54.0000 - val_fp: 470.0000 - val_tn: 45037.0000 - val_fn: 8.0000 - val_accuracy: 0.9895 - val_precision: 0.1031 - val_recall: 0.8710 - val_auc: 0.9831 - val_prc: 0.6863
Epoch 9/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3858 - tp: 260.0000 - fp: 6518.0000 - tn: 175435.0000 - fn: 63.0000 - accuracy: 0.9639 - precision: 0.0384 - recall: 0.8050 - auc: 0.9440 - prc: 0.2364 - val_loss: 0.0561 - val_tp: 54.0000 - val_fp: 617.0000 - val_tn: 44890.0000 - val_fn: 8.0000 - val_accuracy: 0.9863 - val_precision: 0.0805 - val_recall: 0.8710 - val_auc: 0.9866 - val_prc: 0.6177
Epoch 10/100
90/90 [==============================] - 0s 5ms/step - loss: 0.4043 - tp: 270.0000 - fp: 7311.0000 - tn: 174642.0000 - fn: 53.0000 - accuracy: 0.9596 - precision: 0.0356 - recall: 0.8359 - auc: 0.9309 - prc: 0.2219 - val_loss: 0.0613 - val_tp: 55.0000 - val_fp: 664.0000 - val_tn: 44843.0000 - val_fn: 7.0000 - val_accuracy: 0.9853 - val_precision: 0.0765 - val_recall: 0.8871 - val_auc: 0.9875 - val_prc: 0.6111
Epoch 11/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3505 - tp: 275.0000 - fp: 7558.0000 - tn: 174395.0000 - fn: 48.0000 - accuracy: 0.9583 - precision: 0.0351 - recall: 0.8514 - auc: 0.9441 - prc: 0.2239 - val_loss: 0.0636 - val_tp: 55.0000 - val_fp: 686.0000 - val_tn: 44821.0000 - val_fn: 7.0000 - val_accuracy: 0.9848 - val_precision: 0.0742 - val_recall: 0.8871 - val_auc: 0.9881 - val_prc: 0.5893
Epoch 12/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3636 - tp: 273.0000 - fp: 7711.0000 - tn: 174242.0000 - fn: 50.0000 - accuracy: 0.9574 - precision: 0.0342 - recall: 0.8452 - auc: 0.9411 - prc: 0.2117 - val_loss: 0.0687 - val_tp: 57.0000 - val_fp: 720.0000 - val_tn: 44787.0000 - val_fn: 5.0000 - val_accuracy: 0.9841 - val_precision: 0.0734 - val_recall: 0.9194 - val_auc: 0.9899 - val_prc: 0.5428
Epoch 13/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3460 - tp: 280.0000 - fp: 7762.0000 - tn: 174191.0000 - fn: 43.0000 - accuracy: 0.9572 - precision: 0.0348 - recall: 0.8669 - auc: 0.9426 - prc: 0.2167 - val_loss: 0.0686 - val_tp: 57.0000 - val_fp: 700.0000 - val_tn: 44807.0000 - val_fn: 5.0000 - val_accuracy: 0.9845 - val_precision: 0.0753 - val_recall: 0.9194 - val_auc: 0.9903 - val_prc: 0.5322
Epoch 14/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3031 - tp: 282.0000 - fp: 7653.0000 - tn: 174300.0000 - fn: 41.0000 - accuracy: 0.9578 - precision: 0.0355 - recall: 0.8731 - auc: 0.9552 - prc: 0.2294 - val_loss: 0.0703 - val_tp: 57.0000 - val_fp: 705.0000 - val_tn: 44802.0000 - val_fn: 5.0000 - val_accuracy: 0.9844 - val_precision: 0.0748 - val_recall: 0.9194 - val_auc: 0.9905 - val_prc: 0.5162
Epoch 15/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3166 - tp: 283.0000 - fp: 7675.0000 - tn: 174278.0000 - fn: 40.0000 - accuracy: 0.9577 - precision: 0.0356 - recall: 0.8762 - auc: 0.9485 - prc: 0.2119 - val_loss: 0.0712 - val_tp: 57.0000 - val_fp: 710.0000 - val_tn: 44797.0000 - val_fn: 5.0000 - val_accuracy: 0.9843 - val_precision: 0.0743 - val_recall: 0.9194 - val_auc: 0.9907 - val_prc: 0.5222
Epoch 16/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2911 - tp: 284.0000 - fp: 7798.0000 - tn: 174155.0000 - fn: 39.0000 - accuracy: 0.9570 - precision: 0.0351 - recall: 0.8793 - auc: 0.9565 - prc: 0.2035 - val_loss: 0.0729 - val_tp: 57.0000 - val_fp: 721.0000 - val_tn: 44786.0000 - val_fn: 5.0000 - val_accuracy: 0.9841 - val_precision: 0.0733 - val_recall: 0.9194 - val_auc: 0.9908 - val_prc: 0.5133
Epoch 17/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2892 - tp: 281.0000 - fp: 7640.0000 - tn: 174313.0000 - fn: 42.0000 - accuracy: 0.9579 - precision: 0.0355 - recall: 0.8700 - auc: 0.9591 - prc: 0.2057 - val_loss: 0.0745 - val_tp: 57.0000 - val_fp: 739.0000 - val_tn: 44768.0000 - val_fn: 5.0000 - val_accuracy: 0.9837 - val_precision: 0.0716 - val_recall: 0.9194 - val_auc: 0.9910 - val_prc: 0.5185
Epoch 18/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3260 - tp: 279.0000 - fp: 7687.0000 - tn: 174266.0000 - fn: 44.0000 - accuracy: 0.9576 - precision: 0.0350 - recall: 0.8638 - auc: 0.9476 - prc: 0.2022 - val_loss: 0.0745 - val_tp: 57.0000 - val_fp: 738.0000 - val_tn: 44769.0000 - val_fn: 5.0000 - val_accuracy: 0.9837 - val_precision: 0.0717 - val_recall: 0.9194 - val_auc: 0.9920 - val_prc: 0.5078
Epoch 19/100
90/90 [==============================] - 0s 6ms/step - loss: 0.2705 - tp: 290.0000 - fp: 7517.0000 - tn: 174436.0000 - fn: 33.0000 - accuracy: 0.9586 - precision: 0.0371 - recall: 0.8978 - auc: 0.9570 - prc: 0.2163 - val_loss: 0.0733 - val_tp: 57.0000 - val_fp: 722.0000 - val_tn: 44785.0000 - val_fn: 5.0000 - val_accuracy: 0.9840 - val_precision: 0.0732 - val_recall: 0.9194 - val_auc: 0.9922 - val_prc: 0.5083
Epoch 20/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2920 - tp: 281.0000 - fp: 7354.0000 - tn: 174599.0000 - fn: 42.0000 - accuracy: 0.9594 - precision: 0.0368 - recall: 0.8700 - auc: 0.9557 - prc: 0.2178 - val_loss: 0.0729 - val_tp: 57.0000 - val_fp: 721.0000 - val_tn: 44786.0000 - val_fn: 5.0000 - val_accuracy: 0.9841 - val_precision: 0.0733 - val_recall: 0.9194 - val_auc: 0.9923 - val_prc: 0.5088
Epoch 21/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3080 - tp: 279.0000 - fp: 7332.0000 - tn: 174621.0000 - fn: 44.0000 - accuracy: 0.9595 - precision: 0.0367 - recall: 0.8638 - auc: 0.9513 - prc: 0.2143 - val_loss: 0.0754 - val_tp: 57.0000 - val_fp: 746.0000 - val_tn: 44761.0000 - val_fn: 5.0000 - val_accuracy: 0.9835 - val_precision: 0.0710 - val_recall: 0.9194 - val_auc: 0.9930 - val_prc: 0.4993
Epoch 22/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2502 - tp: 285.0000 - fp: 7372.0000 - tn: 174581.0000 - fn: 38.0000 - accuracy: 0.9593 - precision: 0.0372 - recall: 0.8824 - auc: 0.9654 - prc: 0.2125 - val_loss: 0.0780 - val_tp: 58.0000 - val_fp: 777.0000 - val_tn: 44730.0000 - val_fn: 4.0000 - val_accuracy: 0.9829 - val_precision: 0.0695 - val_recall: 0.9355 - val_auc: 0.9923 - val_prc: 0.4887
Epoch 23/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2459 - tp: 286.0000 - fp: 7305.0000 - tn: 174648.0000 - fn: 37.0000 - accuracy: 0.9597 - precision: 0.0377 - recall: 0.8854 - auc: 0.9657 - prc: 0.2204 - val_loss: 0.0749 - val_tp: 58.0000 - val_fp: 730.0000 - val_tn: 44777.0000 - val_fn: 4.0000 - val_accuracy: 0.9839 - val_precision: 0.0736 - val_recall: 0.9355 - val_auc: 0.9925 - val_prc: 0.4943
Epoch 24/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2825 - tp: 290.0000 - fp: 6816.0000 - tn: 175137.0000 - fn: 33.0000 - accuracy: 0.9624 - precision: 0.0408 - recall: 0.8978 - auc: 0.9505 - prc: 0.2292 - val_loss: 0.0703 - val_tp: 57.0000 - val_fp: 692.0000 - val_tn: 44815.0000 - val_fn: 5.0000 - val_accuracy: 0.9847 - val_precision: 0.0761 - val_recall: 0.9194 - val_auc: 0.9927 - val_prc: 0.5167
Epoch 25/100
90/90 [==============================] - 0s 5ms/step - loss: 0.3076 - tp: 278.0000 - fp: 6451.0000 - tn: 175502.0000 - fn: 45.0000 - accuracy: 0.9644 - precision: 0.0413 - recall: 0.8607 - auc: 0.9524 - prc: 0.2296 - val_loss: 0.0719 - val_tp: 58.0000 - val_fp: 701.0000 - val_tn: 44806.0000 - val_fn: 4.0000 - val_accuracy: 0.9845 - val_precision: 0.0764 - val_recall: 0.9355 - val_auc: 0.9926 - val_prc: 0.5167
Epoch 26/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2743 - tp: 279.0000 - fp: 6815.0000 - tn: 175138.0000 - fn: 44.0000 - accuracy: 0.9624 - precision: 0.0393 - recall: 0.8638 - auc: 0.9625 - prc: 0.2236 - val_loss: 0.0762 - val_tp: 58.0000 - val_fp: 760.0000 - val_tn: 44747.0000 - val_fn: 4.0000 - val_accuracy: 0.9832 - val_precision: 0.0709 - val_recall: 0.9355 - val_auc: 0.9925 - val_prc: 0.4897
Epoch 27/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2353 - tp: 287.0000 - fp: 7110.0000 - tn: 174843.0000 - fn: 36.0000 - accuracy: 0.9608 - precision: 0.0388 - recall: 0.8885 - auc: 0.9683 - prc: 0.2186 - val_loss: 0.0777 - val_tp: 58.0000 - val_fp: 775.0000 - val_tn: 44732.0000 - val_fn: 4.0000 - val_accuracy: 0.9829 - val_precision: 0.0696 - val_recall: 0.9355 - val_auc: 0.9931 - val_prc: 0.4753
Epoch 28/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2370 - tp: 288.0000 - fp: 6812.0000 - tn: 175141.0000 - fn: 35.0000 - accuracy: 0.9624 - precision: 0.0406 - recall: 0.8916 - auc: 0.9672 - prc: 0.2298 - val_loss: 0.0752 - val_tp: 58.0000 - val_fp: 749.0000 - val_tn: 44758.0000 - val_fn: 4.0000 - val_accuracy: 0.9835 - val_precision: 0.0719 - val_recall: 0.9355 - val_auc: 0.9925 - val_prc: 0.4855
Epoch 29/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2516 - tp: 285.0000 - fp: 6771.0000 - tn: 175182.0000 - fn: 38.0000 - accuracy: 0.9626 - precision: 0.0404 - recall: 0.8824 - auc: 0.9679 - prc: 0.2265 - val_loss: 0.0804 - val_tp: 58.0000 - val_fp: 810.0000 - val_tn: 44697.0000 - val_fn: 4.0000 - val_accuracy: 0.9821 - val_precision: 0.0668 - val_recall: 0.9355 - val_auc: 0.9923 - val_prc: 0.4532
Epoch 30/100
90/90 [==============================] - 0s 6ms/step - loss: 0.2844 - tp: 279.0000 - fp: 6930.0000 - tn: 175023.0000 - fn: 44.0000 - accuracy: 0.9617 - precision: 0.0387 - recall: 0.8638 - auc: 0.9583 - prc: 0.2226 - val_loss: 0.0815 - val_tp: 59.0000 - val_fp: 836.0000 - val_tn: 44671.0000 - val_fn: 3.0000 - val_accuracy: 0.9816 - val_precision: 0.0659 - val_recall: 0.9516 - val_auc: 0.9923 - val_prc: 0.4532
Epoch 31/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2346 - tp: 287.0000 - fp: 6929.0000 - tn: 175024.0000 - fn: 36.0000 - accuracy: 0.9618 - precision: 0.0398 - recall: 0.8885 - auc: 0.9682 - prc: 0.2284 - val_loss: 0.0777 - val_tp: 58.0000 - val_fp: 782.0000 - val_tn: 44725.0000 - val_fn: 4.0000 - val_accuracy: 0.9828 - val_precision: 0.0690 - val_recall: 0.9355 - val_auc: 0.9924 - val_prc: 0.4714
Epoch 32/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2336 - tp: 287.0000 - fp: 6727.0000 - tn: 175226.0000 - fn: 36.0000 - accuracy: 0.9629 - precision: 0.0409 - recall: 0.8885 - auc: 0.9689 - prc: 0.2463 - val_loss: 0.0789 - val_tp: 59.0000 - val_fp: 806.0000 - val_tn: 44701.0000 - val_fn: 3.0000 - val_accuracy: 0.9822 - val_precision: 0.0682 - val_recall: 0.9516 - val_auc: 0.9931 - val_prc: 0.4717
Epoch 33/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2275 - tp: 287.0000 - fp: 6485.0000 - tn: 175468.0000 - fn: 36.0000 - accuracy: 0.9642 - precision: 0.0424 - recall: 0.8885 - auc: 0.9708 - prc: 0.2569 - val_loss: 0.0750 - val_tp: 58.0000 - val_fp: 748.0000 - val_tn: 44759.0000 - val_fn: 4.0000 - val_accuracy: 0.9835 - val_precision: 0.0720 - val_recall: 0.9355 - val_auc: 0.9925 - val_prc: 0.4868
Epoch 34/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2112 - tp: 293.0000 - fp: 6540.0000 - tn: 175413.0000 - fn: 30.0000 - accuracy: 0.9640 - precision: 0.0429 - recall: 0.9071 - auc: 0.9725 - prc: 0.2631 - val_loss: 0.0749 - val_tp: 58.0000 - val_fp: 743.0000 - val_tn: 44764.0000 - val_fn: 4.0000 - val_accuracy: 0.9836 - val_precision: 0.0724 - val_recall: 0.9355 - val_auc: 0.9932 - val_prc: 0.4872
Epoch 35/100
90/90 [==============================] - 0s 5ms/step - loss: 0.1742 - tp: 295.0000 - fp: 6452.0000 - tn: 175501.0000 - fn: 28.0000 - accuracy: 0.9644 - precision: 0.0437 - recall: 0.9133 - auc: 0.9836 - prc: 0.2557 - val_loss: 0.0735 - val_tp: 58.0000 - val_fp: 737.0000 - val_tn: 44770.0000 - val_fn: 4.0000 - val_accuracy: 0.9837 - val_precision: 0.0730 - val_recall: 0.9355 - val_auc: 0.9932 - val_prc: 0.4979
Epoch 36/100
90/90 [==============================] - 0s 5ms/step - loss: 0.1936 - tp: 292.0000 - fp: 6315.0000 - tn: 175638.0000 - fn: 31.0000 - accuracy: 0.9652 - precision: 0.0442 - recall: 0.9040 - auc: 0.9780 - prc: 0.2886 - val_loss: 0.0710 - val_tp: 58.0000 - val_fp: 689.0000 - val_tn: 44818.0000 - val_fn: 4.0000 - val_accuracy: 0.9848 - val_precision: 0.0776 - val_recall: 0.9355 - val_auc: 0.9939 - val_prc: 0.5025
Epoch 37/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2613 - tp: 278.0000 - fp: 6396.0000 - tn: 175557.0000 - fn: 45.0000 - accuracy: 0.9647 - precision: 0.0417 - recall: 0.8607 - auc: 0.9633 - prc: 0.2542 - val_loss: 0.0746 - val_tp: 58.0000 - val_fp: 737.0000 - val_tn: 44770.0000 - val_fn: 4.0000 - val_accuracy: 0.9837 - val_precision: 0.0730 - val_recall: 0.9355 - val_auc: 0.9937 - val_prc: 0.4918
Epoch 38/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2671 - tp: 283.0000 - fp: 6465.0000 - tn: 175488.0000 - fn: 40.0000 - accuracy: 0.9643 - precision: 0.0419 - recall: 0.8762 - auc: 0.9557 - prc: 0.2458 - val_loss: 0.0739 - val_tp: 58.0000 - val_fp: 726.0000 - val_tn: 44781.0000 - val_fn: 4.0000 - val_accuracy: 0.9840 - val_precision: 0.0740 - val_recall: 0.9355 - val_auc: 0.9938 - val_prc: 0.4969
Epoch 39/100
90/90 [==============================] - 1s 6ms/step - loss: 0.2119 - tp: 296.0000 - fp: 5810.0000 - tn: 176143.0000 - fn: 27.0000 - accuracy: 0.9680 - precision: 0.0485 - recall: 0.9164 - auc: 0.9685 - prc: 0.2967 - val_loss: 0.0657 - val_tp: 58.0000 - val_fp: 631.0000 - val_tn: 44876.0000 - val_fn: 4.0000 - val_accuracy: 0.9861 - val_precision: 0.0842 - val_recall: 0.9355 - val_auc: 0.9940 - val_prc: 0.5188
Epoch 40/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2004 - tp: 294.0000 - fp: 5516.0000 - tn: 176437.0000 - fn: 29.0000 - accuracy: 0.9696 - precision: 0.0506 - recall: 0.9102 - auc: 0.9755 - prc: 0.3311 - val_loss: 0.0661 - val_tp: 58.0000 - val_fp: 638.0000 - val_tn: 44869.0000 - val_fn: 4.0000 - val_accuracy: 0.9859 - val_precision: 0.0833 - val_recall: 0.9355 - val_auc: 0.9935 - val_prc: 0.5144
Epoch 41/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2000 - tp: 294.0000 - fp: 5583.0000 - tn: 176370.0000 - fn: 29.0000 - accuracy: 0.9692 - precision: 0.0500 - recall: 0.9102 - auc: 0.9717 - prc: 0.3051 - val_loss: 0.0643 - val_tp: 58.0000 - val_fp: 626.0000 - val_tn: 44881.0000 - val_fn: 4.0000 - val_accuracy: 0.9862 - val_precision: 0.0848 - val_recall: 0.9355 - val_auc: 0.9941 - val_prc: 0.5312
Epoch 42/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2316 - tp: 286.0000 - fp: 5605.0000 - tn: 176348.0000 - fn: 37.0000 - accuracy: 0.9690 - precision: 0.0485 - recall: 0.8854 - auc: 0.9688 - prc: 0.3018 - val_loss: 0.0670 - val_tp: 58.0000 - val_fp: 649.0000 - val_tn: 44858.0000 - val_fn: 4.0000 - val_accuracy: 0.9857 - val_precision: 0.0820 - val_recall: 0.9355 - val_auc: 0.9940 - val_prc: 0.5310
Epoch 43/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2384 - tp: 290.0000 - fp: 5628.0000 - tn: 176325.0000 - fn: 33.0000 - accuracy: 0.9689 - precision: 0.0490 - recall: 0.8978 - auc: 0.9626 - prc: 0.3067 - val_loss: 0.0659 - val_tp: 58.0000 - val_fp: 644.0000 - val_tn: 44863.0000 - val_fn: 4.0000 - val_accuracy: 0.9858 - val_precision: 0.0826 - val_recall: 0.9355 - val_auc: 0.9940 - val_prc: 0.5368
Epoch 44/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2075 - tp: 291.0000 - fp: 5661.0000 - tn: 176292.0000 - fn: 32.0000 - accuracy: 0.9688 - precision: 0.0489 - recall: 0.9009 - auc: 0.9719 - prc: 0.2974 - val_loss: 0.0674 - val_tp: 58.0000 - val_fp: 663.0000 - val_tn: 44844.0000 - val_fn: 4.0000 - val_accuracy: 0.9854 - val_precision: 0.0804 - val_recall: 0.9355 - val_auc: 0.9940 - val_prc: 0.5437
Epoch 45/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2102 - tp: 289.0000 - fp: 5726.0000 - tn: 176227.0000 - fn: 34.0000 - accuracy: 0.9684 - precision: 0.0480 - recall: 0.8947 - auc: 0.9709 - prc: 0.3091 - val_loss: 0.0698 - val_tp: 58.0000 - val_fp: 697.0000 - val_tn: 44810.0000 - val_fn: 4.0000 - val_accuracy: 0.9846 - val_precision: 0.0768 - val_recall: 0.9355 - val_auc: 0.9939 - val_prc: 0.5209
Epoch 46/100
90/90 [==============================] - 0s 5ms/step - loss: 0.2114 - tp: 288.0000 - fp: 5755.0000 - tn: 176198.0000 - fn: 35.0000 - accuracy: 0.9682 - precision: 0.0477 - recall: 0.8916 - auc: 0.9739 - prc: 0.2887 - val_loss: 0.0682 - val_tp: 58.0000 - val_fp: 674.0000 - val_tn: 44833.0000 - val_fn: 4.0000 - val_accuracy: 0.9851 - val_precision: 0.0792 - val_recall: 0.9355 - val_auc: 0.9940 - val_prc: 0.5264
Epoch 47/100
90/90 [==============================] - 0s 6ms/step - loss: 0.2127 - tp: 287.0000 - fp: 5897.0000 - tn: 176056.0000 - fn: 36.0000 - accuracy: 0.9675 - precision: 0.0464 - recall: 0.8885 - auc: 0.9732 - prc: 0.2851 - val_loss: 0.0709 - val_tp: 59.0000 - val_fp: 722.0000 - val_tn: 44785.0000 - val_fn: 3.0000 - val_accuracy: 0.9841 - val_precision: 0.0755 - val_recall: 0.9516 - val_auc: 0.9939 - val_prc: 0.5151
Epoch 48/100
90/90 [==============================] - 0s 5ms/step - loss: 0.1867 - tp: 292.0000 - fp: 5726.0000 - tn: 176227.0000 - fn: 31.0000 - accuracy: 0.9684 - precision: 0.0485 - recall: 0.9040 - auc: 0.9796 - prc: 0.2983 - val_loss: 0.0685 - val_tp: 58.0000 - val_fp: 698.0000 - val_tn: 44809.0000 - val_fn: 4.0000 - val_accuracy: 0.9846 - val_precision: 0.0767 - val_recall: 0.9355 - val_auc: 0.9935 - val_prc: 0.5268
Epoch 49/100
90/90 [==============================] - 0s 5ms/step - loss: 0.1625 - tp: 298.0000 - fp: 5454.0000 - tn: 176499.0000 - fn: 25.0000 - accuracy: 0.9699 - precision: 0.0518 - recall: 0.9226 - auc: 0.9832 - prc: 0.3148 - val_loss: 0.0663 - val_tp: 58.0000 - val_fp: 662.0000 - val_tn: 44845.0000 - val_fn: 4.0000 - val_accuracy: 0.9854 - val_precision: 0.0806 - val_recall: 0.9355 - val_auc: 0.9936 - val_prc: 0.5326
Epoch 50/100
90/90 [==============================] - 0s 5ms/step - loss: 0.1926 - tp: 292.0000 - fp: 5470.0000 - tn: 176483.0000 - fn: 31.0000 - accuracy: 0.9698 - precision: 0.0507 - recall: 0.9040 - auc: 0.9791 - prc: 0.3001 - val_loss: 0.0680 - val_tp: 57.0000 - val_fp: 693.0000 - val_tn: 44814.0000 - val_fn: 5.0000 - val_accuracy: 0.9847 - val_precision: 0.0760 - val_recall: 0.9194 - val_auc: 0.9940 - val_prc: 0.5270
Epoch 51/100
85/90 [===========================>..] - ETA: 0s - loss: 0.1912 - tp: 284.0000 - fp: 5282.0000 - tn: 168487.0000 - fn: 27.0000 - accuracy: 0.9695 - precision: 0.0510 - recall: 0.9132 - auc: 0.9767 - prc: 0.3209Restoring model weights from the end of the best epoch: 41.
90/90 [==============================] - 0s 5ms/step - loss: 0.1926 - tp: 294.0000 - fp: 5533.0000 - tn: 176420.0000 - fn: 29.0000 - accuracy: 0.9695 - precision: 0.0505 - recall: 0.9102 - auc: 0.9766 - prc: 0.3170 - val_loss: 0.0649 - val_tp: 57.0000 - val_fp: 645.0000 - val_tn: 44862.0000 - val_fn: 5.0000 - val_accuracy: 0.9857 - val_precision: 0.0812 - val_recall: 0.9194 - val_auc: 0.9936 - val_prc: 0.5441
Epoch 51: early stopping

查看训练历史记录

plot_metrics(weighted_history)

png

评估指标

train_predictions_weighted = weighted_model.predict(train_features, batch_size=BATCH_SIZE)
test_predictions_weighted = weighted_model.predict(test_features, batch_size=BATCH_SIZE)
90/90 [==============================] - 0s 1ms/step
28/28 [==============================] - 0s 1ms/step
weighted_results = weighted_model.evaluate(test_features, test_labels,
                                           batch_size=BATCH_SIZE, verbose=0)
for name, value in zip(weighted_model.metrics_names, weighted_results):
  print(name, ': ', value)
print()

plot_cm(test_labels, test_predictions_weighted)
loss :  0.06824768334627151
tp :  94.0
fp :  792.0
tn :  56063.0
fn :  13.0
accuracy :  0.9858677983283997
precision :  0.10609480738639832
recall :  0.8785046935081482
auc :  0.969123363494873
prc :  0.5543282628059387

Legitimate Transactions Detected (True Negatives):  56063
Legitimate Transactions Incorrectly Detected (False Positives):  792
Fraudulent Transactions Missed (False Negatives):  13
Fraudulent Transactions Detected (True Positives):  94
Total Fraudulent Transactions:  107

png

在这里,您可以看到,使用类权重时,由于存在更多假正例,准确率和精确率较低,但是相反,由于模型也找到了更多真正例,召回率和 AUC 较高。尽管准确率较低,但是此模型具有较高的召回率(且识别出了更多欺诈交易)。当然,两种类型的错误都有代价(您也不希望因将过多合法交易标记为欺诈来打扰客户)。请在应用时认真权衡这些不同类型的错误。

绘制 ROC

plot_roc("Train Baseline", train_labels, train_predictions_baseline, color=colors[0])
plot_roc("Test Baseline", test_labels, test_predictions_baseline, color=colors[0], linestyle='--')

plot_roc("Train Weighted", train_labels, train_predictions_weighted, color=colors[1])
plot_roc("Test Weighted", test_labels, test_predictions_weighted, color=colors[1], linestyle='--')


plt.legend(loc='lower right');

png

绘制 AUPRC

plot_prc("Train Baseline", train_labels, train_predictions_baseline, color=colors[0])
plot_prc("Test Baseline", test_labels, test_predictions_baseline, color=colors[0], linestyle='--')

plot_prc("Train Weighted", train_labels, train_predictions_weighted, color=colors[1])
plot_prc("Test Weighted", test_labels, test_predictions_weighted, color=colors[1], linestyle='--')


plt.legend(loc='lower right');

png

过采样

对占少数的类进行过采样

一种相关方法是通过对占少数的类进行过采样来对数据集进行重新采样。

pos_features = train_features[bool_train_labels]
neg_features = train_features[~bool_train_labels]

pos_labels = train_labels[bool_train_labels]
neg_labels = train_labels[~bool_train_labels]

使用 NumPy

您可以通过从正样本中选择正确数量的随机索引来手动平衡数据集:

ids = np.arange(len(pos_features))
choices = np.random.choice(ids, len(neg_features))

res_pos_features = pos_features[choices]
res_pos_labels = pos_labels[choices]

res_pos_features.shape
(181953, 29)
resampled_features = np.concatenate([res_pos_features, neg_features], axis=0)
resampled_labels = np.concatenate([res_pos_labels, neg_labels], axis=0)

order = np.arange(len(resampled_labels))
np.random.shuffle(order)
resampled_features = resampled_features[order]
resampled_labels = resampled_labels[order]

resampled_features.shape
(363906, 29)

使用 tf.data

如果您使用的是 tf.data,则生成平衡样本最简单的方法是从 positivenegative 数据集开始,然后将它们合并。有关更多示例,请参阅 tf.data 指南

BUFFER_SIZE = 100000

def make_ds(features, labels):
  ds = tf.data.Dataset.from_tensor_slices((features, labels))#.cache()
  ds = ds.shuffle(BUFFER_SIZE).repeat()
  return ds

pos_ds = make_ds(pos_features, pos_labels)
neg_ds = make_ds(neg_features, neg_labels)

每个数据集都会提供 (feature, label) 对:

for features, label in pos_ds.take(1):
  print("Features:\n", features.numpy())
  print()
  print("Label: ", label.numpy())
Features:
 [-2.07840741  0.58089391 -3.34012343  3.5071109  -0.1178388  -2.12072939
 -5.          1.35353276 -2.72516911 -5.          5.         -5.
 -1.82559924 -5.          0.24355001 -5.         -5.         -5.
  3.2355405   0.0782585   1.41074231  0.22912395 -1.63502264 -0.78836674
 -0.58449057 -0.21862085  4.67832086  1.50506599 -1.45299089]

Label:  1

使用 experimental.sample_from_datasets 将二者合并起来:

resampled_ds = tf.data.Dataset.sample_from_datasets([pos_ds, neg_ds], weights=[0.5, 0.5])
resampled_ds = resampled_ds.batch(BATCH_SIZE).prefetch(2)
for features, label in resampled_ds.take(1):
  print(label.numpy().mean())
0.509765625

要使用此数据集,您需要每个周期的步骤数。

在这种情况下,“周期”的定义就不那么明确了。假设它是遍历一次所有负样本所需的批次数量:

resampled_steps_per_epoch = np.ceil(2.0*neg/BATCH_SIZE)
resampled_steps_per_epoch
278.0

在过采样数据上进行训练

现在尝试使用重新采样后的数据集(而非使用类权重)来训练模型,对比一下这两种方法有何区别。

注:因为数据平衡是通过复制正样本实现的,所以数据集的总大小变大了,且每个周期运行的训练步骤也增加了。

resampled_model = make_model()
resampled_model.load_weights(initial_weights)

# Reset the bias to zero, since this dataset is balanced.
output_layer = resampled_model.layers[-1] 
output_layer.bias.assign([0])

val_ds = tf.data.Dataset.from_tensor_slices((val_features, val_labels)).cache()
val_ds = val_ds.batch(BATCH_SIZE).prefetch(2) 

resampled_history = resampled_model.fit(
    resampled_ds,
    epochs=EPOCHS,
    steps_per_epoch=resampled_steps_per_epoch,
    callbacks = [early_stopping],
    validation_data=val_ds)
Epoch 1/100
278/278 [==============================] - 8s 24ms/step - loss: 0.5942 - tp: 217449.0000 - fp: 73226.0000 - tn: 268001.0000 - fn: 67630.0000 - accuracy: 0.7751 - precision: 0.7481 - recall: 0.7628 - auc: 0.8456 - prc: 0.8749 - val_loss: 0.2242 - val_tp: 56.0000 - val_fp: 1803.0000 - val_tn: 43704.0000 - val_fn: 6.0000 - val_accuracy: 0.9603 - val_precision: 0.0301 - val_recall: 0.9032 - val_auc: 0.9777 - val_prc: 0.7398
Epoch 2/100
278/278 [==============================] - 6s 22ms/step - loss: 0.2280 - tp: 252608.0000 - fp: 22427.0000 - tn: 262095.0000 - fn: 32214.0000 - accuracy: 0.9040 - precision: 0.9185 - recall: 0.8869 - auc: 0.9646 - prc: 0.9718 - val_loss: 0.1238 - val_tp: 58.0000 - val_fp: 890.0000 - val_tn: 44617.0000 - val_fn: 4.0000 - val_accuracy: 0.9804 - val_precision: 0.0612 - val_recall: 0.9355 - val_auc: 0.9897 - val_prc: 0.7672
Epoch 3/100
278/278 [==============================] - 6s 22ms/step - loss: 0.1737 - tp: 257274.0000 - fp: 12975.0000 - tn: 271731.0000 - fn: 27364.0000 - accuracy: 0.9291 - precision: 0.9520 - recall: 0.9039 - auc: 0.9803 - prc: 0.9833 - val_loss: 0.0906 - val_tp: 59.0000 - val_fp: 722.0000 - val_tn: 44785.0000 - val_fn: 3.0000 - val_accuracy: 0.9841 - val_precision: 0.0755 - val_recall: 0.9516 - val_auc: 0.9931 - val_prc: 0.7815
Epoch 4/100
278/278 [==============================] - 6s 22ms/step - loss: 0.1483 - tp: 260348.0000 - fp: 9793.0000 - tn: 274810.0000 - fn: 24393.0000 - accuracy: 0.9400 - precision: 0.9637 - recall: 0.9143 - auc: 0.9861 - prc: 0.9878 - val_loss: 0.0755 - val_tp: 59.0000 - val_fp: 660.0000 - val_tn: 44847.0000 - val_fn: 3.0000 - val_accuracy: 0.9855 - val_precision: 0.0821 - val_recall: 0.9516 - val_auc: 0.9945 - val_prc: 0.7813
Epoch 5/100
278/278 [==============================] - 6s 23ms/step - loss: 0.1334 - tp: 262624.0000 - fp: 8486.0000 - tn: 275648.0000 - fn: 22586.0000 - accuracy: 0.9454 - precision: 0.9687 - recall: 0.9208 - auc: 0.9890 - prc: 0.9901 - val_loss: 0.0663 - val_tp: 59.0000 - val_fp: 617.0000 - val_tn: 44890.0000 - val_fn: 3.0000 - val_accuracy: 0.9864 - val_precision: 0.0873 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7461
Epoch 6/100
278/278 [==============================] - 6s 22ms/step - loss: 0.1241 - tp: 262799.0000 - fp: 7943.0000 - tn: 277018.0000 - fn: 21584.0000 - accuracy: 0.9481 - precision: 0.9707 - recall: 0.9241 - auc: 0.9907 - prc: 0.9913 - val_loss: 0.0591 - val_tp: 59.0000 - val_fp: 558.0000 - val_tn: 44949.0000 - val_fn: 3.0000 - val_accuracy: 0.9877 - val_precision: 0.0956 - val_recall: 0.9516 - val_auc: 0.9954 - val_prc: 0.7482
Epoch 7/100
278/278 [==============================] - 6s 23ms/step - loss: 0.1164 - tp: 264003.0000 - fp: 7421.0000 - tn: 277055.0000 - fn: 20865.0000 - accuracy: 0.9503 - precision: 0.9727 - recall: 0.9268 - auc: 0.9922 - prc: 0.9925 - val_loss: 0.0545 - val_tp: 59.0000 - val_fp: 530.0000 - val_tn: 44977.0000 - val_fn: 3.0000 - val_accuracy: 0.9883 - val_precision: 0.1002 - val_recall: 0.9516 - val_auc: 0.9962 - val_prc: 0.7377
Epoch 8/100
278/278 [==============================] - 6s 22ms/step - loss: 0.1082 - tp: 265047.0000 - fp: 6975.0000 - tn: 277742.0000 - fn: 19580.0000 - accuracy: 0.9534 - precision: 0.9744 - recall: 0.9312 - auc: 0.9935 - prc: 0.9935 - val_loss: 0.0493 - val_tp: 58.0000 - val_fp: 505.0000 - val_tn: 45002.0000 - val_fn: 4.0000 - val_accuracy: 0.9888 - val_precision: 0.1030 - val_recall: 0.9355 - val_auc: 0.9966 - val_prc: 0.7386
Epoch 9/100
278/278 [==============================] - 6s 22ms/step - loss: 0.1024 - tp: 265887.0000 - fp: 6678.0000 - tn: 277932.0000 - fn: 18847.0000 - accuracy: 0.9552 - precision: 0.9755 - recall: 0.9338 - auc: 0.9943 - prc: 0.9942 - val_loss: 0.0448 - val_tp: 58.0000 - val_fp: 490.0000 - val_tn: 45017.0000 - val_fn: 4.0000 - val_accuracy: 0.9892 - val_precision: 0.1058 - val_recall: 0.9355 - val_auc: 0.9968 - val_prc: 0.7401
Epoch 10/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0972 - tp: 267711.0000 - fp: 6632.0000 - tn: 277318.0000 - fn: 17683.0000 - accuracy: 0.9573 - precision: 0.9758 - recall: 0.9380 - auc: 0.9949 - prc: 0.9948 - val_loss: 0.0406 - val_tp: 58.0000 - val_fp: 464.0000 - val_tn: 45043.0000 - val_fn: 4.0000 - val_accuracy: 0.9897 - val_precision: 0.1111 - val_recall: 0.9355 - val_auc: 0.9968 - val_prc: 0.7299
Epoch 11/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0919 - tp: 267793.0000 - fp: 6502.0000 - tn: 278251.0000 - fn: 16798.0000 - accuracy: 0.9591 - precision: 0.9763 - recall: 0.9410 - auc: 0.9955 - prc: 0.9953 - val_loss: 0.0383 - val_tp: 58.0000 - val_fp: 449.0000 - val_tn: 45058.0000 - val_fn: 4.0000 - val_accuracy: 0.9901 - val_precision: 0.1144 - val_recall: 0.9355 - val_auc: 0.9968 - val_prc: 0.7198
Epoch 12/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0883 - tp: 268569.0000 - fp: 6626.0000 - tn: 278126.0000 - fn: 16023.0000 - accuracy: 0.9602 - precision: 0.9759 - recall: 0.9437 - auc: 0.9958 - prc: 0.9956 - val_loss: 0.0345 - val_tp: 58.0000 - val_fp: 417.0000 - val_tn: 45090.0000 - val_fn: 4.0000 - val_accuracy: 0.9908 - val_precision: 0.1221 - val_recall: 0.9355 - val_auc: 0.9966 - val_prc: 0.7206
Epoch 13/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0853 - tp: 269144.0000 - fp: 6723.0000 - tn: 278376.0000 - fn: 15101.0000 - accuracy: 0.9617 - precision: 0.9756 - recall: 0.9469 - auc: 0.9961 - prc: 0.9958 - val_loss: 0.0329 - val_tp: 58.0000 - val_fp: 413.0000 - val_tn: 45094.0000 - val_fn: 4.0000 - val_accuracy: 0.9908 - val_precision: 0.1231 - val_recall: 0.9355 - val_auc: 0.9966 - val_prc: 0.7216
Epoch 14/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0818 - tp: 270875.0000 - fp: 6738.0000 - tn: 277592.0000 - fn: 14139.0000 - accuracy: 0.9633 - precision: 0.9757 - recall: 0.9504 - auc: 0.9964 - prc: 0.9961 - val_loss: 0.0310 - val_tp: 58.0000 - val_fp: 410.0000 - val_tn: 45097.0000 - val_fn: 4.0000 - val_accuracy: 0.9909 - val_precision: 0.1239 - val_recall: 0.9355 - val_auc: 0.9963 - val_prc: 0.7231
Epoch 15/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0787 - tp: 270967.0000 - fp: 6801.0000 - tn: 278050.0000 - fn: 13526.0000 - accuracy: 0.9643 - precision: 0.9755 - recall: 0.9525 - auc: 0.9966 - prc: 0.9964 - val_loss: 0.0299 - val_tp: 58.0000 - val_fp: 404.0000 - val_tn: 45103.0000 - val_fn: 4.0000 - val_accuracy: 0.9910 - val_precision: 0.1255 - val_recall: 0.9355 - val_auc: 0.9961 - val_prc: 0.7227
Epoch 16/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0753 - tp: 271791.0000 - fp: 6734.0000 - tn: 278346.0000 - fn: 12473.0000 - accuracy: 0.9663 - precision: 0.9758 - recall: 0.9561 - auc: 0.9969 - prc: 0.9966 - val_loss: 0.0276 - val_tp: 58.0000 - val_fp: 379.0000 - val_tn: 45128.0000 - val_fn: 4.0000 - val_accuracy: 0.9916 - val_precision: 0.1327 - val_recall: 0.9355 - val_auc: 0.9956 - val_prc: 0.7239
Epoch 17/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0724 - tp: 272182.0000 - fp: 6699.0000 - tn: 278370.0000 - fn: 12093.0000 - accuracy: 0.9670 - precision: 0.9760 - recall: 0.9575 - auc: 0.9971 - prc: 0.9968 - val_loss: 0.0257 - val_tp: 58.0000 - val_fp: 368.0000 - val_tn: 45139.0000 - val_fn: 4.0000 - val_accuracy: 0.9918 - val_precision: 0.1362 - val_recall: 0.9355 - val_auc: 0.9883 - val_prc: 0.7231
Epoch 18/100
278/278 [==============================] - 6s 22ms/step - loss: 0.0697 - tp: 272881.0000 - fp: 6827.0000 - tn: 278320.0000 - fn: 11316.0000 - accuracy: 0.9681 - precision: 0.9756 - recall: 0.9602 - auc: 0.9973 - prc: 0.9970 - val_loss: 0.0247 - val_tp: 58.0000 - val_fp: 367.0000 - val_tn: 45140.0000 - val_fn: 4.0000 - val_accuracy: 0.9919 - val_precision: 0.1365 - val_recall: 0.9355 - val_auc: 0.9883 - val_prc: 0.7242
Epoch 19/100
278/278 [==============================] - ETA: 0s - loss: 0.0674 - tp: 274138.0000 - fp: 6777.0000 - tn: 277388.0000 - fn: 11041.0000 - accuracy: 0.9687 - precision: 0.9759 - recall: 0.9613 - auc: 0.9974 - prc: 0.9971Restoring model weights from the end of the best epoch: 9.
278/278 [==============================] - 6s 22ms/step - loss: 0.0674 - tp: 274138.0000 - fp: 6777.0000 - tn: 277388.0000 - fn: 11041.0000 - accuracy: 0.9687 - precision: 0.9759 - recall: 0.9613 - auc: 0.9974 - prc: 0.9971 - val_loss: 0.0244 - val_tp: 58.0000 - val_fp: 383.0000 - val_tn: 45124.0000 - val_fn: 4.0000 - val_accuracy: 0.9915 - val_precision: 0.1315 - val_recall: 0.9355 - val_auc: 0.9883 - val_prc: 0.7233
Epoch 19: early stopping

如果训练过程在每次梯度更新时都考虑整个数据集,那么这种过采样将与类加权基本相同。

但是,当按批次训练模型时(如您在上面所做的那样),过采样的数据将提供更加平滑的梯度信号:不在一个权重较大的批次中显示每个正样本,而是在许多具有较小权重的不同批次中分别显示。

这种更平滑的梯度信号使训练模型变得更加容易。

查看训练历史记录

请注意,此处的指标分布将有所不同,因为训练数据与验证和测试数据的分布完全不同。

plot_metrics(resampled_history )

png

重新训练

由于在平衡数据上训练更加容易,上面的训练过程可能很快就会过拟合。

因此,请打破周期,使 callbacks.EarlyStopping 能够更好地控制停止训练的时间。

resampled_model = make_model()
resampled_model.load_weights(initial_weights)

# Reset the bias to zero, since this dataset is balanced.
output_layer = resampled_model.layers[-1] 
output_layer.bias.assign([0])

resampled_history = resampled_model.fit(
    resampled_ds,
    # These are not real epochs
    steps_per_epoch = 20,
    epochs=10*EPOCHS,
    callbacks = [early_stopping],
    validation_data=(val_ds))
Epoch 1/1000
20/20 [==============================] - 2s 50ms/step - loss: 1.8499 - tp: 6136.0000 - fp: 8418.0000 - tn: 57604.0000 - fn: 14371.0000 - accuracy: 0.7366 - precision: 0.4216 - recall: 0.2992 - auc: 0.7069 - prc: 0.4457 - val_loss: 0.6770 - val_tp: 15.0000 - val_fp: 16830.0000 - val_tn: 28677.0000 - val_fn: 47.0000 - val_accuracy: 0.6296 - val_precision: 8.9047e-04 - val_recall: 0.2419 - val_auc: 0.3590 - val_prc: 0.0260
Epoch 2/1000
20/20 [==============================] - 0s 25ms/step - loss: 1.1067 - tp: 11189.0000 - fp: 8286.0000 - tn: 12181.0000 - fn: 9304.0000 - accuracy: 0.5706 - precision: 0.5745 - recall: 0.5460 - auc: 0.5735 - prc: 0.6885 - val_loss: 0.6708 - val_tp: 43.0000 - val_fp: 16500.0000 - val_tn: 29007.0000 - val_fn: 19.0000 - val_accuracy: 0.6375 - val_precision: 0.0026 - val_recall: 0.6935 - val_auc: 0.7002 - val_prc: 0.1706
Epoch 3/1000
20/20 [==============================] - 0s 25ms/step - loss: 0.8005 - tp: 14043.0000 - fp: 7850.0000 - tn: 12490.0000 - fn: 6577.0000 - accuracy: 0.6478 - precision: 0.6414 - recall: 0.6810 - auc: 0.7054 - prc: 0.7913 - val_loss: 0.6280 - val_tp: 50.0000 - val_fp: 14843.0000 - val_tn: 30664.0000 - val_fn: 12.0000 - val_accuracy: 0.6740 - val_precision: 0.0034 - val_recall: 0.8065 - val_auc: 0.8319 - val_prc: 0.2981
Epoch 4/1000
20/20 [==============================] - 0s 25ms/step - loss: 0.6561 - tp: 15515.0000 - fp: 7307.0000 - tn: 12943.0000 - fn: 5195.0000 - accuracy: 0.6948 - precision: 0.6798 - recall: 0.7492 - auc: 0.7789 - prc: 0.8453 - val_loss: 0.5684 - val_tp: 54.0000 - val_fp: 12490.0000 - val_tn: 33017.0000 - val_fn: 8.0000 - val_accuracy: 0.7257 - val_precision: 0.0043 - val_recall: 0.8710 - val_auc: 0.8853 - val_prc: 0.4567
Epoch 5/1000
20/20 [==============================] - 0s 25ms/step - loss: 0.5556 - tp: 16100.0000 - fp: 6353.0000 - tn: 14172.0000 - fn: 4335.0000 - accuracy: 0.7391 - precision: 0.7171 - recall: 0.7879 - auc: 0.8283 - prc: 0.8788 - val_loss: 0.5071 - val_tp: 54.0000 - val_fp: 10207.0000 - val_tn: 35300.0000 - val_fn: 8.0000 - val_accuracy: 0.7758 - val_precision: 0.0053 - val_recall: 0.8710 - val_auc: 0.9114 - val_prc: 0.5618
Epoch 6/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.4985 - tp: 16451.0000 - fp: 5675.0000 - tn: 14851.0000 - fn: 3983.0000 - accuracy: 0.7642 - precision: 0.7435 - recall: 0.8051 - auc: 0.8550 - prc: 0.8970 - val_loss: 0.4507 - val_tp: 54.0000 - val_fp: 8146.0000 - val_tn: 37361.0000 - val_fn: 8.0000 - val_accuracy: 0.8211 - val_precision: 0.0066 - val_recall: 0.8710 - val_auc: 0.9299 - val_prc: 0.6431
Epoch 7/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.4455 - tp: 16902.0000 - fp: 5135.0000 - tn: 15357.0000 - fn: 3566.0000 - accuracy: 0.7876 - precision: 0.7670 - recall: 0.8258 - auc: 0.8796 - prc: 0.9143 - val_loss: 0.4036 - val_tp: 55.0000 - val_fp: 6512.0000 - val_tn: 38995.0000 - val_fn: 7.0000 - val_accuracy: 0.8569 - val_precision: 0.0084 - val_recall: 0.8871 - val_auc: 0.9440 - val_prc: 0.6831
Epoch 8/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.4069 - tp: 17088.0000 - fp: 4455.0000 - tn: 16121.0000 - fn: 3296.0000 - accuracy: 0.8108 - precision: 0.7932 - recall: 0.8383 - auc: 0.8949 - prc: 0.9252 - val_loss: 0.3646 - val_tp: 55.0000 - val_fp: 5245.0000 - val_tn: 40262.0000 - val_fn: 7.0000 - val_accuracy: 0.8847 - val_precision: 0.0104 - val_recall: 0.8871 - val_auc: 0.9542 - val_prc: 0.6983
Epoch 9/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.3817 - tp: 17396.0000 - fp: 4094.0000 - tn: 16275.0000 - fn: 3195.0000 - accuracy: 0.8220 - precision: 0.8095 - recall: 0.8448 - auc: 0.9066 - prc: 0.9333 - val_loss: 0.3320 - val_tp: 56.0000 - val_fp: 4248.0000 - val_tn: 41259.0000 - val_fn: 6.0000 - val_accuracy: 0.9066 - val_precision: 0.0130 - val_recall: 0.9032 - val_auc: 0.9606 - val_prc: 0.6950
Epoch 10/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.3564 - tp: 17443.0000 - fp: 3656.0000 - tn: 16835.0000 - fn: 3026.0000 - accuracy: 0.8369 - precision: 0.8267 - recall: 0.8522 - auc: 0.9169 - prc: 0.9400 - val_loss: 0.3030 - val_tp: 56.0000 - val_fp: 3424.0000 - val_tn: 42083.0000 - val_fn: 6.0000 - val_accuracy: 0.9247 - val_precision: 0.0161 - val_recall: 0.9032 - val_auc: 0.9658 - val_prc: 0.7059
Epoch 11/1000
20/20 [==============================] - 1s 28ms/step - loss: 0.3385 - tp: 17350.0000 - fp: 3304.0000 - tn: 17366.0000 - fn: 2940.0000 - accuracy: 0.8476 - precision: 0.8400 - recall: 0.8551 - auc: 0.9242 - prc: 0.9438 - val_loss: 0.2776 - val_tp: 56.0000 - val_fp: 2832.0000 - val_tn: 42675.0000 - val_fn: 6.0000 - val_accuracy: 0.9377 - val_precision: 0.0194 - val_recall: 0.9032 - val_auc: 0.9697 - val_prc: 0.7178
Epoch 12/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.3141 - tp: 17655.0000 - fp: 3062.0000 - tn: 17475.0000 - fn: 2768.0000 - accuracy: 0.8577 - precision: 0.8522 - recall: 0.8645 - auc: 0.9341 - prc: 0.9511 - val_loss: 0.2562 - val_tp: 56.0000 - val_fp: 2346.0000 - val_tn: 43161.0000 - val_fn: 6.0000 - val_accuracy: 0.9484 - val_precision: 0.0233 - val_recall: 0.9032 - val_auc: 0.9733 - val_prc: 0.7251
Epoch 13/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.2985 - tp: 17806.0000 - fp: 2632.0000 - tn: 17822.0000 - fn: 2700.0000 - accuracy: 0.8698 - precision: 0.8712 - recall: 0.8683 - auc: 0.9401 - prc: 0.9553 - val_loss: 0.2386 - val_tp: 56.0000 - val_fp: 2020.0000 - val_tn: 43487.0000 - val_fn: 6.0000 - val_accuracy: 0.9555 - val_precision: 0.0270 - val_recall: 0.9032 - val_auc: 0.9763 - val_prc: 0.7338
Epoch 14/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.2857 - tp: 17779.0000 - fp: 2558.0000 - tn: 17972.0000 - fn: 2651.0000 - accuracy: 0.8728 - precision: 0.8742 - recall: 0.8702 - auc: 0.9450 - prc: 0.9582 - val_loss: 0.2225 - val_tp: 56.0000 - val_fp: 1795.0000 - val_tn: 43712.0000 - val_fn: 6.0000 - val_accuracy: 0.9605 - val_precision: 0.0303 - val_recall: 0.9032 - val_auc: 0.9786 - val_prc: 0.7412
Epoch 15/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.2742 - tp: 17908.0000 - fp: 2298.0000 - tn: 18154.0000 - fn: 2600.0000 - accuracy: 0.8804 - precision: 0.8863 - recall: 0.8732 - auc: 0.9486 - prc: 0.9609 - val_loss: 0.2093 - val_tp: 56.0000 - val_fp: 1658.0000 - val_tn: 43849.0000 - val_fn: 6.0000 - val_accuracy: 0.9635 - val_precision: 0.0327 - val_recall: 0.9032 - val_auc: 0.9806 - val_prc: 0.7430
Epoch 16/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.2649 - tp: 17957.0000 - fp: 2189.0000 - tn: 18248.0000 - fn: 2566.0000 - accuracy: 0.8839 - precision: 0.8913 - recall: 0.8750 - auc: 0.9521 - prc: 0.9632 - val_loss: 0.1977 - val_tp: 57.0000 - val_fp: 1517.0000 - val_tn: 43990.0000 - val_fn: 5.0000 - val_accuracy: 0.9666 - val_precision: 0.0362 - val_recall: 0.9194 - val_auc: 0.9821 - val_prc: 0.7447
Epoch 17/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.2550 - tp: 17988.0000 - fp: 2062.0000 - tn: 18465.0000 - fn: 2445.0000 - accuracy: 0.8900 - precision: 0.8972 - recall: 0.8803 - auc: 0.9562 - prc: 0.9657 - val_loss: 0.1869 - val_tp: 57.0000 - val_fp: 1392.0000 - val_tn: 44115.0000 - val_fn: 5.0000 - val_accuracy: 0.9693 - val_precision: 0.0393 - val_recall: 0.9194 - val_auc: 0.9833 - val_prc: 0.7479
Epoch 18/1000
20/20 [==============================] - 1s 29ms/step - loss: 0.2473 - tp: 18166.0000 - fp: 1916.0000 - tn: 18475.0000 - fn: 2403.0000 - accuracy: 0.8946 - precision: 0.9046 - recall: 0.8832 - auc: 0.9586 - prc: 0.9677 - val_loss: 0.1768 - val_tp: 57.0000 - val_fp: 1290.0000 - val_tn: 44217.0000 - val_fn: 5.0000 - val_accuracy: 0.9716 - val_precision: 0.0423 - val_recall: 0.9194 - val_auc: 0.9841 - val_prc: 0.7493
Epoch 19/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.2430 - tp: 18161.0000 - fp: 1862.0000 - tn: 18558.0000 - fn: 2379.0000 - accuracy: 0.8965 - precision: 0.9070 - recall: 0.8842 - auc: 0.9599 - prc: 0.9688 - val_loss: 0.1682 - val_tp: 57.0000 - val_fp: 1215.0000 - val_tn: 44292.0000 - val_fn: 5.0000 - val_accuracy: 0.9732 - val_precision: 0.0448 - val_recall: 0.9194 - val_auc: 0.9847 - val_prc: 0.7525
Epoch 20/1000
20/20 [==============================] - 0s 25ms/step - loss: 0.2342 - tp: 18028.0000 - fp: 1701.0000 - tn: 18869.0000 - fn: 2362.0000 - accuracy: 0.9008 - precision: 0.9138 - recall: 0.8842 - auc: 0.9630 - prc: 0.9704 - val_loss: 0.1607 - val_tp: 57.0000 - val_fp: 1143.0000 - val_tn: 44364.0000 - val_fn: 5.0000 - val_accuracy: 0.9748 - val_precision: 0.0475 - val_recall: 0.9194 - val_auc: 0.9857 - val_prc: 0.7559
Epoch 21/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.2263 - tp: 18344.0000 - fp: 1587.0000 - tn: 18737.0000 - fn: 2292.0000 - accuracy: 0.9053 - precision: 0.9204 - recall: 0.8889 - auc: 0.9650 - prc: 0.9726 - val_loss: 0.1541 - val_tp: 57.0000 - val_fp: 1094.0000 - val_tn: 44413.0000 - val_fn: 5.0000 - val_accuracy: 0.9759 - val_precision: 0.0495 - val_recall: 0.9194 - val_auc: 0.9865 - val_prc: 0.7574
Epoch 22/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.2208 - tp: 18114.0000 - fp: 1529.0000 - tn: 19017.0000 - fn: 2300.0000 - accuracy: 0.9065 - precision: 0.9222 - recall: 0.8873 - auc: 0.9663 - prc: 0.9729 - val_loss: 0.1483 - val_tp: 57.0000 - val_fp: 1054.0000 - val_tn: 44453.0000 - val_fn: 5.0000 - val_accuracy: 0.9768 - val_precision: 0.0513 - val_recall: 0.9194 - val_auc: 0.9870 - val_prc: 0.7593
Epoch 23/1000
20/20 [==============================] - 1s 32ms/step - loss: 0.2161 - tp: 18199.0000 - fp: 1449.0000 - tn: 19038.0000 - fn: 2274.0000 - accuracy: 0.9091 - precision: 0.9263 - recall: 0.8889 - auc: 0.9678 - prc: 0.9743 - val_loss: 0.1438 - val_tp: 57.0000 - val_fp: 1026.0000 - val_tn: 44481.0000 - val_fn: 5.0000 - val_accuracy: 0.9774 - val_precision: 0.0526 - val_recall: 0.9194 - val_auc: 0.9876 - val_prc: 0.7610
Epoch 24/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.2136 - tp: 18294.0000 - fp: 1366.0000 - tn: 19070.0000 - fn: 2230.0000 - accuracy: 0.9122 - precision: 0.9305 - recall: 0.8913 - auc: 0.9691 - prc: 0.9751 - val_loss: 0.1390 - val_tp: 57.0000 - val_fp: 1001.0000 - val_tn: 44506.0000 - val_fn: 5.0000 - val_accuracy: 0.9779 - val_precision: 0.0539 - val_recall: 0.9194 - val_auc: 0.9881 - val_prc: 0.7624
Epoch 25/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.2080 - tp: 18105.0000 - fp: 1313.0000 - tn: 19342.0000 - fn: 2200.0000 - accuracy: 0.9142 - precision: 0.9324 - recall: 0.8917 - auc: 0.9706 - prc: 0.9758 - val_loss: 0.1343 - val_tp: 58.0000 - val_fp: 965.0000 - val_tn: 44542.0000 - val_fn: 4.0000 - val_accuracy: 0.9787 - val_precision: 0.0567 - val_recall: 0.9355 - val_auc: 0.9888 - val_prc: 0.7624
Epoch 26/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.2016 - tp: 18246.0000 - fp: 1199.0000 - tn: 19365.0000 - fn: 2150.0000 - accuracy: 0.9182 - precision: 0.9383 - recall: 0.8946 - auc: 0.9726 - prc: 0.9775 - val_loss: 0.1302 - val_tp: 58.0000 - val_fp: 945.0000 - val_tn: 44562.0000 - val_fn: 4.0000 - val_accuracy: 0.9792 - val_precision: 0.0578 - val_recall: 0.9355 - val_auc: 0.9891 - val_prc: 0.7642
Epoch 27/1000
20/20 [==============================] - 1s 28ms/step - loss: 0.2001 - tp: 18151.0000 - fp: 1243.0000 - tn: 19386.0000 - fn: 2180.0000 - accuracy: 0.9164 - precision: 0.9359 - recall: 0.8928 - auc: 0.9729 - prc: 0.9776 - val_loss: 0.1263 - val_tp: 58.0000 - val_fp: 918.0000 - val_tn: 44589.0000 - val_fn: 4.0000 - val_accuracy: 0.9798 - val_precision: 0.0594 - val_recall: 0.9355 - val_auc: 0.9895 - val_prc: 0.7667
Epoch 28/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1938 - tp: 18435.0000 - fp: 1173.0000 - tn: 19292.0000 - fn: 2060.0000 - accuracy: 0.9211 - precision: 0.9402 - recall: 0.8995 - auc: 0.9749 - prc: 0.9792 - val_loss: 0.1226 - val_tp: 58.0000 - val_fp: 895.0000 - val_tn: 44612.0000 - val_fn: 4.0000 - val_accuracy: 0.9803 - val_precision: 0.0609 - val_recall: 0.9355 - val_auc: 0.9900 - val_prc: 0.7676
Epoch 29/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1892 - tp: 18504.0000 - fp: 1097.0000 - tn: 19246.0000 - fn: 2113.0000 - accuracy: 0.9216 - precision: 0.9440 - recall: 0.8975 - auc: 0.9759 - prc: 0.9802 - val_loss: 0.1201 - val_tp: 58.0000 - val_fp: 895.0000 - val_tn: 44612.0000 - val_fn: 4.0000 - val_accuracy: 0.9803 - val_precision: 0.0609 - val_recall: 0.9355 - val_auc: 0.9907 - val_prc: 0.7698
Epoch 30/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1889 - tp: 18295.0000 - fp: 1096.0000 - tn: 19498.0000 - fn: 2071.0000 - accuracy: 0.9227 - precision: 0.9435 - recall: 0.8983 - auc: 0.9759 - prc: 0.9801 - val_loss: 0.1171 - val_tp: 58.0000 - val_fp: 873.0000 - val_tn: 44634.0000 - val_fn: 4.0000 - val_accuracy: 0.9808 - val_precision: 0.0623 - val_recall: 0.9355 - val_auc: 0.9910 - val_prc: 0.7701
Epoch 31/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1884 - tp: 18284.0000 - fp: 1097.0000 - tn: 19550.0000 - fn: 2029.0000 - accuracy: 0.9237 - precision: 0.9434 - recall: 0.9001 - auc: 0.9766 - prc: 0.9801 - val_loss: 0.1139 - val_tp: 59.0000 - val_fp: 854.0000 - val_tn: 44653.0000 - val_fn: 3.0000 - val_accuracy: 0.9812 - val_precision: 0.0646 - val_recall: 0.9516 - val_auc: 0.9913 - val_prc: 0.7708
Epoch 32/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1812 - tp: 18544.0000 - fp: 1009.0000 - tn: 19378.0000 - fn: 2029.0000 - accuracy: 0.9258 - precision: 0.9484 - recall: 0.9014 - auc: 0.9781 - prc: 0.9816 - val_loss: 0.1111 - val_tp: 59.0000 - val_fp: 848.0000 - val_tn: 44659.0000 - val_fn: 3.0000 - val_accuracy: 0.9813 - val_precision: 0.0650 - val_recall: 0.9516 - val_auc: 0.9916 - val_prc: 0.7722
Epoch 33/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1796 - tp: 18420.0000 - fp: 963.0000 - tn: 19517.0000 - fn: 2060.0000 - accuracy: 0.9262 - precision: 0.9503 - recall: 0.8994 - auc: 0.9787 - prc: 0.9820 - val_loss: 0.1086 - val_tp: 59.0000 - val_fp: 835.0000 - val_tn: 44672.0000 - val_fn: 3.0000 - val_accuracy: 0.9816 - val_precision: 0.0660 - val_recall: 0.9516 - val_auc: 0.9918 - val_prc: 0.7717
Epoch 34/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1767 - tp: 18639.0000 - fp: 1016.0000 - tn: 19365.0000 - fn: 1940.0000 - accuracy: 0.9278 - precision: 0.9483 - recall: 0.9057 - auc: 0.9795 - prc: 0.9828 - val_loss: 0.1062 - val_tp: 59.0000 - val_fp: 829.0000 - val_tn: 44678.0000 - val_fn: 3.0000 - val_accuracy: 0.9817 - val_precision: 0.0664 - val_recall: 0.9516 - val_auc: 0.9920 - val_prc: 0.7734
Epoch 35/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1729 - tp: 18423.0000 - fp: 922.0000 - tn: 19659.0000 - fn: 1956.0000 - accuracy: 0.9297 - precision: 0.9523 - recall: 0.9040 - auc: 0.9807 - prc: 0.9835 - val_loss: 0.1039 - val_tp: 59.0000 - val_fp: 814.0000 - val_tn: 44693.0000 - val_fn: 3.0000 - val_accuracy: 0.9821 - val_precision: 0.0676 - val_recall: 0.9516 - val_auc: 0.9921 - val_prc: 0.7740
Epoch 36/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1725 - tp: 18603.0000 - fp: 952.0000 - tn: 19452.0000 - fn: 1953.0000 - accuracy: 0.9291 - precision: 0.9513 - recall: 0.9050 - auc: 0.9805 - prc: 0.9835 - val_loss: 0.1018 - val_tp: 59.0000 - val_fp: 800.0000 - val_tn: 44707.0000 - val_fn: 3.0000 - val_accuracy: 0.9824 - val_precision: 0.0687 - val_recall: 0.9516 - val_auc: 0.9923 - val_prc: 0.7738
Epoch 37/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1697 - tp: 18657.0000 - fp: 929.0000 - tn: 19463.0000 - fn: 1911.0000 - accuracy: 0.9307 - precision: 0.9526 - recall: 0.9071 - auc: 0.9813 - prc: 0.9842 - val_loss: 0.0995 - val_tp: 59.0000 - val_fp: 786.0000 - val_tn: 44721.0000 - val_fn: 3.0000 - val_accuracy: 0.9827 - val_precision: 0.0698 - val_recall: 0.9516 - val_auc: 0.9924 - val_prc: 0.7766
Epoch 38/1000
20/20 [==============================] - 1s 28ms/step - loss: 0.1677 - tp: 18545.0000 - fp: 862.0000 - tn: 19667.0000 - fn: 1886.0000 - accuracy: 0.9329 - precision: 0.9556 - recall: 0.9077 - auc: 0.9819 - prc: 0.9843 - val_loss: 0.0973 - val_tp: 59.0000 - val_fp: 770.0000 - val_tn: 44737.0000 - val_fn: 3.0000 - val_accuracy: 0.9830 - val_precision: 0.0712 - val_recall: 0.9516 - val_auc: 0.9927 - val_prc: 0.7777
Epoch 39/1000
20/20 [==============================] - 1s 30ms/step - loss: 0.1633 - tp: 18532.0000 - fp: 840.0000 - tn: 19693.0000 - fn: 1895.0000 - accuracy: 0.9332 - precision: 0.9566 - recall: 0.9072 - auc: 0.9829 - prc: 0.9850 - val_loss: 0.0954 - val_tp: 59.0000 - val_fp: 755.0000 - val_tn: 44752.0000 - val_fn: 3.0000 - val_accuracy: 0.9834 - val_precision: 0.0725 - val_recall: 0.9516 - val_auc: 0.9928 - val_prc: 0.7788
Epoch 40/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1657 - tp: 18582.0000 - fp: 831.0000 - tn: 19612.0000 - fn: 1935.0000 - accuracy: 0.9325 - precision: 0.9572 - recall: 0.9057 - auc: 0.9822 - prc: 0.9848 - val_loss: 0.0937 - val_tp: 59.0000 - val_fp: 740.0000 - val_tn: 44767.0000 - val_fn: 3.0000 - val_accuracy: 0.9837 - val_precision: 0.0738 - val_recall: 0.9516 - val_auc: 0.9931 - val_prc: 0.7786
Epoch 41/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1614 - tp: 18720.0000 - fp: 794.0000 - tn: 19558.0000 - fn: 1888.0000 - accuracy: 0.9345 - precision: 0.9593 - recall: 0.9084 - auc: 0.9833 - prc: 0.9858 - val_loss: 0.0920 - val_tp: 59.0000 - val_fp: 727.0000 - val_tn: 44780.0000 - val_fn: 3.0000 - val_accuracy: 0.9840 - val_precision: 0.0751 - val_recall: 0.9516 - val_auc: 0.9932 - val_prc: 0.7799
Epoch 42/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1566 - tp: 18704.0000 - fp: 738.0000 - tn: 19658.0000 - fn: 1860.0000 - accuracy: 0.9366 - precision: 0.9620 - recall: 0.9096 - auc: 0.9843 - prc: 0.9866 - val_loss: 0.0904 - val_tp: 59.0000 - val_fp: 728.0000 - val_tn: 44779.0000 - val_fn: 3.0000 - val_accuracy: 0.9840 - val_precision: 0.0750 - val_recall: 0.9516 - val_auc: 0.9933 - val_prc: 0.7818
Epoch 43/1000
20/20 [==============================] - 1s 31ms/step - loss: 0.1585 - tp: 18702.0000 - fp: 795.0000 - tn: 19639.0000 - fn: 1824.0000 - accuracy: 0.9361 - precision: 0.9592 - recall: 0.9111 - auc: 0.9843 - prc: 0.9862 - val_loss: 0.0884 - val_tp: 59.0000 - val_fp: 711.0000 - val_tn: 44796.0000 - val_fn: 3.0000 - val_accuracy: 0.9843 - val_precision: 0.0766 - val_recall: 0.9516 - val_auc: 0.9935 - val_prc: 0.7820
Epoch 44/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1551 - tp: 18612.0000 - fp: 755.0000 - tn: 19741.0000 - fn: 1852.0000 - accuracy: 0.9364 - precision: 0.9610 - recall: 0.9095 - auc: 0.9846 - prc: 0.9866 - val_loss: 0.0866 - val_tp: 59.0000 - val_fp: 693.0000 - val_tn: 44814.0000 - val_fn: 3.0000 - val_accuracy: 0.9847 - val_precision: 0.0785 - val_recall: 0.9516 - val_auc: 0.9936 - val_prc: 0.7841
Epoch 45/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1522 - tp: 18444.0000 - fp: 767.0000 - tn: 19966.0000 - fn: 1783.0000 - accuracy: 0.9377 - precision: 0.9601 - recall: 0.9119 - auc: 0.9852 - prc: 0.9870 - val_loss: 0.0851 - val_tp: 59.0000 - val_fp: 685.0000 - val_tn: 44822.0000 - val_fn: 3.0000 - val_accuracy: 0.9849 - val_precision: 0.0793 - val_recall: 0.9516 - val_auc: 0.9938 - val_prc: 0.7884
Epoch 46/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1528 - tp: 18971.0000 - fp: 713.0000 - tn: 19452.0000 - fn: 1824.0000 - accuracy: 0.9381 - precision: 0.9638 - recall: 0.9123 - auc: 0.9853 - prc: 0.9873 - val_loss: 0.0841 - val_tp: 59.0000 - val_fp: 678.0000 - val_tn: 44829.0000 - val_fn: 3.0000 - val_accuracy: 0.9851 - val_precision: 0.0801 - val_recall: 0.9516 - val_auc: 0.9939 - val_prc: 0.7885
Epoch 47/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1519 - tp: 18691.0000 - fp: 740.0000 - tn: 19735.0000 - fn: 1794.0000 - accuracy: 0.9381 - precision: 0.9619 - recall: 0.9124 - auc: 0.9853 - prc: 0.9872 - val_loss: 0.0826 - val_tp: 59.0000 - val_fp: 665.0000 - val_tn: 44842.0000 - val_fn: 3.0000 - val_accuracy: 0.9853 - val_precision: 0.0815 - val_recall: 0.9516 - val_auc: 0.9940 - val_prc: 0.7889
Epoch 48/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1481 - tp: 18755.0000 - fp: 686.0000 - tn: 19767.0000 - fn: 1752.0000 - accuracy: 0.9405 - precision: 0.9647 - recall: 0.9146 - auc: 0.9861 - prc: 0.9880 - val_loss: 0.0819 - val_tp: 59.0000 - val_fp: 665.0000 - val_tn: 44842.0000 - val_fn: 3.0000 - val_accuracy: 0.9853 - val_precision: 0.0815 - val_recall: 0.9516 - val_auc: 0.9941 - val_prc: 0.7889
Epoch 49/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1451 - tp: 18674.0000 - fp: 644.0000 - tn: 19870.0000 - fn: 1772.0000 - accuracy: 0.9410 - precision: 0.9667 - recall: 0.9133 - auc: 0.9867 - prc: 0.9884 - val_loss: 0.0814 - val_tp: 59.0000 - val_fp: 673.0000 - val_tn: 44834.0000 - val_fn: 3.0000 - val_accuracy: 0.9852 - val_precision: 0.0806 - val_recall: 0.9516 - val_auc: 0.9942 - val_prc: 0.7926
Epoch 50/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1445 - tp: 18860.0000 - fp: 658.0000 - tn: 19720.0000 - fn: 1722.0000 - accuracy: 0.9419 - precision: 0.9663 - recall: 0.9163 - auc: 0.9869 - prc: 0.9886 - val_loss: 0.0807 - val_tp: 59.0000 - val_fp: 673.0000 - val_tn: 44834.0000 - val_fn: 3.0000 - val_accuracy: 0.9852 - val_precision: 0.0806 - val_recall: 0.9516 - val_auc: 0.9943 - val_prc: 0.7949
Epoch 51/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1439 - tp: 18864.0000 - fp: 694.0000 - tn: 19708.0000 - fn: 1694.0000 - accuracy: 0.9417 - precision: 0.9645 - recall: 0.9176 - auc: 0.9870 - prc: 0.9885 - val_loss: 0.0798 - val_tp: 59.0000 - val_fp: 675.0000 - val_tn: 44832.0000 - val_fn: 3.0000 - val_accuracy: 0.9851 - val_precision: 0.0804 - val_recall: 0.9516 - val_auc: 0.9944 - val_prc: 0.7949
Epoch 52/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1446 - tp: 18903.0000 - fp: 693.0000 - tn: 19697.0000 - fn: 1667.0000 - accuracy: 0.9424 - precision: 0.9646 - recall: 0.9190 - auc: 0.9870 - prc: 0.9885 - val_loss: 0.0782 - val_tp: 59.0000 - val_fp: 662.0000 - val_tn: 44845.0000 - val_fn: 3.0000 - val_accuracy: 0.9854 - val_precision: 0.0818 - val_recall: 0.9516 - val_auc: 0.9945 - val_prc: 0.7948
Epoch 53/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1429 - tp: 18650.0000 - fp: 653.0000 - tn: 20004.0000 - fn: 1653.0000 - accuracy: 0.9437 - precision: 0.9662 - recall: 0.9186 - auc: 0.9872 - prc: 0.9887 - val_loss: 0.0771 - val_tp: 59.0000 - val_fp: 654.0000 - val_tn: 44853.0000 - val_fn: 3.0000 - val_accuracy: 0.9856 - val_precision: 0.0827 - val_recall: 0.9516 - val_auc: 0.9946 - val_prc: 0.7950
Epoch 54/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1421 - tp: 18776.0000 - fp: 642.0000 - tn: 19888.0000 - fn: 1654.0000 - accuracy: 0.9439 - precision: 0.9669 - recall: 0.9190 - auc: 0.9875 - prc: 0.9889 - val_loss: 0.0763 - val_tp: 59.0000 - val_fp: 651.0000 - val_tn: 44856.0000 - val_fn: 3.0000 - val_accuracy: 0.9856 - val_precision: 0.0831 - val_recall: 0.9516 - val_auc: 0.9947 - val_prc: 0.7955
Epoch 55/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1436 - tp: 18656.0000 - fp: 701.0000 - tn: 19886.0000 - fn: 1717.0000 - accuracy: 0.9410 - precision: 0.9638 - recall: 0.9157 - auc: 0.9873 - prc: 0.9886 - val_loss: 0.0751 - val_tp: 59.0000 - val_fp: 642.0000 - val_tn: 44865.0000 - val_fn: 3.0000 - val_accuracy: 0.9858 - val_precision: 0.0842 - val_recall: 0.9516 - val_auc: 0.9946 - val_prc: 0.7822
Epoch 56/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1413 - tp: 18756.0000 - fp: 668.0000 - tn: 19832.0000 - fn: 1704.0000 - accuracy: 0.9421 - precision: 0.9656 - recall: 0.9167 - auc: 0.9874 - prc: 0.9888 - val_loss: 0.0737 - val_tp: 59.0000 - val_fp: 629.0000 - val_tn: 44878.0000 - val_fn: 3.0000 - val_accuracy: 0.9861 - val_precision: 0.0858 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7823
Epoch 57/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1382 - tp: 18692.0000 - fp: 609.0000 - tn: 19938.0000 - fn: 1721.0000 - accuracy: 0.9431 - precision: 0.9684 - recall: 0.9157 - auc: 0.9881 - prc: 0.9893 - val_loss: 0.0729 - val_tp: 59.0000 - val_fp: 627.0000 - val_tn: 44880.0000 - val_fn: 3.0000 - val_accuracy: 0.9862 - val_precision: 0.0860 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7824
Epoch 58/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1342 - tp: 18936.0000 - fp: 594.0000 - tn: 19790.0000 - fn: 1640.0000 - accuracy: 0.9455 - precision: 0.9696 - recall: 0.9203 - auc: 0.9888 - prc: 0.9901 - val_loss: 0.0725 - val_tp: 59.0000 - val_fp: 634.0000 - val_tn: 44873.0000 - val_fn: 3.0000 - val_accuracy: 0.9860 - val_precision: 0.0851 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7826
Epoch 59/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1389 - tp: 18737.0000 - fp: 639.0000 - tn: 19946.0000 - fn: 1638.0000 - accuracy: 0.9444 - precision: 0.9670 - recall: 0.9196 - auc: 0.9883 - prc: 0.9892 - val_loss: 0.0717 - val_tp: 59.0000 - val_fp: 623.0000 - val_tn: 44884.0000 - val_fn: 3.0000 - val_accuracy: 0.9863 - val_precision: 0.0865 - val_recall: 0.9516 - val_auc: 0.9947 - val_prc: 0.7827
Epoch 60/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1353 - tp: 18861.0000 - fp: 627.0000 - tn: 19779.0000 - fn: 1693.0000 - accuracy: 0.9434 - precision: 0.9678 - recall: 0.9176 - auc: 0.9888 - prc: 0.9900 - val_loss: 0.0712 - val_tp: 59.0000 - val_fp: 624.0000 - val_tn: 44883.0000 - val_fn: 3.0000 - val_accuracy: 0.9862 - val_precision: 0.0864 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7829
Epoch 61/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1351 - tp: 18764.0000 - fp: 606.0000 - tn: 19963.0000 - fn: 1627.0000 - accuracy: 0.9455 - precision: 0.9687 - recall: 0.9202 - auc: 0.9888 - prc: 0.9898 - val_loss: 0.0704 - val_tp: 59.0000 - val_fp: 619.0000 - val_tn: 44888.0000 - val_fn: 3.0000 - val_accuracy: 0.9864 - val_precision: 0.0870 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7830
Epoch 62/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1315 - tp: 18808.0000 - fp: 571.0000 - tn: 19949.0000 - fn: 1632.0000 - accuracy: 0.9462 - precision: 0.9705 - recall: 0.9202 - auc: 0.9893 - prc: 0.9904 - val_loss: 0.0701 - val_tp: 59.0000 - val_fp: 622.0000 - val_tn: 44885.0000 - val_fn: 3.0000 - val_accuracy: 0.9863 - val_precision: 0.0866 - val_recall: 0.9516 - val_auc: 0.9947 - val_prc: 0.7834
Epoch 63/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1336 - tp: 18950.0000 - fp: 609.0000 - tn: 19753.0000 - fn: 1648.0000 - accuracy: 0.9449 - precision: 0.9689 - recall: 0.9200 - auc: 0.9890 - prc: 0.9902 - val_loss: 0.0695 - val_tp: 59.0000 - val_fp: 622.0000 - val_tn: 44885.0000 - val_fn: 3.0000 - val_accuracy: 0.9863 - val_precision: 0.0866 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7706
Epoch 64/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1351 - tp: 18989.0000 - fp: 641.0000 - tn: 19633.0000 - fn: 1697.0000 - accuracy: 0.9429 - precision: 0.9673 - recall: 0.9180 - auc: 0.9889 - prc: 0.9901 - val_loss: 0.0689 - val_tp: 59.0000 - val_fp: 621.0000 - val_tn: 44886.0000 - val_fn: 3.0000 - val_accuracy: 0.9863 - val_precision: 0.0868 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7837
Epoch 65/1000
20/20 [==============================] - 0s 25ms/step - loss: 0.1328 - tp: 18724.0000 - fp: 620.0000 - tn: 19977.0000 - fn: 1639.0000 - accuracy: 0.9448 - precision: 0.9679 - recall: 0.9195 - auc: 0.9891 - prc: 0.9902 - val_loss: 0.0680 - val_tp: 59.0000 - val_fp: 610.0000 - val_tn: 44897.0000 - val_fn: 3.0000 - val_accuracy: 0.9865 - val_precision: 0.0882 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7590
Epoch 66/1000
20/20 [==============================] - 1s 28ms/step - loss: 0.1295 - tp: 18851.0000 - fp: 555.0000 - tn: 19941.0000 - fn: 1613.0000 - accuracy: 0.9471 - precision: 0.9714 - recall: 0.9212 - auc: 0.9896 - prc: 0.9906 - val_loss: 0.0677 - val_tp: 59.0000 - val_fp: 612.0000 - val_tn: 44895.0000 - val_fn: 3.0000 - val_accuracy: 0.9865 - val_precision: 0.0879 - val_recall: 0.9516 - val_auc: 0.9949 - val_prc: 0.7586
Epoch 67/1000
20/20 [==============================] - 1s 28ms/step - loss: 0.1292 - tp: 18807.0000 - fp: 577.0000 - tn: 19997.0000 - fn: 1579.0000 - accuracy: 0.9474 - precision: 0.9702 - recall: 0.9225 - auc: 0.9898 - prc: 0.9906 - val_loss: 0.0671 - val_tp: 59.0000 - val_fp: 609.0000 - val_tn: 44898.0000 - val_fn: 3.0000 - val_accuracy: 0.9866 - val_precision: 0.0883 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7585
Epoch 68/1000
20/20 [==============================] - 0s 25ms/step - loss: 0.1309 - tp: 18911.0000 - fp: 570.0000 - tn: 19852.0000 - fn: 1627.0000 - accuracy: 0.9464 - precision: 0.9707 - recall: 0.9208 - auc: 0.9894 - prc: 0.9904 - val_loss: 0.0668 - val_tp: 59.0000 - val_fp: 614.0000 - val_tn: 44893.0000 - val_fn: 3.0000 - val_accuracy: 0.9865 - val_precision: 0.0877 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7589
Epoch 69/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1315 - tp: 18832.0000 - fp: 584.0000 - tn: 19908.0000 - fn: 1636.0000 - accuracy: 0.9458 - precision: 0.9699 - recall: 0.9201 - auc: 0.9894 - prc: 0.9903 - val_loss: 0.0660 - val_tp: 59.0000 - val_fp: 598.0000 - val_tn: 44909.0000 - val_fn: 3.0000 - val_accuracy: 0.9868 - val_precision: 0.0898 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7475
Epoch 70/1000
20/20 [==============================] - 1s 26ms/step - loss: 0.1303 - tp: 19009.0000 - fp: 595.0000 - tn: 19760.0000 - fn: 1596.0000 - accuracy: 0.9465 - precision: 0.9696 - recall: 0.9225 - auc: 0.9897 - prc: 0.9907 - val_loss: 0.0655 - val_tp: 59.0000 - val_fp: 599.0000 - val_tn: 44908.0000 - val_fn: 3.0000 - val_accuracy: 0.9868 - val_precision: 0.0897 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7476
Epoch 71/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1270 - tp: 19006.0000 - fp: 547.0000 - tn: 19849.0000 - fn: 1558.0000 - accuracy: 0.9486 - precision: 0.9720 - recall: 0.9242 - auc: 0.9903 - prc: 0.9911 - val_loss: 0.0653 - val_tp: 59.0000 - val_fp: 602.0000 - val_tn: 44905.0000 - val_fn: 3.0000 - val_accuracy: 0.9867 - val_precision: 0.0893 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7479
Epoch 72/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1259 - tp: 18876.0000 - fp: 554.0000 - tn: 19944.0000 - fn: 1586.0000 - accuracy: 0.9478 - precision: 0.9715 - recall: 0.9225 - auc: 0.9904 - prc: 0.9909 - val_loss: 0.0649 - val_tp: 59.0000 - val_fp: 598.0000 - val_tn: 44909.0000 - val_fn: 3.0000 - val_accuracy: 0.9868 - val_precision: 0.0898 - val_recall: 0.9516 - val_auc: 0.9949 - val_prc: 0.7481
Epoch 73/1000
20/20 [==============================] - 0s 26ms/step - loss: 0.1295 - tp: 18926.0000 - fp: 562.0000 - tn: 19932.0000 - fn: 1540.0000 - accuracy: 0.9487 - precision: 0.9712 - recall: 0.9248 - auc: 0.9900 - prc: 0.9906 - val_loss: 0.0641 - val_tp: 59.0000 - val_fp: 595.0000 - val_tn: 44912.0000 - val_fn: 3.0000 - val_accuracy: 0.9869 - val_precision: 0.0902 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7481
Epoch 74/1000
20/20 [==============================] - 0s 25ms/step - loss: 0.1269 - tp: 18914.0000 - fp: 546.0000 - tn: 19893.0000 - fn: 1607.0000 - accuracy: 0.9474 - precision: 0.9719 - recall: 0.9217 - auc: 0.9904 - prc: 0.9911 - val_loss: 0.0636 - val_tp: 59.0000 - val_fp: 594.0000 - val_tn: 44913.0000 - val_fn: 3.0000 - val_accuracy: 0.9869 - val_precision: 0.0904 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7484
Epoch 75/1000
20/20 [==============================] - 1s 27ms/step - loss: 0.1243 - tp: 19086.0000 - fp: 538.0000 - tn: 19765.0000 - fn: 1571.0000 - accuracy: 0.9485 - precision: 0.9726 - recall: 0.9239 - auc: 0.9908 - prc: 0.9917 - val_loss: 0.0636 - val_tp: 59.0000 - val_fp: 596.0000 - val_tn: 44911.0000 - val_fn: 3.0000 - val_accuracy: 0.9869 - val_precision: 0.0901 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7489
Epoch 76/1000
20/20 [==============================] - ETA: 0s - loss: 0.1255 - tp: 18993.0000 - fp: 570.0000 - tn: 19854.0000 - fn: 1543.0000 - accuracy: 0.9484 - precision: 0.9709 - recall: 0.9249 - auc: 0.9906 - prc: 0.9911Restoring model weights from the end of the best epoch: 66.
20/20 [==============================] - 1s 26ms/step - loss: 0.1255 - tp: 18993.0000 - fp: 570.0000 - tn: 19854.0000 - fn: 1543.0000 - accuracy: 0.9484 - precision: 0.9709 - recall: 0.9249 - auc: 0.9906 - prc: 0.9911 - val_loss: 0.0628 - val_tp: 59.0000 - val_fp: 591.0000 - val_tn: 44916.0000 - val_fn: 3.0000 - val_accuracy: 0.9870 - val_precision: 0.0908 - val_recall: 0.9516 - val_auc: 0.9948 - val_prc: 0.7490
Epoch 76: early stopping

重新查看训练历史记录

plot_metrics(resampled_history)

png

评估指标

train_predictions_resampled = resampled_model.predict(train_features, batch_size=BATCH_SIZE)
test_predictions_resampled = resampled_model.predict(test_features, batch_size=BATCH_SIZE)
90/90 [==============================] - 0s 1ms/step
28/28 [==============================] - 0s 1ms/step
resampled_results = resampled_model.evaluate(test_features, test_labels,
                                             batch_size=BATCH_SIZE, verbose=0)
for name, value in zip(resampled_model.metrics_names, resampled_results):
  print(name, ': ', value)
print()

plot_cm(test_labels, test_predictions_resampled)
loss :  0.07047542929649353
tp :  94.0
fp :  748.0
tn :  56107.0
fn :  13.0
accuracy :  0.986640214920044
precision :  0.11163895577192307
recall :  0.8785046935081482
auc :  0.9670301079750061
prc :  0.6843827962875366

Legitimate Transactions Detected (True Negatives):  56107
Legitimate Transactions Incorrectly Detected (False Positives):  748
Fraudulent Transactions Missed (False Negatives):  13
Fraudulent Transactions Detected (True Positives):  94
Total Fraudulent Transactions:  107

png

绘制 ROC

plot_roc("Train Baseline", train_labels, train_predictions_baseline, color=colors[0])
plot_roc("Test Baseline", test_labels, test_predictions_baseline, color=colors[0], linestyle='--')

plot_roc("Train Weighted", train_labels, train_predictions_weighted, color=colors[1])
plot_roc("Test Weighted", test_labels, test_predictions_weighted, color=colors[1], linestyle='--')

plot_roc("Train Resampled", train_labels, train_predictions_resampled, color=colors[2])
plot_roc("Test Resampled", test_labels, test_predictions_resampled, color=colors[2], linestyle='--')
plt.legend(loc='lower right');

png

绘制 AUPRC

plot_prc("Train Baseline", train_labels, train_predictions_baseline, color=colors[0])
plot_prc("Test Baseline", test_labels, test_predictions_baseline, color=colors[0], linestyle='--')

plot_prc("Train Weighted", train_labels, train_predictions_weighted, color=colors[1])
plot_prc("Test Weighted", test_labels, test_predictions_weighted, color=colors[1], linestyle='--')

plot_prc("Train Resampled", train_labels, train_predictions_resampled, color=colors[2])
plot_prc("Test Resampled", test_labels, test_predictions_resampled, color=colors[2], linestyle='--')
plt.legend(loc='lower right');

png

使用本教程解决您的问题

由于可供学习的样本过少,不平衡数据的分类是固有难题。您应该始终先从数据开始,尽可能多地收集样本,并充分考虑可能相关的特征,以便模型能够充分利用占少数的类。有时您的模型可能难以改善且无法获得想要的结果,因此请务必牢记问题的上下文,并在不同类型的错误之间进行权衡。