마이그레이션 예제: 미리 준비된 Estimator

미리 준비된(또는 미리 만들어진) Estimator는 TensorFlow 1에서 다양한 일반적인 사용 사례에 대해 모델을 훈련하는 빠르고 쉬운 방법으로 전통적으로 사용되었습니다. TensorFlow 2는 Keras 모델을 통해 이들 중 다수에 대한 간단한 대략적인 대체물을 제공합니다. TensorFlow 2 대체 기능이 내장되어 있지 않은 미리 준비된 추정기의 경우에도 상당히 쉽게 자체 대체 기능을 구축할 수 있습니다.

이 가이드에서는 TensorFlow 1의 tf.estimator에서 파생된 모델을 Keras를 사용하여 TensorFlow 2로 마이그레이션하는 방법을 보여줄 수 있도록 직접 등가물 및 사용자 정의 대체물의 몇 가지 예제를 안내합니다.

즉, 이 가이드에는 마이그레이션에 대한 예가 포함되어 있습니다.

에서 tf.estimator 의 LinearEstimator , Classifier 또는 Regressor Keras에 TensorFlow 1 tf.compat.v1.keras.models.LinearModel TensorFlow 2
에서 tf.estimator 의 DNNEstimator , Classifier 또는 Regressor TensorFlow 1 TensorFlow 2에서 사용자 지정 Keras DNN ModelKeras에
에서 tf.estimator 의 DNNLinearCombinedEstimator , Classifier 또는 Regressor 에 TensorFlow 1 tf.compat.v1.keras.models.WideDeepModel TensorFlow 2
TensorFlow 1에 있는 tf.estimator의 BoostedTreesEstimator, Classifier 또는 Regressor에서 TensorFlow 2의 tfdf.keras.GradientBoostedTreesModel로

모델 훈련의 경우 일반적으로 tf.feature_column을 사용하여 TensorFlow 1 Estimator 모델에 대한 특성 전처리 작업을 사전작업으로 수행합니다. TensorFlow 2의 특성 전처리에 대한 자세한 내용은 특성 열에서 Keras 전처리 레이어 API로 마이그레이션하기 가이드를 참고하세요.

설정

몇 가지 필요한 TensorFlow 가져오기로 시작합니다.

pip install tensorflow_decision_forests

import keras
import pandas as pd
import tensorflow as tf
import tensorflow.compat.v1 as tf1
import tensorflow_decision_forests as tfdf

2022-12-14 20:20:50.182060: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-12-14 20:20:50.182151: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-12-14 20:20:50.182160: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

표준 Titanic 데이터 세트에서 데모용으로 몇 가지 간단한 데이터를 준비하고,

x_train = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
x_eval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')
x_train['sex'].replace(('male', 'female'), (0, 1), inplace=True)
x_eval['sex'].replace(('male', 'female'), (0, 1), inplace=True)

x_train['alone'].replace(('n', 'y'), (0, 1), inplace=True)
x_eval['alone'].replace(('n', 'y'), (0, 1), inplace=True)

x_train['class'].replace(('First', 'Second', 'Third'), (1, 2, 3), inplace=True)
x_eval['class'].replace(('First', 'Second', 'Third'), (1, 2, 3), inplace=True)

x_train.drop(['embark_town', 'deck'], axis=1, inplace=True)
x_eval.drop(['embark_town', 'deck'], axis=1, inplace=True)

y_train = x_train.pop('survived')
y_eval = x_eval.pop('survived')

# Data setup for TensorFlow 1 with `tf.estimator`
def _input_fn():
  return tf1.data.Dataset.from_tensor_slices((dict(x_train), y_train)).batch(32)


def _eval_input_fn():
  return tf1.data.Dataset.from_tensor_slices((dict(x_eval), y_eval)).batch(32)


FEATURE_NAMES = [
    'age', 'fare', 'sex', 'n_siblings_spouses', 'parch', 'class', 'alone'
]

feature_columns = []
for fn in FEATURE_NAMES:
  feat_col = tf1.feature_column.numeric_column(fn, dtype=tf.float32)
  feature_columns.append(feat_col)

다양한 TensorFlow 1 Estimator 및 TensorFlow 2 Keras 모델을 활용하여 사용할 간단한 샘플 옵티마이저 프로그램을 인스턴스화하는 메서드를 생성합니다.

def create_sample_optimizer(tf_version):
  if tf_version == 'tf1':
    optimizer = lambda: tf.keras.optimizers.legacy.Ftrl(
        l1_regularization_strength=0.001,
        learning_rate=tf1.train.exponential_decay(
            learning_rate=0.1,
            global_step=tf1.train.get_global_step(),
            decay_steps=10000,
            decay_rate=0.9))
  elif tf_version == 'tf2':
    optimizer = tf.keras.optimizers.legacy.Ftrl(
        l1_regularization_strength=0.001,
        learning_rate=tf.keras.optimizers.schedules.ExponentialDecay(
            initial_learning_rate=0.1, decay_steps=10000, decay_rate=0.9))
  return optimizer

예 1: LinearEstimator에서 마이그레이션

TensorFlow 1: LinearEstimator 사용하기

TensorFlow 1에서는 tf.estimator.LinearEstimator 를 사용하여 회귀 및 분류 문제에 대한 기준선 선형 모델을 생성할 수 있습니다.

linear_estimator = tf.estimator.LinearEstimator(
    head=tf.estimator.BinaryClassHead(),
    feature_columns=feature_columns,
    optimizer=create_sample_optimizer('tf1'))

INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmpqfovz752
INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmpqfovz752', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

linear_estimator.train(input_fn=_input_fn, steps=100)
linear_estimator.evaluate(input_fn=_eval_input_fn, steps=10)

WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/training_util.py:396: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/keras/optimizers/optimizer_v2/ftrl.py:173: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmpqfovz752/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.6931472, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 20...
INFO:tensorflow:Saving checkpoints for 20 into /tmpfs/tmp/tmpqfovz752/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 20...
INFO:tensorflow:Loss for final step: 0.552688.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2022-12-14T20:20:57
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpqfovz752/model.ckpt-20
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Inference Time : 0.51618s
INFO:tensorflow:Finished evaluation at 2022-12-14-20:20:58
INFO:tensorflow:Saving dict for global step 20: accuracy = 0.70075756, accuracy_baseline = 0.625, auc = 0.75472915, auc_precision_recall = 0.65362054, average_loss = 0.5759378, global_step = 20, label/mean = 0.375, loss = 0.5704811, precision = 0.6388889, prediction/mean = 0.41331065, recall = 0.46464646
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 20: /tmpfs/tmp/tmpqfovz752/model.ckpt-20
{'accuracy': 0.70075756,
 'accuracy_baseline': 0.625,
 'auc': 0.75472915,
 'auc_precision_recall': 0.65362054,
 'average_loss': 0.5759378,
 'label/mean': 0.375,
 'loss': 0.5704811,
 'precision': 0.6388889,
 'prediction/mean': 0.41331065,
 'recall': 0.46464646,
 'global_step': 20}

TensorFlow 2: Keras LinearModel 사용하기

TensorFlow 2에서는 Keras의 인스턴스를 만들 수 있습니다 tf.compat.v1.keras.models.LinearModel 받는 대체입니다 tf.estimator.LinearEstimator . tf.compat.v1.keras 경로는 호환성을 위해 미리 만들어진 모델이 존재함을 나타내는 데 사용됩니다.

linear_model = tf.compat.v1.keras.experimental.LinearModel()
linear_model.compile(loss='mse', optimizer=create_sample_optimizer('tf2'), metrics=['accuracy'])
linear_model.fit(x_train, y_train, epochs=10)
linear_model.evaluate(x_eval, y_eval, return_dict=True)

Epoch 1/10
20/20 [==============================] - 0s 2ms/step - loss: 6.1600 - accuracy: 0.6459
Epoch 2/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2290 - accuracy: 0.6411
Epoch 3/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2093 - accuracy: 0.6762
Epoch 4/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2013 - accuracy: 0.6794
Epoch 5/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1920 - accuracy: 0.6858
Epoch 6/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1894 - accuracy: 0.7002
Epoch 7/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1801 - accuracy: 0.7352
Epoch 8/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1732 - accuracy: 0.7544
Epoch 9/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1722 - accuracy: 0.7624
Epoch 10/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1696 - accuracy: 0.7879
9/9 [==============================] - 0s 2ms/step - loss: 0.1866 - accuracy: 0.7538
{'loss': 0.18664006888866425, 'accuracy': 0.7537878751754761}

예 2: DNNEstimator에서 마이그레이션

TensorFlow 1: DNNEstimator 사용하기

TensorFlow 1에서는 tf.estimator.DNNEstimator를 사용하여 회귀 및 분류 문제용 기준 DNN(심층 신경망) 모델을 생성할 수 있습니다.

dnn_estimator = tf.estimator.DNNEstimator(
    head=tf.estimator.BinaryClassHead(),
    feature_columns=feature_columns,
    hidden_units=[128],
    activation_fn=tf.nn.relu,
    optimizer=create_sample_optimizer('tf1'))

INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmpolwmhcb7
INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmpolwmhcb7', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

dnn_estimator.train(input_fn=_input_fn, steps=100)
dnn_estimator.evaluate(input_fn=_eval_input_fn, steps=10)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2022-12-14 20:20:59.847616: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT64
    }
  }
}
 is neither a subtype nor a supertype of the combined inputs preceding it:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT32
    }
  }
}

    while inferring type of node 'dnn/zero_fraction/cond/output/_18'
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmpolwmhcb7/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 1.2786384, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 20...
INFO:tensorflow:Saving checkpoints for 20 into /tmpfs/tmp/tmpolwmhcb7/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 20...
INFO:tensorflow:Loss for final step: 0.6017107.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2022-12-14T20:21:01
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpolwmhcb7/model.ckpt-20
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Inference Time : 0.45178s
INFO:tensorflow:Finished evaluation at 2022-12-14-20:21:01
INFO:tensorflow:Saving dict for global step 20: accuracy = 0.70075756, accuracy_baseline = 0.625, auc = 0.69831645, auc_precision_recall = 0.6083623, average_loss = 0.5983687, global_step = 20, label/mean = 0.375, loss = 0.5946772, precision = 0.64705884, prediction/mean = 0.39281103, recall = 0.44444445
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 20: /tmpfs/tmp/tmpolwmhcb7/model.ckpt-20
{'accuracy': 0.70075756,
 'accuracy_baseline': 0.625,
 'auc': 0.69831645,
 'auc_precision_recall': 0.6083623,
 'average_loss': 0.5983687,
 'label/mean': 0.375,
 'loss': 0.5946772,
 'precision': 0.64705884,
 'prediction/mean': 0.39281103,
 'recall': 0.44444445,
 'global_step': 20}

TensorFlow 2: Keras를 사용하여 사용자 정의 DNN 모델 생성하기

tf.estimator.DNNEstimator 에 의해 생성된 모델을 대체하기 위해 사용자 지정 DNN 모델을 생성할 수 있습니다. 비슷한 수준의 사용자 지정 사용자 지정(예: 이전 예에서 선택한 모델 최적화 프로그램을 사용자 지정하는 기능) .

유사한 워크플로를 사용하여 tf.estimator.experimental.RNNEstimator를 Keras 순환 신경망(RNN) 모델로 대체할 수 있습니다. Keras는 tf.keras.layers.RNN과 tf.keras.layers.LSTM와 tf.keras.layers.GRU를 통해 다양한 내장형 사용자 정의 설정 옵션을 제공합니다. 자세한 내용은 Keras를 사용하는 RNN 가이드의 내장형 RNN 레이어: 간단한 예제 섹션을 확인하세요.

dnn_model = tf.keras.models.Sequential(
    [tf.keras.layers.Dense(128, activation='relu'),
     tf.keras.layers.Dense(1)])

dnn_model.compile(loss='mse', optimizer=create_sample_optimizer('tf2'), metrics=['accuracy'])

dnn_model.fit(x_train, y_train, epochs=10)
dnn_model.evaluate(x_eval, y_eval, return_dict=True)

Epoch 1/10
20/20 [==============================] - 0s 2ms/step - loss: 176.7372 - accuracy: 0.6093
Epoch 2/10
20/20 [==============================] - 0s 2ms/step - loss: 0.3162 - accuracy: 0.6619
Epoch 3/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2453 - accuracy: 0.7177
Epoch 4/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2101 - accuracy: 0.7448
Epoch 5/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1874 - accuracy: 0.7448
Epoch 6/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1761 - accuracy: 0.7687
Epoch 7/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1649 - accuracy: 0.7895
Epoch 8/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1675 - accuracy: 0.7687
Epoch 9/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1576 - accuracy: 0.7895
Epoch 10/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1531 - accuracy: 0.8022
9/9 [==============================] - 0s 2ms/step - loss: 0.1707 - accuracy: 0.7311
{'loss': 0.17067377269268036, 'accuracy': 0.7310606241226196}

예 3: DNNLinearCombinedEstimator에서 마이그레이션

TensorFlow 1: DNNLinearCombinedEstimator 사용하기

TensorFlow 1에서는 tf.estimator.DNNLinearCombinedEstimator 를 사용하여 선형 및 DNN 구성 요소 모두에 대한 사용자 지정 기능이 있는 회귀 및 분류 문제에 대한 기준 결합 모델을 생성할 수 있습니다.

optimizer = create_sample_optimizer('tf1')

combined_estimator = tf.estimator.DNNLinearCombinedEstimator(
    head=tf.estimator.BinaryClassHead(),
    # Wide settings
    linear_feature_columns=feature_columns,
    linear_optimizer=optimizer,
    # Deep settings
    dnn_feature_columns=feature_columns,
    dnn_hidden_units=[128],
    dnn_optimizer=optimizer)

INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmpcmo2jmcd
INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmpcmo2jmcd', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}

combined_estimator.train(input_fn=_input_fn, steps=100)
combined_estimator.evaluate(input_fn=_eval_input_fn, steps=10)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmpcmo2jmcd/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 1.7963595, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 20...
INFO:tensorflow:Saving checkpoints for 20 into /tmpfs/tmp/tmpcmo2jmcd/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 20...
INFO:tensorflow:Loss for final step: 0.55054855.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2022-12-14T20:21:05
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpcmo2jmcd/model.ckpt-20
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Inference Time : 0.54160s
INFO:tensorflow:Finished evaluation at 2022-12-14-20:21:06
INFO:tensorflow:Saving dict for global step 20: accuracy = 0.70075756, accuracy_baseline = 0.625, auc = 0.7516375, auc_precision_recall = 0.6489425, average_loss = 0.5854902, global_step = 20, label/mean = 0.375, loss = 0.57909745, precision = 0.6351351, prediction/mean = 0.41934198, recall = 0.47474748
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 20: /tmpfs/tmp/tmpcmo2jmcd/model.ckpt-20
{'accuracy': 0.70075756,
 'accuracy_baseline': 0.625,
 'auc': 0.7516375,
 'auc_precision_recall': 0.6489425,
 'average_loss': 0.5854902,
 'label/mean': 0.375,
 'loss': 0.57909745,
 'precision': 0.6351351,
 'prediction/mean': 0.41934198,
 'recall': 0.47474748,
 'global_step': 20}

TensorFlow 2: Keras WideDeepModel 사용하기

TensorFlow 2에서는 Keras의 인스턴스를 만들 수 있습니다 tf.compat.v1.keras.models.WideDeepModel 에 의해 생성 된 하나 대신에 tf.estimator.DNNLinearCombinedEstimator 같이, 예를 들어 사용자가 지정한 사용자 정의 비슷한 수준의 (와, 이전 예, 선택한 모델 최적화 프로그램을 사용자 정의하는 기능).

이 WideDeepModel LinearModel 과 사용자 정의 DNN 모델을 기반으로 구성되며, 둘 다 앞의 두 예에서 논의되었습니다. 원하는 경우 LinearModel 대신 사용자 정의 선형 모델을 사용할 수도 있습니다.

미리 준비된 Estimator를 사용하는 대신 자체 모델을 빌드하려면 Keras 순차형 모델 가이드를 확인하세요. 사용자 정의 훈련 및 옵티마이저에 대한 자세한 내용은 사용자 정의 훈련: 둘러보기 가이드를 확인하세요.

# Create LinearModel and DNN Model as in Examples 1 and 2
optimizer = create_sample_optimizer('tf2')

linear_model = tf.compat.v1.keras.experimental.LinearModel()
linear_model.compile(loss='mse', optimizer=optimizer, metrics=['accuracy'])
linear_model.fit(x_train, y_train, epochs=10, verbose=0)

dnn_model = tf.keras.models.Sequential(
    [tf.keras.layers.Dense(128, activation='relu'),
     tf.keras.layers.Dense(1)])
dnn_model.compile(loss='mse', optimizer=optimizer, metrics=['accuracy'])

combined_model = tf.compat.v1.keras.experimental.WideDeepModel(linear_model,
                                                               dnn_model)
combined_model.compile(
    optimizer=[optimizer, optimizer], loss='mse', metrics=['accuracy'])
combined_model.fit([x_train, x_train], y_train, epochs=10)
combined_model.evaluate(x_eval, y_eval, return_dict=True)

Epoch 1/10
20/20 [==============================] - 0s 3ms/step - loss: 426.9457 - accuracy: 0.4817
Epoch 2/10
20/20 [==============================] - 0s 3ms/step - loss: 0.5450 - accuracy: 0.7145
Epoch 3/10
20/20 [==============================] - 0s 3ms/step - loss: 0.2486 - accuracy: 0.7624
Epoch 4/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1728 - accuracy: 0.7879
Epoch 5/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1634 - accuracy: 0.7990
Epoch 6/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1569 - accuracy: 0.8006
Epoch 7/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1621 - accuracy: 0.7927
Epoch 8/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1767 - accuracy: 0.7767
Epoch 9/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1677 - accuracy: 0.7831
Epoch 10/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1522 - accuracy: 0.8006
9/9 [==============================] - 0s 2ms/step - loss: 0.1819 - accuracy: 0.7348
{'loss': 0.18186123669147491, 'accuracy': 0.7348484992980957}

예 4: BoostedTreesEstimator에서 마이그레이션

TensorFlow 1: BoostedTreesEstimator 사용하기

TensorFlow 1에서는 tf.estimator.BoostedTreesEstimator를 사용하여 회귀 및 분류 문제의 결정 트리 앙상블을 사용하는 기준 그래디언트 부스팅 모델을 만드는 기준을 생성할 수 있습니다. 이 기능은 더 이상 TensorFlow 2에 포함되어 있지 않습니다.

bt_estimator = tf1.estimator.BoostedTreesEstimator(
    head=tf.estimator.BinaryClassHead(),
    n_batches_per_layer=1,
    max_depth=10,
    n_trees=1000,
    feature_columns=feature_columns)

bt_estimator.train(input_fn=_input_fn, steps=1000)
bt_estimator.evaluate(input_fn=_eval_input_fn, steps=100)

TensorFlow 2: TensorFlow 의사결정 포레스트 사용하기

TensorFlow 2에서 tf.estimator.BoostedTreesEstimator가 TensorFlow Decision Forests 패키지의 tfdf.keras.GradientBoostedTreesModel로 교체되었습니다.

TensorFlow Decision Forests는 tf.estimator.BoostedTreesEstimator에 비해 다양한 이점을 제공합니다. 특히 품질, 속도, 사용 편의성 및 유연성 측면에서 좋습니다. TensorFlow Decision Forests에 대해 알아보려면 초보자 colab부터 시작하세요.

다음 예제는 TensorFlow 2를 사용하여 그래디언트 부스트 트리 모델을 훈련하는 방법을 보여줍니다.

TensorFlow Decision Forests를 설치합니다.

pip install tensorflow_decision_forests

TensorFlow 데이터세트를 생성합니다. 의사결정 포레스트는 기본적으로 다양한 유형의 특성을 지원하며 전처리가 필요하지 않습니다.

train_dataframe = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
eval_dataframe = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')

# Convert the Pandas Dataframes into TensorFlow datasets.
train_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(train_dataframe, label="survived")
eval_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(eval_dataframe, label="survived")

train_dataset 데이터세트에서 모델을 훈련합니다.

# Use the default hyper-parameters of the model.
gbt_model = tfdf.keras.GradientBoostedTreesModel()
gbt_model.fit(train_dataset)

Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
Use /tmpfs/tmp/tmp1qf0bvpd as temporary training directory
Reading training dataset...
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089
Training dataset read in 0:00:03.104184. Found 627 examples.
Training model...
2022-12-14 20:21:13.279635: W external/ydf/yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.cc:1765] Subsample hyperparameter given but sampling method does not match.
2022-12-14 20:21:13.279676: W external/ydf/yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.cc:1778] GOSS alpha hyperparameter given but GOSS is disabled.
2022-12-14 20:21:13.279683: W external/ydf/yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.cc:1787] GOSS beta hyperparameter given but GOSS is disabled.
2022-12-14 20:21:13.279689: W external/ydf/yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.cc:1799] SelGB ratio hyperparameter given but SelGB is disabled.
Model trained in 0:00:00.219057
Compiling model...
[INFO 2022-12-14T20:21:13.489275094+00:00 kernel.cc:1175] Loading model from path /tmpfs/tmp/tmp1qf0bvpd/model/ with prefix d65ec7f5c8094db0
[INFO 2022-12-14T20:21:13.492734623+00:00 abstract_model.cc:1306] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 2022-12-14T20:21:13.492762848+00:00 kernel.cc:1021] Use fast generic engine
WARNING:tensorflow:AutoGraph could not transform <function simple_ml_inference_op_with_handle at 0x7f688dbd8d30> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function simple_ml_inference_op_with_handle at 0x7f688dbd8d30> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function simple_ml_inference_op_with_handle at 0x7f688dbd8d30> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Model compiled.
<keras.callbacks.History at 0x7f67883ca790>

eval_dataset 데이터세트에서 모델의 품질을 평가합니다.

gbt_model.compile(metrics=['accuracy'])
gbt_evaluation = gbt_model.evaluate(eval_dataset, return_dict=True)
print(gbt_evaluation)

1/1 [==============================] - 0s 282ms/step - loss: 0.0000e+00 - accuracy: 0.8295
{'loss': 0.0, 'accuracy': 0.8295454382896423}

그래디언트 부스트 트리는 TensorFlow Decision Forests에서 사용할 수 있는 많은 결정 포레스트 알고리즘 중 하나일 뿐입니다. 예를 들어 Random Forests(tfdf.keras.GradientBoostedTreesModel로 사용 가능)는 과대적합에 매우 강하게 저항하는 한편 CART(tfdf.keras.CartModel로 사용 가능)는 모델 해석에 적합합니다.

다음 예제에서는 랜덤 포레스트 모델을 훈련하고 플로팅합니다.

# Train a Random Forest model
rf_model = tfdf.keras.RandomForestModel()
rf_model.fit(train_dataset)

# Evaluate the Random Forest model
rf_model.compile(metrics=['accuracy'])
rf_evaluation = rf_model.evaluate(eval_dataset, return_dict=True)
print(rf_evaluation)

Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
Use /tmpfs/tmp/tmpdy5b8a3n as temporary training directory
Reading training dataset...
Training dataset read in 0:00:00.180211. Found 627 examples.
Training model...
Model trained in 0:00:00.189182
Compiling model...
[INFO 2022-12-14T20:21:15.408555748+00:00 kernel.cc:1175] Loading model from path /tmpfs/tmp/tmpdy5b8a3n/model/ with prefix 0ea37c04c3024366
[INFO 2022-12-14T20:21:15.505069923+00:00 kernel.cc:1021] Use fast generic engine
Model compiled.
1/1 [==============================] - 0s 127ms/step - loss: 0.0000e+00 - accuracy: 0.8333
{'loss': 0.0, 'accuracy': 0.8333333134651184}

마지막 예제에서는 CART 모델을 훈련하고 평가합니다.

# Train a CART model
cart_model = tfdf.keras.CartModel()
cart_model.fit(train_dataset)

# Plot the CART model
tfdf.model_plotter.plot_model_in_colab(cart_model, max_depth=2)

Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
Use /tmpfs/tmp/tmprdgpsale as temporary training directory
Reading training dataset...
Training dataset read in 0:00:00.172604. Found 627 examples.
Training model...
Model trained in 0:00:00.016789
Compiling model...
2022-12-14 20:21:15.990105: W external/ydf/yggdrasil_decision_forests/model/random_forest/random_forest.cc:607] ValidationEvaluation requires OOB evaluation enabled.Random Forest models should be trained with compute_oob_performances:true. CART models do not support OOB evaluation.
[INFO 2022-12-14T20:21:16.001366191+00:00 kernel.cc:1175] Loading model from path /tmpfs/tmp/tmprdgpsale/model/ with prefix adef0aee997c4b5e
[INFO 2022-12-14T20:21:16.00171176+00:00 kernel.cc:1021] Use fast generic engine
Model compiled.