Migrasikan kode TensorFlow 1 Anda ke TensorFlow 2

Lihat di TensorFlow.org Jalankan di Google Colab Lihat sumber di GitHub Unduh buku catatan

Panduan ini ditujukan untuk pengguna TensorFlow API tingkat rendah. Jika Anda menggunakan API tingkat tinggi ( tf.keras ) mungkin ada sedikit atau tidak ada tindakan yang perlu Anda ambil untuk membuat kode Anda sepenuhnya TensorFlow 2.x kompatibel:

Hal ini masih mungkin untuk menjalankan kode 1.x, dimodifikasi ( kecuali untuk contrib ), di TensorFlow 2.x:

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

Namun, ini tidak memungkinkan Anda memanfaatkan banyak peningkatan yang dibuat di TensorFlow 2.x. Panduan ini akan membantu Anda meningkatkan kode, membuatnya lebih sederhana, lebih berkinerja, dan lebih mudah dirawat.

Skrip konversi otomatis

Langkah pertama, sebelum mencoba untuk mengimplementasikan perubahan yang dijelaskan dalam panduan ini, adalah untuk mencoba menjalankan script upgrade .

Ini akan mengeksekusi pass awal saat memutakhirkan kode Anda ke TensorFlow 2.x tetapi tidak dapat membuat kode Anda menjadi idiomatis ke v2. Kode Anda mungkin masih memanfaatkan tf.compat.v1 endpoint ke placeholder akses, sesi, koleksi, dan fungsi 1.x-gaya lainnya.

Perubahan perilaku tingkat atas

Jika karya kode Anda di TensorFlow 2.x menggunakan tf.compat.v1.disable_v2_behavior , masih ada perubahan perilaku global yang Anda mungkin perlu alamat. Perubahan utama adalah:

  • Eksekusi bersemangat, v1.enable_eager_execution() : Setiap kode yang secara implisit menggunakan tf.Graph akan gagal. Pastikan untuk membungkus kode ini dalam with tf.Graph().as_default() konteks.

  • Variabel sumber daya, v1.enable_resource_variables() : Beberapa kode mungkin tergantung pada perilaku non-deterministik diaktifkan oleh variabel referensi TensorFlow. Variabel sumber daya dikunci saat sedang ditulis, sehingga memberikan jaminan konsistensi yang lebih intuitif.

    • Ini dapat mengubah perilaku dalam kasus tepi.
    • Ini dapat membuat salinan tambahan dan dapat memiliki penggunaan memori yang lebih tinggi.
    • Hal ini dapat dinonaktifkan dengan melewati use_resource=False ke tf.Variable konstruktor.
  • Tensor bentuk, v1.enable_v2_tensorshape() : TensorFlow 2.x menyederhanakan perilaku bentuk tensor. Alih-alih t.shape[0].value Anda dapat mengatakan t.shape[0] . Perubahan ini harus kecil, dan masuk akal untuk segera memperbaikinya. Mengacu pada TensorShape bagian untuk contoh.

  • Aliran kontrol, v1.enable_control_flow_v2() : The TensorFlow 2.x pelaksanaan kontrol aliran telah disederhanakan, sehingga menghasilkan representasi grafik yang berbeda. Silakan bug berkas untuk setiap masalah.

Buat kode untuk TensorFlow 2.x

Panduan ini akan menjelaskan beberapa contoh konversi kode TensorFlow 1.x ke TensorFlow 2.x. Perubahan ini akan memungkinkan kode Anda memanfaatkan pengoptimalan kinerja dan panggilan API yang disederhanakan.

Dalam setiap kasus, polanya adalah:

1. Ganti v1.Session.run panggilan

Setiap v1.Session.run panggilan harus diganti dengan fungsi Python.

  • The feed_dict dan v1.placeholder s menjadi argumen fungsi.
  • The fetches menjadi nilai kembali fungsi ini.
  • Selama konversi eksekusi bersemangat memungkinkan debugging mudah dengan alat Python standar seperti pdb .

Setelah itu, tambahkan tf.function dekorator untuk membuatnya berjalan efisien dalam grafik. Check out Autograph panduan untuk informasi lebih lanjut tentang cara kerja ini.

Perhatikan bahwa:

  • Tidak seperti v1.Session.run , sebuah tf.function memiliki tanda tangan kembali tetap dan selalu mengembalikan semua output. Jika ini menyebabkan masalah kinerja, buat dua fungsi terpisah.

  • Tidak perlu untuk tf.control_dependencies atau operasi serupa: Sebuah tf.function berperilaku seolah-olah itu dijalankan dalam perintah tertulis. tf.Variable tugas dan tf.assert s, misalnya, dijalankan secara otomatis.

The Bagian model mengkonversi berisi contoh kerja proses konversi ini.

2. Gunakan objek Python untuk melacak variabel dan kerugian

Semua pelacakan variabel berbasis nama sangat tidak disarankan di TensorFlow 2.x. Gunakan objek Python untuk melacak variabel.

Gunakan tf.Variable bukan v1.get_variable .

Setiap v1.variable_scope harus dikonversi ke objek Python. Biasanya ini akan menjadi salah satu dari:

Jika Anda perlu daftar agregat dari variabel (seperti tf.Graph.get_collection(tf.GraphKeys.VARIABLES) ), menggunakan .variables dan .trainable_variables atribut dari Layer dan Model objek.

Ini Layer dan Model kelas mengimplementasikan beberapa properti lain yang menghapus kebutuhan untuk koleksi global. Mereka .losses properti bisa menjadi pengganti untuk menggunakan tf.GraphKeys.LOSSES koleksi.

Mengacu pada panduan Keras untuk lebih jelasnya.

3. Tingkatkan loop pelatihan Anda

Gunakan API tingkat tertinggi yang berfungsi untuk kasus penggunaan Anda. Lebih tf.keras.Model.fit lebih membangun loop pelatihan Anda sendiri.

Fungsi tingkat tinggi ini mengelola banyak detail tingkat rendah yang mungkin mudah terlewatkan jika Anda menulis loop pelatihan Anda sendiri. Misalnya, mereka secara otomatis mengumpulkan kerugian regularisasi, dan mengatur training=True argumen saat memanggil model.

4. Tingkatkan saluran input data Anda

Gunakan tf.data dataset untuk input data. Objek ini efisien, ekspresif, dan terintegrasi dengan baik dengan tensorflow.

Mereka dapat dikirimkan secara langsung ke tf.keras.Model.fit metode.

model.fit(dataset, epochs=5)

Mereka dapat diulang melalui Python standar langsung:

for example_batch, label_batch in dataset:
    break

5. Migrasi off compat.v1 simbol

The tf.compat.v1 modul berisi lengkap TensorFlow 1.x API, dengan semantik aslinya.

The TensorFlow 2.x meng-upgrade skrip akan mengkonversi simbol untuk v2 setara mereka jika konversi tersebut aman, yaitu, jika dapat menentukan bahwa perilaku versi TensorFlow 2.x adalah persis sama (misalnya, itu akan mengubah nama v1.arg_max untuk tf.argmax , karena mereka adalah fungsi yang sama).

Setelah script upgrade dilakukan dengan sepotong kode, kemungkinan ada banyak menyebutkan compat.v1 . Sebaiknya menelusuri kode dan mengonversinya secara manual ke yang setara dengan v2 (harus disebutkan dalam log jika ada).

Mengonversi model

Variabel tingkat rendah & eksekusi operator

Contoh penggunaan API tingkat rendah meliputi:

Sebelum mengonversi

Berikut adalah tampilan pola-pola ini dalam kode menggunakan TensorFlow 1.x.

import tensorflow as tf
import tensorflow.compat.v1 as v1

import tensorflow_datasets as tfds
2021-07-19 23:37:03.701382: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
g = v1.Graph()

with g.as_default():
  in_a = v1.placeholder(dtype=v1.float32, shape=(2))
  in_b = v1.placeholder(dtype=v1.float32, shape=(2))

  def forward(x):
    with v1.variable_scope("matmul", reuse=v1.AUTO_REUSE):
      W = v1.get_variable("W", initializer=v1.ones(shape=(2,2)),
                          regularizer=lambda x:tf.reduce_mean(x**2))
      b = v1.get_variable("b", initializer=v1.zeros(shape=(2)))
      return W * x + b

  out_a = forward(in_a)
  out_b = forward(in_b)
  reg_loss=v1.losses.get_regularization_loss(scope="matmul")

with v1.Session(graph=g) as sess:
  sess.run(v1.global_variables_initializer())
  outs = sess.run([out_a, out_b, reg_loss],
                feed_dict={in_a: [1, 0], in_b: [0, 1]})

print(outs[0])
print()
print(outs[1])
print()
print(outs[2])
2021-07-19 23:37:05.720243: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-07-19 23:37:06.406838: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:06.407495: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:06.407533: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-19 23:37:06.410971: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-07-19 23:37:06.411090: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-07-19 23:37:06.412239: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-07-19 23:37:06.412612: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-07-19 23:37:06.413657: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-07-19 23:37:06.414637: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-07-19 23:37:06.414862: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-19 23:37:06.415002: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:06.415823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:06.416461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:06.417159: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-07-19 23:37:06.417858: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:06.418588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:06.418704: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:06.419416: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:06.420021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:06.420085: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-07-19 23:37:07.053897: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:07.053954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:07.053964: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:07.054212: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.054962: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.055685: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.056348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
2021-07-19 23:37:07.060371: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2000165000 Hz
[[1. 0.]
 [1. 0.]]

[[0. 1.]
 [0. 1.]]

1.0

Setelah mengkonversi

Dalam kode yang dikonversi:

  • Variabel adalah objek Python lokal.
  • The forward fungsi masih mendefinisikan perhitungan.
  • The Session.run panggilan diganti dengan panggilan untuk forward .
  • Opsional tf.function dekorator dapat ditambahkan untuk kinerja.
  • Regularisasi dihitung secara manual, tanpa mengacu pada koleksi global apa pun.
  • Tidak ada penggunaan sesi atau penampung.
W = tf.Variable(tf.ones(shape=(2,2)), name="W")
b = tf.Variable(tf.zeros(shape=(2)), name="b")

@tf.function
def forward(x):
  return W * x + b

out_a = forward([1,0])
print(out_a)
tf.Tensor(
[[1. 0.]
 [1. 0.]], shape=(2, 2), dtype=float32)
2021-07-19 23:37:07.370160: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.370572: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:07.370699: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.371011: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.371278: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:07.371360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:07.371370: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:07.371377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:07.371511: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.371844: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:07.372131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
2021-07-19 23:37:07.419147: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
out_b = forward([0,1])

regularizer = tf.keras.regularizers.l2(0.04)
reg_loss=regularizer(W)

Model berdasarkan tf.layers

The v1.layers modul yang digunakan untuk mengandung lapisan-fungsi yang bergantung pada v1.variable_scope untuk mendefinisikan dan variabel digunakan kembali.

Sebelum mengonversi

def model(x, training, scope='model'):
  with v1.variable_scope(scope, reuse=v1.AUTO_REUSE):
    x = v1.layers.conv2d(x, 32, 3, activation=v1.nn.relu,
          kernel_regularizer=lambda x:0.004*tf.reduce_mean(x**2))
    x = v1.layers.max_pooling2d(x, (2, 2), 1)
    x = v1.layers.flatten(x)
    x = v1.layers.dropout(x, 0.1, training=training)
    x = v1.layers.dense(x, 64, activation=v1.nn.relu)
    x = v1.layers.batch_normalization(x, training=training)
    x = v1.layers.dense(x, 10)
    return x
train_data = tf.ones(shape=(1, 28, 28, 1))
test_data = tf.ones(shape=(1, 28, 28, 1))

train_out = model(train_data, training=True)
test_out = model(test_data, training=False)

print(train_out)
print()
print(test_out)
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/legacy_tf_layers/convolutional.py:414: UserWarning: `tf.layers.conv2d` is deprecated and will be removed in a future version. Please Use `tf.keras.layers.Conv2D` instead.
  warnings.warn('`tf.layers.conv2d` is deprecated and '
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/engine/base_layer.py:2183: UserWarning: `layer.apply` is deprecated and will be removed in a future version. Please use `layer.__call__` method instead.
  warnings.warn('`layer.apply` is deprecated and '
2021-07-19 23:37:07.471106: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-07-19 23:37:09.562531: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8100
2021-07-19 23:37:14.794726: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
tf.Tensor([[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]], shape=(1, 10), dtype=float32)

tf.Tensor(
[[ 0.04853132 -0.08974641 -0.32679698  0.07017353  0.12982666 -0.2153313
  -0.09793851  0.10957378  0.01823931  0.00898573]], shape=(1, 10), dtype=float32)
2021-07-19 23:37:15.173234: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/legacy_tf_layers/pooling.py:310: UserWarning: `tf.layers.max_pooling2d` is deprecated and will be removed in a future version. Please use `tf.keras.layers.MaxPooling2D` instead.
  warnings.warn('`tf.layers.max_pooling2d` is deprecated and '
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/legacy_tf_layers/core.py:329: UserWarning: `tf.layers.flatten` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Flatten` instead.
  warnings.warn('`tf.layers.flatten` is deprecated and '
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/legacy_tf_layers/core.py:268: UserWarning: `tf.layers.dropout` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Dropout` instead.
  warnings.warn('`tf.layers.dropout` is deprecated and '
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/legacy_tf_layers/core.py:171: UserWarning: `tf.layers.dense` is deprecated and will be removed in a future version. Please use `tf.keras.layers.Dense` instead.
  warnings.warn('`tf.layers.dense` is deprecated and '
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/legacy_tf_layers/normalization.py:308: UserWarning: `tf.layers.batch_normalization` is deprecated and will be removed in a future version. Please use `tf.keras.layers.BatchNormalization` instead. In particular, `tf.control_dependencies(tf.GraphKeys.UPDATE_OPS)` should not be used (consult the `tf.keras.layers.BatchNormalization` documentation).
  '`tf.layers.batch_normalization` is deprecated and '

Setelah mengkonversi

Sebagian besar argumen tetap sama. Tapi perhatikan perbedaannya:

  • The training argumen dilewatkan ke setiap lapisan dengan model ketika berjalan.
  • Argumen pertama yang asli model fungsi (input x ) adalah hilang. Ini karena lapisan objek memisahkan pembuatan model dari pemanggilan model.

Perhatikan juga bahwa:

  • Jika Anda menggunakan regularizers atau initializers dari tf.contrib , ini memiliki perubahan argumen yang lebih daripada yang lain.
  • Kode tidak lagi menulis untuk koleksi, sehingga fungsi seperti v1.losses.get_regularization_loss tidak akan lagi kembali nilai-nilai ini, berpotensi melanggar loop pelatihan Anda.
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, 3, activation='relu',
                           kernel_regularizer=tf.keras.regularizers.l2(0.04),
                           input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.1),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(10)
])

train_data = tf.ones(shape=(1, 28, 28, 1))
test_data = tf.ones(shape=(1, 28, 28, 1))
train_out = model(train_data, training=True)
print(train_out)
tf.Tensor([[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]], shape=(1, 10), dtype=float32)
test_out = model(test_data, training=False)
print(test_out)
tf.Tensor(
[[-0.06252427  0.30122417 -0.18610534 -0.04890637 -0.01496555  0.41607457
   0.24905115  0.014429   -0.12719882 -0.22354674]], shape=(1, 10), dtype=float32)
# Here are all the trainable variables
len(model.trainable_variables)
8
# Here is the regularization loss
model.losses
[<tf.Tensor: shape=(), dtype=float32, numpy=0.07443664>]

Campuran variabel & v1.layers

Yang ada kode sering campuran menurunkan tingkat TensorFlow 1.x variabel dan operasi dengan tingkat yang lebih tinggi v1.layers .

Sebelum mengonversi

def model(x, training, scope='model'):
  with v1.variable_scope(scope, reuse=v1.AUTO_REUSE):
    W = v1.get_variable(
      "W", dtype=v1.float32,
      initializer=v1.ones(shape=x.shape),
      regularizer=lambda x:0.004*tf.reduce_mean(x**2),
      trainable=True)
    if training:
      x = x + W
    else:
      x = x + W * 0.5
    x = v1.layers.conv2d(x, 32, 3, activation=tf.nn.relu)
    x = v1.layers.max_pooling2d(x, (2, 2), 1)
    x = v1.layers.flatten(x)
    return x

train_out = model(train_data, training=True)
test_out = model(test_data, training=False)

Setelah mengkonversi

Untuk mengonversi kode ini, ikuti pola pemetaan lapisan ke lapisan seperti pada contoh sebelumnya.

Pola umumnya adalah:

  • Parameter lapisan Kumpulkan di __init__ .
  • Membangun variabel dalam build .
  • Mengeksekusi perhitungan dalam call , dan mengembalikan hasil.

The v1.variable_scope pada dasarnya adalah lapisan sendiri. Jadi menulis ulang sebagai tf.keras.layers.Layer . Check out Layers baru Membuat dan Model melalui subclassing panduan untuk rincian.

# Create a custom layer for part of the model
class CustomLayer(tf.keras.layers.Layer):
  def __init__(self, *args, **kwargs):
    super(CustomLayer, self).__init__(*args, **kwargs)

  def build(self, input_shape):
    self.w = self.add_weight(
        shape=input_shape[1:],
        dtype=tf.float32,
        initializer=tf.keras.initializers.ones(),
        regularizer=tf.keras.regularizers.l2(0.02),
        trainable=True)

  # Call method will sometimes get used in graph mode,
  # training will get turned into a tensor
  @tf.function
  def call(self, inputs, training=None):
    if training:
      return inputs + self.w
    else:
      return inputs + self.w * 0.5
custom_layer = CustomLayer()
print(custom_layer([1]).numpy())
print(custom_layer([1], training=True).numpy())
[1.5]
[2.]
train_data = tf.ones(shape=(1, 28, 28, 1))
test_data = tf.ones(shape=(1, 28, 28, 1))

# Build the model including the custom layer
model = tf.keras.Sequential([
    CustomLayer(input_shape=(28, 28, 1)),
    tf.keras.layers.Conv2D(32, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
])

train_out = model(train_data, training=True)
test_out = model(test_data, training=False)

Beberapa hal yang perlu diperhatikan:

  • Model dan lapisan Keras subkelas perlu dijalankan di grafik v1 (tidak ada dependensi kontrol otomatis) dan dalam mode bersemangat:

    • Bungkus call dalam tf.function untuk mendapatkan tanda tangan dan kontrol otomatis dependensi.
  • Jangan lupa untuk menerima training argumen untuk call :

    • Kadang-kadang itu adalah tf.Tensor
    • Terkadang itu adalah boolean Python
  • Buat variabel model dalam konstruktor atau Model.build menggunakan `self.add_weight:

    • Dalam Model.build Anda memiliki akses ke bentuk masukan, sehingga dapat membuat bobot dengan bentuk yang cocok
    • Menggunakan tf.keras.layers.Layer.add_weight memungkinkan Keras untuk variabel track dan kerugian regularisasi
  • Jangan menyimpan tf.Tensors dalam objek Anda:

    • Mereka mungkin akan dibuat baik dalam tf.function atau dalam konteks bersemangat, dan tensor ini berperilaku berbeda
    • Gunakan tf.Variable s bagi negara, mereka selalu dapat digunakan dari kedua konteks
    • tf.Tensors hanya untuk nilai menengah

Catatan tentang Slim dan contrib.layers

Sejumlah besar kode TensorFlow 1.x tua menggunakan Slim perpustakaan, yang dikemas dengan TensorFlow 1.x sebagai tf.contrib.layers . Sebagai contrib modul, ini tidak lagi tersedia di TensorFlow 2.x, bahkan dalam tf.compat.v1 . Kode Konversi menggunakan Slim untuk TensorFlow 2.x lebih terlibat daripada mengkonversi repositori yang menggunakan v1.layers . Bahkan, mungkin masuk akal untuk mengkonversi kode Slim Anda untuk v1.layers pertama, kemudian dikonversi ke Keras.

  • Hapus arg_scopes , semua args harus eksplisit.
  • Jika Anda menggunakannya, split normalizer_fn dan activation_fn ke lapisan mereka sendiri.
  • Lapisan konv yang dapat dipisahkan dipetakan ke satu atau lebih lapisan Keras yang berbeda (lapisan Keras secara mendalam, titik, dan dapat dipisahkan).
  • Slim dan v1.layers memiliki nama argumen yang berbeda dan nilai-nilai default.
  • Beberapa argumen memiliki skala yang berbeda.
  • Jika Anda menggunakan Slim model pra-dilatih, mencoba model pra-traimed Keras ini dari tf.keras.applications atau TF Hub 's TensorFlow 2.x SavedModels diekspor dari kode Slim asli.

Beberapa tf.contrib lapisan tidak mungkin telah dipindahkan ke inti TensorFlow tetapi sebaliknya telah dipindahkan ke paket TensorFlow Addons .

Latihan

Ada banyak cara untuk data umpan ke tf.keras Model. Mereka akan menerima generator Python dan array Numpy sebagai input.

Cara yang direkomendasikan untuk data yang pakan untuk model adalah dengan menggunakan tf.data paket, yang berisi kumpulan kelas kinerja tinggi untuk memanipulasi data.

Jika Anda masih menggunakan tf.queue , ini sekarang hanya didukung sebagai data-struktur, bukan sebagai pipa masukan.

Menggunakan Kumpulan Data TensorFlow

The TensorFlow Datasets paket ( tfds ) mengandung utilitas untuk memuat dataset telah ditetapkan sebagai tf.data.Dataset objek.

Untuk contoh ini, Anda dapat memuat dataset MNIST menggunakan tfds :

datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True)
mnist_train, mnist_test = datasets['train'], datasets['test']

Kemudian siapkan data untuk pelatihan:

  • Skala ulang setiap gambar.
  • Acak urutan contoh.
  • Kumpulkan kumpulan gambar dan label.
BUFFER_SIZE = 10 # Use a much larger value for real code
BATCH_SIZE = 64
NUM_EPOCHS = 5


def scale(image, label):
  image = tf.cast(image, tf.float32)
  image /= 255

  return image, label

Untuk mempersingkat contoh, pangkas kumpulan data agar hanya mengembalikan 5 kumpulan:

train_data = mnist_train.map(scale).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)
test_data = mnist_test.map(scale).batch(BATCH_SIZE)

STEPS_PER_EPOCH = 5

train_data = train_data.take(STEPS_PER_EPOCH)
test_data = test_data.take(STEPS_PER_EPOCH)
image_batch, label_batch = next(iter(train_data))
2021-07-19 23:37:19.049077: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

Gunakan loop pelatihan Keras

Jika Anda tidak perlu kontrol tingkat rendah dari proses pelatihan Anda, menggunakan Keras built-in fit , evaluate , dan predict metode dianjurkan. Metode ini menyediakan antarmuka yang seragam untuk melatih model terlepas dari implementasinya (berurutan, fungsional, atau sub-kelas).

Keuntungan dari metode ini meliputi:

  • Mereka menerima Numpy array, generator Python dan, tf.data.Datasets .
  • Mereka menerapkan regularisasi, dan kerugian aktivasi secara otomatis.
  • Mereka mendukung tf.distribute untuk pelatihan multi-perangkat .
  • Mereka mendukung callable sewenang-wenang sebagai kerugian dan metrik.
  • Mereka mendukung callback seperti tf.keras.callbacks.TensorBoard , dan callback kustom.
  • Mereka berkinerja baik, secara otomatis menggunakan grafik TensorFlow.

Berikut adalah contoh dari pelatihan model menggunakan Dataset . (Untuk rincian tentang cara kerja ini, memeriksa tutorial bagian.)

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, 3, activation='relu',
                           kernel_regularizer=tf.keras.regularizers.l2(0.02),
                           input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.1),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(10)
])

# Model is the full model w/o custom layers
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(train_data, epochs=NUM_EPOCHS)
loss, acc = model.evaluate(test_data)

print("Loss {}, Accuracy {}".format(loss, acc))
Epoch 1/5
5/5 [==============================] - 2s 8ms/step - loss: 1.5874 - accuracy: 0.4719
Epoch 2/5
2021-07-19 23:37:20.919125: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
5/5 [==============================] - 0s 5ms/step - loss: 0.4435 - accuracy: 0.9094
Epoch 3/5
2021-07-19 23:37:21.242435: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
5/5 [==============================] - 0s 6ms/step - loss: 0.2764 - accuracy: 0.9594
Epoch 4/5
2021-07-19 23:37:21.576808: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
5/5 [==============================] - 0s 5ms/step - loss: 0.1889 - accuracy: 0.9844
Epoch 5/5
2021-07-19 23:37:21.888991: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
5/5 [==============================] - 1s 6ms/step - loss: 0.1504 - accuracy: 0.9906
2021-07-19 23:37:23.082199: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
5/5 [==============================] - 1s 3ms/step - loss: 1.6299 - accuracy: 0.7031
Loss 1.6299388408660889, Accuracy 0.703125
2021-07-19 23:37:23.932781: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

Tulis lingkaran Anda sendiri

Jika model Keras ini langkah pelatihan bekerja untuk Anda, tetapi Anda perlu lebih banyak kontrol luar yang langkah, pertimbangkan untuk menggunakan tf.keras.Model.train_on_batch metode, di sendiri lingkaran data iterasi Anda.

Ingat: Banyak hal dapat diimplementasikan sebagai tf.keras.callbacks.Callback .

Metode ini memiliki banyak keuntungan dari metode yang disebutkan di bagian sebelumnya, tetapi memberikan kontrol loop luar kepada pengguna.

Anda juga dapat menggunakan tf.keras.Model.test_on_batch atau tf.keras.Model.evaluate kinerja cek selama pelatihan.

Untuk melanjutkan pelatihan model di atas:

# Model is the full model w/o custom layers
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

for epoch in range(NUM_EPOCHS):
  # Reset the metric accumulators
  model.reset_metrics()

  for image_batch, label_batch in train_data:
    result = model.train_on_batch(image_batch, label_batch)
    metrics_names = model.metrics_names
    print("train: ",
          "{}: {:.3f}".format(metrics_names[0], result[0]),
          "{}: {:.3f}".format(metrics_names[1], result[1]))
  for image_batch, label_batch in test_data:
    result = model.test_on_batch(image_batch, label_batch,
                                 # Return accumulated metrics
                                 reset_metrics=False)
  metrics_names = model.metrics_names
  print("\neval: ",
        "{}: {:.3f}".format(metrics_names[0], result[0]),
        "{}: {:.3f}".format(metrics_names[1], result[1]))
train:  loss: 0.131 accuracy: 1.000
train:  loss: 0.179 accuracy: 0.969
train:  loss: 0.117 accuracy: 0.984
train:  loss: 0.187 accuracy: 0.969
train:  loss: 0.168 accuracy: 0.969
2021-07-19 23:37:24.758128: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2021-07-19 23:37:25.476778: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
eval:  loss: 1.655 accuracy: 0.703
train:  loss: 0.083 accuracy: 1.000
train:  loss: 0.080 accuracy: 1.000
train:  loss: 0.099 accuracy: 0.984
train:  loss: 0.088 accuracy: 1.000
train:  loss: 0.084 accuracy: 1.000
2021-07-19 23:37:25.822978: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2021-07-19 23:37:26.103858: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
eval:  loss: 1.645 accuracy: 0.759
train:  loss: 0.066 accuracy: 1.000
train:  loss: 0.070 accuracy: 1.000
train:  loss: 0.062 accuracy: 1.000
train:  loss: 0.067 accuracy: 1.000
train:  loss: 0.061 accuracy: 1.000
2021-07-19 23:37:26.454306: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2021-07-19 23:37:26.715112: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
eval:  loss: 1.609 accuracy: 0.819
train:  loss: 0.056 accuracy: 1.000
train:  loss: 0.053 accuracy: 1.000
train:  loss: 0.048 accuracy: 1.000
train:  loss: 0.057 accuracy: 1.000
train:  loss: 0.069 accuracy: 0.984
2021-07-19 23:37:27.059747: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2021-07-19 23:37:27.327066: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
eval:  loss: 1.568 accuracy: 0.825
train:  loss: 0.048 accuracy: 1.000
train:  loss: 0.048 accuracy: 1.000
train:  loss: 0.044 accuracy: 1.000
train:  loss: 0.045 accuracy: 1.000
train:  loss: 0.045 accuracy: 1.000
2021-07-19 23:37:28.593597: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
eval:  loss: 1.531 accuracy: 0.841
2021-07-19 23:37:29.220455: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

Sesuaikan langkah pelatihan

Jika Anda membutuhkan lebih banyak fleksibilitas dan kontrol, Anda dapat memilikinya dengan menerapkan loop pelatihan Anda sendiri. Ada tiga langkah:

  1. Iterate lebih generator Python atau tf.data.Dataset untuk mendapatkan batch contoh.
  2. Gunakan tf.GradientTape gradien mengumpulkan.
  3. Gunakan salah satu tf.keras.optimizers untuk menerapkan update bobot variabel model.

Ingat:

  • Selalu menyertakan training argumen pada call metode lapisan subclassed dan model.
  • Pastikan untuk memanggil model dengan training argumen set dengan benar.
  • Tergantung pada penggunaan, variabel model mungkin tidak ada sampai model dijalankan pada kumpulan data.
  • Anda perlu menangani hal-hal seperti kerugian regularisasi untuk model secara manual.

Perhatikan penyederhanaan relatif terhadap v1:

  • Tidak perlu menjalankan inisialisasi variabel. Variabel diinisialisasi pada pembuatan.
  • Tidak perlu menambahkan dependensi kontrol manual. Bahkan di tf.function operasi bertindak sebagai dalam mode bersemangat.
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, 3, activation='relu',
                           kernel_regularizer=tf.keras.regularizers.l2(0.02),
                           input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.1),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(10)
])

optimizer = tf.keras.optimizers.Adam(0.001)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

@tf.function
def train_step(inputs, labels):
  with tf.GradientTape() as tape:
    predictions = model(inputs, training=True)
    regularization_loss=tf.math.add_n(model.losses)
    pred_loss=loss_fn(labels, predictions)
    total_loss=pred_loss + regularization_loss

  gradients = tape.gradient(total_loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))

for epoch in range(NUM_EPOCHS):
  for inputs, labels in train_data:
    train_step(inputs, labels)
  print("Finished epoch", epoch)
2021-07-19 23:37:29.998049: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Finished epoch 0
2021-07-19 23:37:30.316333: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Finished epoch 1
2021-07-19 23:37:30.618560: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Finished epoch 2
2021-07-19 23:37:30.946881: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Finished epoch 3
Finished epoch 4
2021-07-19 23:37:31.261594: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

Metrik dan kerugian gaya baru

Di TensorFlow 2.x, metrik dan kerugian adalah objek. Ini bekerja baik bersemangat dan tf.function s.

Objek kerugian dapat dipanggil, dan mengharapkan (y_true, y_pred) sebagai argumen:

cce = tf.keras.losses.CategoricalCrossentropy(from_logits=True)
cce([[1, 0]], [[-1.0,3.0]]).numpy()
4.01815

Objek metrik memiliki metode berikut:

  • Metric.update_state() : menambahkan pengamatan baru.
  • Metric.result() : mendapatkan hasil saat metrik, mengingat nilai-nilai yang diamati.
  • Metric.reset_states() : menghapus semua pengamatan.

Objek itu sendiri dapat dipanggil. Memanggil update negara dengan pengamatan baru, seperti update_state , dan mengembalikan hasil baru dari metrik.

Anda tidak perlu menginisialisasi variabel metrik secara manual, dan karena TensorFlow 2.x memiliki dependensi kontrol otomatis, Anda juga tidak perlu mengkhawatirkannya.

Kode di bawah ini menggunakan metrik untuk melacak kerugian rata-rata yang diamati dalam loop pelatihan khusus.

# Create the metrics
loss_metric = tf.keras.metrics.Mean(name='train_loss')
accuracy_metric = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')

@tf.function
def train_step(inputs, labels):
  with tf.GradientTape() as tape:
    predictions = model(inputs, training=True)
    regularization_loss=tf.math.add_n(model.losses)
    pred_loss=loss_fn(labels, predictions)
    total_loss=pred_loss + regularization_loss

  gradients = tape.gradient(total_loss, model.trainable_variables)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))
  # Update the metrics
  loss_metric.update_state(total_loss)
  accuracy_metric.update_state(labels, predictions)


for epoch in range(NUM_EPOCHS):
  # Reset the metrics
  loss_metric.reset_states()
  accuracy_metric.reset_states()

  for inputs, labels in train_data:
    train_step(inputs, labels)
  # Get the metric results
  mean_loss=loss_metric.result()
  mean_accuracy = accuracy_metric.result()

  print('Epoch: ', epoch)
  print('  loss:     {:.3f}'.format(mean_loss))
  print('  accuracy: {:.3f}'.format(mean_accuracy))
2021-07-19 23:37:31.878403: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Epoch:  0
  loss:     0.172
  accuracy: 0.988
2021-07-19 23:37:32.177136: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Epoch:  1
  loss:     0.143
  accuracy: 0.997
2021-07-19 23:37:32.493570: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Epoch:  2
  loss:     0.126
  accuracy: 0.997
2021-07-19 23:37:32.807739: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
Epoch:  3
  loss:     0.109
  accuracy: 1.000
Epoch:  4
  loss:     0.092
  accuracy: 1.000
2021-07-19 23:37:33.155028: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

Nama metrik keras

Di TensorFlow 2.x, model Keras lebih konsisten dalam menangani nama metrik.

Sekarang ketika Anda melewati string dalam daftar metrik, string tepat digunakan sebagai metrik ini name . Nama-nama ini terlihat di objek sejarah dikembalikan oleh model.fit , dan di log diteruskan ke keras.callbacks . disetel ke string yang Anda berikan dalam daftar metrik.

model.compile(
    optimizer = tf.keras.optimizers.Adam(0.001),
    loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics = ['acc', 'accuracy', tf.keras.metrics.SparseCategoricalAccuracy(name="my_accuracy")])
history = model.fit(train_data)
5/5 [==============================] - 1s 6ms/step - loss: 0.1042 - acc: 0.9969 - accuracy: 0.9969 - my_accuracy: 0.9969
2021-07-19 23:37:34.039643: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
history.history.keys()
dict_keys(['loss', 'acc', 'accuracy', 'my_accuracy'])

Ini berbeda dari versi sebelumnya di mana lewat metrics=["accuracy"] akan menghasilkan dict_keys(['loss', 'acc'])

Pengoptimal keras

Pengoptimalan di v1.train , seperti v1.train.AdamOptimizer dan v1.train.GradientDescentOptimizer , memiliki setara dalam tf.keras.optimizers .

Mengkonversi v1.train ke keras.optimizers

Berikut adalah hal-hal yang perlu diingat saat mengonversi pengoptimal Anda:

Default baru untuk beberapa tf.keras.optimizers

Tidak ada perubahan untuk optimizers.SGD , optimizers.Adam , atau optimizers.RMSprop .

Tingkat pembelajaran default berikut telah berubah:

Papan Tensor

TensorFlow 2.x termasuk perubahan yang signifikan terhadap tf.summary API digunakan untuk data ringkasan menulis untuk visualisasi di TensorBoard. Untuk pengantar umum baru tf.summary , ada beberapa tutorial yang tersedia yang menggunakan TensorFlow 2.x API. Ini mencakup TensorBoard TensorFlow panduan migrasi 2.x .

Menyimpan dan memuat

Kompatibilitas pos pemeriksaan

TensorFlow 2.x menggunakan berbasis obyek pemeriksaan .

Pos pemeriksaan berbasis nama gaya lama masih dapat dimuat, jika Anda berhati-hati. Proses konversi kode dapat mengakibatkan perubahan nama variabel, tetapi ada solusi.

Pendekatan paling sederhana untuk menyejajarkan nama-nama model baru dengan nama-nama di pos pemeriksaan:

  • Variabel masih semua memiliki name argumen Anda dapat mengatur.
  • Model Keras juga mengambil name argumen seperti yang mereka ditetapkan sebagai awalan untuk variabel mereka.
  • The v1.name_scope fungsi dapat digunakan untuk mengatur awalan nama variabel. Hal ini sangat berbeda dari tf.variable_scope . Itu hanya memengaruhi nama, dan tidak melacak variabel dan menggunakan kembali.

Jika itu tidak bekerja untuk kasus penggunaan Anda, cobalah v1.train.init_from_checkpoint fungsi. Dibutuhkan assignment_map argumen, yang menentukan pemetaan dari nama-nama lama ke nama baru.

The TensorFlow Pengukur repositori termasuk alat konversi untuk meningkatkan pos-pos pemeriksaan untuk estimator premade dari TensorFlow 1.x ke 2.0. Ini dapat berfungsi sebagai contoh bagaimana membangun alat untuk kasus penggunaan serupa.

Kompatibilitas model yang disimpan

Tidak ada masalah kompatibilitas yang signifikan untuk model yang disimpan.

  • TensorFlow 1.x stored_models berfungsi di TensorFlow 2.x.
  • TensorFlow 2.x save_models berfungsi di TensorFlow 1.x jika semua operasi didukung.

Sebuah Graph.pb atau Graph.pbtxt

Tidak ada cara mudah untuk meng-upgrade baku Graph.pb file TensorFlow 2.x. Taruhan terbaik Anda adalah memutakhirkan kode yang menghasilkan file.

Tapi, jika Anda memiliki "grafik beku" (a tf.Graph mana variabel telah berubah menjadi konstanta), maka adalah mungkin untuk mengubahnya ke concrete_function menggunakan v1.wrap_function :

def wrap_frozen_graph(graph_def, inputs, outputs):
  def _imports_graph_def():
    tf.compat.v1.import_graph_def(graph_def, name="")
  wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
  import_graph = wrapped_import.graph
  return wrapped_import.prune(
      tf.nest.map_structure(import_graph.as_graph_element, inputs),
      tf.nest.map_structure(import_graph.as_graph_element, outputs))

Misalnya, berikut adalah grafik beku untuk Inception v1, dari 2016:

path = tf.keras.utils.get_file(
    'inception_v1_2016_08_28_frozen.pb',
    'http://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz',
    untar=True)
Downloading data from http://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz
24698880/24695710 [==============================] - 1s 0us/step

Memuat tf.GraphDef :

graph_def = tf.compat.v1.GraphDef()
loaded = graph_def.ParseFromString(open(path,'rb').read())

Bungkus menjadi concrete_function :

inception_func = wrap_frozen_graph(
    graph_def, inputs='input:0',
    outputs='InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu:0')

Berikan tensor sebagai input:

input_img = tf.ones([1,224,224,3], dtype=tf.float32)
inception_func(input_img).shape
TensorShape([1, 28, 28, 96])

Penaksir

Pelatihan dengan Penaksir

Penaksir didukung di TensorFlow 2.x.

Bila Anda menggunakan estimator, Anda dapat menggunakan input_fn , tf.estimator.TrainSpec , dan tf.estimator.EvalSpec dari TensorFlow 1.x.

Berikut adalah contoh menggunakan input_fn dengan kereta api dan mengevaluasi spesifikasi.

Membuat spesifikasi input_fn dan train/eval

# Define the estimator's input_fn
def input_fn():
  datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True)
  mnist_train, mnist_test = datasets['train'], datasets['test']

  BUFFER_SIZE = 10000
  BATCH_SIZE = 64

  def scale(image, label):
    image = tf.cast(image, tf.float32)
    image /= 255

    return image, label[..., tf.newaxis]

  train_data = mnist_train.map(scale).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)
  return train_data.repeat()

# Define train and eval specs
train_spec = tf.estimator.TrainSpec(input_fn=input_fn,
                                    max_steps=STEPS_PER_EPOCH * NUM_EPOCHS)
eval_spec = tf.estimator.EvalSpec(input_fn=input_fn,
                                  steps=STEPS_PER_EPOCH)

Menggunakan definisi model Keras

Ada beberapa perbedaan dalam cara membuat estimator di TensorFlow 2.x.

Ini direkomendasikan bahwa Anda menentukan model Anda menggunakan Keras, kemudian gunakan tf.keras.estimator.model_to_estimator utilitas untuk mengubah model menjadi estimator. Kode di bawah ini menunjukkan cara menggunakan utilitas ini saat membuat dan melatih estimator.

def make_model():
  return tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, 3, activation='relu',
                           kernel_regularizer=tf.keras.regularizers.l2(0.02),
                           input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.1),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.Dense(10)
  ])
model = make_model()

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

estimator = tf.keras.estimator.model_to_estimator(
  keras_model = model
)

tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpbhtumut0
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpbhtumut0
INFO:tensorflow:Using the Keras model provided.
INFO:tensorflow:Using the Keras model provided.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/layers/normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/backend.py:435: UserWarning: `tf.keras.backend.set_learning_phase` is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the `training` argument of the `__call__` method of your layer or model.
  warnings.warn('`tf.keras.backend.set_learning_phase` is deprecated and '
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/layers/normalization.py:534: _colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpbhtumut0', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
2021-07-19 23:37:36.453946: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:36.454330: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:36.454461: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:36.454737: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:36.454977: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:36.455020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:36.455027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:36.455033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:36.455126: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:36.455479: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:36.455779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpbhtumut0', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Warm-starting with WarmStartSettings: WarmStartSettings(ckpt_to_initialize_from='/tmp/tmpbhtumut0/keras/keras_model.ckpt', vars_to_warm_start='.*', var_name_to_vocab_info={}, var_name_to_prev_var_name={})
INFO:tensorflow:Warm-starting with WarmStartSettings: WarmStartSettings(ckpt_to_initialize_from='/tmp/tmpbhtumut0/keras/keras_model.ckpt', vars_to_warm_start='.*', var_name_to_vocab_info={}, var_name_to_prev_var_name={})
INFO:tensorflow:Warm-starting from: /tmp/tmpbhtumut0/keras/keras_model.ckpt
INFO:tensorflow:Warm-starting from: /tmp/tmpbhtumut0/keras/keras_model.ckpt
INFO:tensorflow:Warm-starting variables only in TRAINABLE_VARIABLES.
INFO:tensorflow:Warm-starting variables only in TRAINABLE_VARIABLES.
INFO:tensorflow:Warm-started 8 variables.
INFO:tensorflow:Warm-started 8 variables.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
2021-07-19 23:37:39.175917: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:39.176299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:39.176424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:39.176729: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:39.176999: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:39.177042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:39.177050: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:39.177057: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:39.177159: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:39.177481: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:39.177761: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpbhtumut0/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpbhtumut0/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 3.1193407, step = 0
INFO:tensorflow:loss = 3.1193407, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 25...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 25...
INFO:tensorflow:Saving checkpoints for 25 into /tmp/tmpbhtumut0/model.ckpt.
INFO:tensorflow:Saving checkpoints for 25 into /tmp/tmpbhtumut0/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 25...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 25...
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:2426: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
  warnings.warn('`Model.state_updates` will be removed in a future version. '
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-07-19T23:37:42
INFO:tensorflow:Starting evaluation at 2021-07-19T23:37:42
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
2021-07-19 23:37:42.476830: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:42.477207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:42.477339: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:42.477648: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:42.477910: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:42.477955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:42.477963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:42.477969: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:42.478058: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:42.478332: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
INFO:tensorflow:Restoring parameters from /tmp/tmpbhtumut0/model.ckpt-25
2021-07-19 23:37:42.478592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:Restoring parameters from /tmp/tmpbhtumut0/model.ckpt-25
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/5]
INFO:tensorflow:Evaluation [1/5]
INFO:tensorflow:Evaluation [2/5]
INFO:tensorflow:Evaluation [2/5]
INFO:tensorflow:Evaluation [3/5]
INFO:tensorflow:Evaluation [3/5]
INFO:tensorflow:Evaluation [4/5]
INFO:tensorflow:Evaluation [4/5]
INFO:tensorflow:Evaluation [5/5]
INFO:tensorflow:Evaluation [5/5]
INFO:tensorflow:Inference Time : 1.02146s
2021-07-19 23:37:43.437293: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
INFO:tensorflow:Inference Time : 1.02146s
INFO:tensorflow:Finished evaluation at 2021-07-19-23:37:43
INFO:tensorflow:Finished evaluation at 2021-07-19-23:37:43
INFO:tensorflow:Saving dict for global step 25: accuracy = 0.634375, global_step = 25, loss = 1.493957
INFO:tensorflow:Saving dict for global step 25: accuracy = 0.634375, global_step = 25, loss = 1.493957
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 25: /tmp/tmpbhtumut0/model.ckpt-25
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 25: /tmp/tmpbhtumut0/model.ckpt-25
INFO:tensorflow:Loss for final step: 0.37796202.
2021-07-19 23:37:43.510911: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
INFO:tensorflow:Loss for final step: 0.37796202.
({'accuracy': 0.634375, 'loss': 1.493957, 'global_step': 25}, [])

Menggunakan kustom model_fn

Jika Anda memiliki estimator kustom yang ada model_fn bahwa Anda perlu untuk mempertahankan, Anda dapat mengkonversi Anda model_fn menggunakan model Keras.

Namun, untuk alasan kompatibilitas, kebiasaan model_fn akan tetap berjalan dalam modus grafik 1.x-gaya. Ini berarti tidak ada eksekusi yang bersemangat dan tidak ada ketergantungan kontrol otomatis.

Model_fn khusus dengan sedikit perubahan

Untuk membuat kustom Anda model_fn bekerja di TensorFlow 2.x, jika Anda lebih memilih sedikit perubahan pada kode yang ada, tf.compat.v1 simbol seperti optimizers dan metrics dapat digunakan.

Menggunakan model Keras di custom model_fn mirip dengan menggunakannya dalam loop pelatihan kustom:

  • Mengatur training fase tepat, berdasarkan mode argumen.
  • Secara eksplisit lulus model trainable_variables untuk optimizer.

Tapi ada perbedaan penting, relatif terhadap lingkaran kustom :

  • Alih-alih menggunakan Model.losses , ekstrak kerugian menggunakan Model.get_losses_for .
  • Ekstrak update model menggunakan Model.get_updates_for .

Kode berikut menciptakan estimator dari kebiasaan model_fn , menggambarkan semua kekhawatiran ini.

def my_model_fn(features, labels, mode):
  model = make_model()

  optimizer = tf.compat.v1.train.AdamOptimizer()
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

  training = (mode == tf.estimator.ModeKeys.TRAIN)
  predictions = model(features, training=training)

  if mode == tf.estimator.ModeKeys.PREDICT:
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

  reg_losses = model.get_losses_for(None) + model.get_losses_for(features)
  total_loss=loss_fn(labels, predictions) + tf.math.add_n(reg_losses)

  accuracy = tf.compat.v1.metrics.accuracy(labels=labels,
                                           predictions=tf.math.argmax(predictions, axis=1),
                                           name='acc_op')

  update_ops = model.get_updates_for(None) + model.get_updates_for(features)
  minimize_op = optimizer.minimize(
      total_loss,
      var_list=model.trainable_variables,
      global_step=tf.compat.v1.train.get_or_create_global_step())
  train_op = tf.group(minimize_op, update_ops)

  return tf.estimator.EstimatorSpec(
    mode=mode,
    predictions=predictions,
    loss=total_loss,
    train_op=train_op, eval_metric_ops={'accuracy': accuracy})

# Create the Estimator & Train
estimator = tf.estimator.Estimator(model_fn=my_model_fn)
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpqiom6a5s
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpqiom6a5s
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpqiom6a5s', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpqiom6a5s', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
2021-07-19 23:37:46.140692: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:46.141065: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:46.141220: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:46.141517: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:46.141765: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:46.141807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:46.141814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:46.141820: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:46.141907: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:46.142234: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:46.142497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpqiom6a5s/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpqiom6a5s/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 2.9167266, step = 0
INFO:tensorflow:loss = 2.9167266, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 25...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 25...
INFO:tensorflow:Saving checkpoints for 25 into /tmp/tmpqiom6a5s/model.ckpt.
INFO:tensorflow:Saving checkpoints for 25 into /tmp/tmpqiom6a5s/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 25...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 25...
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-07-19T23:37:49
INFO:tensorflow:Starting evaluation at 2021-07-19T23:37:49
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpqiom6a5s/model.ckpt-25
2021-07-19 23:37:49.640699: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:49.641091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:49.641238: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:49.641580: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:49.641848: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:49.641893: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:49.641901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:49.641910: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:49.642029: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:49.642355: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:49.642657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:Restoring parameters from /tmp/tmpqiom6a5s/model.ckpt-25
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/5]
INFO:tensorflow:Evaluation [1/5]
INFO:tensorflow:Evaluation [2/5]
INFO:tensorflow:Evaluation [2/5]
INFO:tensorflow:Evaluation [3/5]
INFO:tensorflow:Evaluation [3/5]
INFO:tensorflow:Evaluation [4/5]
INFO:tensorflow:Evaluation [4/5]
INFO:tensorflow:Evaluation [5/5]
INFO:tensorflow:Evaluation [5/5]
INFO:tensorflow:Inference Time : 1.38362s
2021-07-19 23:37:50.924973: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
INFO:tensorflow:Inference Time : 1.38362s
INFO:tensorflow:Finished evaluation at 2021-07-19-23:37:50
INFO:tensorflow:Finished evaluation at 2021-07-19-23:37:50
INFO:tensorflow:Saving dict for global step 25: accuracy = 0.70625, global_step = 25, loss = 1.6135181
INFO:tensorflow:Saving dict for global step 25: accuracy = 0.70625, global_step = 25, loss = 1.6135181
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 25: /tmp/tmpqiom6a5s/model.ckpt-25
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 25: /tmp/tmpqiom6a5s/model.ckpt-25
INFO:tensorflow:Loss for final step: 0.60315084.
2021-07-19 23:37:51.035953: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
INFO:tensorflow:Loss for final step: 0.60315084.
({'accuracy': 0.70625, 'loss': 1.6135181, 'global_step': 25}, [])

Kustom model_fn dengan simbol 2.x TensorFlow

Jika Anda ingin menyingkirkan semua simbol 1.x TensorFlow dan upgrade kustom Anda model_fn untuk TensorFlow 2.x, Anda perlu memperbarui optimizer dan metrik untuk tf.keras.optimizers dan tf.keras.metrics .

Dalam kebiasaan model_fn , selain di atas perubahan , lebih upgrade perlu dibuat:

Untuk contoh di atas dari my_model_fn , kode bermigrasi dengan simbol 2.x TensorFlow ditampilkan sebagai:

def my_model_fn(features, labels, mode):
  model = make_model()

  training = (mode == tf.estimator.ModeKeys.TRAIN)
  loss_obj = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
  predictions = model(features, training=training)

  # Get both the unconditional losses (the None part)
  # and the input-conditional losses (the features part).
  reg_losses = model.get_losses_for(None) + model.get_losses_for(features)
  total_loss=loss_obj(labels, predictions) + tf.math.add_n(reg_losses)

  # Upgrade to tf.keras.metrics.
  accuracy_obj = tf.keras.metrics.Accuracy(name='acc_obj')
  accuracy = accuracy_obj.update_state(
      y_true=labels, y_pred=tf.math.argmax(predictions, axis=1))

  train_op = None
  if training:
    # Upgrade to tf.keras.optimizers.
    optimizer = tf.keras.optimizers.Adam()
    # Manually assign tf.compat.v1.global_step variable to optimizer.iterations
    # to make tf.compat.v1.train.global_step increased correctly.
    # This assignment is a must for any `tf.train.SessionRunHook` specified in
    # estimator, as SessionRunHooks rely on global step.
    optimizer.iterations = tf.compat.v1.train.get_or_create_global_step()
    # Get both the unconditional updates (the None part)
    # and the input-conditional updates (the features part).
    update_ops = model.get_updates_for(None) + model.get_updates_for(features)
    # Compute the minimize_op.
    minimize_op = optimizer.get_updates(
        total_loss,
        model.trainable_variables)[0]
    train_op = tf.group(minimize_op, *update_ops)

  return tf.estimator.EstimatorSpec(
    mode=mode,
    predictions=predictions,
    loss=total_loss,
    train_op=train_op,
    eval_metric_ops={'Accuracy': accuracy_obj})

# Create the Estimator and train.
estimator = tf.estimator.Estimator(model_fn=my_model_fn)
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpomveromc
WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpomveromc
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpomveromc', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpomveromc', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
2021-07-19 23:37:53.371110: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:53.371633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:53.371845: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:53.372311: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:53.372679: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:53.372742: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:53.372779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:53.372790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:53.372939: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:53.373380: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:53.373693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpomveromc/model.ckpt.
INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpomveromc/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 2.874814, step = 0
INFO:tensorflow:loss = 2.874814, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 25...
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 25...
INFO:tensorflow:Saving checkpoints for 25 into /tmp/tmpomveromc/model.ckpt.
INFO:tensorflow:Saving checkpoints for 25 into /tmp/tmpomveromc/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 25...
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 25...
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-07-19T23:37:56
INFO:tensorflow:Starting evaluation at 2021-07-19T23:37:56
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmp/tmpomveromc/model.ckpt-25
2021-07-19 23:37:56.884303: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:56.884746: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:00:05.0 name: NVIDIA Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-07-19 23:37:56.884934: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:56.885330: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:56.885640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-07-19 23:37:56.885696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-07-19 23:37:56.885711: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-07-19 23:37:56.885720: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-07-19 23:37:56.885861: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:56.886386: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-07-19 23:37:56.886729: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14646 MB memory) -> physical GPU (device: 0, name: NVIDIA Tesla V100-SXM2-16GB, pci bus id: 0000:00:05.0, compute capability: 7.0)
INFO:tensorflow:Restoring parameters from /tmp/tmpomveromc/model.ckpt-25
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/5]
INFO:tensorflow:Evaluation [1/5]
INFO:tensorflow:Evaluation [2/5]
INFO:tensorflow:Evaluation [2/5]
INFO:tensorflow:Evaluation [3/5]
INFO:tensorflow:Evaluation [3/5]
INFO:tensorflow:Evaluation [4/5]
INFO:tensorflow:Evaluation [4/5]
INFO:tensorflow:Evaluation [5/5]
INFO:tensorflow:Evaluation [5/5]
INFO:tensorflow:Inference Time : 1.04574s
2021-07-19 23:37:57.852422: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
INFO:tensorflow:Inference Time : 1.04574s
INFO:tensorflow:Finished evaluation at 2021-07-19-23:37:57
INFO:tensorflow:Finished evaluation at 2021-07-19-23:37:57
INFO:tensorflow:Saving dict for global step 25: Accuracy = 0.790625, global_step = 25, loss = 1.4257433
INFO:tensorflow:Saving dict for global step 25: Accuracy = 0.790625, global_step = 25, loss = 1.4257433
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 25: /tmp/tmpomveromc/model.ckpt-25
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 25: /tmp/tmpomveromc/model.ckpt-25
INFO:tensorflow:Loss for final step: 0.42627147.
2021-07-19 23:37:57.941217: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
INFO:tensorflow:Loss for final step: 0.42627147.
({'Accuracy': 0.790625, 'loss': 1.4257433, 'global_step': 25}, [])

Penaksir Premade

Premade estimator dalam keluarga tf.estimator.DNN* , tf.estimator.Linear* dan tf.estimator.DNNLinearCombined* masih didukung dalam TensorFlow 2.x API. Namun, beberapa argumen telah berubah:

  1. input_layer_partitioner : Dihapus di v2.
  2. loss_reduction : Diperbarui untuk tf.keras.losses.Reduction bukan tf.compat.v1.losses.Reduction . Nilai default juga berubah menjadi tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE dari tf.compat.v1.losses.Reduction.SUM .
  3. optimizer , dnn_optimizer dan linear_optimizer : argumen ini telah diperbarui untuk tf.keras.optimizers bukan tf.compat.v1.train.Optimizer .

Untuk memigrasikan perubahan di atas:

  1. Tidak ada migrasi diperlukan untuk input_layer_partitioner sejak Distribution Strategy akan menanganinya secara otomatis di TensorFlow 2.x.
  2. Untuk loss_reduction , periksa tf.keras.losses.Reduction untuk pilihan didukung.
  3. Untuk optimizer argumen:
    • Jika Anda tidak: 1) lulus dalam optimizer , dnn_optimizer atau linear_optimizer argumen, atau 2) menentukan optimizer argumen sebagai string dalam kode Anda, maka Anda tidak perlu perubahan apa-apa karena tf.keras.optimizers digunakan secara default .
    • Jika tidak, Anda perlu memperbarui dari tf.compat.v1.train.Optimizer ke yang sesuai tf.keras.optimizers .

Konverter pos pemeriksaan

Migrasi ke keras.optimizers akan merusak pos pemeriksaan disimpan menggunakan TensorFlow 1.x, sebagai tf.keras.optimizers menghasilkan satu set yang berbeda dari variabel yang akan disimpan di pos pemeriksaan. Untuk membuat reusable pos pemeriksaan lama setelah migrasi ke TensorFlow 2.x, coba alat pos pemeriksaan converter .

 curl -O https://raw.githubusercontent.com/tensorflow/estimator/master/tensorflow_estimator/python/estimator/tools/checkpoint_converter.py
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 14889  100 14889    0     0  60771      0 --:--:-- --:--:-- --:--:-- 60771

Alat ini memiliki bantuan bawaan:

 python checkpoint_converter.py -h
2021-07-19 23:37:58.805973: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
usage: checkpoint_converter.py [-h]
                               {dnn,linear,combined} source_checkpoint
                               source_graph target_checkpoint

positional arguments:
  {dnn,linear,combined}
                        The type of estimator to be converted. So far, the
                        checkpoint converter only supports Canned Estimator.
                        So the allowed types include linear, dnn and combined.
  source_checkpoint     Path to source checkpoint file to be read in.
  source_graph          Path to source graph file to be read in.
  target_checkpoint     Path to checkpoint file to be written out.

optional arguments:
  -h, --help            show this help message and exit

Bentuk Tensor

Kelas ini disederhanakan untuk terus int s, bukan tf.compat.v1.Dimension objek. Jadi tidak perlu untuk memanggil .value untuk mendapatkan int .

Individu tf.compat.v1.Dimension benda masih dapat diakses dari tf.TensorShape.dims .

Berikut ini menunjukkan perbedaan antara TensorFlow 1.x dan TensorFlow 2.x.

# Create a shape and choose an index
i = 0
shape = tf.TensorShape([16, None, 256])
shape
TensorShape([16, None, 256])

Jika Anda memiliki ini di TensorFlow 1.x:

value = shape[i].value

Kemudian lakukan ini di TensorFlow 2.x:

value = shape[i]
value
16

Jika Anda memiliki ini di TensorFlow 1.x:

for dim in shape:
    value = dim.value
    print(value)

Kemudian lakukan ini di TensorFlow 2.x:

for value in shape:
  print(value)
16
None
256

Jika Anda memilikinya di TensorFlow 1.x (atau menggunakan metode dimensi lain):

dim = shape[i]
dim.assert_is_compatible_with(other_dim)

Kemudian lakukan ini di TensorFlow 2.x:

other_dim = 16
Dimension = tf.compat.v1.Dimension

if shape.rank is None:
  dim = Dimension(None)
else:
  dim = shape.dims[i]
dim.is_compatible_with(other_dim) # or any other dimension method
True
shape = tf.TensorShape(None)

if shape:
  dim = shape.dims[i]
  dim.is_compatible_with(other_dim) # or any other dimension method

Nilai boolean dari tf.TensorShape adalah True jika rank diketahui, False sebaliknya.

print(bool(tf.TensorShape([])))      # Scalar
print(bool(tf.TensorShape([0])))     # 0-length vector
print(bool(tf.TensorShape([1])))     # 1-length vector
print(bool(tf.TensorShape([None])))  # Unknown-length vector
print(bool(tf.TensorShape([1, 10, 100])))       # 3D tensor
print(bool(tf.TensorShape([None, None, None]))) # 3D tensor with no known dimensions
print()
print(bool(tf.TensorShape(None)))  # A tensor with unknown rank.
True
True
True
True
True
True

False

Perubahan lainnya

  • Hapus tf.colocate_with : algoritma penempatan perangkat TensorFlow ini telah meningkat secara signifikan. Ini seharusnya tidak lagi diperlukan. Jika menghapus itu menyebabkan degradasi kinerja silakan melaporkan bug .

  • Ganti v1.ConfigProto penggunaan dengan fungsi setara dari tf.config .

Kesimpulan

Proses keseluruhannya adalah:

  1. Jalankan skrip peningkatan.
  2. Hapus simbol kontribusi.
  3. Alihkan model Anda ke gaya berorientasi objek (Keras).
  4. Gunakan tf.keras atau tf.estimator pelatihan dan loop evaluasi di mana Anda bisa.
  5. Jika tidak, gunakan loop khusus, tetapi pastikan untuk menghindari sesi & koleksi.

Dibutuhkan sedikit kerja untuk mengonversi kode ke TensorFlow 2.x idiomatik, tetapi setiap perubahan menghasilkan:

  • Lebih sedikit baris kode.
  • Peningkatan kejelasan dan kesederhanaan.
  • Debugging lebih mudah.