Google ใช้เทคโนโลยี AI เพื่อแปลเนื้อหาเป็นภาษาที่คุณต้องการ การแปลโดย AI อาจมีข้อผิดพลาด

ข้อมูลเบื้องต้นเกี่ยวกับโมดูล เลเยอร์ และรุ่น

ดูบน TensorFlow.org

ทำงานใน Google Colab

ดูแหล่งที่มาบน GitHub

ดาวน์โหลดโน๊ตบุ๊ค

ในการทำแมชชีนเลิร์นนิงใน TensorFlow คุณมักจะต้องกำหนด บันทึก และกู้คืนโมเดล

โมเดลเป็นนามธรรม:

ฟังก์ชันที่คำนวณบางอย่างบนเทนเซอร์ ( forward pass )
ตัวแปรบางตัวที่สามารถอัพเดตตามการฝึกได้

ในคู่มือนี้ คุณจะไปที่ด้านล่างพื้นผิวของ Keras เพื่อดูว่าโมเดล TensorFlow ถูกกำหนดอย่างไร นี่คือวิธีที่ TensorFlow รวบรวมตัวแปรและแบบจำลอง ตลอดจนวิธีการบันทึกและกู้คืน

ติดตั้ง

import tensorflow as tf
from datetime import datetime

%load_ext tensorboard

การกำหนดโมเดลและเลเยอร์ใน TensorFlow

โมเดลส่วนใหญ่ทำจากเลเยอร์ เลเยอร์เป็นฟังก์ชันที่มีโครงสร้างทางคณิตศาสตร์ที่รู้จักซึ่งสามารถนำมาใช้ซ้ำได้และมีตัวแปรที่ฝึกได้ ใน TensorFlow การใช้งานเลเยอร์และโมเดลระดับสูงส่วนใหญ่ เช่น Keras หรือ Sonnet สร้างขึ้นบนคลาสพื้นฐานเดียวกัน: tf.Module

ต่อไปนี้คือตัวอย่าง tf.Module แบบง่ายๆ ที่ทำงานบนเมตริกซ์สเกลาร์:

class SimpleModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)
    self.a_variable = tf.Variable(5.0, name="train_me")
    self.non_trainable_variable = tf.Variable(5.0, trainable=False, name="do_not_train_me")
  def __call__(self, x):
    return self.a_variable * x + self.non_trainable_variable

simple_module = SimpleModule(name="simple")

simple_module(tf.constant(5.0))

<tf.Tensor: shape=(), dtype=float32, numpy=30.0>

โมดูลและโดยการขยาย เลเยอร์เป็นคำศัพท์ที่เรียนรู้อย่างลึกซึ้งสำหรับ "วัตถุ": พวกมันมีสถานะภายในและวิธีการที่ใช้สถานะนั้น

ไม่มีอะไรพิเศษเกี่ยวกับ __call__ ยกเว้นการทำตัวเหมือน Python callable ; คุณสามารถเรียกใช้โมเดลของคุณด้วยฟังก์ชันใดก็ได้ที่คุณต้องการ

คุณสามารถตั้งค่าเปิดและปิดความสามารถในการฝึกของตัวแปรได้ไม่ว่าจะด้วยเหตุผลใดก็ตาม รวมถึงการแช่แข็งเลเยอร์และตัวแปรระหว่างการปรับแต่งอย่างละเอียด

หมายเหตุ: tf.Module เป็นคลาสพื้นฐานสำหรับทั้ง tf.keras.layers.Layer และ tf.keras.Model ดังนั้นทุกสิ่งที่คุณเจอในที่นี้จึงนำไปใช้กับ Keras ด้วย ด้วยเหตุผลด้านความเข้ากันได้ในอดีต เลเยอร์ Keras จะไม่รวบรวมตัวแปรจากโมดูล ดังนั้นโมเดลของคุณจึงควรใช้เฉพาะโมดูลหรือเฉพาะเลเยอร์ Keras อย่างไรก็ตาม วิธีการที่แสดงด้านล่างสำหรับการตรวจสอบตัวแปรจะเหมือนกันในทุกกรณี

โดยการแบ่งคลาสย่อย tf.Module อินสแตนซ์ tf.Module หรือ tf.Variable ใดๆ ที่กำหนดให้กับคุณสมบัติของอ็อบเจ็กต์นี้จะถูกรวบรวมโดยอัตโนมัติ สิ่งนี้ทำให้คุณสามารถบันทึกและโหลดตัวแปร และสร้างคอลเล็กชันของ tf.Module s

# All trainable variables
print("trainable variables:", simple_module.trainable_variables)
# Every variable
print("all variables:", simple_module.variables)

trainable variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>,)
all variables: (<tf.Variable 'train_me:0' shape=() dtype=float32, numpy=5.0>, <tf.Variable 'do_not_train_me:0' shape=() dtype=float32, numpy=5.0>)
2021-10-26 01:29:45.284549: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.

นี่คือตัวอย่างโมเดลเลเยอร์เชิงเส้นแบบสองชั้นที่สร้างจากโมดูล

ขั้นแรกให้เลเยอร์หนาแน่น (เชิงเส้น):

class Dense(tf.Module):
  def __init__(self, in_features, out_features, name=None):
    super().__init__(name=name)
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')
  def __call__(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

จากนั้นโมเดลที่สมบูรณ์ ซึ่งสร้างอินสแตนซ์สองชั้นและนำไปใช้:

class SequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)

    self.dense_1 = Dense(in_features=3, out_features=3)
    self.dense_2 = Dense(in_features=3, out_features=2)

  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

# You have made a model!
my_model = SequentialModule(name="the_model")

# Call it, with random results
print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

Model results: tf.Tensor([[7.706234  3.0919805]], shape=(1, 2), dtype=float32)

อินสแตนซ์ tf.Module จะรวบรวมอินสแตนซ์ tf.Variable หรือ tf.Module แบบเรียกซ้ำ แบบเรียกซ้ำโดยอัตโนมัติ สิ่งนี้ทำให้คุณสามารถจัดการคอลเลกชั่นของ tf.Module ด้วยอินสแตนซ์โมเดลเดียว และบันทึกและโหลดทั้งโมเดล

print("Submodules:", my_model.submodules)

Submodules: (<__main__.Dense object at 0x7f7ab2391290>, <__main__.Dense object at 0x7f7b6869ea10>)

for var in my_model.variables:
  print(var, "\n")

<tf.Variable 'b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)> 

<tf.Variable 'w:0' shape=(3, 3) dtype=float32, numpy=
array([[ 0.05711935,  0.22440144,  0.6370985 ],
       [ 0.3136791 , -1.7006774 ,  0.7256515 ],
       [ 0.16120772, -0.8412193 ,  0.5250952 ]], dtype=float32)> 

<tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)> 

<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
array([[-0.5353216 ,  1.2815404 ],
       [ 0.62764466,  0.47087234],
       [ 2.19187   ,  0.45777202]], dtype=float32)>

รอสร้างตัวแปร

คุณอาจสังเกตเห็นที่นี่ว่าคุณต้องกำหนดทั้งขนาดอินพุตและเอาต์พุตให้กับเลเยอร์ นี่คือตัวแปร w มีรูปร่างที่รู้จักและสามารถจัดสรรได้

เมื่อเลื่อนการสร้างตัวแปรไปในครั้งแรกที่มีการเรียกโมดูลด้วยรูปร่างอินพุตเฉพาะ คุณไม่จำเป็นต้องระบุขนาดอินพุตไว้ล่วงหน้า

class FlexibleDenseModule(tf.Module):
  # Note: No need for `in_features`
  def __init__(self, out_features, name=None):
    super().__init__(name=name)
    self.is_built = False
    self.out_features = out_features

  def __call__(self, x):
    # Create variables on first call.
    if not self.is_built:
      self.w = tf.Variable(
        tf.random.normal([x.shape[-1], self.out_features]), name='w')
      self.b = tf.Variable(tf.zeros([self.out_features]), name='b')
      self.is_built = True

    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

# Used in a module
class MySequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)

    self.dense_1 = FlexibleDenseModule(out_features=3)
    self.dense_2 = FlexibleDenseModule(out_features=2)

  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

my_model = MySequentialModule(name="the_model")
print("Model results:", my_model(tf.constant([[2.0, 2.0, 2.0]])))

Model results: tf.Tensor([[4.0598335 0.       ]], shape=(1, 2), dtype=float32)

ความยืดหยุ่นนี้เป็นสาเหตุที่เลเยอร์ TensorFlow มักจะต้องระบุรูปร่างของเอาต์พุตเท่านั้น เช่น ใน tf.keras.layers.Dense แทนที่จะเป็นทั้งขนาดอินพุตและเอาต์พุต

ออมน้ำหนัก

คุณสามารถบันทึก tf.Module เป็นทั้ง จุดตรวจสอบ และ SavedModel

จุดตรวจเป็นเพียงค่าน้ำหนัก (นั่นคือ ค่าของชุดของตัวแปรภายในโมดูลและโมดูลย่อย):

chkp_path = "my_checkpoint"
checkpoint = tf.train.Checkpoint(model=my_model)
checkpoint.write(chkp_path)

'my_checkpoint'

จุดตรวจประกอบด้วยไฟล์สองประเภท: ตัวข้อมูลและไฟล์ดัชนีสำหรับข้อมูลเมตา ไฟล์ดัชนีติดตามสิ่งที่ถูกบันทึกไว้จริง ๆ และการกำหนดหมายเลขของจุดตรวจ ในขณะที่ข้อมูลจุดตรวจสอบมีค่าตัวแปรและเส้นทางการค้นหาแอตทริบิวต์

ls my_checkpoint*

my_checkpoint.data-00000-of-00001  my_checkpoint.index

คุณสามารถดูภายในจุดตรวจเพื่อให้แน่ใจว่าได้บันทึกคอลเลกชันของตัวแปรทั้งหมดแล้ว โดยจัดเรียงตามวัตถุ Python ที่มีอยู่

tf.train.list_variables(chkp_path)

[('_CHECKPOINTABLE_OBJECT_GRAPH', []),
 ('model/dense_1/b/.ATTRIBUTES/VARIABLE_VALUE', [3]),
 ('model/dense_1/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 3]),
 ('model/dense_2/b/.ATTRIBUTES/VARIABLE_VALUE', [2]),
 ('model/dense_2/w/.ATTRIBUTES/VARIABLE_VALUE', [3, 2])]

ในระหว่างการฝึกอบรมแบบกระจาย (หลายเครื่อง) พวกเขาสามารถแบ่งส่วนข้อมูลได้ ซึ่งเป็นสาเหตุว่าทำไมจึงมีการกำหนดหมายเลขไว้ (เช่น '00000-of-00001') ในกรณีนี้ มีเพียงเศษเสี้ยวเดียวเท่านั้น

เมื่อคุณโหลดโมเดลกลับเข้าไป คุณจะเขียนทับค่าในอ็อบเจกต์ Python ของคุณ

new_model = MySequentialModule()
new_checkpoint = tf.train.Checkpoint(model=new_model)
new_checkpoint.restore("my_checkpoint")

# Should be the same result as above
new_model(tf.constant([[2.0, 2.0, 2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[4.0598335, 0.       ]], dtype=float32)>

ตัวยึดตำแหน่ง23

ฟังก์ชั่นการบันทึก

TensorFlow สามารถเรียกใช้โมเดลโดยไม่มีอ็อบเจ็กต์ Python ดั้งเดิม ดังที่แสดงโดย TensorFlow Serving และ TensorFlow Lite แม้ว่าคุณจะดาวน์โหลดโมเดลที่ผ่านการฝึกอบรมจาก TensorFlow Hub

TensorFlow จำเป็นต้องรู้วิธีคำนวณตามที่อธิบายไว้ใน Python แต่ ไม่มีโค้ดต้นฉบับ ในการดำเนินการนี้ คุณสามารถสร้าง กราฟ ซึ่งมีคำอธิบายอยู่ในคู่มือ แนะนำกราฟและฟังก์ชัน

กราฟนี้มีการดำเนินการ หรือ ops ที่นำฟังก์ชันไปใช้

คุณสามารถกำหนดกราฟในแบบจำลองด้านบนโดยเพิ่ม @tf.function decorator เพื่อระบุว่าโค้ดนี้ควรทำงานเป็นกราฟ

class MySequentialModule(tf.Module):
  def __init__(self, name=None):
    super().__init__(name=name)

    self.dense_1 = Dense(in_features=3, out_features=3)
    self.dense_2 = Dense(in_features=3, out_features=2)

  @tf.function
  def __call__(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

# You have made a model with a graph!
my_model = MySequentialModule(name="the_model")

โมดูลที่คุณสร้างขึ้นนั้นทำงานเหมือนเดิมทุกประการ ลายเซ็นที่ไม่ซ้ำกันแต่ละอันที่ส่งผ่านไปยังฟังก์ชันจะสร้างกราฟแยกกัน ตรวจสอบคำ แนะนำเกี่ยวกับกราฟและฟังก์ชัน สำหรับรายละเอียด

print(my_model([[2.0, 2.0, 2.0]]))
print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

tf.Tensor([[0.62891716 0.        ]], shape=(1, 2), dtype=float32)
tf.Tensor(
[[[0.62891716 0.        ]
  [0.62891716 0.        ]]], shape=(1, 2, 2), dtype=float32)

คุณสามารถดูกราฟได้โดยการติดตามภายในสรุป TensorBoard

# Set up logging.
stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
logdir = "logs/func/%s" % stamp
writer = tf.summary.create_file_writer(logdir)

# Create a new model to get a fresh trace
# Otherwise the summary will not see the graph.
new_model = MySequentialModule()

# Bracket the function call with
# tf.summary.trace_on() and tf.summary.trace_export().
tf.summary.trace_on(graph=True)
tf.profiler.experimental.start(logdir)
# Call only one tf.function when tracing.
z = print(new_model(tf.constant([[2.0, 2.0, 2.0]])))
with writer.as_default():
  tf.summary.trace_export(
      name="my_func_trace",
      step=0,
      profiler_outdir=logdir)

tf.Tensor([[0.         0.01750386]], shape=(1, 2), dtype=float32)

เรียกใช้ TensorBoard เพื่อดูการติดตามผลลัพธ์:

#docs_infra: no_execute
%tensorboard --logdir logs/func

ภาพหน้าจอของกราฟใน TensorBoard

การสร้าง `SavedModel`

วิธีที่แนะนำในการแชร์โมเดลที่ได้รับการฝึกอบรมอย่างสมบูรณ์คือการใช้ SavedModel SavedModel มีทั้งชุดของฟังก์ชันและชุดของตุ้มน้ำหนัก

คุณสามารถบันทึกโมเดลที่คุณเพิ่งฝึกได้ดังนี้:

tf.saved_model.save(my_model, "the_saved_model")

INFO:tensorflow:Assets written to: the_saved_model/assets

# Inspect the SavedModel in the directory
ls -l the_saved_model

total 24
drwxr-sr-x 2 kbuilder kokoro  4096 Oct 26 01:29 assets
-rw-rw-r-- 1 kbuilder kokoro 14702 Oct 26 01:29 saved_model.pb
drwxr-sr-x 2 kbuilder kokoro  4096 Oct 26 01:29 variables

ตัวยึดตำแหน่ง

# The variables/ directory contains a checkpoint of the variables
ls -l the_saved_model/variables

total 8
-rw-rw-r-- 1 kbuilder kokoro 408 Oct 26 01:29 variables.data-00000-of-00001
-rw-rw-r-- 1 kbuilder kokoro 356 Oct 26 01:29 variables.index

ไฟล์ saved_model.pb เป็น บัฟเฟอร์โปรโตคอล ที่อธิบายการทำงาน tf.Graph

โมเดลและเลเยอร์สามารถโหลดได้จากการแสดงนี้ โดยไม่ต้องสร้างอินสแตนซ์ของคลาสที่สร้างมันขึ้นมาจริงๆ สิ่งนี้เป็นที่ต้องการในสถานการณ์ที่คุณไม่มี (หรือต้องการ) ล่าม Python เช่นการให้บริการในขนาดหรือบนอุปกรณ์ Edge หรือในสถานการณ์ที่ไม่มีรหัส Python ดั้งเดิมหรือใช้งานได้จริง

คุณสามารถโหลดโมเดลเป็นวัตถุใหม่ได้:

new_model = tf.saved_model.load("the_saved_model")

new_model สร้างขึ้นจากการโหลดโมเดลที่บันทึกไว้ เป็นอ็อบเจ็กต์ผู้ใช้ TensorFlow ภายในที่ไม่มีความรู้ในคลาสใดๆ ไม่ใช่ประเภท SequentialModule

isinstance(new_model, SequentialModule)

False

โมเดลใหม่นี้ใช้งานได้กับลายเซ็นอินพุตที่กำหนดไว้แล้ว คุณไม่สามารถเพิ่มลายเซ็นเพิ่มเติมให้กับโมเดลที่กู้คืนเช่นนี้

print(my_model([[2.0, 2.0, 2.0]]))
print(my_model([[[2.0, 2.0, 2.0], [2.0, 2.0, 2.0]]]))

tf.Tensor([[0.62891716 0.        ]], shape=(1, 2), dtype=float32)
tf.Tensor(
[[[0.62891716 0.        ]
  [0.62891716 0.        ]]], shape=(1, 2, 2), dtype=float32)

ดังนั้น เมื่อใช้ SavedModel คุณจะสามารถบันทึกน้ำหนักและกราฟของ TensorFlow โดยใช้ tf.Module แล้วโหลดอีกครั้ง

Keras รุ่นและชั้น

โปรดทราบว่าจนถึงจุดนี้ ไม่มีการกล่าวถึง Keras คุณสามารถสร้าง API ระดับสูงของคุณเองบน tf.Module และผู้คนมี

ในส่วนนี้ คุณจะพิจารณาว่า Keras ใช้ tf.Module อย่างไร คู่มือผู้ใช้ฉบับสมบูรณ์สำหรับรุ่น Keras มีอยู่ใน คู่มือ Keras

ชั้น Keras

tf.keras.layers.Layer เป็นคลาสพื้นฐานของเลเยอร์ Keras ทั้งหมด และสืบทอดมาจาก tf.Module

คุณสามารถแปลงโมดูลเป็นเลเยอร์ Keras เพียงแค่สลับพาเรนต์แล้วเปลี่ยน __call__ เป็น call :

class MyDense(tf.keras.layers.Layer):
  # Adding **kwargs to support base Keras layer arguments
  def __init__(self, in_features, out_features, **kwargs):
    super().__init__(**kwargs)

    # This will soon move to the build step; see below
    self.w = tf.Variable(
      tf.random.normal([in_features, out_features]), name='w')
    self.b = tf.Variable(tf.zeros([out_features]), name='b')
  def call(self, x):
    y = tf.matmul(x, self.w) + self.b
    return tf.nn.relu(y)

simple_layer = MyDense(name="simple", in_features=3, out_features=3)

เลเยอร์ Keras มี __call__ ของตัวเองที่ทำบัญชีที่อธิบายไว้ในส่วนถัดไปแล้วเรียก call() คุณควรสังเกตว่าไม่มีการเปลี่ยนแปลงในการทำงาน

simple_layer([[2.0, 2.0, 2.0]])

<tf.Tensor: shape=(1, 3), dtype=float32, numpy=array([[0.      , 0.179402, 0.      ]], dtype=float32)>

ตัวยึดตำแหน่ง43

ขั้นตอนการ `build`

ดังที่ระบุไว้ ในหลาย ๆ กรณีจะสะดวกที่จะรอเพื่อสร้างตัวแปรจนกว่าคุณจะแน่ใจในรูปร่างอินพุต

เลเยอร์ Keras มาพร้อมกับขั้นตอนวงจรชีวิตเพิ่มเติมที่ช่วยให้คุณมีความยืดหยุ่นมากขึ้นในการกำหนดเลเยอร์ของคุณ สิ่งนี้ถูกกำหนดไว้ในฟังก์ชัน build ด์

บิลด์ถูกเรียกเพียงครั้งเดียว และถูกเรียกด้วย build ของอินพุต มักใช้ในการสร้างตัวแปร (น้ำหนัก)

คุณสามารถเขียนเลเยอร์ MyDense ด้านบนใหม่เพื่อให้มีความยืดหยุ่นตามขนาดของอินพุต:

class FlexibleDense(tf.keras.layers.Layer):
  # Note the added `**kwargs`, as Keras supports many arguments
  def __init__(self, out_features, **kwargs):
    super().__init__(**kwargs)
    self.out_features = out_features

  def build(self, input_shape):  # Create the state of the layer (weights)
    self.w = tf.Variable(
      tf.random.normal([input_shape[-1], self.out_features]), name='w')
    self.b = tf.Variable(tf.zeros([self.out_features]), name='b')

  def call(self, inputs):  # Defines the computation from inputs to outputs
    return tf.matmul(inputs, self.w) + self.b

# Create the instance of the layer
flexible_dense = FlexibleDense(out_features=3)

ณ จุดนี้ โมเดลยังไม่ถูกสร้างขึ้น ดังนั้นจึงไม่มีตัวแปร:

flexible_dense.variables

[]

ตัวยึดตำแหน่ง46

การเรียกใช้ฟังก์ชันจะจัดสรรตัวแปรที่มีขนาดเหมาะสม:

# Call it, with predictably random results
print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0], [3.0, 3.0, 3.0]])))

Model results: tf.Tensor(
[[-1.6998017  1.6444504 -1.3103955]
 [-2.5497022  2.4666753 -1.9655929]], shape=(2, 3), dtype=float32)

flexible_dense.variables

[<tf.Variable 'flexible_dense/w:0' shape=(3, 3) dtype=float32, numpy=
 array([[ 1.277462  ,  0.5399406 , -0.301957  ],
        [-1.6277349 ,  0.7374014 , -1.7651852 ],
        [-0.49962795, -0.45511687,  1.4119445 ]], dtype=float32)>,
 <tf.Variable 'flexible_dense/b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]

เนื่องจากมีการเรียกบิลด์เพียงครั้งเดียว อินพุตจะถูกปฏิเสธหาก build อินพุตไม่เข้ากันกับตัวแปรของเลเยอร์:

try:
  print("Model results:", flexible_dense(tf.constant([[2.0, 2.0, 2.0, 2.0]])))
except tf.errors.InvalidArgumentError as e:
  print("Failed:", e)

Failed: In[0] mismatch In[1] shape: 4 vs. 3: [1,4] [3,3] 0 0 [Op:MatMul]

ตัวยึดตำแหน่ง52

เลเยอร์ Keras มีคุณสมบัติพิเศษมากมายรวมถึง:

การสูญเสียทางเลือก
รองรับเมตริก
การสนับสนุนในตัวสำหรับอาร์กิวเมนต์ training ที่เป็นตัวเลือกเพื่อแยกความแตกต่างระหว่างการใช้การฝึกอบรมและการอนุมาน
get_config และ from_config ที่ให้คุณจัดเก็บการกำหนดค่าได้อย่างแม่นยำเพื่ออนุญาตการโคลนโมเดลใน Python

อ่านเกี่ยวกับสิ่งเหล่านี้ใน คู่มือฉบับเต็มสำหรับ เลเยอร์และโมเดลที่กำหนดเอง

รุ่น Keras

คุณสามารถกำหนดโมเดลของคุณเป็นเลเยอร์ Keras ที่ซ้อนกันได้

อย่างไรก็ตาม Keras ยังมีคลาสโมเดลที่มีคุณสมบัติครบถ้วนที่เรียกว่า tf.keras.Model มันสืบทอดมาจาก tf.keras.layers.Layer ดังนั้นโมเดล Keras สามารถใช้ ซ้อน และบันทึกได้ในลักษณะเดียวกับเลเยอร์ Keras รุ่น Keras มาพร้อมกับฟังก์ชันพิเศษที่ทำให้ง่ายต่อการฝึก ประเมิน โหลด บันทึก และแม้แต่ฝึกบนเครื่องหลายเครื่อง

คุณสามารถกำหนด SequentialModule จากด้านบนด้วยโค้ดที่เกือบเหมือนกัน แปลง __call__ เป็น call() อีกครั้ง และเปลี่ยนพาเรนต์:

class MySequentialModel(tf.keras.Model):
  def __init__(self, name=None, **kwargs):
    super().__init__(**kwargs)

    self.dense_1 = FlexibleDense(out_features=3)
    self.dense_2 = FlexibleDense(out_features=2)
  def call(self, x):
    x = self.dense_1(x)
    return self.dense_2(x)

# You have made a Keras model!
my_sequential_model = MySequentialModel(name="the_model")

# Call it on a tensor, with random results
print("Model results:", my_sequential_model(tf.constant([[2.0, 2.0, 2.0]])))

Model results: tf.Tensor([[5.5604653 3.3511646]], shape=(1, 2), dtype=float32)

มีคุณลักษณะเดียวกันทั้งหมด รวมถึงตัวแปรการติดตามและโมดูลย่อย

my_sequential_model.variables

[<tf.Variable 'my_sequential_model/flexible_dense_1/w:0' shape=(3, 3) dtype=float32, numpy=
 array([[ 0.05627853, -0.9386015 , -0.77410126],
        [ 0.63149   ,  1.0802224 , -0.37785745],
        [-0.24788402, -1.1076807 , -0.5956209 ]], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_1/b:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_2/w:0' shape=(3, 2) dtype=float32, numpy=
 array([[-0.93912166,  0.77979285],
        [ 1.4049559 , -1.9380962 ],
        [-2.6039495 ,  0.30885765]], dtype=float32)>,
 <tf.Variable 'my_sequential_model/flexible_dense_2/b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>]

my_sequential_model.submodules

(<__main__.FlexibleDense at 0x7f7b48525550>,
 <__main__.FlexibleDense at 0x7f7b48508d10>)

การแทนที่ tf.keras.Model เป็นแนวทาง Pythonic ในการสร้างแบบจำลอง TensorFlow หากคุณกำลังโยกย้ายโมเดลจากเฟรมเวิร์กอื่น สิ่งนี้สามารถทำได้ตรงไปตรงมามาก

หากคุณกำลังสร้างโมเดลที่เป็นการรวมเลเยอร์และอินพุตที่มีอยู่อย่างง่าย คุณสามารถประหยัดเวลาและพื้นที่โดยใช้ API การทำงาน ซึ่งมาพร้อมกับคุณสมบัติเพิ่มเติมเกี่ยวกับการสร้างโมเดลใหม่และสถาปัตยกรรม

นี่เป็นรุ่นเดียวกันกับ API ที่ใช้งานได้:

inputs = tf.keras.Input(shape=[3,])

x = FlexibleDense(3)(inputs)
x = FlexibleDense(2)(x)

my_functional_model = tf.keras.Model(inputs=inputs, outputs=x)

my_functional_model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 3)]               0         
_________________________________________________________________
flexible_dense_3 (FlexibleDe (None, 3)                 12        
_________________________________________________________________
flexible_dense_4 (FlexibleDe (None, 2)                 8         
=================================================================
Total params: 20
Trainable params: 20
Non-trainable params: 0
_________________________________________________________________

my_functional_model(tf.constant([[2.0, 2.0, 2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[8.219393, 4.511119]], dtype=float32)>

ความแตกต่างที่สำคัญในที่นี้คือ รูปร่างของอินพุตถูกระบุไว้ด้านหน้าซึ่งเป็นส่วนหนึ่งของกระบวนการสร้างฟังก์ชันการทำงาน อาร์กิวเมนต์ input_shape ในกรณีนี้ไม่จำเป็นต้องระบุอย่างสมบูรณ์ คุณสามารถปล่อยให้บางมิติเป็น None

การบันทึกโมเดล Keras

โมเดล Keras สามารถถูกตรวจสอบได้ และจะมีลักษณะเหมือนกับ tf.Module

โมเดล Keras สามารถบันทึกได้ด้วย tf.saved_model.save() เนื่องจากเป็นโมดูล อย่างไรก็ตาม รุ่น Keras มีวิธีอำนวยความสะดวกและฟังก์ชันอื่น ๆ :

my_sequential_model.save("exname_of_file")

INFO:tensorflow:Assets written to: exname_of_file/assets

สามารถโหลดกลับเข้าไปได้อย่างง่ายดายเช่นเดียวกัน:

reconstructed_model = tf.keras.models.load_model("exname_of_file")

WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.

Keras SavedModels ยังบันทึกสถานะการวัด การสูญเสีย และเครื่องมือเพิ่มประสิทธิภาพอีกด้วย

โมเดลที่สร้างใหม่นี้สามารถใช้ได้และจะให้ผลลัพธ์เดียวกันเมื่อเรียกใช้ข้อมูลเดียวกัน:

reconstructed_model(tf.constant([[2.0, 2.0, 2.0]]))

<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[5.5604653, 3.3511646]], dtype=float32)>

มีข้อมูลเพิ่มเติมเกี่ยวกับการบันทึกและการทำให้เป็นอนุกรมของโมเดล Keras รวมถึงการจัดเตรียมวิธีการกำหนดค่าสำหรับเลเยอร์ที่กำหนดเองสำหรับการสนับสนุนคุณลักษณะ ดู คำแนะนำในการบันทึกและซีเรีย ลไลซ์เซชัน

อะไรต่อไป

หากคุณต้องการทราบรายละเอียดเพิ่มเติมเกี่ยวกับ Keras คุณสามารถทำตามคำแนะนำของ Keras ได้ ที่นี่

อีกตัวอย่างหนึ่งของ API ระดับสูงที่สร้างบน tf.module คือ Sonnet จาก DeepMind ซึ่งครอบคลุมใน เว็บไซต์ของตน