עזרה להגן על שונית המחסום הגדולה עם TensorFlow על Kaggle הצטרפו אתגר

הנדסת תכונות באמצעות צינור TFX ו- Transform TensorFlow

טרנספורמציה של נתוני קלט ואימון מודל עם צינור TFX.

במדריך זה המבוסס על מחברת, ניצור ונפעיל צינור TFX כדי להטמיע נתוני קלט גולמיים ולעבד אותם מראש כראוי לאימון ML. מחברת זו מבוססת על צינור TFX שבנינו אימות נתונים באמצעות צינור TFX ו TensorFlow הדרכת אימות נתונים . אם עדיין לא קראת אותו, עליך לקרוא אותו לפני שתמשיך עם המחברת הזו.

אתה יכול להגדיל את איכות הניבוי של הנתונים שלך ו/או להפחית את הממדיות עם הנדסת תכונות. אחד היתרונות של שימוש ב-TFX הוא שאתה תכתוב את קוד הטרנספורמציה שלך פעם אחת, והטרנספורמציות שיתקבלו יהיו עקביות בין אימון להגשה על מנת למנוע הטיית אימון/הגשה.

אנחנו נוסיף Transform רכיב בצנרת. מרכיב Transform מיושם באמצעות tf.transform הספרייה.

אנא ראה הבנת TFX צנרת כדי ללמוד עוד על מושגים שונים TFX.

להכין

ראשית עלינו להתקין את חבילת TFX Python ולהוריד את מערך הנתונים בו נשתמש עבור המודל שלנו.

שדרוג פיפ

כדי להימנע משדרוג Pip במערכת בעת הפעלה מקומית, בדוק כדי לוודא שאנו פועלים ב-Colab. ניתן כמובן לשדרג מערכות מקומיות בנפרד.

try:
  import colab
  !pip install --upgrade pip
except:
  pass

התקן TFX

pip install -U tfx

הפעלת מחדש את זמן הריצה?

אם אתה משתמש ב-Google Colab, בפעם הראשונה שאתה מפעיל את התא שלמעלה, עליך להפעיל מחדש את זמן הריצה על ידי לחיצה מעל לחצן "התחל ריצה מחדש" או שימוש בתפריט "זמן ריצה > הפעל מחדש זמן ריצה...". זה בגלל האופן שבו קולאב טוען חבילות.

בדוק את גרסאות TensorFlow ו-TFX.

import tensorflow as tf
print('TensorFlow version: {}'.format(tf.__version__))
from tfx import v1 as tfx
print('TFX version: {}'.format(tfx.__version__))
TensorFlow version: 2.6.2
TFX version: 1.4.0

הגדר משתנים

ישנם כמה משתנים המשמשים להגדרת צינור. אתה יכול להתאים אישית את המשתנים האלה כרצונך. כברירת מחדל, כל הפלט מהצינור ייווצר תחת הספרייה הנוכחית.

import os

PIPELINE_NAME = "penguin-transform"

# Output directory to store artifacts generated from the pipeline.
PIPELINE_ROOT = os.path.join('pipelines', PIPELINE_NAME)
# Path to a SQLite DB file to use as an MLMD storage.
METADATA_PATH = os.path.join('metadata', PIPELINE_NAME, 'metadata.db')
# Output directory where created models from the pipeline will be exported.
SERVING_MODEL_DIR = os.path.join('serving_model', PIPELINE_NAME)

from absl import logging
logging.set_verbosity(logging.INFO)  # Set default logging level.

הכן נתונים לדוגמה

נוריד את מערך הנתונים לדוגמה לשימוש בצינור ה-TFX שלנו. בסיס הנתונים שאנו משתמשים הוא במערך הפינגווינים פאלמר .

עם זאת, בניגוד הדרכות קודמות אשר השתמשו במערך מעובד כבר, נשתמש במערך פאלמר הפינגווינים גלם.

מכיוון שרכיב TFX ExampleGen קורא קלט מספריה, עלינו ליצור ספרייה ולהעתיק אליה את מערך הנתונים.

import urllib.request
import tempfile

DATA_ROOT = tempfile.mkdtemp(prefix='tfx-data')  # Create a temporary directory.
_data_path = 'https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins_size.csv'
_data_filepath = os.path.join(DATA_ROOT, "data.csv")
urllib.request.urlretrieve(_data_path, _data_filepath)
('/tmp/tfx-dataacmxfq9f/data.csv', <http.client.HTTPMessage at 0x7f5b0ab1bf10>)

תסתכל במהירות איך הנתונים הגולמיים נראים.

head {_data_filepath}
species,island,culmen_length_mm,culmen_depth_mm,flipper_length_mm,body_mass_g,sex
Adelie,Torgersen,39.1,18.7,181,3750,MALE
Adelie,Torgersen,39.5,17.4,186,3800,FEMALE
Adelie,Torgersen,40.3,18,195,3250,FEMALE
Adelie,Torgersen,NA,NA,NA,NA,NA
Adelie,Torgersen,36.7,19.3,193,3450,FEMALE
Adelie,Torgersen,39.3,20.6,190,3650,MALE
Adelie,Torgersen,38.9,17.8,181,3625,FEMALE
Adelie,Torgersen,39.2,19.6,195,4675,MALE
Adelie,Torgersen,34.1,18.1,193,3475,NA

קיימים מספר ערכים עם חסר ערכים אשר מיוצגים NA . אנו פשוט נמחק את הערכים הללו במדריך זה.

sed -i '/\bNA\b/d' {_data_filepath}
head {_data_filepath}
species,island,culmen_length_mm,culmen_depth_mm,flipper_length_mm,body_mass_g,sex
Adelie,Torgersen,39.1,18.7,181,3750,MALE
Adelie,Torgersen,39.5,17.4,186,3800,FEMALE
Adelie,Torgersen,40.3,18,195,3250,FEMALE
Adelie,Torgersen,36.7,19.3,193,3450,FEMALE
Adelie,Torgersen,39.3,20.6,190,3650,MALE
Adelie,Torgersen,38.9,17.8,181,3625,FEMALE
Adelie,Torgersen,39.2,19.6,195,4675,MALE
Adelie,Torgersen,41.1,17.6,182,3200,FEMALE
Adelie,Torgersen,38.6,21.2,191,3800,MALE

אתה אמור להיות מסוגל לראות שבע תכונות המתארות פינגווינים. אנו נשתמש באותה קבוצת תכונות כמו המדריכים הקודמים - 'culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g' - ונחזה את ה'זן' של פינגווין.

ההבדל היחיד יהיה שנתוני הקלט אינם מעובדים מראש. שימו לב שלא נשתמש בתכונות אחרות כמו 'אי' או 'סקס' במדריך זה.

הכן קובץ סכימה

כפי שתואר אימות נתונים באמצעות TFX צינור ו TensorFlow נתונים אימות הדרכה , אנחנו צריכים קובץ סכימה עבור בסיס הנתונים. מכיוון שמערך הנתונים שונה מהמדריך הקודם, עלינו ליצור אותו שוב. במדריך זה, נדלג על השלבים הללו ופשוט נשתמש בקובץ סכימה מוכן.

import shutil

SCHEMA_PATH = 'schema'

_schema_uri = 'https://raw.githubusercontent.com/tensorflow/tfx/master/tfx/examples/penguin/schema/raw/schema.pbtxt'
_schema_filename = 'schema.pbtxt'
_schema_filepath = os.path.join(SCHEMA_PATH, _schema_filename)

os.makedirs(SCHEMA_PATH, exist_ok=True)
urllib.request.urlretrieve(_schema_uri, _schema_filepath)
('schema/schema.pbtxt', <http.client.HTTPMessage at 0x7f5b0ab20f50>)

קובץ הסכימה הזה נוצר עם אותו צינור כמו במדריך הקודם ללא שינויים ידניים.

צור צינור

צינורות TFX מוגדרים באמצעות ממשקי API של Python. נוסיף Transform רכיב בצנרת יצרנו ב הדרכת אימות נתונים .

מרכיב המרה דורש נתון קלט מתוך ExampleGen רכיב ו סכימה של SchemaGen רכיב, וגם מייצר "להפוך גרף". הפלט יהיה לשמש Trainer רכיב. טרנספורמציה יכולה, באופן אופציונלי, לייצר "נתונים שעברו שינוי" בנוסף, שהם הנתונים שהתממשו לאחר השינוי. עם זאת, אנו נשנה נתונים במהלך האימון במדריך זה מבלי להתממש של הנתונים שעברו טרנספורמציה ביניים.

דבר אחד לציין הוא כי אנו צריכים להגדיר פונקציה Python, preprocessing_fn לתאר כיצד נתוני הקלט צריך להיות טרנספורמציה. זה דומה לרכיב Trainer שגם דורש קוד משתמש להגדרת המודל.

כתוב קוד עיבוד מקדים והדרכה

עלינו להגדיר שתי פונקציות של Python. אחד לטרנספורם ואחד למאמן.

preprocessing_fn

מרכיב Transform ימצא פונקציה בשם preprocessing_fn בקובץ מודול נתון כפי שעשינו עבור Trainer רכיב. ניתן גם לציין תפקיד ספציפי באמצעות preprocessing_fn פרמטר של רכיב המרה.

בדוגמה זו, נעשה שני סוגים של טרנספורמציה. עבור תכונות מספריות רציפות כמו culmen_length_mm ו body_mass_g , נוכל לנרמל את הערכים הללו באמצעות tft.scale_to_z_score הפונקציה. עבור תכונת התווית, עלינו להמיר תוויות מחרוזות לערכי אינדקס מספריים. נשתמש tf.lookup.StaticHashTable לגיור.

כדי לזהות שדות להפוך בקלות, אנחנו לצרף _xf סיומת לשמות תכונה טרנספורמציה.

run_fn

המודל עצמו כמעט זהה למדריכים הקודמים, אך הפעם נמיר את נתוני הקלט באמצעות גרף הטרנספורמציה מרכיב ה-Transform.

הבדל אחד חשוב יותר בהשוואה למדריך הקודם הוא שכעת אנו מייצאים מודל להגשה הכולל לא רק את גרף החישוב של המודל, אלא גם את גרף הטרנספורמציה לעיבוד מקדים, שנוצר ב-Transform component. עלינו להגדיר פונקציה נפרדת שתשמש למתן בקשות נכנסות. ניתן לראות כי אותו תפקיד _apply_preprocessing שימש הן של נתוני האימונים והבקשה ההגשה.

_module_file = 'penguin_utils.py'
%%writefile {_module_file}


from typing import List, Text
from absl import logging
import tensorflow as tf
from tensorflow import keras
from tensorflow_metadata.proto.v0 import schema_pb2
import tensorflow_transform as tft
from tensorflow_transform.tf_metadata import schema_utils

from tfx import v1 as tfx
from tfx_bsl.public import tfxio

# Specify features that we will use.
_FEATURE_KEYS = [
    'culmen_length_mm', 'culmen_depth_mm', 'flipper_length_mm', 'body_mass_g'
]
_LABEL_KEY = 'species'

_TRAIN_BATCH_SIZE = 20
_EVAL_BATCH_SIZE = 10


# NEW: TFX Transform will call this function.
def preprocessing_fn(inputs):
  """tf.transform's callback function for preprocessing inputs.

  Args:
    inputs: map from feature keys to raw not-yet-transformed features.

  Returns:
    Map from string feature key to transformed feature.
  """
  outputs = {}

  # Uses features defined in _FEATURE_KEYS only.
  for key in _FEATURE_KEYS:
    # tft.scale_to_z_score computes the mean and variance of the given feature
    # and scales the output based on the result.
    outputs[key] = tft.scale_to_z_score(inputs[key])

  # For the label column we provide the mapping from string to index.
  # We could instead use `tft.compute_and_apply_vocabulary()` in order to
  # compute the vocabulary dynamically and perform a lookup.
  # Since in this example there are only 3 possible values, we use a hard-coded
  # table for simplicity.
  table_keys = ['Adelie', 'Chinstrap', 'Gentoo']
  initializer = tf.lookup.KeyValueTensorInitializer(
      keys=table_keys,
      values=tf.cast(tf.range(len(table_keys)), tf.int64),
      key_dtype=tf.string,
      value_dtype=tf.int64)
  table = tf.lookup.StaticHashTable(initializer, default_value=-1)
  outputs[_LABEL_KEY] = table.lookup(inputs[_LABEL_KEY])

  return outputs


# NEW: This function will apply the same transform operation to training data
#      and serving requests.
def _apply_preprocessing(raw_features, tft_layer):
  transformed_features = tft_layer(raw_features)
  if _LABEL_KEY in raw_features:
    transformed_label = transformed_features.pop(_LABEL_KEY)
    return transformed_features, transformed_label
  else:
    return transformed_features, None


# NEW: This function will create a handler function which gets a serialized
#      tf.example, preprocess and run an inference with it.
def _get_serve_tf_examples_fn(model, tf_transform_output):
  # We must save the tft_layer to the model to ensure its assets are kept and
  # tracked.
  model.tft_layer = tf_transform_output.transform_features_layer()

  @tf.function(input_signature=[
      tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')
  ])
  def serve_tf_examples_fn(serialized_tf_examples):
    # Expected input is a string which is serialized tf.Example format.
    feature_spec = tf_transform_output.raw_feature_spec()
    # Because input schema includes unnecessary fields like 'species' and
    # 'island', we filter feature_spec to include required keys only.
    required_feature_spec = {
        k: v for k, v in feature_spec.items() if k in _FEATURE_KEYS
    }
    parsed_features = tf.io.parse_example(serialized_tf_examples,
                                          required_feature_spec)

    # Preprocess parsed input with transform operation defined in
    # preprocessing_fn().
    transformed_features, _ = _apply_preprocessing(parsed_features,
                                                   model.tft_layer)
    # Run inference with ML model.
    return model(transformed_features)

  return serve_tf_examples_fn


def _input_fn(file_pattern: List[Text],
              data_accessor: tfx.components.DataAccessor,
              tf_transform_output: tft.TFTransformOutput,
              batch_size: int = 200) -> tf.data.Dataset:
  """Generates features and label for tuning/training.

  Args:
    file_pattern: List of paths or patterns of input tfrecord files.
    data_accessor: DataAccessor for converting input to RecordBatch.
    tf_transform_output: A TFTransformOutput.
    batch_size: representing the number of consecutive elements of returned
      dataset to combine in a single batch

  Returns:
    A dataset that contains (features, indices) tuple where features is a
      dictionary of Tensors, and indices is a single Tensor of label indices.
  """
  dataset = data_accessor.tf_dataset_factory(
      file_pattern,
      tfxio.TensorFlowDatasetOptions(batch_size=batch_size),
      schema=tf_transform_output.raw_metadata.schema)

  transform_layer = tf_transform_output.transform_features_layer()
  def apply_transform(raw_features):
    return _apply_preprocessing(raw_features, transform_layer)

  return dataset.map(apply_transform).repeat()


def _build_keras_model() -> tf.keras.Model:
  """Creates a DNN Keras model for classifying penguin data.

  Returns:
    A Keras Model.
  """
  # The model below is built with Functional API, please refer to
  # https://www.tensorflow.org/guide/keras/overview for all API options.
  inputs = [
      keras.layers.Input(shape=(1,), name=key)
      for key in _FEATURE_KEYS
  ]
  d = keras.layers.concatenate(inputs)
  for _ in range(2):
    d = keras.layers.Dense(8, activation='relu')(d)
  outputs = keras.layers.Dense(3)(d)

  model = keras.Model(inputs=inputs, outputs=outputs)
  model.compile(
      optimizer=keras.optimizers.Adam(1e-2),
      loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
      metrics=[keras.metrics.SparseCategoricalAccuracy()])

  model.summary(print_fn=logging.info)
  return model


# TFX Trainer will call this function.
def run_fn(fn_args: tfx.components.FnArgs):
  """Train the model based on given args.

  Args:
    fn_args: Holds args used to train the model as name/value pairs.
  """
  tf_transform_output = tft.TFTransformOutput(fn_args.transform_output)

  train_dataset = _input_fn(
      fn_args.train_files,
      fn_args.data_accessor,
      tf_transform_output,
      batch_size=_TRAIN_BATCH_SIZE)
  eval_dataset = _input_fn(
      fn_args.eval_files,
      fn_args.data_accessor,
      tf_transform_output,
      batch_size=_EVAL_BATCH_SIZE)

  model = _build_keras_model()
  model.fit(
      train_dataset,
      steps_per_epoch=fn_args.train_steps,
      validation_data=eval_dataset,
      validation_steps=fn_args.eval_steps)

  # NEW: Save a computation graph including transform layer.
  signatures = {
      'serving_default': _get_serve_tf_examples_fn(model, tf_transform_output),
  }
  model.save(fn_args.serving_model_dir, save_format='tf', signatures=signatures)
Writing penguin_utils.py

כעת השלמת את כל שלבי ההכנה לבניית צינור TFX.

כתוב הגדרת צינור

אנו מגדירים פונקציה ליצירת צינור TFX. Pipeline אובייקט מייצג צינור TFX, אשר ניתן להפעיל באמצעות אחת ממערכות תזמור צינור תומך TFX.

def _create_pipeline(pipeline_name: str, pipeline_root: str, data_root: str,
                     schema_path: str, module_file: str, serving_model_dir: str,
                     metadata_path: str) -> tfx.dsl.Pipeline:
  """Implements the penguin pipeline with TFX."""
  # Brings data into the pipeline or otherwise joins/converts training data.
  example_gen = tfx.components.CsvExampleGen(input_base=data_root)

  # Computes statistics over data for visualization and example validation.
  statistics_gen = tfx.components.StatisticsGen(
      examples=example_gen.outputs['examples'])

  # Import the schema.
  schema_importer = tfx.dsl.Importer(
      source_uri=schema_path,
      artifact_type=tfx.types.standard_artifacts.Schema).with_id(
          'schema_importer')

  # Performs anomaly detection based on statistics and data schema.
  example_validator = tfx.components.ExampleValidator(
      statistics=statistics_gen.outputs['statistics'],
      schema=schema_importer.outputs['result'])

  # NEW: Transforms input data using preprocessing_fn in the 'module_file'.
  transform = tfx.components.Transform(
      examples=example_gen.outputs['examples'],
      schema=schema_importer.outputs['result'],
      materialize=False,
      module_file=module_file)

  # Uses user-provided Python function that trains a model.
  trainer = tfx.components.Trainer(
      module_file=module_file,
      examples=example_gen.outputs['examples'],

      # NEW: Pass transform_graph to the trainer.
      transform_graph=transform.outputs['transform_graph'],

      train_args=tfx.proto.TrainArgs(num_steps=100),
      eval_args=tfx.proto.EvalArgs(num_steps=5))

  # Pushes the model to a filesystem destination.
  pusher = tfx.components.Pusher(
      model=trainer.outputs['model'],
      push_destination=tfx.proto.PushDestination(
          filesystem=tfx.proto.PushDestination.Filesystem(
              base_directory=serving_model_dir)))

  components = [
      example_gen,
      statistics_gen,
      schema_importer,
      example_validator,

      transform,  # NEW: Transform component was added to the pipeline.

      trainer,
      pusher,
  ]

  return tfx.dsl.Pipeline(
      pipeline_name=pipeline_name,
      pipeline_root=pipeline_root,
      metadata_connection_config=tfx.orchestration.metadata
      .sqlite_metadata_connection_config(metadata_path),
      components=components)

הפעל את הצינור

נשתמש LocalDagRunner כמו ההדרכה הקודמת.

tfx.orchestration.LocalDagRunner().run(
  _create_pipeline(
      pipeline_name=PIPELINE_NAME,
      pipeline_root=PIPELINE_ROOT,
      data_root=DATA_ROOT,
      schema_path=SCHEMA_PATH,
      module_file=_module_file,
      serving_model_dir=SERVING_MODEL_DIR,
      metadata_path=METADATA_PATH))
INFO:absl:Excluding no splits because exclude_splits is not set.
INFO:absl:Excluding no splits because exclude_splits is not set.
INFO:absl:Generating ephemeral wheel package for '/tmpfs/src/temp/docs/tutorials/tfx/penguin_utils.py' (including modules: ['penguin_utils']).
INFO:absl:User module package has hash fingerprint version a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.
INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '/tmp/tmp_rl2wpg3/_tfx_generated_setup.py', 'bdist_wheel', '--bdist-dir', '/tmp/tmps7emqvj6', '--dist-dir', '/tmp/tmpnvanprdd']
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  setuptools.SetuptoolsDeprecationWarning,
listing git files failed - pretending there aren't any
INFO:absl:Successfully built user code wheel distribution at 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'; target user module is 'penguin_utils'.
INFO:absl:Full user module path is 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'
INFO:absl:Generating ephemeral wheel package for '/tmpfs/src/temp/docs/tutorials/tfx/penguin_utils.py' (including modules: ['penguin_utils']).
INFO:absl:User module package has hash fingerprint version a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.
INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '/tmp/tmpi9sy085o/_tfx_generated_setup.py', 'bdist_wheel', '--bdist-dir', '/tmp/tmpugc_ecw_', '--dist-dir', '/tmp/tmpr1xz5bg6']
running bdist_wheel
running build
running build_py
creating build
creating build/lib
copying penguin_utils.py -> build/lib
installing to /tmp/tmps7emqvj6
running install
running install_lib
copying build/lib/penguin_utils.py -> /tmp/tmps7emqvj6
running install_egg_info
running egg_info
creating tfx_user_code_Transform.egg-info
writing tfx_user_code_Transform.egg-info/PKG-INFO
writing dependency_links to tfx_user_code_Transform.egg-info/dependency_links.txt
writing top-level names to tfx_user_code_Transform.egg-info/top_level.txt
writing manifest file 'tfx_user_code_Transform.egg-info/SOURCES.txt'
reading manifest file 'tfx_user_code_Transform.egg-info/SOURCES.txt'
writing manifest file 'tfx_user_code_Transform.egg-info/SOURCES.txt'
Copying tfx_user_code_Transform.egg-info to /tmp/tmps7emqvj6/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3.7.egg-info
running install_scripts
creating /tmp/tmps7emqvj6/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL
creating '/tmp/tmpnvanprdd/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' and adding '/tmp/tmps7emqvj6' to it
adding 'penguin_utils.py'
adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/METADATA'
adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL'
adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/top_level.txt'
adding 'tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/RECORD'
removing /tmp/tmps7emqvj6
/tmpfs/src/tf_docs_env/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
  setuptools.SetuptoolsDeprecationWarning,
listing git files failed - pretending there aren't any
INFO:absl:Successfully built user code wheel distribution at 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'; target user module is 'penguin_utils'.
INFO:absl:Full user module path is 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'
INFO:absl:Using deployment config:
 executor_specs {
  key: "CsvExampleGen"
  value {
    beam_executable_spec {
      python_executor_spec {
        class_path: "tfx.components.example_gen.csv_example_gen.executor.Executor"
      }
    }
  }
}
executor_specs {
  key: "ExampleValidator"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.example_validator.executor.Executor"
    }
  }
}
executor_specs {
  key: "Pusher"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.pusher.executor.Executor"
    }
  }
}
executor_specs {
  key: "StatisticsGen"
  value {
    beam_executable_spec {
      python_executor_spec {
        class_path: "tfx.components.statistics_gen.executor.Executor"
      }
    }
  }
}
executor_specs {
  key: "Trainer"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.trainer.executor.GenericExecutor"
    }
  }
}
executor_specs {
  key: "Transform"
  value {
    beam_executable_spec {
      python_executor_spec {
        class_path: "tfx.components.transform.executor.Executor"
      }
    }
  }
}
custom_driver_specs {
  key: "CsvExampleGen"
  value {
    python_class_executable_spec {
      class_path: "tfx.components.example_gen.driver.FileBasedDriver"
    }
  }
}
metadata_connection_config {
  sqlite {
    filename_uri: "metadata/penguin-transform/metadata.db"
    connection_mode: READWRITE_OPENCREATE
  }
}

INFO:absl:Using connection config:
 sqlite {
  filename_uri: "metadata/penguin-transform/metadata.db"
  connection_mode: READWRITE_OPENCREATE
}

INFO:absl:Component CsvExampleGen is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen"
  }
  id: "CsvExampleGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.CsvExampleGen"
      }
    }
  }
}
outputs {
  outputs {
    key: "examples"
    value {
      artifact_spec {
        type {
          name: "Examples"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
          properties {
            key: "version"
            value: INT
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "input_base"
    value {
      field_value {
        string_value: "/tmp/tfx-dataacmxfq9f"
      }
    }
  }
  parameters {
    key: "input_config"
    value {
      field_value {
        string_value: "{\n  \"splits\": [\n    {\n      \"name\": \"single_split\",\n      \"pattern\": \"*\"\n    }\n  ]\n}"
      }
    }
  }
  parameters {
    key: "output_config"
    value {
      field_value {
        string_value: "{\n  \"split_config\": {\n    \"splits\": [\n      {\n        \"hash_buckets\": 2,\n        \"name\": \"train\"\n      },\n      {\n        \"hash_buckets\": 1,\n        \"name\": \"eval\"\n      }\n    ]\n  }\n}"
      }
    }
  }
  parameters {
    key: "output_data_format"
    value {
      field_value {
        int_value: 6
      }
    }
  }
  parameters {
    key: "output_file_format"
    value {
      field_value {
        int_value: 5
      }
    }
  }
}
downstream_nodes: "StatisticsGen"
downstream_nodes: "Trainer"
downstream_nodes: "Transform"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
running bdist_wheel
running build
running build_py
creating build
creating build/lib
copying penguin_utils.py -> build/lib
installing to /tmp/tmpugc_ecw_
running install
running install_lib
copying build/lib/penguin_utils.py -> /tmp/tmpugc_ecw_
running install_egg_info
running egg_info
creating tfx_user_code_Trainer.egg-info
writing tfx_user_code_Trainer.egg-info/PKG-INFO
writing dependency_links to tfx_user_code_Trainer.egg-info/dependency_links.txt
writing top-level names to tfx_user_code_Trainer.egg-info/top_level.txt
writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
reading manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
writing manifest file 'tfx_user_code_Trainer.egg-info/SOURCES.txt'
Copying tfx_user_code_Trainer.egg-info to /tmp/tmpugc_ecw_/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3.7.egg-info
running install_scripts
creating /tmp/tmpugc_ecw_/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL
creating '/tmp/tmpr1xz5bg6/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' and adding '/tmp/tmpugc_ecw_' to it
adding 'penguin_utils.py'
adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/METADATA'
adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/WHEEL'
adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/top_level.txt'
adding 'tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9.dist-info/RECORD'
removing /tmp/tmpugc_ecw_
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1205 10:21:51.351922 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
I1205 10:21:52.158721 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
I1205 10:21:52.173334 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
I1205 10:21:52.180279 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:select span and version = (0, None)
INFO:absl:latest span and version = (0, None)
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 1
I1205 10:21:52.194584 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=1, input_dict={}, output_dict=defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1638699709,sum_checksum:1638699709"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "span"
  value {
    int_value: 0
  }
}
, artifact_type: name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}), exec_properties={'output_config': '{\n  "split_config": {\n    "splits": [\n      {\n        "hash_buckets": 2,\n        "name": "train"\n      },\n      {\n        "hash_buckets": 1,\n        "name": "eval"\n      }\n    ]\n  }\n}', 'input_config': '{\n  "splits": [\n    {\n      "name": "single_split",\n      "pattern": "*"\n    }\n  ]\n}', 'output_file_format': 5, 'output_data_format': 6, 'input_base': '/tmp/tfx-dataacmxfq9f', 'span': 0, 'version': None, 'input_fingerprint': 'split:single_split,num_files:1,total_bytes:13161,xor_checksum:1638699709,sum_checksum:1638699709'}, execution_output_uri='pipelines/penguin-transform/CsvExampleGen/.system/executor_execution/1/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/CsvExampleGen/.system/stateful_working_dir/2021-12-05T10:21:51.187624', tmp_dir='pipelines/penguin-transform/CsvExampleGen/.system/executor_execution/1/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.example_gen.csv_example_gen.component.CsvExampleGen"
  }
  id: "CsvExampleGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.CsvExampleGen"
      }
    }
  }
}
outputs {
  outputs {
    key: "examples"
    value {
      artifact_spec {
        type {
          name: "Examples"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
          properties {
            key: "version"
            value: INT
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "input_base"
    value {
      field_value {
        string_value: "/tmp/tfx-dataacmxfq9f"
      }
    }
  }
  parameters {
    key: "input_config"
    value {
      field_value {
        string_value: "{\n  \"splits\": [\n    {\n      \"name\": \"single_split\",\n      \"pattern\": \"*\"\n    }\n  ]\n}"
      }
    }
  }
  parameters {
    key: "output_config"
    value {
      field_value {
        string_value: "{\n  \"split_config\": {\n    \"splits\": [\n      {\n        \"hash_buckets\": 2,\n        \"name\": \"train\"\n      },\n      {\n        \"hash_buckets\": 1,\n        \"name\": \"eval\"\n      }\n    ]\n  }\n}"
      }
    }
  }
  parameters {
    key: "output_data_format"
    value {
      field_value {
        int_value: 6
      }
    }
  }
  parameters {
    key: "output_file_format"
    value {
      field_value {
        int_value: 5
      }
    }
  }
}
downstream_nodes: "StatisticsGen"
downstream_nodes: "Trainer"
downstream_nodes: "Transform"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-12-05T10:21:51.187624')
INFO:absl:Generating examples.
WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.
INFO:absl:Processing input csv data /tmp/tfx-dataacmxfq9f/* to TFExample.
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
WARNING:apache_beam.io.tfrecordio:Couldn't find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.
INFO:absl:Examples generated.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 1 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'examples': [Artifact(artifact: uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1638699709,sum_checksum:1638699709"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "span"
  value {
    int_value: 0
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}) for execution 1
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component CsvExampleGen is finished.
INFO:absl:Component schema_importer is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.dsl.components.common.importer.Importer"
  }
  id: "schema_importer"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.schema_importer"
      }
    }
  }
}
outputs {
  outputs {
    key: "result"
    value {
      artifact_spec {
        type {
          name: "Schema"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "artifact_uri"
    value {
      field_value {
        string_value: "schema"
      }
    }
  }
  parameters {
    key: "reimport"
    value {
      field_value {
        int_value: 0
      }
    }
  }
}
downstream_nodes: "ExampleValidator"
downstream_nodes: "Transform"
execution_options {
  caching_options {
  }
}

INFO:absl:Running as an importer node.
INFO:absl:MetadataStore with DB connection initialized
I1205 10:21:53.330585 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:Processing source uri: schema, properties: {}, custom_properties: {}
I1205 10:21:53.340232 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:Component schema_importer is finished.
INFO:absl:Component StatisticsGen is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.statistics_gen.component.StatisticsGen"
  }
  id: "StatisticsGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.StatisticsGen"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
      min_count: 1
    }
  }
}
outputs {
  outputs {
    key: "statistics"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
downstream_nodes: "ExampleValidator"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
I1205 10:21:53.360662 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 3
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=3, input_dict={'examples': [Artifact(artifact: id: 1
type_id: 15
uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "file_format"
  value {
    string_value: "tfrecords_gzip"
  }
}
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1638699709,sum_checksum:1638699709"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "payload_format"
  value {
    string_value: "FORMAT_TF_EXAMPLE"
  }
}
custom_properties {
  key: "span"
  value {
    int_value: 0
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699713316
last_update_time_since_epoch: 1638699713316
, artifact_type: id: 15
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}, output_dict=defaultdict(<class 'list'>, {'statistics': [Artifact(artifact: uri: "pipelines/penguin-transform/StatisticsGen/statistics/3"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:StatisticsGen:statistics:0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}), exec_properties={'exclude_splits': '[]'}, execution_output_uri='pipelines/penguin-transform/StatisticsGen/.system/executor_execution/3/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/StatisticsGen/.system/stateful_working_dir/2021-12-05T10:21:51.187624', tmp_dir='pipelines/penguin-transform/StatisticsGen/.system/executor_execution/3/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.statistics_gen.component.StatisticsGen"
  }
  id: "StatisticsGen"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.StatisticsGen"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
      min_count: 1
    }
  }
}
outputs {
  outputs {
    key: "statistics"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
downstream_nodes: "ExampleValidator"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-12-05T10:21:51.187624')
INFO:absl:Generating statistics for split train.
INFO:absl:Statistics for split train written to pipelines/penguin-transform/StatisticsGen/statistics/3/Split-train.
INFO:absl:Generating statistics for split eval.
INFO:absl:Statistics for split eval written to pipelines/penguin-transform/StatisticsGen/statistics/3/Split-eval.
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 3 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'statistics': [Artifact(artifact: uri: "pipelines/penguin-transform/StatisticsGen/statistics/3"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:StatisticsGen:statistics:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}) for execution 3
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component StatisticsGen is finished.
INFO:absl:Component Transform is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.transform.component.Transform"
  }
  id: "Transform"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Transform"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
      min_count: 1
    }
  }
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "schema_importer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.schema_importer"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
      min_count: 1
    }
  }
}
outputs {
  outputs {
    key: "post_transform_anomalies"
    value {
      artifact_spec {
        type {
          name: "ExampleAnomalies"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
  outputs {
    key: "post_transform_schema"
    value {
      artifact_spec {
        type {
          name: "Schema"
        }
      }
    }
  }
  outputs {
    key: "post_transform_stats"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
  outputs {
    key: "pre_transform_schema"
    value {
      artifact_spec {
        type {
          name: "Schema"
        }
      }
    }
  }
  outputs {
    key: "pre_transform_stats"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
  outputs {
    key: "transform_graph"
    value {
      artifact_spec {
        type {
          name: "TransformGraph"
        }
      }
    }
  }
  outputs {
    key: "updated_analyzer_cache"
    value {
      artifact_spec {
        type {
          name: "TransformCache"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "disable_statistics"
    value {
      field_value {
        int_value: 0
      }
    }
  }
  parameters {
    key: "force_tf_compat_v1"
    value {
      field_value {
        int_value: 0
      }
    }
  }
  parameters {
    key: "module_path"
    value {
      field_value {
        string_value: "penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "schema_importer"
downstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
I1205 10:21:56.029392 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 4
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=4, input_dict={'schema': [Artifact(artifact: id: 2
type_id: 17
uri: "schema"
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699713343
last_update_time_since_epoch: 1638699713343
, artifact_type: id: 17
name: "Schema"
)], 'examples': [Artifact(artifact: id: 1
type_id: 15
uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "file_format"
  value {
    string_value: "tfrecords_gzip"
  }
}
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1638699709,sum_checksum:1638699709"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "payload_format"
  value {
    string_value: "FORMAT_TF_EXAMPLE"
  }
}
custom_properties {
  key: "span"
  value {
    int_value: 0
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699713316
last_update_time_since_epoch: 1638699713316
, artifact_type: id: 15
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)]}, output_dict=defaultdict(<class 'list'>, {'updated_analyzer_cache': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/updated_analyzer_cache/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:updated_analyzer_cache:0"
  }
}
, artifact_type: name: "TransformCache"
)], 'post_transform_stats': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/post_transform_stats/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:post_transform_stats:0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)], 'pre_transform_stats': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/pre_transform_stats/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:pre_transform_stats:0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)], 'pre_transform_schema': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/pre_transform_schema/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:pre_transform_schema:0"
  }
}
, artifact_type: name: "Schema"
)], 'post_transform_anomalies': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/post_transform_anomalies/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:post_transform_anomalies:0"
  }
}
, artifact_type: name: "ExampleAnomalies"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)], 'transform_graph': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/transform_graph/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:transform_graph:0"
  }
}
, artifact_type: name: "TransformGraph"
)], 'post_transform_schema': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/post_transform_schema/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:post_transform_schema:0"
  }
}
, artifact_type: name: "Schema"
)]}), exec_properties={'disable_statistics': 0, 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'custom_config': 'null', 'force_tf_compat_v1': 0}, execution_output_uri='pipelines/penguin-transform/Transform/.system/executor_execution/4/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Transform/.system/stateful_working_dir/2021-12-05T10:21:51.187624', tmp_dir='pipelines/penguin-transform/Transform/.system/executor_execution/4/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.transform.component.Transform"
  }
  id: "Transform"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Transform"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
      min_count: 1
    }
  }
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "schema_importer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.schema_importer"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
      min_count: 1
    }
  }
}
outputs {
  outputs {
    key: "post_transform_anomalies"
    value {
      artifact_spec {
        type {
          name: "ExampleAnomalies"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
  outputs {
    key: "post_transform_schema"
    value {
      artifact_spec {
        type {
          name: "Schema"
        }
      }
    }
  }
  outputs {
    key: "post_transform_stats"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
  outputs {
    key: "pre_transform_schema"
    value {
      artifact_spec {
        type {
          name: "Schema"
        }
      }
    }
  }
  outputs {
    key: "pre_transform_stats"
    value {
      artifact_spec {
        type {
          name: "ExampleStatistics"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
  outputs {
    key: "transform_graph"
    value {
      artifact_spec {
        type {
          name: "TransformGraph"
        }
      }
    }
  }
  outputs {
    key: "updated_analyzer_cache"
    value {
      artifact_spec {
        type {
          name: "TransformCache"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "disable_statistics"
    value {
      field_value {
        int_value: 0
      }
    }
  }
  parameters {
    key: "force_tf_compat_v1"
    value {
      field_value {
        int_value: 0
      }
    }
  }
  parameters {
    key: "module_path"
    value {
      field_value {
        string_value: "penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "schema_importer"
downstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-12-05T10:21:51.187624')
INFO:absl:Analyze the 'train' split and transform all splits when splits_config is not set.
INFO:absl:udf_utils.get_fn {'module_file': None, 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'preprocessing_fn': None} 'preprocessing_fn'
INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.
INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmp3elppure', 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']
Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl
INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.
INFO:absl:udf_utils.get_fn {'module_file': None, 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'stats_options_updater_fn': None} 'stats_options_updater_fn'
INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.
INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmpctb52fyz', 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']
Installing collected packages: tfx-user-code-Transform
Successfully installed tfx-user-code-Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9
Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl
INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.
INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.
INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmpgv9zk7st', 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']
Installing collected packages: tfx-user-code-Transform
Successfully installed tfx-user-code-Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9
Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl
INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
Installing collected packages: tfx-user-code-Transform
Successfully installed tfx-user-code-Transform-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_transform/tf_utils.py:289: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.7/site-packages/tensorflow_transform/tf_utils.py:289: Tensor.experimental_ref (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use ref() instead.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
WARNING:root:This output type hint will be ignored and not used for type-checking purposes. Typically, output type hints for a PTransform are single (or nested) types wrapped by a PCollection, PDone, or None. Got: Tuple[Dict[str, Union[NoneType, _Dataset]], Union[Dict[str, Dict[str, PCollection]], NoneType], int] instead.
WARNING:absl:Tables initialized inside a tf.function  will be re-initialized on every invocation of the function. This  re-initialization can have significant impact on performance. Consider lifting  them out of the graph context using  `tf.init_scope`.: key_value_init/LookupTableImportV2
WARNING:absl:Tables initialized inside a tf.function  will be re-initialized on every invocation of the function. This  re-initialization can have significant impact on performance. Consider lifting  them out of the graph context using  `tf.init_scope`.: key_value_init/LookupTableImportV2
WARNING:root:This output type hint will be ignored and not used for type-checking purposes. Typically, output type hints for a PTransform are single (or nested) types wrapped by a PCollection, PDone, or None. Got: Tuple[Dict[str, Union[NoneType, _Dataset]], Union[Dict[str, Dict[str, PCollection]], NoneType], int] instead.
WARNING:absl:Tables initialized inside a tf.function  will be re-initialized on every invocation of the function. This  re-initialization can have significant impact on performance. Consider lifting  them out of the graph context using  `tf.init_scope`.: key_value_init/LookupTableImportV2
WARNING:absl:Tables initialized inside a tf.function  will be re-initialized on every invocation of the function. This  re-initialization can have significant impact on performance. Consider lifting  them out of the graph context using  `tf.init_scope`.: key_value_init/LookupTableImportV2
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
WARNING:root:Make sure that locally built Python SDK docker image has Python 3.7 interpreter.
2021-12-05 10:22:06.547139: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/167780659a644435abe6c969ed4771de/assets
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/167780659a644435abe6c969ed4771de/assets
WARNING:absl:Tables initialized inside a tf.function  will be re-initialized on every invocation of the function. This  re-initialization can have significant impact on performance. Consider lifting  them out of the graph context using  `tf.init_scope`.: key_value_init/LookupTableImportV2
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/cbe53dc813ec4d51a99f25099bd3736e/assets
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Transform/transform_graph/4/.temp_path/tftransform_tmp/cbe53dc813ec4d51a99f25099bd3736e/assets
WARNING:absl:Tables initialized inside a tf.function  will be re-initialized on every invocation of the function. This  re-initialization can have significant impact on performance. Consider lifting  them out of the graph context using  `tf.init_scope`.: key_value_init/LookupTableImportV2
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 4 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'updated_analyzer_cache': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/updated_analyzer_cache/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:updated_analyzer_cache:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "TransformCache"
)], 'post_transform_stats': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/post_transform_stats/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:post_transform_stats:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)], 'pre_transform_stats': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/pre_transform_stats/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:pre_transform_stats:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)], 'pre_transform_schema': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/pre_transform_schema/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:pre_transform_schema:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "Schema"
)], 'post_transform_anomalies': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/post_transform_anomalies/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:post_transform_anomalies:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "ExampleAnomalies"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)], 'transform_graph': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/transform_graph/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:transform_graph:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "TransformGraph"
)], 'post_transform_schema': [Artifact(artifact: uri: "pipelines/penguin-transform/Transform/post_transform_schema/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:post_transform_schema:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "Schema"
)]}) for execution 4
INFO:absl:MetadataStore with DB connection initialized
I1205 10:22:11.698540 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
I1205 10:22:11.707963 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:Component Transform is finished.
INFO:absl:Component ExampleValidator is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.example_validator.component.ExampleValidator"
  }
  id: "ExampleValidator"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.ExampleValidator"
      }
    }
  }
}
inputs {
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "schema_importer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.schema_importer"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
      min_count: 1
    }
  }
  inputs {
    key: "statistics"
    value {
      channels {
        producer_node_query {
          id: "StatisticsGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.StatisticsGen"
            }
          }
        }
        artifact_query {
          type {
            name: "ExampleStatistics"
          }
        }
        output_key: "statistics"
      }
      min_count: 1
    }
  }
}
outputs {
  outputs {
    key: "anomalies"
    value {
      artifact_spec {
        type {
          name: "ExampleAnomalies"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "StatisticsGen"
upstream_nodes: "schema_importer"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
I1205 10:22:11.732254 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 5
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=5, input_dict={'schema': [Artifact(artifact: id: 2
type_id: 17
uri: "schema"
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699713343
last_update_time_since_epoch: 1638699713343
, artifact_type: id: 17
name: "Schema"
)], 'statistics': [Artifact(artifact: id: 3
type_id: 19
uri: "pipelines/penguin-transform/StatisticsGen/statistics/3"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:StatisticsGen:statistics:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699716011
last_update_time_since_epoch: 1638699716011
, artifact_type: id: 19
name: "ExampleStatistics"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}, output_dict=defaultdict(<class 'list'>, {'anomalies': [Artifact(artifact: uri: "pipelines/penguin-transform/ExampleValidator/anomalies/5"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:ExampleValidator:anomalies:0"
  }
}
, artifact_type: name: "ExampleAnomalies"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}), exec_properties={'exclude_splits': '[]'}, execution_output_uri='pipelines/penguin-transform/ExampleValidator/.system/executor_execution/5/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/ExampleValidator/.system/stateful_working_dir/2021-12-05T10:21:51.187624', tmp_dir='pipelines/penguin-transform/ExampleValidator/.system/executor_execution/5/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.example_validator.component.ExampleValidator"
  }
  id: "ExampleValidator"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.ExampleValidator"
      }
    }
  }
}
inputs {
  inputs {
    key: "schema"
    value {
      channels {
        producer_node_query {
          id: "schema_importer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.schema_importer"
            }
          }
        }
        artifact_query {
          type {
            name: "Schema"
          }
        }
        output_key: "result"
      }
      min_count: 1
    }
  }
  inputs {
    key: "statistics"
    value {
      channels {
        producer_node_query {
          id: "StatisticsGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.StatisticsGen"
            }
          }
        }
        artifact_query {
          type {
            name: "ExampleStatistics"
          }
        }
        output_key: "statistics"
      }
      min_count: 1
    }
  }
}
outputs {
  outputs {
    key: "anomalies"
    value {
      artifact_spec {
        type {
          name: "ExampleAnomalies"
          properties {
            key: "span"
            value: INT
          }
          properties {
            key: "split_names"
            value: STRING
          }
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "exclude_splits"
    value {
      field_value {
        string_value: "[]"
      }
    }
  }
}
upstream_nodes: "StatisticsGen"
upstream_nodes: "schema_importer"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-12-05T10:21:51.187624')
INFO:absl:Validating schema against the computed statistics for split train.
INFO:absl:Validation complete for split train. Anomalies written to pipelines/penguin-transform/ExampleValidator/anomalies/5/Split-train.
INFO:absl:Validating schema against the computed statistics for split eval.
INFO:absl:Validation complete for split eval. Anomalies written to pipelines/penguin-transform/ExampleValidator/anomalies/5/Split-eval.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 5 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'anomalies': [Artifact(artifact: uri: "pipelines/penguin-transform/ExampleValidator/anomalies/5"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:ExampleValidator:anomalies:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "ExampleAnomalies"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
)]}) for execution 5
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component ExampleValidator is finished.
INFO:absl:Component Trainer is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.trainer.component.Trainer"
  }
  id: "Trainer"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Trainer"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
      min_count: 1
    }
  }
  inputs {
    key: "transform_graph"
    value {
      channels {
        producer_node_query {
          id: "Transform"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Transform"
            }
          }
        }
        artifact_query {
          type {
            name: "TransformGraph"
          }
        }
        output_key: "transform_graph"
      }
    }
  }
}
outputs {
  outputs {
    key: "model"
    value {
      artifact_spec {
        type {
          name: "Model"
        }
      }
    }
  }
  outputs {
    key: "model_run"
    value {
      artifact_spec {
        type {
          name: "ModelRun"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "eval_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 5\n}"
      }
    }
  }
  parameters {
    key: "module_path"
    value {
      field_value {
        string_value: "penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl"
      }
    }
  }
  parameters {
    key: "train_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 100\n}"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "Transform"
downstream_nodes: "Pusher"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
I1205 10:22:11.785852 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 6
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=6, input_dict={'examples': [Artifact(artifact: id: 1
type_id: 15
uri: "pipelines/penguin-transform/CsvExampleGen/examples/1"
properties {
  key: "split_names"
  value {
    string_value: "[\"train\", \"eval\"]"
  }
}
custom_properties {
  key: "file_format"
  value {
    string_value: "tfrecords_gzip"
  }
}
custom_properties {
  key: "input_fingerprint"
  value {
    string_value: "split:single_split,num_files:1,total_bytes:13161,xor_checksum:1638699709,sum_checksum:1638699709"
  }
}
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:CsvExampleGen:examples:0"
  }
}
custom_properties {
  key: "payload_format"
  value {
    string_value: "FORMAT_TF_EXAMPLE"
  }
}
custom_properties {
  key: "span"
  value {
    int_value: 0
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699713316
last_update_time_since_epoch: 1638699713316
, artifact_type: id: 15
name: "Examples"
properties {
  key: "span"
  value: INT
}
properties {
  key: "split_names"
  value: STRING
}
properties {
  key: "version"
  value: INT
}
)], 'transform_graph': [Artifact(artifact: id: 9
type_id: 23
uri: "pipelines/penguin-transform/Transform/transform_graph/4"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Transform:transform_graph:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699731712
last_update_time_since_epoch: 1638699731712
, artifact_type: id: 23
name: "TransformGraph"
)]}, output_dict=defaultdict(<class 'list'>, {'model': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Trainer:model:0"
  }
}
, artifact_type: name: "Model"
)], 'model_run': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model_run/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Trainer:model_run:0"
  }
}
, artifact_type: name: "ModelRun"
)]}), exec_properties={'custom_config': 'null', 'train_args': '{\n  "num_steps": 100\n}', 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'eval_args': '{\n  "num_steps": 5\n}'}, execution_output_uri='pipelines/penguin-transform/Trainer/.system/executor_execution/6/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Trainer/.system/stateful_working_dir/2021-12-05T10:21:51.187624', tmp_dir='pipelines/penguin-transform/Trainer/.system/executor_execution/6/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.trainer.component.Trainer"
  }
  id: "Trainer"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Trainer"
      }
    }
  }
}
inputs {
  inputs {
    key: "examples"
    value {
      channels {
        producer_node_query {
          id: "CsvExampleGen"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.CsvExampleGen"
            }
          }
        }
        artifact_query {
          type {
            name: "Examples"
          }
        }
        output_key: "examples"
      }
      min_count: 1
    }
  }
  inputs {
    key: "transform_graph"
    value {
      channels {
        producer_node_query {
          id: "Transform"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Transform"
            }
          }
        }
        artifact_query {
          type {
            name: "TransformGraph"
          }
        }
        output_key: "transform_graph"
      }
    }
  }
}
outputs {
  outputs {
    key: "model"
    value {
      artifact_spec {
        type {
          name: "Model"
        }
      }
    }
  }
  outputs {
    key: "model_run"
    value {
      artifact_spec {
        type {
          name: "ModelRun"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "eval_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 5\n}"
      }
    }
  }
  parameters {
    key: "module_path"
    value {
      field_value {
        string_value: "penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl"
      }
    }
  }
  parameters {
    key: "train_args"
    value {
      field_value {
        string_value: "{\n  \"num_steps\": 100\n}"
      }
    }
  }
}
upstream_nodes: "CsvExampleGen"
upstream_nodes: "Transform"
downstream_nodes: "Pusher"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-12-05T10:21:51.187624')
INFO:absl:Train on the 'train' split when train_args.splits is not set.
INFO:absl:Evaluate on the 'eval' split when eval_args.splits is not set.
INFO:absl:udf_utils.get_fn {'custom_config': 'null', 'train_args': '{\n  "num_steps": 100\n}', 'module_path': 'penguin_utils@pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl', 'eval_args': '{\n  "num_steps": 5\n}'} 'run_fn'
INFO:absl:Installing 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl' to a temporary directory.
INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmpfnmreae0', 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl']
Processing ./pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl
INFO:absl:Successfully installed 'pipelines/penguin-transform/_wheels/tfx_user_code_Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9-py3-none-any.whl'.
INFO:absl:Training model.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
Installing collected packages: tfx-user-code-Trainer
Successfully installed tfx-user-code-Trainer-0.0+a5e9139bd7facf5026b5306a6aea534f89db0dea58ebe1bb1fb5ebb9df5fdea9
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_text is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:tensorflow_decision_forests is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:tensorflow:struct2tensor is not available.
INFO:absl:Feature body_mass_g has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_depth_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature culmen_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature flipper_length_mm has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature island has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature sex has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Feature species has a shape dim {
  size: 1
}
. Setting to DenseTensor.
INFO:absl:Model: "model"
INFO:absl:__________________________________________________________________________________________________
INFO:absl:Layer (type)                    Output Shape         Param #     Connected to                     
INFO:absl:==================================================================================================
INFO:absl:culmen_length_mm (InputLayer)   [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:culmen_depth_mm (InputLayer)    [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:flipper_length_mm (InputLayer)  [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:body_mass_g (InputLayer)        [(None, 1)]          0                                            
INFO:absl:__________________________________________________________________________________________________
INFO:absl:concatenate (Concatenate)       (None, 4)            0           culmen_length_mm[0][0]           
INFO:absl:                                                                 culmen_depth_mm[0][0]            
INFO:absl:                                                                 flipper_length_mm[0][0]          
INFO:absl:                                                                 body_mass_g[0][0]                
INFO:absl:__________________________________________________________________________________________________
INFO:absl:dense (Dense)                   (None, 8)            40          concatenate[0][0]                
INFO:absl:__________________________________________________________________________________________________
INFO:absl:dense_1 (Dense)                 (None, 8)            72          dense[0][0]                      
INFO:absl:__________________________________________________________________________________________________
INFO:absl:dense_2 (Dense)                 (None, 3)            27          dense_1[0][0]                    
INFO:absl:==================================================================================================
INFO:absl:Total params: 139
INFO:absl:Trainable params: 139
INFO:absl:Non-trainable params: 0
INFO:absl:__________________________________________________________________________________________________
100/100 [==============================] - 1s 4ms/step - loss: 0.2132 - sparse_categorical_accuracy: 0.9490 - val_loss: 0.0102 - val_sparse_categorical_accuracy: 1.0000
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Trainer/model/6/Format-Serving/assets
INFO:tensorflow:Assets written to: pipelines/penguin-transform/Trainer/model/6/Format-Serving/assets
INFO:absl:Training complete. Model written to pipelines/penguin-transform/Trainer/model/6/Format-Serving. ModelRun written to pipelines/penguin-transform/Trainer/model_run/6
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 6 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'model': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Trainer:model:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "Model"
)], 'model_run': [Artifact(artifact: uri: "pipelines/penguin-transform/Trainer/model_run/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Trainer:model_run:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "ModelRun"
)]}) for execution 6
INFO:absl:MetadataStore with DB connection initialized
I1205 10:22:18.036643 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:Component Trainer is finished.
I1205 10:22:18.041664 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:Component Pusher is running.
INFO:absl:Running launcher for node_info {
  type {
    name: "tfx.components.pusher.component.Pusher"
  }
  id: "Pusher"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Pusher"
      }
    }
  }
}
inputs {
  inputs {
    key: "model"
    value {
      channels {
        producer_node_query {
          id: "Trainer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Trainer"
            }
          }
        }
        artifact_query {
          type {
            name: "Model"
          }
        }
        output_key: "model"
      }
    }
  }
}
outputs {
  outputs {
    key: "pushed_model"
    value {
      artifact_spec {
        type {
          name: "PushedModel"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "push_destination"
    value {
      field_value {
        string_value: "{\n  \"filesystem\": {\n    \"base_directory\": \"serving_model/penguin-transform\"\n  }\n}"
      }
    }
  }
}
upstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}

INFO:absl:MetadataStore with DB connection initialized
I1205 10:22:18.063011 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Going to run a new execution 7
INFO:absl:Going to run a new execution: ExecutionInfo(execution_id=7, input_dict={'model': [Artifact(artifact: id: 12
type_id: 26
uri: "pipelines/penguin-transform/Trainer/model/6"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Trainer:model:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
state: LIVE
create_time_since_epoch: 1638699738045
last_update_time_since_epoch: 1638699738045
, artifact_type: id: 26
name: "Model"
)]}, output_dict=defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-transform/Pusher/pushed_model/7"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Pusher:pushed_model:0"
  }
}
, artifact_type: name: "PushedModel"
)]}), exec_properties={'push_destination': '{\n  "filesystem": {\n    "base_directory": "serving_model/penguin-transform"\n  }\n}', 'custom_config': 'null'}, execution_output_uri='pipelines/penguin-transform/Pusher/.system/executor_execution/7/executor_output.pb', stateful_working_dir='pipelines/penguin-transform/Pusher/.system/stateful_working_dir/2021-12-05T10:21:51.187624', tmp_dir='pipelines/penguin-transform/Pusher/.system/executor_execution/7/.temp/', pipeline_node=node_info {
  type {
    name: "tfx.components.pusher.component.Pusher"
  }
  id: "Pusher"
}
contexts {
  contexts {
    type {
      name: "pipeline"
    }
    name {
      field_value {
        string_value: "penguin-transform"
      }
    }
  }
  contexts {
    type {
      name: "pipeline_run"
    }
    name {
      field_value {
        string_value: "2021-12-05T10:21:51.187624"
      }
    }
  }
  contexts {
    type {
      name: "node"
    }
    name {
      field_value {
        string_value: "penguin-transform.Pusher"
      }
    }
  }
}
inputs {
  inputs {
    key: "model"
    value {
      channels {
        producer_node_query {
          id: "Trainer"
        }
        context_queries {
          type {
            name: "pipeline"
          }
          name {
            field_value {
              string_value: "penguin-transform"
            }
          }
        }
        context_queries {
          type {
            name: "pipeline_run"
          }
          name {
            field_value {
              string_value: "2021-12-05T10:21:51.187624"
            }
          }
        }
        context_queries {
          type {
            name: "node"
          }
          name {
            field_value {
              string_value: "penguin-transform.Trainer"
            }
          }
        }
        artifact_query {
          type {
            name: "Model"
          }
        }
        output_key: "model"
      }
    }
  }
}
outputs {
  outputs {
    key: "pushed_model"
    value {
      artifact_spec {
        type {
          name: "PushedModel"
        }
      }
    }
  }
}
parameters {
  parameters {
    key: "custom_config"
    value {
      field_value {
        string_value: "null"
      }
    }
  }
  parameters {
    key: "push_destination"
    value {
      field_value {
        string_value: "{\n  \"filesystem\": {\n    \"base_directory\": \"serving_model/penguin-transform\"\n  }\n}"
      }
    }
  }
}
upstream_nodes: "Trainer"
execution_options {
  caching_options {
  }
}
, pipeline_info=id: "penguin-transform"
, pipeline_run_id='2021-12-05T10:21:51.187624')
WARNING:absl:Pusher is going to push the model without validation. Consider using Evaluator or InfraValidator in your pipeline.
INFO:absl:Model version: 1638699738
INFO:absl:Model written to serving path serving_model/penguin-transform/1638699738.
INFO:absl:Model pushed to pipelines/penguin-transform/Pusher/pushed_model/7.
INFO:absl:Cleaning up stateless execution info.
INFO:absl:Execution 7 succeeded.
INFO:absl:Cleaning up stateful execution info.
INFO:absl:Publishing output artifacts defaultdict(<class 'list'>, {'pushed_model': [Artifact(artifact: uri: "pipelines/penguin-transform/Pusher/pushed_model/7"
custom_properties {
  key: "name"
  value {
    string_value: "penguin-transform:2021-12-05T10:21:51.187624:Pusher:pushed_model:0"
  }
}
custom_properties {
  key: "tfx_version"
  value {
    string_value: "1.4.0"
  }
}
, artifact_type: name: "PushedModel"
)]}) for execution 7
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Component Pusher is finished.
I1205 10:22:18.092860 24712 rdbms_metadata_access_object.cc:686] No property is defined for the Type

אתה אמור לראות "INFO:absl:Pusher Component הסתיים." אם הצינור הסתיים בהצלחה.

מרכיב דחפן דוחף את המודל המודרך אל SERVING_MODEL_DIR המהווה את serving_model/penguin-transform ספרייה אם לא שינית את משתנה בשלבים הקודמים. אתה יכול לראות את התוצאה מדפדפן הקבצים בחלונית השמאלית ב-Colab, או באמצעות הפקודה הבאה:

# List files in created model directory.
find {SERVING_MODEL_DIR}
serving_model/penguin-transform
serving_model/penguin-transform/1638699738
serving_model/penguin-transform/1638699738/keras_metadata.pb
serving_model/penguin-transform/1638699738/assets
serving_model/penguin-transform/1638699738/variables
serving_model/penguin-transform/1638699738/variables/variables.data-00000-of-00001
serving_model/penguin-transform/1638699738/variables/variables.index
serving_model/penguin-transform/1638699738/saved_model.pb

ניתן גם לבדוק את החתימה של המודל שנוצר באמצעות saved_model_cli הכלי .

saved_model_cli show --dir {SERVING_MODEL_DIR}/$(ls -1 {SERVING_MODEL_DIR} | sort -nr | head -1) --tag_set serve --signature_def serving_default
The given SavedModel SignatureDef contains the following input(s):
  inputs['examples'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_examples:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_0'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 3)
      name: StatefulPartitionedCall_2:0
Method name is: tensorflow/serving/predict

מכיוון שהגדרנו serving_default עם משלנו serve_tf_examples_fn פונקציה, מראה חתימה שהיא מקבלת מחרוזת אחת. מחרוזת זוהי מחרוזת בהמשכים של tf.Examples ויהיה מנותח עם tf.io.parse_example () פונקציה כפי שהגדרנו קודם לכן (ללמוד עוד על tf.Examples כאן ).

אנחנו יכולים לטעון את המודל המיוצא ולנסות כמה מסקנות עם כמה דוגמאות.

# Find a model with the latest timestamp.
model_dirs = (item for item in os.scandir(SERVING_MODEL_DIR) if item.is_dir())
model_path = max(model_dirs, key=lambda i: int(i.name)).path

loaded_model = tf.keras.models.load_model(model_path)
inference_fn = loaded_model.signatures['serving_default']
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<keras.saving.saved_model.load.TensorFlowTransform>TransformFeaturesLayer object at 0x7f5b0836e3d0> and <keras.engine.input_layer.InputLayer object at 0x7f5b091aa550>).
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program.

Two checkpoint references resolved to different objects (<keras.saving.saved_model.load.TensorFlowTransform>TransformFeaturesLayer object at 0x7f5b0836e3d0> and <keras.engine.input_layer.InputLayer object at 0x7f5b091aa550>).
# Prepare an example and run inference.
features = {
  'culmen_length_mm': tf.train.Feature(float_list=tf.train.FloatList(value=[49.9])),
  'culmen_depth_mm': tf.train.Feature(float_list=tf.train.FloatList(value=[16.1])),
  'flipper_length_mm': tf.train.Feature(int64_list=tf.train.Int64List(value=[213])),
  'body_mass_g': tf.train.Feature(int64_list=tf.train.Int64List(value=[5400])),
}
example_proto = tf.train.Example(features=tf.train.Features(feature=features))
examples = example_proto.SerializeToString()

result = inference_fn(examples=tf.constant([examples]))
print(result['output_0'].numpy())
[[-2.5357873 -3.0600576  3.4993587]]

האלמנט השלישי, המתאים למין 'ג'נטו', צפוי להיות הגדול מבין שלושה.

הצעדים הבאים

אם אתה רוצה ללמוד יותר על Transform רכיב, לראות Transform מדריך רכיב . אתה יכול למצוא עוד משאבים על https://www.tensorflow.org/tfx/tutorials

אנא ראה הבנת TFX צנרת כדי ללמוד עוד על מושגים שונים TFX.