Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

tfx.orchestration.kubeflow.kubeflow_metadata_adapter.KubeflowMetadataAdapter

View source on GitHub

A Metadata adapter class for pipelines run using KFP.

Inherits From: Metadata

tfx.orchestration.kubeflow.kubeflow_metadata_adapter.KubeflowMetadataAdapter(
    connection_config
)

This is used to add properties to artifacts and executions, such as the Argo pod IDs.

Attributes:

  • store: Returns underlying MetadataStore.

Methods

__enter__

View source

__enter__()

__exit__

View source

__exit__(
    exc_type, exc_value, exc_tb
)

get_artifacts_by_type

View source

get_artifacts_by_type(
    type_name
)

Fetches artifacts given artifact type name.

get_artifacts_by_uri

View source

get_artifacts_by_uri(
    uri
)

Fetches artifacts given uri.

get_cached_outputs

View source

get_cached_outputs(
    input_artifacts, exec_properties, pipeline_info, component_info
)

Fetches cached output artifacts if any.

Returns the output artifacts of a cached execution if any. An eligible cached execution should take the same input artifacts, execution properties and is associated with the same pipeline context.

Args:

  • input_artifacts: inputs used by the run.
  • exec_properties: execution properties used by the run.
  • pipeline_info: info of the current pipeline run.
  • component_info: info of the current component.

Returns:

Dict of cached output artifacts if eligible cached execution is found. Otherwise, return None.

get_component_run_context

View source

get_component_run_context(
    component_info
)

Gets the context for the component run.

Args:

  • component_info: component information for the current component run.

Returns:

a matched context or None

get_pipeline_context

View source

get_pipeline_context(
    pipeline_info
)

Gets the context for the pipeline run.

Args:

  • pipeline_info: pipeline information for the current pipeline run.

Returns:

a matched context or None

get_pipeline_run_context

View source

get_pipeline_run_context(
    pipeline_info
)

Gets the context for the pipeline run.

Args:

  • pipeline_info: pipeline information for the current pipeline run.

Returns:

a matched context or None

get_published_artifacts_by_type_within_context

View source

get_published_artifacts_by_type_within_context(
    type_names, context_id
)

Fetches artifacts given artifact type name and context id.

get_qualified_artifacts

View source

get_qualified_artifacts(
    context, type_name, producer_component_id=None, output_key=None
)

Gets qualified artifacts that have the right producer info.

Args:

  • context: context constraint to filter artifacts
  • type_name: type constraint to filter artifacts
  • producer_component_id: producer constraint to filter artifacts
  • output_key: output key constraint to filter artifacts

Returns:

A list of ArtifactAndType, containing qualified artifacts.

publish_artifacts

View source

publish_artifacts(
    tfx_artifact_list
)

Publishes artifacts to MLMD.

This call will also update original tfx artifact list to contain the artifact type info and artifact id.

Args:

  • tfx_artifact_list: A list of tfx.types.Artifact which will be updated

publish_execution

View source

publish_execution(
    component_info, output_artifacts=None, exec_properties=None
)

Publishes an execution with input and output artifacts info.

This method will publish any execution with non-final states. It will register unseen artifacts and publish events for them.

Args:

  • component_info: component information.
  • output_artifacts: output artifacts produced by the execution.
  • exec_properties: execution properties for the execution to be published.

register_execution

View source

register_execution(
    pipeline_info, component_info, contexts, exec_properties=None,
    input_artifacts=None
)

Registers a new execution in metadata.

Args:

  • pipeline_info: optional pipeline info of the execution.
  • component_info: optional component info of the execution.
  • contexts: contexts for current run, all contexts will be linked to the execution. In addition, a component run context will be added to the contexts list.
  • exec_properties: the execution properties of the execution.
  • input_artifacts: input artifacts of the execution.

Returns:

execution id of the new execution.

register_pipeline_contexts_if_not_exists

View source

register_pipeline_contexts_if_not_exists(
    pipeline_info
)

Creates or fetches the pipeline contexts needed for the run.

There are two potential contexts: - Context for the pipeline. - Context for the current pipeline run. This is optional, only available when run_id is specified.

Args:

  • pipeline_info: pipeline information for current run.

Returns:

a list (of size one or two) of context.

search_artifacts

View source

search_artifacts(
    artifact_name, pipeline_info, producer_component_id
)

Search artifacts that matches given info.

Args:

  • artifact_name: the name of the artifact that set by producer component. The name is logged both in artifacts and the events when the execution being published.
  • pipeline_info: the information of the current pipeline
  • producer_component_id: the id of the component that produces the artifact

Returns:

A list of Artifacts that matches the given info

Raises:

  • RuntimeError: when no matching execution is found given producer info.

update_artifact_state

View source

update_artifact_state(
    artifact, new_state
)

Update the state of a given artifact.

update_execution

View source

update_execution(
    execution, component_info, input_artifacts=None, output_artifacts=None,
    exec_properties=None, execution_state=None, artifact_state=None, contexts=None
)

Updates the given execution in MLMD based on given information.

All artifacts provided will be registered if not already. Registered id will be reflected inline.

Args:

  • execution: the execution to be updated. It is required that the execution passed in has an id.
  • component_info: the information of the current running component
  • input_artifacts: artifacts to be declared as inputs of the execution
  • output_artifacts: artifacts to be declared as outputs of the execution
  • exec_properties: execution properties of the execution
  • execution_state: state the execution to be updated to
  • artifact_state: state the artifacts to be updated to
  • contexts: a list of contexts the execution and artifacts to be linked to

Raises:

  • RuntimeError: if the execution to be updated has no id.