![]() |
Definition for TFX ImporterNode.
Inherits From: BaseNode
tfx.components.ImporterNode(
instance_name: Text,
source_uri: Text,
artifact_type: Type[tfx.types.Artifact
],
reimport: Optional[bool] = False,
properties: Optional[Dict[Text, Union[Text, int]]] = None,
custom_properties: Optional[Dict[Text, Union[Text, int]]] = None
)
ImporterNode is a special TFX node which registers an external resource into MLMD so that downstream nodes can use the registered artifact as input.
Here is an example to use ImporterNode:
... importer = ImporterNode( instance_name='import_schema', source_uri='uri/to/schema', artifact_type=standard_artifacts.Schema, reimport=False) schema_gen = SchemaGen( fixed_schema=importer.outputs['result'], examples=...) ...
Args | |
---|---|
instance_name
|
the name of the ImporterNode instance. |
source_uri
|
the URI of the resource that needs to be registered. |
artifact_type
|
the type of the artifact to import. |
reimport
|
whether or not to re-import as a new artifact if the URI has been imported in before. |
properties
|
Dictionary of properties for the imported Artifact. These properties should be ones declared for the given artifact_type (see the PROPERTIES attribute of the definition of the type for details). |
custom_properties
|
Dictionary of custom properties for the imported Artifact. These properties should be of type Text or int. |
Attributes | |
---|---|
_source_uri
|
the source uri to import. |
_reimport
|
whether or not to re-import the URI even if it already exists in MLMD. |
component_id
|
|
component_type
|
|
downstream_nodes
|
|
exec_properties
|
|
id
|
Node id, unique across all TFX nodes in a pipeline.
If |
inputs
|
|
outputs
|
|
type
|
|
upstream_nodes
|
Methods
add_downstream_node
add_downstream_node(
downstream_node
)
Experimental: Add another component that must run after this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_upstream_node
.
Args | |
---|---|
downstream_node
|
a component that must run after this node. |
add_upstream_node
add_upstream_node(
upstream_node
)
Experimental: Add another component that must run before this one.
This method enables task-based dependencies by enforcing execution order for synchronous pipelines on supported platforms. Currently, the supported platforms are Airflow, Beam, and Kubeflow Pipelines.
Note that this API call should be considered experimental, and may not work with asynchronous pipelines, sub-pipelines and pipelines with conditional nodes. We also recommend relying on data for capturing dependencies where possible to ensure data lineage is fully captured within MLMD.
It is symmetric with add_downstream_node
.
Args | |
---|---|
upstream_node
|
a component that must run before this node. |
from_json_dict
@classmethod
from_json_dict( dict_data: Dict[Text, Any] ) -> Any
Convert from dictionary data to an object.
get_id
@classmethod
get_id( instance_name: Optional[Text] = None )
Gets the id of a node.
This can be used during pipeline authoring time. For example: from tfx.components import Trainer
resolver = ResolverNode(..., model=Channel( type=Model, producer_component_id=Trainer.get_id('my_trainer')))
Args | |
---|---|
instance_name
|
(Optional) instance name of a node. If given, the instance name will be taken into consideration when generating the id. |
Returns | |
---|---|
an id for the node. |
to_json_dict
to_json_dict() -> Dict[Text, Any]
Convert from an object to a JSON serializable dictionary.
with_id
with_id(
id: Text
) -> "BaseNode"