Join the SIG TFX-Addons community and help make TFX even better!


Airflow-specific TFX Component.

This class wrap a component run into its own PythonOperator in Airflow.

parent_dag An AirflowPipeline instance as the pipeline DAG.
component An instance of base_node.BaseNode that holds all properties of a logical component.
component_launcher_class The class of the launcher to launch the component.
pipeline_info An instance of data_types.PipelineInfo that holds pipeline properties.
enable_cache Whether or not cache is enabled for this component run.
metadata_connection_config A config proto for metadata connection.
beam_pipeline_args Pipeline arguments for Beam powered Components.
additional_pipeline_args Additional pipeline args.
component_config Component config to launch the component.

dag Returns the Operator's DAG if set, otherwise raises an error
dag_id Returns dag id if it has one or an adhoc + owner
downstream_list @property: list of tasks directly downstream
downstream_task_ids @property: set of ids of tasks directly downstream
inherits_from_dummy_operator Used to determine if an Operator is inherited from DummyOperator
leaves Required by TaskMixin
log Returns a logger.
output Returns reference to XCom pushed by current operator
priority_weight_total Total priority weight for the task. It might include all upstream or downstream tasks. depending on the weight rule.

  • WeightRule.ABSOLUTE - only own weight
  • WeightRule.DOWNSTREAM - adds priority weight of all downstream tasks
  • WeightRule.UPSTREAM - adds priority weight of all upstream tasks
roots Required by TaskMixin
task_type @property: type of the task
upstream_list @property: list of tasks directly upstream
upstream_task_ids @property: set of ids of tasks directly upstream



Sets inlets to this operator


Adds only new items to item set


Defines the outlets of this operator


Clears the state of task instances associated with the task, following the parameters specified.


Performs dry run for the operator - just render template fields.


This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.


Calls the python callable with the given arguments.

:return: the return value of the call. :rtype: any


Get set of the direct relative ids to the current task, upstream or downstream.


Get list of the direct relatives to the current task, upstream or downstream.

For an operator, gets the URL that the external links specified in extra_links should point to.

:raise ValueError: The error message of a ValueError will be passed on through to the fronted to show up as a tooltip on the disabled link :param dttm: The datetime parsed execution date for the URL being searched for :param link_name: The name of the link we're looking for the URL for. Should be one of the options specified in extra_links :return: A URL


Get a flat set of relatives' ids, either upstream or downstream.


Get a flat list of relatives, either upstream or downstream.


:return: list of inlets defined for this operator


:return: list of outlets defined for this operator


Stringified DAGs and operators contain exactly these fields.


Get a set of task instance related to this task for a specific date range.


Fetch a Jinja template environment from the DAG or instantiate empty environment if no DAG.


Returns True if the Operator has been assigned to a DAG.


Return if this operator can use smart service. Default False.


Override this method to cleanup subprocesses when a task instance gets killed. Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up or it will leave ghost processes behind.


This hook is triggered right after self.execute() is called. It is passed the execution context and any results returned by the operator.


This hook is triggered right before self.execute() is called.


Lock task for execution to disable custom action in setattr and returns a copy of the task


Hook that is triggered after the templated fields get replaced by their content. If you need your operator to alter the content of the file before the template is rendered, it should override this method to do so.


Render a templated string. The content can be a collection holding multiple templated strings and will be templated recursively.

:param content: Content to template. Only strings can be templated (may be inside collection). :type content: Any :param context: Dict with values to apply on templated content :type context: dict :param jinja_env: Jinja environment. Can be provided to avoid re-creating Jinja environments during recursion. :type jinja_env: jinja2.Environment :param seen_oids: template fields already rendered (to avoid RecursionError on circular dependencies) :type seen_oids: set :return: Templated content


Template all attributes listed in template_fields. Note this operation is irreversible.

:param context: Dict with values to apply on content :type context: dict :param jinja_env: Jinja environment :type jinja_env: jinja2.Environment


Getting the content of files for template_field / template_ext


Run a set of task instances for a date range.


Set a task or a task list to be directly downstream from the current task. Required by TaskMixin.


Set a task or a task list to be directly upstream from the current task. Required by TaskMixin.


Resolves upstream dependencies of a task. In this way passing an XComArg as value for a template field will result in creating upstream relation between two tasks.

Example: ::

with DAG(...):
    generate_content = GenerateContentOperator(task_id="generate_content")
    send_email = EmailOperator(..., html_content=generate_content.output)

# This is equivalent to
with DAG(...):
    generate_content = GenerateContentOperator(task_id="generate_content")
    send_email = EmailOperator(
        ..., html_content="{ { task_instance.xcom_pull('generate_content') } }"
    generate_content >> send_email


Update relationship information about another TaskMixin. Default is no-op. Override if necessary.


Pull XComs that optionally meet certain criteria.

The default value for key limits the search to XComs that were returned by other tasks (as opposed to those that were pushed manually). To remove this filter, pass key=None (or any desired value).

If a single task_id string is provided, the result is the value of the most recent matching XCom from that task_id. If multiple task_ids are provided, a tuple of matching values is returned. None is returned whenever no matches are found.

:param context: Execution Context Dictionary :type: Any :param key: A key for the XCom. If provided, only XComs with matching keys will be returned. The default key is 'return_value', also available as a constant XCOM_RETURN_KEY. This key is automatically given to XComs returned by tasks (as opposed to being pushed manually). To remove the filter, pass key=None. :type key: str :param task_ids: Only XComs from tasks with matching ids will be pulled. Can pass None to remove the filter. :type task_ids: str or iterable of strings (representing task_ids) :param dag_id: If provided, only pulls XComs from this DAG. If None (default), the DAG of the calling task is used. :type dag_id: str :param include_prior_dates: If False, only XComs from the current execution_date are returned. If True, XComs from previous dates are returned as well. :type include_prior_dates: bool


Make an XCom available for tasks to pull.

:param context: Execution Context Dictionary :type: Any :param key: A key for the XCom :type key: str :param value: A value for the XCom. The value is pickled and stored in the database. :type value: any pickleable object :param execution_date: if provided, the XCom will not be visible until this date. This can be used, for example, to send a message to a task on a future date without it being immediately visible. :type execution_date: datetime


Return self==value.


Return a >= b. Computed by @total_ordering from (not a < b).


Called for [Operator] > [Outlet], so that if other is an attr annotated object it is set as an outlet of this Operator.


Return a <= b. Computed by @total_ordering from (a < b) or (a == b).


Called for [Inlet] > [Operator] or [Operator] < [Inlet], so that if other is an attr annotated object it is set as an inlet to this operator


Return self!=value.


Called for [This Operator] | [Operator], The inlets of other will be set to pickup the outlets from this operator. Other will be set as a downstream task of this operator.


 <TIDep(Not In Retry Period)>,
 <TIDep(Not Previously Skipped)>,
 <TIDep(Previous Dagrun State)>,
 <TIDep(Trigger Rule)>

extra_links Instance of cached_property.cached_property

@property: extra links for the task

global_operator_extra_link_dict Instance of cached_property.cached_property

Returns dictionary of all global extra links

operator_extra_link_dict Instance of cached_property.cached_property

Returns dictionary of all extra links for the operator

operator_extra_links ()
pool ''
shallow_copy_attrs ('python_callable', 'op_kwargs')
supports_lineage False
template_ext ()
template_fields ('templates_dict', 'op_args', 'op_kwargs')

 'op_args': 'py',
 'op_kwargs': 'py',
 'templates_dict': 'json'

ui_color '#ffefeb'
ui_fgcolor '#000'