Class AnalyzeDataset
Takes a preprocessing_fn and computes the relevant statistics.
AnalyzeDataset accepts a preprocessing_fn in its constructor. When its
expand
method is called on a dataset, it computes all the relevant
statistics required to run the transformation described by the
preprocessing_fn, and returns a TransformFn representing the application of
the preprocessing_fn.
Args:
preprocessing_fn
: A function that accepts and returns a dictionary from strings toTensor
orSparseTensor
s.
__init__
__init__(
preprocessing_fn,
pipeline=None
)
Properties
label
Methods
__long__
__long__()
__native__
__native__()
Hook for the future.utils.native() function
__nonzero__
__nonzero__()
__or__
__or__(right)
Used to compose PTransforms, e.g., ptransform1 | ptransform2.
__ror__
__ror__(
left,
label=None
)
Used to apply this PTransform to non-PValues, e.g., a tuple.
__rrshift__
__rrshift__(label)
__unicode__
__unicode__()
default_label
default_label()
default_type_hints
default_type_hints()
display_data
display_data()
Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns:
Dict[str, Any]: A dictionary containing key:value
pairs.
The value might be an integer, float or string value; a
:class:DisplayDataItem
for values that have more data
(e.g. short value, label, url); or a :class:HasDisplayData
instance
that has more display data that should be picked up. For example::
{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
expand
expand(dataset)
from_runner_api
from_runner_api(
cls,
proto,
context
)
get_type_hints
get_type_hints()
get_windowing
get_windowing(inputs)
Returns the window function to be associated with transform's output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
infer_output_type
infer_output_type(unused_input_type)
next
next()
register_urn
register_urn(
cls,
urn,
parameter_type,
constructor=None
)
runner_api_requires_keyed_input
runner_api_requires_keyed_input()
to_runner_api
to_runner_api(
context,
has_parts=False
)
to_runner_api_parameter
to_runner_api_parameter(unused_context)
to_runner_api_pickled
to_runner_api_pickled(unused_context)
type_check_inputs
type_check_inputs(pvalueish)
type_check_inputs_or_outputs
type_check_inputs_or_outputs(
pvalueish,
input_or_output
)
type_check_outputs
type_check_outputs(pvalueish)
with_input_types
with_input_types(input_type_hint)
Annotates the input type of a :class:PTransform
with a type-hint.
Args:
input_type_hint (type): An instance of an allowed built-in type, a custom
class, or an instance of a
:class:~apache_beam.typehints.typehints.TypeConstraint
.
Raises:
~exceptions.TypeError: If input_type_hint is not a valid type-hint.
See
:obj:apache_beam.typehints.typehints.validate_composite_type_param()
for further details.
Returns:
PTransform
: A reference to the instance of this particular :class:PTransform
object. This allows chaining type-hinting related methods.
with_output_types
with_output_types(type_hint)
Annotates the output type of a :class:PTransform
with a type-hint.
Args:
type_hint (type): An instance of an allowed built-in type, a custom class,
or a :class:~apache_beam.typehints.typehints.TypeConstraint
.
Raises:
~exceptions.TypeError: If type_hint is not a valid type-hint. See
:obj:~apache_beam.typehints.typehints.validate_composite_type_param()
for further details.
Returns:
PTransform
: A reference to the instance of this particular :class:PTransform
object. This allows chaining type-hinting related methods.