A TFX SchemaGen component to generate a schema from the training data.
tfx.v1.components.SchemaGen(
statistics: types.BaseChannel,
infer_feature_shape: Optional[Union[bool, tfx.v1.dsl.experimental.RuntimeParameter
]] = True,
exclude_splits: Optional[List[str]] = None
)
Used in the notebooks
Used in the tutorials |
---|
The SchemaGen component uses TensorFlow Data Validation to generate a schema from input statistics. The following TFX libraries use the schema:
- TensorFlow Data Validation
- TensorFlow Transform
- TensorFlow Model Analysis
In a typical TFX pipeline, the SchemaGen component generates a schema which is consumed by the other pipeline components.
Example
# Generates schema based on statistics files.
infer_schema = SchemaGen(statistics=statistics_gen.outputs['statistics'])
Component outputs
contains:
schema
: Channel of typestandard_artifacts.Schema
for schema result.
See the SchemaGen guide for more details.
Args | |
---|---|
statistics
|
A BaseChannel of ExampleStatistics type (required if spec is
not passed). This should contain at least a train split. Other splits
are currently ignored. required
|
infer_feature_shape
|
Boolean (or RuntimeParameter) value indicating whether or not to infer the shape of features. If the feature shape is not inferred, downstream Tensorflow Transform component using the schema will parse input as tf.SparseTensor. Default to True if not set. |
exclude_splits
|
Names of splits that will not be taken into consideration when auto-generating a schema. Default behavior (when exclude_splits is set to None) is excluding no splits. |
Attributes | |
---|---|
outputs
|
Component's output channel dict. |