tfds.features.Sequence

Composite FeatureConnector for a dict where each value is a list.

Inherits From: FeatureConnector

Sequence correspond to sequence of tfds.features.FeatureConnector. At generation time, a list for each of the sequence element is given. The output of tf.data.Dataset will batch all the elements of the sequence together.

If the length of the sequence is static and known in advance, it should be specified in the constructor using the length param.

Note that Sequence does not support features which are of type tf.io.FixedLenSequenceFeature.

Example:

At construction time:

tfds.features.Sequence(tfds.features.Image(), length=NB_FRAME)

or:

tfds.features.Sequence({
    'frame': tfds.features.Image(shape=(64, 64, 3))
    'action': tfds.features.ClassLabel(['up', 'down', 'left', 'right'])
}, length=NB_FRAME)

During data generation:

yield {
    'frame': np.ones(shape=(NB_FRAME, 64, 64, 3)),
    'action': ['left', 'left', 'up', ...],
}

Tensor returned by .as_dataset():

{
    'frame': tf.Tensor(shape=(NB_FRAME, 64, 64, 3), dtype=tf.uint8),
    'action': tf.Tensor(shape=(NB_FRAME,), dtype=tf.int64),
}

At generation time, you can specify a list of features dict, a dict of list values or a stacked numpy array. The lists will automatically be distributed into their corresponding FeatureConnector.

feature dict, the features to wrap
length int, length of the sequence if static and known in advance
**kwargs dict, constructor kwargs of tfds.features.FeaturesDict

dtype Return the dtype (or dict of dtype) of this FeatureConnector.
feature The inner feature.
shape Return the shape (or dict of shape) of this FeatureConnector.

Methods

decode_batch_example

View source

Decode multiple features batched in a single tf.Tensor.

This function is used to decode features wrapped in tfds.features.Sequence(). By default, this function apply decode_example on each individual elements using tf.map_fn. However, for optimization, features can overwrite this method to apply a custom batch decoding.

Args
tfexample_data Same tf.Tensor inputs as decode_example, but with and additional first dimension for the sequence length.

Returns
tensor_data Tensor or dictionary of tensor, output of the tf.data.Dataset object

decode_example

View source

Decode the serialize examples.

Args
serialized_example Nested dict of tf.Tensor
decoders Nested dict of Decoder objects which allow to customize the decoding. The structure should match the feature structure, but only customized feature keys need to be present. See the guide for more info.

Returns
example Nested dict containing the decoded nested examples.

decode_ragged_example

View source

Decode nested features from a tf.RaggedTensor.

This function is used to decode features wrapped in nested tfds.features.Sequence(). By default, this function apply decode_batch_example on the flat values of the ragged tensor. For optimization, features can overwrite this method to apply a custom batch decoding.

Args
tfexample_data tf.RaggedTensor inputs containing the nested encoded examples.

Returns
tensor_data The decoded tf.RaggedTensor or dictionary of tensor, output of the tf.data.Dataset object

encode_example

View source

Encode the feature dict into tf-example compatible input.

The input example_data can be anything that the user passed at data generation. For example:

For features:

features={
    'image': tfds.features.Image(),
    'custom_feature': tfds.features.CustomFeature(),
}

At data generation (in _generate_examples), if the user yields:

yield {
    'image': 'path/to/img.png',
    'custom_feature': [123, 'str', lambda x: x+1]
}

Then:

Args
example_data Value or dictionary of values to convert into tf-example compatible data.

Returns
tfexample_data Data or dictionary of data to write as tf-example. Data can be a list or numpy array. Note that numpy arrays are flattened so it's the feature connector responsibility to reshape them in decode_example(). Note that tf.train.Example only supports int64, float32 and string so the data returned here should be integer, float or string. User type can be restored in decode_example().

from_config

View source

Reconstructs the FeatureConnector from the config file.

Usage:

features = FeatureConnector.from_config('path/to/features.json')

Args
root_dir Directory containing to the features.json file.

Returns
The reconstructed feature instance.

from_json

View source

FeatureConnector factory.

This function should be called from the tfds.features.FeatureConnector base class. Subclass should implement the from_json_content.

Example:

feature = tfds.features.FeatureConnector.from_json(
    {'type': 'Image', 'content': {'shape': [32, 32, 3], 'dtype': 'uint8'} }
)
assert isinstance(feature, tfds.features.Image)

Args
value dict(type=, content=) containing the feature to restore. Match dict returned by to_json.

Returns
The reconstructed FeatureConnector.

from_json_content

View source

FeatureConnector factory (to overwrite).

Subclasses should overwritte this method. importing the feature connector from the config.

This function should not be called directly. FeatureConnector.from_json should be called instead.

This function See existing FeatureConnector for example of implementation.

Args
value FeatureConnector information. Match the dict returned by to_json_content.

Returns
The reconstructed FeatureConnector.

get_serialized_info

View source

See base class for details.

get_tensor_info

View source

See base class for details.

load_metadata

View source

See base class for details.

repr_html

View source

Returns the HTML str representation of the object.

repr_html_batch

View source

Returns the HTML str representation of the object (Sequence).

repr_html_ragged

View source

Returns the HTML str representation of the object (Nested sequence).

save_config

View source

Exports the FeatureConnector to a file.

Args
root_dir path/to/dir containing the features.json

save_metadata

View source

See base class for details.

to_json

View source

Exports the FeatureConnector to Json.

Each feature is serialized as a dict(type=..., content=...).

  • type: The cannonical name of the feature (module.FeatureName).
  • content: is specific to each feature connector and defined in to_json_content. Can contain nested sub-features (like for tfds.features.FeaturesDict and tfds.features.Sequence).

For example:

tfds.features.FeaturesDict({
    'input': tfds.features.Image(),
    'target': tfds.features.ClassLabel(num_classes=10),
})

Is serialized as:

{
    "type": "tensorflow_datasets.core.features.features_dict.FeaturesDict",
    "content": {
        "input": {
            "type": "tensorflow_datasets.core.features.image_feature.Image",
            "content": {
                "shape": [null, null, 3],
                "dtype": "uint8",
                "encoding_format": "png"
            }
        },
        "target": {
            "type": "tensorflow_datasets.core.features.class_label_feature.ClassLabel",
            "num_classes": 10
        }
    }
}

Returns
A dict(type=, content=). Will be forwarded to from_json when reconstructing the feature.

to_json_content

View source

FeatureConnector factory (to overwrite).

This function should be overwritten by the subclass to allow re-importing the feature connector from the config. See existing FeatureConnector for example of implementation.

Returns
Dict containing the FeatureConnector metadata. Will be forwarded to from_json_content when reconstructing the feature.

__getitem__

View source

Convenience method to access the underlying features.