TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

tfds.features.Image

FeatureConnector for images.

Inherits From: FeatureConnector

tfds.features.Image(
    *,
    shape: Optional[utils.Shape] = None,
    dtype: Optional[type_utils.TfdsDType] = None,
    encoding_format: Optional[str] = None,
    use_colormap: bool = False,
    doc: feature_lib.DocArg = None
)

During _generate_examples, the feature connector accept as input any of:

str: path to a {bmp,gif,jpeg,png} image (ex: /path/to/img.png).
np.array: 3d np.uint8 array representing an image.
A file object containing the png or jpeg encoded image string (ex: io.BytesIO(encoded_img_bytes))

Output
`tf.Tensor` of type `tf.uint8` and shape `[height, width, num_channels]` for BMP, JPEG, and PNG images and shape `[num_frames, height, width, 3]` for GIF images.

Example
In the `tfds.core.DatasetInfo` object: `features=features.FeaturesDict({ 'input': features.Image(), 'target': features.Image(shape=(None, None, 1), encoding_format='png'), })` During generation: `yield { 'input': 'path/to/img.jpg', 'target': np.ones(shape=(64, 64, 1), dtype=np.uint8), }`

Example

In the tfds.core.DatasetInfo object:

features=features.FeaturesDict({
    'input': features.Image(),
    'target': features.Image(shape=(None, None, 1), encoding_format='png'),
})

During generation:

yield {
    'input': 'path/to/img.jpg',
    'target': np.ones(shape=(64, 64, 1), dtype=np.uint8),
}

Args
`shape`	tuple of ints or None, the shape of decoded image. For GIF images: (num_frames, height, width, channels=3). num_frames, height and width can be None. For other images: (height, width, channels). height and width can be None. See `tf.image.encode_*` for doc on channels parameter. Defaults to (None, None, 3).
`dtype`	`np.uint8` (default), `np.uint16` or `np.float32`. * `np.uint16` requires png encoding_format. * `np.float32` only supports single-channel image. Internally float images are bitcasted to 4-channels `np.uint8` and saved as PNG.
`encoding_format`	'jpeg' or 'png'. Format to serialize `np.ndarray` images on disk. If None, encode images as PNG. If image is loaded from {bmg,gif,jpeg,png} file, this parameter is ignored, and file original encoding is used.
`use_colormap`	Only used for gray-scale images. If `True`, `tfds.as_dataframe` will display each value in the image with a different color.
`doc`	Documentation of this feature (e.g. description).

Raises
`ValueError`	If the shape is invalid

Attributes
`doc`
`dtype`	Return the dtype (or dict of dtype) of this FeatureConnector.
`encoding_format`
`np_dtype`
`numpy_dtype`
`shape`	Return the shape (or dict of shape) of this FeatureConnector.
`tf_dtype`
`use_colormap`

Methods

`catalog_documentation`

View source

catalog_documentation() -> List[CatalogFeatureDocumentation]

Returns the feature documentation to be shown in the catalog.

`cls_from_name`

View source

@classmethod
cls_from_name(
    python_class_name: str
) -> Type['FeatureConnector']

Returns the feature class for the given Python class.

`decode_batch_example`

View source

decode_batch_example(
    tfexample_data
)

Decode multiple features batched in a single tf.Tensor.

This function is used to decode features wrapped in tfds.features.Sequence(). By default, this function apply decode_example on each individual elements using tf.map_fn. However, for optimization, features can overwrite this method to apply a custom batch decoding.

Args
`tfexample_data`	Same `tf.Tensor` inputs as `decode_example`, but with and additional first dimension for the sequence length.

Returns
`tensor_data`	Tensor or dictionary of tensor, output of the tf.data.Dataset object

`decode_example`

View source

decode_example(
    example
)

Reconstruct the image with TensorFlow from the tf example.

`decode_example_np`

View source

decode_example_np(
    example: bytes
) -> np.ndarray

Reconstruct the image with OpenCV from bytes, or default to PIL.

`decode_example_np_with_opencv`

View source

decode_example_np_with_opencv(
    example: bytes, num_channels: int
) -> np.ndarray

Reconstruct the image with OpenCV from bytes.

`decode_example_np_with_pil`

View source

decode_example_np_with_pil(
    example: bytes, num_channels: int
) -> np.ndarray

`decode_ragged_example`

View source

decode_ragged_example(
    tfexample_data
)

Decode nested features from a tf.RaggedTensor.

This function is used to decode features wrapped in nested tfds.features.Sequence(). By default, this function apply decode_batch_example on the flat values of the ragged tensor. For optimization, features can overwrite this method to apply a custom batch decoding.

Args
`tfexample_data`	`tf.RaggedTensor` inputs containing the nested encoded examples.

Returns
`tensor_data`	The decoded `tf.RaggedTensor` or dictionary of tensor, output of the tf.data.Dataset object

`encode_example`

View source

encode_example(
    image_or_path_or_fobj
)

Convert the given image into a dict convertible to tf example.

`from_config`

View source

@classmethod
from_config(
    root_dir: str
) -> FeatureConnector

Reconstructs the FeatureConnector from the config file.

Usage:

features = FeatureConnector.from_config('path/to/dir')

Args
`root_dir`	Directory containing the features.json file.

Returns
The reconstructed feature instance.

`from_json`

View source

@classmethod
from_json(
    value: Json
) -> FeatureConnector

FeatureConnector factory.

This function should be called from the tfds.features.FeatureConnector base class. Subclass should implement the from_json_content.

Example:

feature = tfds.features.FeatureConnector.from_json(
    {'type': 'Image', 'content': {'shape': [32, 32, 3], 'dtype': 'uint8'} }
)
assert isinstance(feature, tfds.features.Image)

Args
`value`	`dict(type=, content=)` containing the feature to restore. Match dict returned by `to_json`.

Returns
The reconstructed FeatureConnector.

`from_json_content`

View source

@classmethod
from_json_content(
    value: Union[Json, feature_pb2.ImageFeature]
) -> 'Image'

FeatureConnector factory (to overwrite).

Subclasses should overwrite this method. This method is used when importing the feature connector from the config.

This function should not be called directly. FeatureConnector.from_json should be called instead.

See existing FeatureConnectors for implementation examples.

Args
`value`	FeatureConnector information represented as either Json or a Feature proto. The content must match what is returned by `to_json_content`.
`doc`	Documentation of this feature (e.g. description).

Returns
The reconstructed FeatureConnector.

`from_proto`

View source

@classmethod
from_proto(
    feature_proto: feature_pb2.Feature
) -> T

Instantiates a feature from its proto representation.

`get_serialized_info`

View source

get_serialized_info()

`get_tensor_info`

View source

get_tensor_info()

`get_tensor_spec`

View source

get_tensor_spec() -> TreeDict[tf.TensorSpec]

Returns the tf.TensorSpec of this feature (not the element spec!).

Note that the output of this method may not correspond to the element spec of the dataset. For example, currently this method does not support RaggedTensorSpec.

`load_metadata`

View source

load_metadata(
    data_dir: epath.PathLike, feature_name: Optional[str]
)

Restore the feature metadata from disk.

If a dataset is re-loaded and generated files exists on disk, this function will restore the feature metadata from the saved file.

Args
`data_dir`	path to the dataset folder to which save the info (ex: `~/datasets/cifar10/1.2.0/`)
`feature_name`	the name of the feature (from the FeaturesDict key)

`repr_html`

View source

repr_html(
    ex: np.ndarray
) -> str

Images are displayed as thumbnail.

`repr_html_batch`

View source

repr_html_batch(
    ex: np.ndarray
) -> str

Sequence(Image()) are displayed as <video>.

`repr_html_ragged`

View source

repr_html_ragged(
    ex: np.ndarray
) -> str

Returns the HTML str representation of the object (Nested sequence).

`save_config`

View source

save_config(
    root_dir: str
) -> None

Exports the FeatureConnector to a file.

Args
`root_dir`	`path/to/dir` containing the `features.json`

`save_metadata`

View source

save_metadata(
    data_dir: epath.PathLike, feature_name: Optional[str]
) -> None

Save the feature metadata on disk.

This function is called after the data has been generated (by _download_and_prepare) to save the feature connector info with the generated dataset.

Some dataset/features dynamically compute info during _download_and_prepare. For instance:

Labels are loaded from the downloaded data
Vocabulary is created from the downloaded data
ImageLabelFolder compute the image dtypes/shape from the manual_dir

After the info have been added to the feature, this function allow to save those additional info to be restored the next time the data is loaded.

By default, this function do not save anything, but sub-classes can overwrite the function.

Args
`data_dir`	path to the dataset folder to which save the info (ex: `~/datasets/cifar10/1.2.0/`)
`feature_name`	the name of the feature (from the FeaturesDict key)

`to_json`

View source

to_json() -> Json

Exports the FeatureConnector to Json.

Each feature is serialized as a dict(type=..., content=...).

type: The cannonical name of the feature (module.FeatureName).
content: is specific to each feature connector and defined in to_json_content. Can contain nested sub-features (like for tfds.features.FeaturesDict and tfds.features.Sequence).

For example:

tfds.features.FeaturesDict({
    'input': tfds.features.Image(),
    'target': tfds.features.ClassLabel(num_classes=10),
})

Is serialized as:

{
    "type": "tensorflow_datasets.core.features.features_dict.FeaturesDict",
    "content": {
        "input": {
            "type": "tensorflow_datasets.core.features.image_feature.Image",
            "content": {
                "shape": [null, null, 3],
                "dtype": "uint8",
                "encoding_format": "png"
            }
        },
        "target": {
            "type":
            "tensorflow_datasets.core.features.class_label_feature.ClassLabel",
            "content": {
              "num_classes": 10
            }
        }
    }
}

Returns
A `dict(type=, content=)`. Will be forwarded to `from_json` when reconstructing the feature.

`to_json_content`

View source

to_json_content() -> feature_pb2.ImageFeature

FeatureConnector factory (to overwrite).

This function should be overwritten by the subclass to allow re-importing the feature connector from the config. See existing FeatureConnector for example of implementation.

Returns
The FeatureConnector metadata in either a dict, or a Feature proto. This output is used in `from_json_content` when reconstructing the feature.

`to_proto`

View source

to_proto() -> feature_pb2.Feature

Exports the FeatureConnector to the Feature proto.

For features that have a specific schema defined in a proto, this function needs to be overriden. If there's no specific proto schema, then the feature will be represented using JSON.

Returns
The feature proto describing this feature.

Class Variables
ALIASES	`[]`

tfds.features.Image

Output

Example

Args

Raises

Attributes

Methods

catalog_documentation

cls_from_name

decode_batch_example

decode_example

decode_example_np

decode_example_np_with_opencv

decode_example_np_with_pil

decode_ragged_example

encode_example

from_config

Usage:

from_json

Example:

from_json_content

from_proto

get_serialized_info

get_tensor_info

get_tensor_spec

load_metadata

repr_html

repr_html_batch

repr_html_ragged

save_config

save_metadata

to_json

For example:

Is serialized as:

to_json_content

to_proto

Class Variables

`catalog_documentation`

`cls_from_name`

`decode_batch_example`

`decode_example`

`decode_example_np`

`decode_example_np_with_opencv`

`decode_example_np_with_pil`

`decode_ragged_example`

`encode_example`

`from_config`

`from_json`

`from_json_content`

`from_proto`

`get_serialized_info`

`get_tensor_info`

`get_tensor_spec`

`load_metadata`

`repr_html`

`repr_html_batch`

`repr_html_ragged`

`save_config`

`save_metadata`

`to_json`

`to_json_content`

`to_proto`