![]() |
FeatureConnector
for images.
Inherits From: FeatureConnector
tfds.features.Image(
*, shape=None, dtype=None, encoding_format=None
)
During _generate_examples
, the feature connector accept as input any of:
str
: path to a {bmp,gif,jpeg,png} image (ex:/path/to/img.png
).np.array
: 3dnp.uint8
array representing an image.- A file object containing the png or jpeg encoded image string (ex:
io.BytesIO(encoded_img_bytes)
)
Output:
tf.Tensor
of type tf.uint8
and shape [height, width, num_channels]
for BMP, JPEG, and PNG images and shape [num_frames, height, width, 3]
for
GIF images.
Example:
- In the
tfds.core.DatasetInfo
object:
features=features.FeaturesDict({
'input': features.Image(),
'target': features.Image(shape=(None, None, 1),
encoding_format='png'),
})
- During generation:
yield {
'input': 'path/to/img.jpg',
'target': np.ones(shape=(64, 64, 1), dtype=np.uint8),
}
Args | |
---|---|
shape
|
tuple of ints or None, the shape of decoded image.
For GIF images: (num_frames, height, width, channels=3). num_frames,
height and width can be None.
For other images: (height, width, channels). height and width can be
None. See tf.image.encode_* for doc on channels parameter.
Defaults to (None, None, 3).
|
dtype
|
tf.uint16 or tf.uint8 (default). tf.uint16 can be used only with png encoding_format |
encoding_format
|
'jpeg' or 'png'. Format to serialize np.ndarray images
on disk. If None, encode images as PNG.
If image is loaded from {bmg,gif,jpeg,png} file, this parameter is
ignored, and file original encoding is used.
|
Raises | |
---|---|
ValueError
|
If the shape is invalid |
Attributes | |
---|---|
dtype
|
Return the dtype (or dict of dtype) of this FeatureConnector. |
shape
|
Return the shape (or dict of shape) of this FeatureConnector. |
Methods
decode_batch_example
decode_batch_example(
tfexample_data
)
Decode multiple features batched in a single tf.Tensor.
This function is used to decode features wrapped in
tfds.features.Sequence()
.
By default, this function apply decode_example
on each individual
elements using tf.map_fn
. However, for optimization, features can
overwrite this method to apply a custom batch decoding.
Args | |
---|---|
tfexample_data
|
Same tf.Tensor inputs as decode_example , but with
and additional first dimension for the sequence length.
|
Returns | |
---|---|
tensor_data
|
Tensor or dictionary of tensor, output of the tf.data.Dataset object |
decode_example
decode_example(
example
)
Reconstruct the image from the tf example.
decode_ragged_example
decode_ragged_example(
tfexample_data
)
Decode nested features from a tf.RaggedTensor.
This function is used to decode features wrapped in nested
tfds.features.Sequence()
.
By default, this function apply decode_batch_example
on the flat values
of the ragged tensor. For optimization, features can
overwrite this method to apply a custom batch decoding.
Args | |
---|---|
tfexample_data
|
tf.RaggedTensor inputs containing the nested encoded
examples.
|
Returns | |
---|---|
tensor_data
|
The decoded tf.RaggedTensor or dictionary of tensor,
output of the tf.data.Dataset object
|
encode_example
encode_example(
image_or_path_or_fobj
)
Convert the given image into a dict convertible to tf example.
from_config
@classmethod
from_config( root_dir: str ) -> "FeatureConnector"
Reconstructs the FeatureConnector from the config file.
Usage:
features = FeatureConnector.from_config('path/to/features.json')
Args | |
---|---|
root_dir
|
Directory containing to the features.json file. |
Returns | |
---|---|
The reconstructed feature instance. |
from_json
@classmethod
from_json( value:
tfds.typing.Json
) -> "FeatureConnector"
FeatureConnector factory.
This function should be called from the tfds.features.FeatureConnector
base class. Subclass should implement the from_json_content
.
Example:
feature = tfds.features.FeatureConnector.from_json(
{'type': 'Image', 'content': {'shape': [32, 32, 3], 'dtype': 'uint8'} }
)
assert isinstance(feature, tfds.features.Image)
Args | |
---|---|
value
|
dict(type=, content=) containing the feature to restore.
Match dict returned by to_json .
|
Returns | |
---|---|
The reconstructed FeatureConnector. |
from_json_content
@classmethod
from_json_content( value:
tfds.typing.Json
) -> "Image"
FeatureConnector factory (to overwrite).
Subclasses should overwritte this method. importing the feature connector from the config.
This function should not be called directly. FeatureConnector.from_json
should be called instead.
This function See existing FeatureConnector for example of implementation.
Args | |
---|---|
value
|
FeatureConnector information. Match the dict returned by
to_json_content .
|
Returns | |
---|---|
The reconstructed FeatureConnector. |
get_serialized_info
get_serialized_info()
Return the shape/dtype of features after encoding (for the adapter).
The FileAdapter
then use those information to write data on disk.
This function indicates how this feature is encoded on file internally. The DatasetBuilder are written on disk as tf.train.Example proto.
Ex:
return {
'image': tfds.features.TensorInfo(shape=(None,), dtype=tf.uint8),
'height': tfds.features.TensorInfo(shape=(), dtype=tf.int32),
'width': tfds.features.TensorInfo(shape=(), dtype=tf.int32),
}
FeatureConnector which are not containers should return the feature proto directly:
return tfds.features.TensorInfo(shape=(64, 64), tf.uint8)
If not defined, the retuned values are automatically deduced from the
get_tensor_info
function.
Returns | |
---|---|
features
|
Either a dict of feature proto object, or a feature proto object |
get_tensor_info
get_tensor_info()
Return the tf.Tensor dtype/shape of the feature.
This returns the tensor dtype/shape, as returned by .as_dataset by the
tf.data.Dataset
object.
Ex:
return {
'image': tfds.features.TensorInfo(shape=(None,), dtype=tf.uint8),
'height': tfds.features.TensorInfo(shape=(), dtype=tf.int32),
'width': tfds.features.TensorInfo(shape=(), dtype=tf.int32),
}
FeatureConnector which are not containers should return the feature proto directly:
return tfds.features.TensorInfo(shape=(256, 256), dtype=tf.uint8)
Returns | |
---|---|
tensor_info
|
Either a dict of tfds.features.TensorInfo object, or a
tfds.features.TensorInfo
|
load_metadata
load_metadata(
data_dir, feature_name
)
Restore the feature metadata from disk.
If a dataset is re-loaded and generated files exists on disk, this function will restore the feature metadata from the saved file.
Args | |
---|---|
data_dir
|
str , path to the dataset folder to which save the info (ex:
~/datasets/cifar10/1.2.0/ )
|
feature_name
|
str , the name of the feature (from the FeaturesDict key)
|
repr_html
repr_html(
ex: np.ndarray
) -> str
Images are displayed as thumbnail.
repr_html_batch
repr_html_batch(
ex: np.ndarray
) -> str
Returns the HTML str representation of the object (Sequence).
repr_html_ragged
repr_html_ragged(
ex: np.ndarray
) -> str
Returns the HTML str representation of the object (Nested sequence).
save_config
save_config(
root_dir: str
) -> None
Exports the FeatureConnector
to a file.
Args | |
---|---|
root_dir
|
path/to/dir containing the features.json
|
save_metadata
save_metadata(
data_dir, feature_name
)
Save the feature metadata on disk.
This function is called after the data has been generated (by
_download_and_prepare
) to save the feature connector info with the
generated dataset.
Some dataset/features dynamically compute info during
_download_and_prepare
. For instance:
- Labels are loaded from the downloaded data
- Vocabulary is created from the downloaded data
- ImageLabelFolder compute the image dtypes/shape from the manual_dir
After the info have been added to the feature, this function allow to save those additional info to be restored the next time the data is loaded.
By default, this function do not save anything, but sub-classes can overwrite the function.
Args | |
---|---|
data_dir
|
str , path to the dataset folder to which save the info (ex:
~/datasets/cifar10/1.2.0/ )
|
feature_name
|
str , the name of the feature (from the FeaturesDict key)
|
to_json
to_json() -> tfds.typing.Json
Exports the FeatureConnector to Json.
Each feature is serialized as a dict(type=..., content=...)
.
type
: The cannonical name of the feature (module.FeatureName
).content
: is specific to each feature connector and defined into_json_content
. Can contain nested sub-features (like fortfds.features.FeaturesDict
andtfds.features.Sequence
).
For example:
tfds.features.FeaturesDict({
'input': tfds.features.Image(),
'target': tfds.features.ClassLabel(num_classes=10),
})
Is serialized as:
{
"type": "tensorflow_datasets.core.features.features_dict.FeaturesDict",
"content": {
"input": {
"type": "tensorflow_datasets.core.features.image_feature.Image",
"content": {
"shape": [null, null, 3],
"dtype": "uint8",
"encoding_format": "png"
}
},
"target": {
"type": "tensorflow_datasets.core.features.class_label_feature.ClassLabel",
"num_classes": 10
}
}
}
Returns | |
---|---|
A dict(type=, content=) . Will be forwarded to
from_json when reconstructing the feature.
|
to_json_content
to_json_content() -> tfds.typing.Json
FeatureConnector factory (to overwrite).
This function should be overwritten by the subclass to allow re-importing the feature connector from the config. See existing FeatureConnector for example of implementation.
Returns | |
---|---|
Dict containing the FeatureConnector metadata. Will be forwarded to
from_json_content when reconstructing the feature.
|