Module: tfds

tensorflow_datasets (tfds) defines a collection of datasets ready-to-use with TensorFlow.

Each dataset is defined as a tfds.core.DatasetBuilder, which encapsulates the logic to download the dataset and construct an input pipeline, as well as contains the dataset documentation (version, splits, number of examples, etc.).

The main library entrypoints are:



core module: API to define datasets.

decode module: Decoder public API.

deprecated module: Deprecated symbols.

download module: API.

features module: API defining dataset features (image, text, scalar,...).

folder_dataset module: Custom Datasets APIs.

testing module: Testing utilities.

units module: Defines convenience constants/functions for converting various units.

visualization module: Visualizer utils.


class GenerateMode: Enum for how to treat pre-existing downloads and data.

class ImageFolder: Generic image classification dataset created from manual directory.

class ReadConfig: Configures input reading pipeline.

class Split: Enum for dataset splits.

class TranslateFolder: Generic text translation dataset created from manual directory.


as_dataframe(...): Convert the dataset into a pandas dataframe.

as_numpy(...): Converts a to an iterable of NumPy arrays.

builder(...): Fetches a tfds.core.DatasetBuilder by string name.

builder_cls(...): Fetches a tfds.core.DatasetBuilder class by string name.

disable_progress_bar(...): Disabled Tqdm progress bar.

even_splits(...): Generates a list of sub-splits of same size.

is_dataset_on_gcs(...): If the dataset is available on the GCS bucket gs://tfds-data/datasets.

list_builders(...): Returns the string names of all tfds.core.DatasetBuilders.

load(...): Loads the named dataset into a

show_examples(...): Visualize images (and labels) from an image classification dataset.

show_statistics(...): Display the datasets statistics on a Colab/Jupyter notebook.

version '4.0.1'