Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

Module: tfds

View source on GitHub

tensorflow_datasets (tfds) defines a collection of datasets ready-to-use with TensorFlow.

Each dataset is defined as a tfds.core.DatasetBuilder, which encapsulates the logic to download the dataset and construct an input pipeline, as well as contains the dataset documentation (version, splits, number of examples, etc.).

The main library entrypoints are:



core module: API to define datasets.

decode module: Decoder public API.

download module: tfds.download.DownloadManager API.

features module: tfds.features.FeatureConnector API defining feature types.

file_adapter module: tfds.file_adapter.FileFormatAdapters for GeneratorBasedBuilder.

testing module: Testing utilities.

units module: Defines convenience constants/functions for converting various units.


class GenerateMode: Enum for how to treat pre-existing downloads and data.

class ReadConfig: Configures input reading pipeline.

class Split: Enum for dataset splits.

class percent: Syntactic sugar for defining slice subsplits: tfds.percent[75:-5].


as_numpy(...): Converts a tf.data.Dataset to an iterable of NumPy arrays.

builder(...): Fetches a tfds.core.DatasetBuilder by string name.

disable_progress_bar(...): Disabled Tqdm progress bar.

is_dataset_on_gcs(...): If the dataset is available on the GCS bucket gs://tfds-data/datasets.

list_builders(...): Returns the string names of all tfds.core.DatasetBuilders.

load(...): Loads the named dataset into a tf.data.Dataset.

show_examples(...): Visualize images (and labels) from an image classification dataset.

Other Members

  • __version__ = '2.1.0'