TensorFlow 2.0 RC is available Learn more

Module: tfds

tensorflow_datasets (tfds) defines a collection of datasets ready-to-use with TensorFlow.

View source

Each dataset is defined as a tfds.core.DatasetBuilder, which encapsulates the logic to download the dataset and construct an input pipeline, as well as contains the dataset documentation (version, splits, number of examples, etc.).

The main library entrypoints are:



core module: API to define datasets.

decode module: Decoder public API.

download module: tfds.download.DownloadManager API.

features module: tfds.features.FeatureConnector API defining feature types.

file_adapter module: tfds.file_adapter.FileFormatAdapters for GeneratorBasedBuilder.

testing module: Testing utilities.

units module: Defines convenience constants/functions for converting various units.


class GenerateMode: Enum for how to treat pre-existing downloads and data.

class Split: Enum for dataset splits.

class percent: Syntactic sugar for defining slice subsplits: tfds.percent[75:-5].


as_numpy(...): Converts a tf.data.Dataset to an iterable of NumPy arrays.

builder(...): Fetches a tfds.core.DatasetBuilder by string name.

disable_progress_bar(...): Disabled Tqdm progress bar.

is_dataset_on_gcs(...): If the dataset is available on the GCS bucket gs://tfds-data/datasets.

list_builders(...): Returns the string names of all tfds.core.DatasetBuilders.

load(...): Loads the named dataset into a tf.data.Dataset.