![]() |
tensorflow_datasets
(tfds
) defines a collection of datasets ready-to-use with TensorFlow.
Each dataset is defined as a tfds.core.DatasetBuilder
, which encapsulates
the logic to download the dataset and construct an input pipeline, as well as
contains the dataset documentation (version, splits, number of examples, etc.).
The main library entrypoints are:
tfds.builder
: fetch atfds.core.DatasetBuilder
by nametfds.load
: convenience method to construct a builder, download the data, and create an input pipeline, returning atf.data.Dataset
.
Documentation:
- These API docs
- Available datasets
- Colab tutorial
- Add a dataset
Modules
core
module: API to define datasets.
decode
module: Decoder public API.
download
module: tfds.download.DownloadManager
API.
features
module: tfds.features.FeatureConnector
API defining feature types.
file_adapter
module: tfds.file_adapter.FileFormatAdapter
s for GeneratorBasedBuilder.
testing
module: Testing utilities.
units
module: Defines convenience constants/functions for converting various units.
Classes
class GenerateMode
: Enum
for how to treat pre-existing downloads and data.
class Split
: Enum
for dataset splits.
class percent
: Syntactic sugar for defining slice subsplits: tfds.percent[75:-5]
.
Functions
as_numpy(...)
: Converts a tf.data.Dataset
to an iterable of NumPy arrays.
builder(...)
: Fetches a tfds.core.DatasetBuilder
by string name.
disable_progress_bar(...)
: Disabled Tqdm progress bar.
is_dataset_on_gcs(...)
: If the dataset is available on the GCS bucket gs://tfds-data/datasets.
list_builders(...)
: Returns the string names of all tfds.core.DatasetBuilder
s.
load(...)
: Loads the named dataset into a tf.data.Dataset
.
show_examples(...)
: Visualize images (and labels) from an image classification dataset.