Google I / O가 5 월 18 ~ 20 일에 돌아옵니다! 공간을 예약하고 일정을 짜세요 지금 등록하세요


Loads the named dataset into a

Used in the notebooks

Used in the guide Used in the tutorials

tfds.load is a convenience method that:

  1. Fetch the tfds.core.DatasetBuilder by name:

    builder = tfds.builder(name, data_dir=data_dir, **builder_kwargs)
  2. Generate the data (when download=True):

  3. Load the object:

    ds = builder.as_dataset(

See: for more examples.

If you'd like NumPy arrays instead of or tf.Tensors, you can pass the return value to tfds.as_numpy.

name str, the registered name of the DatasetBuilder (the snake case version of the class name). This can be either 'dataset_name' or 'dataset_name/config_name' for datasets with BuilderConfigs. As a convenience, this string may contain comma-separated keyword arguments for the builder. For example 'foo_bar/a=True,b=3' would use the FooBar dataset passing the keyword arguments a=True and b=3 (for builders with configs, it would be 'foo_bar/zoo/a=True,b=3' to use the 'zoo' config and pass to the builder keyword arguments a=True and b=3).
split Which split of the data to load (e.g. 'train', 'test', ['train', 'test'], 'train[80%:]',...). See our split API guide. If None, will return all splits in a Dict[Split,]
data_dir str, directory to read/write data. Defaults to the value of the environment variable TFDS_DATA_DIR, if set, otherwise falls back to '~/tensorflow_datasets'.
batch_size int, if set, add a batch dimension to examples. Note that variable length features will be 0-padded. If batch_size=-1, will return the full dataset as tf.Tensors.
shuffle_files bool, whether to shuffle the input files. Defaults to False.
download bool (optional), whether to call tfds.core.DatasetBuilder.download_and_prepare before calling tf.DatasetBuilder.as_dataset. If False, data is expected to be in data_dir. If True and the data is already in data_dir, download_and_prepare is a no-op.
as_supervised bool, if True, the returned will have a 2-tuple structure (input, label) according to If False, the default, the returned will have a dictionary with all the features.
decoders Nested dict of Decoder objects which allow to customize the decoding. The structure should match the feature structure, but only customized feature keys need to be present. See the guide for more info.
read_config tfds.ReadConfig, Additional options to configure the input pipeline (e.g. seed, num parallel reads,...).
with_info bool, if True, tfds.load will return the tuple (, tfds.core.DatasetInfo), the latter containing the info associated with the builder.
builder_kwargs dict (optional), keyword arguments to be passed to the tfds.core.DatasetBuilder constructor. data_dir will be passed through by default.
download_and_prepare_kwargs dict (optional) keyword arguments passed to tfds.core.DatasetBuilder.download_and_prepare if download=True. Allow to control where to download and extract the cached data. If not set, cache_dir and manual_dir will automatically be deduced from data_dir.
as_dataset_kwargs dict (optional), keyword arguments passed to tfds.core.DatasetBuilder.as_dataset.
try_gcs bool, if True, tfds.load will see if the dataset exists on the public GCS bucket before building it locally.

ds, the dataset requested, or if split is None, a dict<key: tfds.Split, value:>. If batch_size=-1, these will be full datasets as tf.Tensors.
ds_info tfds.core.DatasetInfo, if with_info is True, then tfds.load will return a tuple (ds, ds_info) containing dataset information (version, features, splits, num_examples,...). Note that the ds_info object documents the entire dataset, regardless of the split requested. Split-specific information is available in ds_info.splits.