Stay organized with collections Save and categorize content based on your preferences.

Configures input reading pipeline.

Used in the notebooks

Used in the tutorials

options, dataset options to use. Note that when shuffle_files is True and no seed is defined, deterministic will be set to False internally, unless it is defined here.
try_autocache If True (default) and the dataset satisfy the right conditions (dataset small enough, files not shuffled,...) the dataset will be cached during the first iteration (through ds = ds.cache()).
add_tfds_id If True, examples dict in will have an additional key 'tfds_id': tf.Tensor(shape=(), dtype=tf.string) containing the example unique identifier (e.g. 'train.tfrecord-000045-of-001024__123'). Note: IDs might changes in future version of TFDS.
shuffle_seed tf.int64, seed forwarded to during file shuffling (which happens when tfds.load(..., shuffle_files=True)).
shuffle_reshuffle_each_iteration bool, forwarded to during file shuffling (which happens when tfds.load(..., shuffle_files=True)).
interleave_cycle_length int, forwarded to
interleave_block_length int, forwarded to
input_context tf.distribute.InputContext, if set, each worker will read a different set of file. For more info, see the distribute_datasets_from_function documentation. Note: * Each workers will always read the same subset of files. shuffle_files only shuffle files within each worker. * If info.splits[split].num_shards < input_context.num_input_pipelines, an error will be raised, as some workers would be empty.
experimental_interleave_sort_fn Function with signature List[FileDict] -> List[FileDict], which takes the list of dict(file: str, take: int, skip: int) and returns the modified version to read. This can be used to sort/shuffle the shards to read in a custom order, instead of relying on shuffle_files=True.
skip_prefetch If False (default), add a ds.prefetch() op at the end. Might be set for performance optimization in some cases (e.g. if you're already calling ds.prefetch() at the end of your pipeline)
num_parallel_calls_for_decode The number of parallel calls for decoding record. By default using's AUTOTUNE.
num_parallel_calls_for_interleave_files The number of parallel calls for interleaving files. By default using's AUTOTUNE.
enable_ordering_guard When True (default), an exception is raised if shuffling or interleaving are used on an ordered dataset.
assert_cardinality When True (default), an exception is raised if at the end of an Epoch the number of read examples does not match the expected number from dataset metadata. A power user would typically want to set False if input files have been tempered with and they don't mind missing records or have too many of them.

add_tfds_id False
assert_cardinality True
enable_ordering_guard True
experimental_interleave_sort_fn None
input_context None
interleave_block_length 16
interleave_cycle_length 'missing'
num_parallel_calls_for_decode None
num_parallel_calls_for_interleave_files None
options None
shuffle_reshuffle_each_iteration None
shuffle_seed None
skip_prefetch False
try_autocache True