tfds.ReadConfig

Configures input reading pipeline.

options tf.data.Options(), dataset options to use. Note that when shuffle_files is True and no seed is defined, experimental_deterministic will be set to False internally, unless it is defined here.
try_autocache If True (default) and the dataset satisfy the right conditions (dataset small enough, files not shuffled,...) the dataset will be cached during the first iteration (through ds = ds.cache()).
shuffle_seed tf.int64, seed forwarded to tf.data.Dataset.shuffle during file shuffling (which happens when tfds.load(..., shuffle_files=True)).
shuffle_reshuffle_each_iteration bool, forwarded to tf.data.Dataset.shuffle during file shuffling (which happens when tfds.load(..., shuffle_files=True)).
interleave_cycle_length int, forwarded to tf.data.Dataset.interleave.
interleave_block_length int, forwarded to tf.data.Dataset.interleave.
input_context tf.distribute.InputContext, if set, each worker will read a different set of file. For more info, see the distribute_datasets_from_function documentation. Note:

  • Each workers will always read the same subset of files. shuffle_files only shuffle files within each worker.
  • If info.splits[split].num_shards < input_context.num_input_pipelines, an error will be raised, as some workers would be empty.
experimental_interleave_sort_fn Function with signature List[FileDict] -> List[FileDict], which takes the list of dict(file: str, take: int, skip: int) and returns the modified version to read. This can be used to sort/shuffle the shards to read in a custom order, instead of relying on shuffle_files=True.
skip_prefetch If False (default), add a ds.prefetch() op at the end. Might be set for performance optimization in some cases (e.g. if you're already calling ds.prefetch() at the end of your pipeline)

experimental_interleave_sort_fn None
input_context None
interleave_block_length 16
interleave_cycle_length 16
shuffle_reshuffle_each_iteration None
shuffle_seed None
skip_prefetch False
try_autocache True