Help protect the Great Barrier Reef with TensorFlow on Kaggle Join Challenge


Configures input reading pipeline.

Used in the notebooks

Used in the tutorials

options, dataset options to use. Note that when shuffle_files is True and no seed is defined, experimental_deterministic will be set to False internally, unless it is defined here.
try_autocache If True (default) and the dataset satisfy the right conditions (dataset small enough, files not shuffled,...) the dataset will be cached during the first iteration (through ds = ds.cache()).
add_tfds_id If True, examples dict in will have an additional key 'tfds_id': tf.Tensor(shape=(), dtype=tf.string) containing the example unique identifier (e.g. 'train.tfrecord-000045-of-001024__123'). Note: IDs might changes in future version of TFDS.
shuffle_seed tf.int64, seed forwarded to during file shuffling (which happens when tfds.load(..., shuffle_files=True)).
shuffle_reshuffle_each_iteration bool, forwarded to during file shuffling (which happens when tfds.load(..., shuffle_files=True)).
interleave_cycle_length int, forwarded to
interleave_block_length int, forwarded to
input_context tf.distribute.InputContext, if set, each worker will read a different set of file. For more info, see the distribute_datasets_from_function documentation. Note: * Each workers will always read the same subset of files. shuffle_files only shuffle files within each worker. * If info.splits[split].num_shards < input_context.num_input_pipelines, an error will be raised, as some workers would be empty.
experimental_interleave_sort_fn Function with signature List[FileDict] -> List[FileDict], which takes the list of dict(file: str, take: int, skip: int) and returns the modified version to read. This can be used to sort/shuffle the shards to read in a custom order, instead of relying on shuffle_files=True.
skip_prefetch If False (default), add a ds.prefetch() op at the end. Might be set for performance optimization in some cases (e.g. if you're already calling ds.prefetch() at the end of your pipeline)
num_parallel_calls_for_decode The number of parallel calls for decoding record. By default using's AUTOTUNE.
num_parallel_calls_for_interleave_files The number of parallel calls for interleaving files. By default using's AUTOTUNE.
enable_ordering_guard When True (default), an exception is raised if shuffling or interleaving are used on an ordered dataset.

add_tfds_id False
enable_ordering_guard True
experimental_interleave_sort_fn None
input_context None
interleave_block_length 16
interleave_cycle_length Instance of tensorflow_datasets.core.utils.read_config._MISSING
num_parallel_calls_for_decode -1
num_parallel_calls_for_interleave_files -1
shuffle_reshuffle_each_iteration None
shuffle_seed None
skip_prefetch False
try_autocache True