tf.raw_ops.ParallelInterleaveDataset

Creates a dataset that applies f to the outputs of input_dataset.

tf.raw_ops.ParallelInterleaveDataset(
    input_dataset, other_arguments, cycle_length, block_length, sloppy,
    buffer_output_elements, prefetch_input_elements, f, output_types, output_shapes,
    name=None
)

The resulting dataset is similar to the InterleaveDataset, with the exception that if retrieving the next value from a dataset would cause the requester to block, it will skip that input dataset. This dataset is especially useful when loading data from a variable-latency datastores (e.g. HDFS, GCS), as it allows the training step to proceed so long as some data is available.

!! WARNING !! If the sloppy parameter is set to True, the operation of this dataset will not be deterministic!

This dataset has been superseded by ParallelInterleaveDatasetV2. New code should use ParallelInterleaveDatasetV2.

The Python API tf.data.experimental.parallel_interleave creates instances of this op. tf.data.experimental.parallel_interleave is a deprecated API.

Args:

  • input_dataset: A Tensor of type variant. Dataset that produces a stream of arguments for the function f.
  • other_arguments: A list of Tensor objects. Additional arguments to pass to f beyond those produced by input_dataset. Evaluated once when the dataset is instantiated.
  • cycle_length: A Tensor of type int64. Number of datasets (each created by applying f to the elements of input_dataset) among which the ParallelInterleaveDataset will cycle in a round-robin fashion.
  • block_length: A Tensor of type int64. Number of elements at a time to produce from each interleaved invocation of a dataset returned by f.
  • sloppy: A Tensor of type bool. If True, return elements as they become available, even if that means returning these elements in a non-deterministic order. Sloppy operation may result in better performance in the presence of stragglers, but the dataset will still block if all of its open streams are blocked. If False, always return elements in a deterministic order.
  • buffer_output_elements: A Tensor of type int64. The number of elements each iterator being interleaved should buffer (similar to the .prefetch() transformation for each interleaved iterator).
  • prefetch_input_elements: A Tensor of type int64. Determines the number of iterators to prefetch, allowing buffers to warm up and data to be pre-fetched without blocking the main thread.
  • f: A function decorated with @Defun. A function mapping elements of input_dataset, concatenated with other_arguments, to a Dataset variant that contains elements matching output_types and output_shapes.
  • output_types: A list of tf.DTypes that has length >= 1.
  • output_shapes: A list of shapes (each a tf.TensorShape or list of ints) that has length >= 1.
  • name: A name for the operation (optional).

Returns:

A Tensor of type variant.