tf.contrib.data.shuffle_and_repeat

tf.contrib.data.shuffle_and_repeat(
    buffer_size,
    count=None,
    seed=None
)

Defined in tensorflow/contrib/data/python/ops/shuffle_ops.py.

See the guide: Dataset Input Pipeline > Transformations on existing datasets

Shuffles and repeats a Dataset returning a new permutation for each epoch.

dataset.apply(tf.contrib.data.shuffle_and_repeat(buffer_size, count))

is equivalent to

dataset.shuffle(buffer_size, reshuffle_each_iteration=True).repeat(count)

The difference is that the latter dataset is not serializable. So, if you need to checkpoint an input pipeline with reshuffling you must use this implementation.

Args:

  • buffer_size: A tf.int64 scalar tf.Tensor, representing the maximum number elements that will be buffered when prefetching.
  • count: (Optional.) A tf.int64 scalar tf.Tensor, representing the number of times the dataset should be repeated. The default behavior (if count is None or -1) is for the dataset be repeated indefinitely.
  • seed: (Optional.) A tf.int64 scalar tf.Tensor, representing the random seed that will be used to create the distribution. See tf.set_random_seed for behavior.

Returns:

A Dataset transformation function, which can be passed to tf.data.Dataset.apply.