ML Community Day is November 9! Join us for updates from TensorFlow, JAX, and more Learn more


Wraps proto.SplitInfo with an additional property.

name Name of the split (e.g. train, test,...)
shard_lengths List of length containing the number of examples stored in each file.
num_examples Total number of examples (sum(shard_lengths))
num_shards Number of files (len(shard_lengths))
num_bytes Size of the files
statistics Additional statistics of the split.
file_instructions Returns the list of dict(filename, take, skip).

This allows for creating your own using the low-level TFDS values.

file_instructions = info.splits['train[75%:]'].file_instructions
instruction_ds =
    lambda: file_instructions,
        'filename': tf.string,
        'take': tf.int64,
        'skip': tf.int64,
ds = instruction_ds.interleave(
    lambda f:

When skip=0 and take=-1, the full shard will be read, so the ds.skip and ds.take could be skipped.

filenames Returns the list of filenames.



View source


View source

Returns a copy of the SplitInfo with updated attributes.


View source