tfds.core.SplitInfo
Stay organized with collections
Save and categorize content based on your preferences.
Wraps proto.SplitInfo
with an additional property.
tfds.core.SplitInfo(
name: str,
shard_lengths: List[int],
num_bytes: int,
filename_template: Optional[naming.ShardedFileTemplate] = None,
statistics: statistics_pb2.DatasetFeatureStatistics = dataclasses.field(default_factory=statistics_pb2.DatasetFeatureStatistics)
)
Attributes |
name
|
Name of the split (e.g. train , test ,...)
|
shard_lengths
|
List of length containing the number of
examples stored in each file.
|
filename_template
|
The template used to create sharded filenames.
|
num_examples
|
Total number of examples (sum(shard_lengths) )
|
num_shards
|
Number of files (len(shard_lengths) )
|
num_bytes
|
Size of the files (in bytes)
|
statistics
|
Additional statistics of the split.
|
file_instructions
|
Returns the list of dict(filename, take, skip).
This allows for creating your own tf.data.Dataset using the low-level
TFDS values.
file_instructions = info.splits['train[75%:]'].file_instructions
instruction_ds = tf.data.Dataset.from_generator(
lambda: file_instructions,
output_types={
'filename': tf.string,
'take': tf.int64,
'skip': tf.int64,
},
)
ds = instruction_ds.interleave(
lambda f: tf.data.TFRecordDataset(
f['filename']).skip(f['skip']).take(f['take'])
)
When skip=0 and take=-1 , the full shard will be read, so the ds.skip
and ds.take could be skipped.
|
filenames
|
Returns the list of filenames.
|
filepaths
|
All the paths for all the files that are part of this split.
|
Methods
from_proto
View source
@classmethod
from_proto(
proto: proto_lib.SplitInfo, filename_template: naming.ShardedFileTemplate
) -> 'SplitInfo'
Returns a SplitInfo class instance from a SplitInfo proto.
replace
View source
replace(
**kwargs
) -> 'SplitInfo'
Returns a copy of the SplitInfo
with updated attributes.
to_proto
View source
to_proto() -> proto_lib.SplitInfo
Class Variables |
filename_template
|
None
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[{
"type": "thumb-down",
"id": "missingTheInformationINeed",
"label":"Missing the information I need"
},{
"type": "thumb-down",
"id": "tooComplicatedTooManySteps",
"label":"Too complicated / too many steps"
},{
"type": "thumb-down",
"id": "outOfDate",
"label":"Out of date"
},{
"type": "thumb-down",
"id": "samplesCodeIssue",
"label":"Samples / code issue"
},{
"type": "thumb-down",
"id": "otherDown",
"label":"Other"
}]
[{
"type": "thumb-up",
"id": "easyToUnderstand",
"label":"Easy to understand"
},{
"type": "thumb-up",
"id": "solvedMyProblem",
"label":"Solved my problem"
},{
"type": "thumb-up",
"id": "otherUp",
"label":"Other"
}]
{"lastModified": "Last updated 2024-04-26 UTC."}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[]]