![]() |
Information about a dataset.
tfds.core.DatasetInfo(
*, builder=None, description=None, features=None, supervised_keys=None,
homepage=None, citation=None, metadata=None, redistribution_info=None
)
DatasetInfo
documents datasets, including its name, version, and features.
See the constructor arguments and properties for a full list.
Args | |
---|---|
builder
|
DatasetBuilder , dataset builder for this info.
|
description
|
str , description of this dataset.
|
features
|
tfds.features.FeaturesDict , Information on the feature dict
of the tf.data.Dataset() object from the builder.as_dataset()
method.
|
supervised_keys
|
tuple of (input_key, target_key) , Specifies the
input feature and the label for supervised learning, if applicable for
the dataset. The keys correspond to the feature names to select in
info.features . When calling tfds.core.DatasetBuilder.as_dataset()
with as_supervised=True , the tf.data.Dataset object will yield
the (input, target) defined here.
|
homepage
|
str , optional, the homepage for this dataset.
|
citation
|
str , optional, the citation to use for this dataset.
|
metadata
|
tfds.core.Metadata , additonal object which will be
stored/restored with the dataset. This allows for storing additional
information with the dataset.
|
redistribution_info
|
dict , optional, information needed for
redistribution, as specified in dataset_info_pb2.RedistributionInfo .
The content of the license subfield will automatically be written to a
LICENSE file stored with the dataset.
|
Attributes | |
---|---|
as_json
|
|
as_proto
|
|
citation
|
|
data_dir
|
|
dataset_size
|
Generated dataset files size, in bytes. |
description
|
|
download_size
|
Downloaded files size, in bytes. |
features
|
|
full_name
|
Full canonical name: ( |
homepage
|
|
initialized
|
Whether DatasetInfo has been fully initialized. |
metadata
|
|
module_name
|
|
name
|
|
redistribution_info
|
|
splits
|
|
supervised_keys
|
|
version
|
Methods
compute_dynamic_properties
compute_dynamic_properties()
initialize_from_bucket
initialize_from_bucket()
Initialize DatasetInfo from GCS bucket info files.
read_from_directory
read_from_directory(
dataset_info_dir
)
Update DatasetInfo from the JSON file in dataset_info_dir
.
This function updates all the dynamically generated fields (num_examples, hash, time of creation,...) of the DatasetInfo.
This will overwrite all previous metadata.
Args | |
---|---|
dataset_info_dir
|
str The directory containing the metadata file. This
should be the root directory of a specific dataset version.
|
Raises | |
---|---|
FileNotFoundError
|
If the file can't be found. |
set_splits
set_splits(
split_dict: tfds.core.SplitDict
) -> None
Split setter (private method).
write_to_directory
write_to_directory(
dataset_info_dir
)
Write DatasetInfo
as JSON to dataset_info_dir
.