TFDS CLI è uno strumento da riga di comando che fornisce vari comandi per lavorare facilmente con TensorFlow Dataset.

Visualizza su Esegui in Google Colab Visualizza l'origine su GitHub Scarica quaderno
Disabilita i registri TF durante l'importazione
%env TF_CPP_MIN_LOG_LEVEL=1  # Disable logs on TF import


Lo strumento CLI viene installato con tensorflow-datasets (o tfds-nightly ).

pip install -q tfds-nightly
tfds --version

Per l'elenco di tutti i comandi CLI:

tfds --help
usage: tfds [-h] [--helpfull] [--version] {build,new} ...

Tensorflow Datasets CLI tool

optional arguments:
  -h, --help   show this help message and exit
  --helpfull   show full help message and exit
  --version    show program's version number and exit

    build      Commands for downloading and preparing datasets.
    new        Creates a new dataset directory from the template.

tfds new : Implementazione di un nuovo set di dati

Questo comando ti aiuterà a iniziare a scrivere il tuo nuovo set di dati Python creando una <dataset_name>/ contenente i file di implementazione predefiniti.


tfds new my_dataset
2022-02-07 04:04:10.397902: E tensorflow/stream_executor/cuda/] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Dataset generated at /tmpfs/src/temp/docs/my_dataset
You can start searching `TODO(my_dataset)` to complete the implementation.
Please check for additional details.


ls -1 my_dataset/

Consulta la nostra guida alla scrittura dei set di dati per ulteriori informazioni.

Opzioni disponibili:

tfds new --help
usage: tfds new [-h] [--helpfull] [--dir DIR] dataset_name

positional arguments:
  dataset_name  Name of the dataset to be created (in snake_case)

optional arguments:
  -h, --help    show this help message and exit
  --helpfull    show full help message and exit
  --dir DIR     Path where the dataset directory will be created. Defaults to
                current directory.

tfds build : scarica e prepara un set di dati

Usa tfds build <my_dataset> per generare un nuovo dataset. <my_dataset> può essere:

  • Un percorso a dataset/ cartella o file (vuoto per la directory corrente):

    • tfds build datasets/my_dataset/
    • cd datasets/my_dataset/ && tfds build
    • cd datasets/my_dataset/ && tfds build my_dataset
    • cd datasets/my_dataset/ && tfds build
  • Un set di dati registrato:

    • tfds build mnist
    • tfds build my_dataset --imports my_project.datasets

Opzioni disponibili:

tfds build --help
usage: tfds build [-h] [--helpfull]
                  [--datasets DATASETS_KEYWORD [DATASETS_KEYWORD ...]]
                  [--max_examples_per_split [MAX_EXAMPLES_PER_SPLIT]]
                  [--data_dir DATA_DIR] [--download_dir DOWNLOAD_DIR]
                  [--extract_dir EXTRACT_DIR] [--manual_dir MANUAL_DIR]
                  [--add_name_to_manual_dir] [--config CONFIG]
                  [--config_idx CONFIG_IDX] [--imports IMPORTS]
                  [--register_checksums] [--force_checksums_validation]
                  [--beam_pipeline_options BEAM_PIPELINE_OPTIONS]
                  [--file_format FILE_FORMAT]
                  [--exclude_datasets EXCLUDE_DATASETS]
                  [datasets [datasets ...]]

positional arguments:
  datasets              Name(s) of the dataset(s) to build. Default to current
                        dir. See for
                        accepted values.

optional arguments:
  -h, --help            show this help message and exit
  --helpfull            show full help message and exit
                        Datasets can also be provided as keyword argument.

Debug & tests:
  --pdb Enter post-mortem debugging mode if an exception is raised.

  --overwrite           Delete pre-existing dataset if it exists.
  --max_examples_per_split [MAX_EXAMPLES_PER_SPLIT]
                        When set, only generate the first X examples (default
                        to 1), rather than the full dataset.If set to 0, only
                        execute the `_split_generators` (which download the
                        original data), but skip `_generator_examples`

  --data_dir DATA_DIR   Where to place datasets. Default to
                        `~/tensorflow_datasets/` or `TFDS_DATA_DIR`
                        environement variable.
  --download_dir DOWNLOAD_DIR
                        Where to place downloads. Default to
  --extract_dir EXTRACT_DIR
                        Where to extract files. Default to
  --manual_dir MANUAL_DIR
                        Where to manually download data (required for some
                        datasets). Default to `<download_dir>/manual/`.
                        If true, append the dataset name to the `manual_dir`
                        (e.g. `<download_dir>/manual/<dataset_name>/`. Useful
                        to avoid collisions if many datasets are generated.

  --config CONFIG, -c CONFIG
                        Config name to build. Build all configs if not set.
  --config_idx CONFIG_IDX
                        Config id to build
                        (`builder_cls.BUILDER_CONFIGS[config_idx]`). Mutually
                        exclusive with `--config`.
  --imports IMPORTS, -i IMPORTS
                        Comma separated list of module to import to register
  --register_checksums  If True, store size and checksum of downloaded files.
                        If True, raise an error if the checksums are not
  --beam_pipeline_options BEAM_PIPELINE_OPTIONS
                        A (comma-separated) list of flags to pass to
                        `PipelineOptions` when preparing with Apache Beam.
                        Example: `--beam_pipeline_options=job_name=my-
  --file_format FILE_FORMAT
                        File format to which generate the tf-examples.
                        Available values: ['tfrecord', 'riegeli'] (see

  Used by automated scripts.

  --exclude_datasets EXCLUDE_DATASETS
                        If set, generate all datasets except the one defined
                        here. Comma separated list of datasets to exclude.
                        Build the latest Version(experiments=...) available
                        rather than default version.