Datasets

Usage

# See all registered datasets
tfds.list_builders()

# Load a given dataset by name, along with the DatasetInfo
data, info = tfds.load("mnist", with_info=True)
train_data, test_data = data['test'], data['train']
assert isinstance(train_data, tf.data.Dataset)
assert info.features['label'].num_classes == 10
assert info.splits['train'].num_examples == 60000

# You can also access a builder directly
builder = tfds.builder("mnist")
assert builder.info.splits['train'].num_examples == 60000
builder.download_and_prepare()
datasets = builder.as_dataset()

# If you need NumPy arrays
np_datasets = tfds.as_numpy(datasets)

Datasets


audio

"nsynth"

The NSynth Dataset is an audio dataset containing ~300k musical notes, each with a unique pitch, timbre, and envelope. Each note is annotated with three additional pieces of information based on a combination of human evaluation and heuristic algorithms: -Source: The method of sound production for the note's instrument. -Family: The high-level family of which the note's instrument is a member. -Qualities: Sonic qualities of the note.

The dataset is split into train, valid, and test sets, with no instruments overlapping between the train set and the valid/test sets.

Features

FeaturesDict({
    'audio': Tensor(shape=(64000,), dtype=tf.float32),
    'id': Tensor(shape=(), dtype=tf.string),
    'instrument': FeaturesDict({
        'family': ClassLabel(shape=(), dtype=tf.int64, num_classes=11),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=1006),
        'source': ClassLabel(shape=(), dtype=tf.int64, num_classes=3),
    }),
    'pitch': ClassLabel(shape=(), dtype=tf.int64, num_classes=128),
    'qualities': FeaturesDict({
        'bright': Tensor(shape=(), dtype=tf.bool),
        'dark': Tensor(shape=(), dtype=tf.bool),
        'distortion': Tensor(shape=(), dtype=tf.bool),
        'fast_decay': Tensor(shape=(), dtype=tf.bool),
        'long_release': Tensor(shape=(), dtype=tf.bool),
        'multiphonic': Tensor(shape=(), dtype=tf.bool),
        'nonlinear_env': Tensor(shape=(), dtype=tf.bool),
        'percussive': Tensor(shape=(), dtype=tf.bool),
        'reverb': Tensor(shape=(), dtype=tf.bool),
        'tempo-synced': Tensor(shape=(), dtype=tf.bool),
    }),
    'velocity': ClassLabel(shape=(), dtype=tf.int64, num_classes=128),
})

Statistics

Split Examples
ALL 305,979
TRAIN 289,205
VALID 12,678
TEST 4,096

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@InProceedings{pmlr-v70-engel17a,
  title =    {Neural Audio Synthesis of Musical Notes with {W}ave{N}et Autoencoders},
  author =   {Jesse Engel and Cinjon Resnick and Adam Roberts and Sander Dieleman and Mohammad Norouzi and Douglas Eck and Karen Simonyan},
  booktitle =    {Proceedings of the 34th International Conference on Machine Learning},
  pages =    {1068--1077},
  year =     {2017},
  editor =   {Doina Precup and Yee Whye Teh},
  volume =   {70},
  series =   {Proceedings of Machine Learning Research},
  address =      {International Convention Centre, Sydney, Australia},
  month =    {06--11 Aug},
  publisher =    {PMLR},
  pdf =      {http://proceedings.mlr.press/v70/engel17a/engel17a.pdf},
  url =      {http://proceedings.mlr.press/v70/engel17a.html},
}


image

"cats_vs_dogs"

A large set of images of cats and dogs.There are 1738 corrupted images that are dropped.

Features

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
})

Statistics

Split Examples
TRAIN 23,262
ALL 23,262

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@Inproceedings (Conference){asirra-a-captcha-that-exploits-interest-aligned-manual-image-categorization,
author = {Elson, Jeremy and Douceur, John (JD) and Howell, Jon and Saul, Jared},
title = {Asirra: A CAPTCHA that Exploits Interest-Aligned Manual Image Categorization},
booktitle = {Proceedings of 14th ACM Conference on Computer and Communications Security (CCS)},
year = {2007},
month = {October},
publisher = {Association for Computing Machinery, Inc.},
url = {https://www.microsoft.com/en-us/research/publication/asirra-a-captcha-that-exploits-interest-aligned-manual-image-categorization/},
edition = {Proceedings of 14th ACM Conference on Computer and Communications Security (CCS)},
}


"celeb_a"

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image. The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark (or facial part) localization.

Features

FeaturesDict({
    'attributes': FeaturesDict({
        '5_o_Clock_Shadow': Tensor(shape=(), dtype=tf.bool),
        'Arched_Eyebrows': Tensor(shape=(), dtype=tf.bool),
        'Attractive': Tensor(shape=(), dtype=tf.bool),
        'Bags_Under_Eyes': Tensor(shape=(), dtype=tf.bool),
        'Bald': Tensor(shape=(), dtype=tf.bool),
        'Bangs': Tensor(shape=(), dtype=tf.bool),
        'Big_Lips': Tensor(shape=(), dtype=tf.bool),
        'Big_Nose': Tensor(shape=(), dtype=tf.bool),
        'Black_Hair': Tensor(shape=(), dtype=tf.bool),
        'Blond_Hair': Tensor(shape=(), dtype=tf.bool),
        'Blurry': Tensor(shape=(), dtype=tf.bool),
        'Brown_Hair': Tensor(shape=(), dtype=tf.bool),
        'Bushy_Eyebrows': Tensor(shape=(), dtype=tf.bool),
        'Chubby': Tensor(shape=(), dtype=tf.bool),
        'Double_Chin': Tensor(shape=(), dtype=tf.bool),
        'Eyeglasses': Tensor(shape=(), dtype=tf.bool),
        'Goatee': Tensor(shape=(), dtype=tf.bool),
        'Gray_Hair': Tensor(shape=(), dtype=tf.bool),
        'Heavy_Makeup': Tensor(shape=(), dtype=tf.bool),
        'High_Cheekbones': Tensor(shape=(), dtype=tf.bool),
        'Male': Tensor(shape=(), dtype=tf.bool),
        'Mouth_Slightly_Open': Tensor(shape=(), dtype=tf.bool),
        'Mustache': Tensor(shape=(), dtype=tf.bool),
        'Narrow_Eyes': Tensor(shape=(), dtype=tf.bool),
        'No_Beard': Tensor(shape=(), dtype=tf.bool),
        'Oval_Face': Tensor(shape=(), dtype=tf.bool),
        'Pale_Skin': Tensor(shape=(), dtype=tf.bool),
        'Pointy_Nose': Tensor(shape=(), dtype=tf.bool),
        'Receding_Hairline': Tensor(shape=(), dtype=tf.bool),
        'Rosy_Cheeks': Tensor(shape=(), dtype=tf.bool),
        'Sideburns': Tensor(shape=(), dtype=tf.bool),
        'Smiling': Tensor(shape=(), dtype=tf.bool),
        'Straight_Hair': Tensor(shape=(), dtype=tf.bool),
        'Wavy_Hair': Tensor(shape=(), dtype=tf.bool),
        'Wearing_Earrings': Tensor(shape=(), dtype=tf.bool),
        'Wearing_Hat': Tensor(shape=(), dtype=tf.bool),
        'Wearing_Lipstick': Tensor(shape=(), dtype=tf.bool),
        'Wearing_Necklace': Tensor(shape=(), dtype=tf.bool),
        'Wearing_Necktie': Tensor(shape=(), dtype=tf.bool),
        'Young': Tensor(shape=(), dtype=tf.bool),
    }),
    'image': Image(shape=(218, 178, 3), dtype=tf.uint8),
    'landmarks': FeaturesDict({
        'lefteye_x': Tensor(shape=(), dtype=tf.int64),
        'lefteye_y': Tensor(shape=(), dtype=tf.int64),
        'leftmouth_x': Tensor(shape=(), dtype=tf.int64),
        'leftmouth_y': Tensor(shape=(), dtype=tf.int64),
        'nose_x': Tensor(shape=(), dtype=tf.int64),
        'nose_y': Tensor(shape=(), dtype=tf.int64),
        'righteye_x': Tensor(shape=(), dtype=tf.int64),
        'righteye_y': Tensor(shape=(), dtype=tf.int64),
        'rightmouth_x': Tensor(shape=(), dtype=tf.int64),
        'rightmouth_y': Tensor(shape=(), dtype=tf.int64),
    }),
})

Statistics

Split Examples
ALL 202,599
TRAIN 162,770
TEST 19,962
VALIDATION 19,867

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@inproceedings{conf/iccv/LiuLWT15,
  added-at = {2018-10-09T00:00:00.000+0200},
  author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
  biburl = {https://www.bibsonomy.org/bibtex/250e4959be61db325d2f02c1d8cd7bfbb/dblp},
  booktitle = {ICCV},
  crossref = {conf/iccv/2015},
  ee = {http://doi.ieeecomputersociety.org/10.1109/ICCV.2015.425},
  interhash = {3f735aaa11957e73914bbe2ca9d5e702},
  intrahash = {50e4959be61db325d2f02c1d8cd7bfbb},
  isbn = {978-1-4673-8391-2},
  keywords = {dblp},
  pages = {3730-3738},
  publisher = {IEEE Computer Society},
  timestamp = {2018-10-11T11:43:28.000+0200},
  title = {Deep Learning Face Attributes in the Wild.},
  url = {http://dblp.uni-trier.de/db/conf/iccv/iccv2015.html#LiuLWT15},
  year = 2015
}


"celeb_a_hq"

High-quality version of the CELEBA dataset, consisting of 30000 images in 1024 x 1024 resolution.

WARNING: this dataset currently requires you to prepare images on your own.

celeb_a_hq is configured with tfds.image.celebahq.CelebaHQConfig and has the following configurations predefined (defaults to the first one):

  • "1024" (v0.1.0): CelebaHQ images in 1024 x 1024 resolution

  • "512" (v0.1.0): CelebaHQ images in 512 x 512 resolution

  • "256" (v0.1.0): CelebaHQ images in 256 x 256 resolution

  • "128" (v0.1.0): CelebaHQ images in 128 x 128 resolution

  • "64" (v0.1.0): CelebaHQ images in 64 x 64 resolution

  • "32" (v0.1.0): CelebaHQ images in 32 x 32 resolution

  • "16" (v0.1.0): CelebaHQ images in 16 x 16 resolution

  • "8" (v0.1.0): CelebaHQ images in 8 x 8 resolution

  • "4" (v0.1.0): CelebaHQ images in 4 x 4 resolution

  • "2" (v0.1.0): CelebaHQ images in 2 x 2 resolution

  • "1" (v0.1.0): CelebaHQ images in 1 x 1 resolution

"celeb_a_hq/1024"

FeaturesDict({
    'image': Image(shape=(1024, 1024, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/512"

FeaturesDict({
    'image': Image(shape=(512, 512, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/256"

FeaturesDict({
    'image': Image(shape=(256, 256, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/128"

FeaturesDict({
    'image': Image(shape=(128, 128, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/64"

FeaturesDict({
    'image': Image(shape=(64, 64, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/32"

FeaturesDict({
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/16"

FeaturesDict({
    'image': Image(shape=(16, 16, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/8"

FeaturesDict({
    'image': Image(shape=(8, 8, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/4"

FeaturesDict({
    'image': Image(shape=(4, 4, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/2"

FeaturesDict({
    'image': Image(shape=(2, 2, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

"celeb_a_hq/1"

FeaturesDict({
    'image': Image(shape=(1, 1, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
})

Statistics

Split Examples
TRAIN 30,000
ALL 30,000

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@article{DBLP:journals/corr/abs-1710-10196,
  author    = {Tero Karras and
               Timo Aila and
               Samuli Laine and
               Jaakko Lehtinen},
  title     = {Progressive Growing of GANs for Improved Quality, Stability, and Variation},
  journal   = {CoRR},
  volume    = {abs/1710.10196},
  year      = {2017},
  url       = {http://arxiv.org/abs/1710.10196},
  archivePrefix = {arXiv},
  eprint    = {1710.10196},
  timestamp = {Mon, 13 Aug 2018 16:46:42 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1710-10196},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}


"cifar10"

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

Features

FeaturesDict({
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})

Statistics

Split Examples
ALL 60,000
TRAIN 50,000
TEST 10,000

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@TECHREPORT{Krizhevsky09learningmultiple,
    author = {Alex Krizhevsky},
    title = {Learning multiple layers of features from tiny images},
    institution = {},
    year = {2009}
}


"cifar100"

This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs).

Features

FeaturesDict({
    'coarse_label': ClassLabel(shape=(), dtype=tf.int64, num_classes=20),
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=100),
})

Statistics

Split Examples
ALL 60,000
TRAIN 50,000
TEST 10,000

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@TECHREPORT{Krizhevsky09learningmultiple,
    author = {Alex Krizhevsky},
    title = {Learning multiple layers of features from tiny images},
    institution = {},
    year = {2009}
}


"coco2014"

COCO is a large-scale object detection, segmentation, and captioning dataset. This version contains images, bounding boxes and labels for the 2014 version. Note: * Some images from the train and validation sets don't have annotations. * The test split don't have any annotations (only images). * Coco defines 91 classes but the data only had 80 classes.

Features

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
    'objects': SequenceDict({
        'bbox': BBoxFeature(shape=(4,), dtype=tf.float32),
        'is_crowd': Tensor(shape=(), dtype=tf.bool),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=80),
    }),
})

Statistics

Split Examples
ALL 245,496
TRAIN 82,783
TEST2015 81,434
TEST 40,775
VALIDATION 40,504

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@article{DBLP:journals/corr/LinMBHPRDZ14,
  author    = {Tsung{-}Yi Lin and
               Michael Maire and
               Serge J. Belongie and
               Lubomir D. Bourdev and
               Ross B. Girshick and
               James Hays and
               Pietro Perona and
               Deva Ramanan and
               Piotr Doll{'{a}}r and
               C. Lawrence Zitnick},
  title     = {Microsoft {COCO:} Common Objects in Context},
  journal   = {CoRR},
  volume    = {abs/1405.0312},
  year      = {2014},
  url       = {http://arxiv.org/abs/1405.0312},
  archivePrefix = {arXiv},
  eprint    = {1405.0312},
  timestamp = {Mon, 13 Aug 2018 16:48:13 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/LinMBHPRDZ14},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}


"diabetic_retinopathy_detection"

A large set of high-resolution retina images taken under a variety of imaging conditions.

Features

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=5),
    'name': Text(shape=(), dtype=tf.string, encoder=None),
})

Statistics

Split Examples
ALL 88,712
TEST 53,576
TRAIN 35,126
SAMPLE 10

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@ONLINE {kaggle-diabetic-retinopathy,
    author = "Kaggle and EyePacs",
    title  = "Kaggle Diabetic Retinopathy Detection",
    month  = "jul",
    year   = "2015",
    url    = "https://www.kaggle.com/c/diabetic-retinopathy-detection/data"
}


"fashion_mnist"

Fashion-MNIST is a dataset of Zalando's article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.

Features

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})

Statistics

Split Examples
ALL 70,000
TRAIN 60,000
TEST 10,000

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@article{DBLP:journals/corr/abs-1708-07747,
  author    = {Han Xiao and
               Kashif Rasul and
               Roland Vollgraf},
  title     = {Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning
               Algorithms},
  journal   = {CoRR},
  volume    = {abs/1708.07747},
  year      = {2017},
  url       = {http://arxiv.org/abs/1708.07747},
  archivePrefix = {arXiv},
  eprint    = {1708.07747},
  timestamp = {Mon, 13 Aug 2018 16:47:27 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1708-07747},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}


"image_label_folder"

Generic image classification dataset.

Features

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=None),
})

Statistics

None computed

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation



"imagenet2012"

ILSVRC 2012, aka ImageNet is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.

Features

FeaturesDict({
    'file_name': Text(shape=(), dtype=tf.string, encoder=None),
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=1000),
})

Statistics

Split Examples
ALL 1,331,167
TRAIN 1,281,167
VALIDATION 50,000

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@article{ILSVRC15,
Author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
Title = { {ImageNet Large Scale Visual Recognition Challenge}},
Year = {2015},
journal   = {International Journal of Computer Vision (IJCV)},
doi = {10.1007/s11263-015-0816-y},
volume={115},
number={3},
pages={211-252}
}


"lsun"

Large scale images showing different objects from given categories like bedroom, tower etc.

lsun is configured with tfds.image.lsun.BuilderConfig and has the following configurations predefined (defaults to the first one):

  • "classroom" (v0.1.1): Classroom images.

  • "bedroom" (v0.1.1): Bedroom images.

"lsun/classroom"

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
})

"lsun/bedroom"

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
})

Statistics

Split Examples
ALL 3,033,342
TRAIN 3,033,042
VALIDATION 300

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@article{journals/corr/YuZSSX15,
  added-at = {2018-08-13T00:00:00.000+0200},
  author = {Yu, Fisher and Zhang, Yinda and Song, Shuran and Seff, Ari and Xiao, Jianxiong},
  biburl = {https://www.bibsonomy.org/bibtex/2446d4ffb99a5d7d2ab6e5417a12e195f/dblp},
  ee = {http://arxiv.org/abs/1506.03365},
  interhash = {3e9306c4ce2ead125f3b2ab0e25adc85},
  intrahash = {446d4ffb99a5d7d2ab6e5417a12e195f},
  journal = {CoRR},
  keywords = {dblp},
  timestamp = {2018-08-14T15:08:59.000+0200},
  title = {LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop.},
  url = {http://dblp.uni-trier.de/db/journals/corr/corr1506.html#YuZSSX15},
  volume = {abs/1506.03365},
  year = 2015
}


"mnist"

The MNIST database of handwritten digits.

Features

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})

Statistics

Split Examples
ALL 70,000
TRAIN 60,000
TEST 10,000

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@article{lecun2010mnist,
  title={MNIST handwritten digit database},
  author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
  journal={ATT Labs [Online]. Available: http://yann. lecun. com/exdb/mnist},
  volume={2},
  year={2010}
}


"omniglot"

Omniglot data set for one-shot learning. This dataset contains 1623 different handwritten characters from 50 different alphabets.

Features

FeaturesDict({
    'alphabet': ClassLabel(shape=(), dtype=tf.int64, num_classes=50),
    'alphabet_char_id': Tensor(shape=(), dtype=tf.int64),
    'image': Image(shape=(105, 105, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=1623),
})

Statistics

Split Examples
ALL 38,300
TRAIN 19,280
TEST 13,180
SMALL2 3,120
SMALL1 2,720

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@article{lake2015human,
  title={Human-level concept learning through probabilistic program induction},
  author={Lake, Brenden M and Salakhutdinov, Ruslan and Tenenbaum, Joshua B},
  journal={Science},
  volume={350},
  number={6266},
  pages={1332--1338},
  year={2015},
  publisher={American Association for the Advancement of Science}
}


"open_images_v4"

Open Images is a dataset of ~9M images that have been annotated with image-level labels and object bounding boxes.

The training set of V4 contains 14.6M bounding boxes for 600 object classes on 1.74M images, making it the largest existing dataset with object location annotations. The boxes have been largely manually drawn by professional annotators to ensure accuracy and consistency. The images are very diverse and often contain complex scenes with several objects (8.4 per image on average). Moreover, the dataset is annotated with image-level labels spanning thousands of classes.

Features

FeaturesDict({
    'bobjects': SequenceDict({
        'bbox': BBoxFeature(shape=(4,), dtype=tf.float32),
        'is_depiction': Tensor(shape=(), dtype=tf.int8),
        'is_group_of': Tensor(shape=(), dtype=tf.int8),
        'is_inside': Tensor(shape=(), dtype=tf.int8),
        'is_occluded': Tensor(shape=(), dtype=tf.int8),
        'is_truncated': Tensor(shape=(), dtype=tf.int8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=19995),
        'source': ClassLabel(shape=(), dtype=tf.int64, num_classes=6),
    }),
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'image/filename': Text(shape=(), dtype=tf.string, encoder=None),
    'objects': SequenceDict({
        'confidence': Tensor(shape=(), dtype=tf.int32),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=19995),
        'source': ClassLabel(shape=(), dtype=tf.int64, num_classes=6),
    }),
})

Statistics

Split Examples
ALL 1,910,098
TRAIN 1,743,042
TEST 125,436
VALIDATION 41,620

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@article{OpenImages,
  author = {Alina Kuznetsova and
            Hassan Rom and
            Neil Alldrin and
            Jasper Uijlings and
            Ivan Krasin and
            Jordi Pont-Tuset and
            Shahab Kamali and
            Stefan Popov and
            Matteo Malloci and
            Tom Duerig and
            Vittorio Ferrari},
  title = {The Open Images Dataset V4: Unified image classification,
           object detection, and visual relationship detection at scale},
  year = {2018},
  journal = {arXiv:1811.00982}
}
@article{OpenImages2,
  author = {Krasin, Ivan and
            Duerig, Tom and
            Alldrin, Neil and
            Ferrari, Vittorio
            and Abu-El-Haija, Sami and
            Kuznetsova, Alina and
            Rom, Hassan and
            Uijlings, Jasper and
            Popov, Stefan and
            Kamali, Shahab and
            Malloci, Matteo and
            Pont-Tuset, Jordi and
            Veit, Andreas and
            Belongie, Serge and
            Gomes, Victor and
            Gupta, Abhinav and
            Sun, Chen and
            Chechik, Gal and
            Cai, David and
            Feng, Zheyun and
            Narayanan, Dhyanesh and
            Murphy, Kevin},
  title = {OpenImages: A public dataset for large-scale multi-label and
           multi-class image classification.},
  journal = {Dataset available from
             https://storage.googleapis.com/openimages/web/index.html},
  year={2017}
}


"quickdraw_bitmap"

The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. The bitmap dataset contains these drawings converted from vector format into 28x28 grayscale images

Features

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=345),
})

Statistics

Split Examples
TRAIN 50,426,266
ALL 50,426,266

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

A Neural Representation of Sketch Drawings, D. Ha and D. Eck, arXiv:1704.03477v4, 2017.

"svhn_cropped"

The Street View House Numbers (SVHN) Dataset is an image digit recognition dataset of over 600,000 digit images coming from real world data. Images are cropped to 32x32.

Features

FeaturesDict({
    'image': Image(shape=(32, 32, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=10),
})

Statistics

Split Examples
ALL 630,420
EXTRA 531,131
TRAIN 73,257
TEST 26,032

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@article{Netzer2011,
author = {Netzer, Yuval and Wang, Tao and Coates, Adam and Bissacco, Alessandro and Wu, Bo and Ng, Andrew Y},
booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
title = {Reading Digits in Natural Images with Unsupervised Feature Learning},
year = {2011}
}


"tf_flowers"

A large set of images of flowers

Features

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=5),
})

Statistics

Split Examples
TRAIN 3,670
ALL 3,670

Urls

Supervised keys (for as_supervised=True)

(u'image', u'label')

Citation

@ONLINE {tfflowers,
author = "The TensorFlow Team",
title = "Flowers",
month = "jan",
year = "2019",
url = "http://download.tensorflow.org/example_images/flower_photos.tgz" }


text

"imdb_reviews"

Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing.

imdb_reviews is configured with tfds.text.imdb.IMDBReviewsConfig and has the following configurations predefined (defaults to the first one):

  • "plain_text" (v0.0.1): Plain text

  • "bytes" (v0.0.1): Uses byte-level text encoding with tfds.features.text.ByteTextEncoder

  • "subwords8k" (v0.0.1): Uses tfds.features.text.SubwordTextEncoder with 8k vocab size

  • "subwords32k" (v0.0.1): Uses tfds.features.text.SubwordTextEncoder with 32k vocab size

"imdb_reviews/plain_text"

FeaturesDict({
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'text': Text(shape=(), dtype=tf.string, encoder=None),
})

"imdb_reviews/bytes"

FeaturesDict({
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'text': Text(shape=(None,), dtype=tf.int64, encoder=<ByteTextEncoder vocab_size=257>),
})

"imdb_reviews/subwords8k"

FeaturesDict({
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'text': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8185>),
})

"imdb_reviews/subwords32k"

FeaturesDict({
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=2),
    'text': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=32650>),
})

Statistics

Split Examples
ALL 50,000
TRAIN 25,000
TEST 25,000

Urls

Supervised keys (for as_supervised=True)

(u'text', u'label')

Citation

@InProceedings{maas-EtAl:2011:ACL-HLT2011,
  author    = {Maas, Andrew L.  and  Daly, Raymond E.  and  Pham, Peter T.  and  Huang, Dan  and  Ng, Andrew Y.  and  Potts, Christopher},
  title     = {Learning Word Vectors for Sentiment Analysis},
  booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
  month     = {June},
  year      = {2011},
  address   = {Portland, Oregon, USA},
  publisher = {Association for Computational Linguistics},
  pages     = {142--150},
  url       = {http://www.aclweb.org/anthology/P11-1015}
}


"lm1b"

A benchmark corpus to be used for measuring progress in statistical language modeling. This has almost one billion words in the training data.

lm1b is configured with tfds.text.lm1b.Lm1bConfig and has the following configurations predefined (defaults to the first one):

  • "plain_text" (v0.0.1): Plain text

  • "bytes" (v0.0.1): Uses byte-level text encoding with tfds.features.text.ByteTextEncoder

  • "subwords8k" (v0.0.1): Uses tfds.features.text.SubwordTextEncoder with 8k vocab size

  • "subwords32k" (v0.0.1): Uses tfds.features.text.SubwordTextEncoder with 32k vocab size

"lm1b/plain_text"

FeaturesDict({
    'text': Text(shape=(), dtype=tf.string, encoder=None),
})

"lm1b/bytes"

FeaturesDict({
    'text': Text(shape=(None,), dtype=tf.int64, encoder=<ByteTextEncoder vocab_size=257>),
})

"lm1b/subwords8k"

FeaturesDict({
    'text': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8189>),
})

"lm1b/subwords32k"

FeaturesDict({
    'text': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=32711>),
})

Statistics

Split Examples
ALL 30,607,716
TRAIN 30,301,028
TEST 306,688

Urls

Supervised keys (for as_supervised=True)

(u'text', u'text')

Citation

@article{DBLP:journals/corr/ChelbaMSGBK13,
  author    = {Ciprian Chelba and
               Tomas Mikolov and
               Mike Schuster and
               Qi Ge and
               Thorsten Brants and
               Phillipp Koehn},
  title     = {One Billion Word Benchmark for Measuring Progress in Statistical Language
               Modeling},
  journal   = {CoRR},
  volume    = {abs/1312.3005},
  year      = {2013},
  url       = {http://arxiv.org/abs/1312.3005},
  archivePrefix = {arXiv},
  eprint    = {1312.3005},
  timestamp = {Mon, 13 Aug 2018 16:46:16 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/ChelbaMSGBK13},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}


"squad"

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

squad is configured with tfds.text.squad.SquadConfig and has the following configurations predefined (defaults to the first one):

  • "plain_text" (v0.0.1): Plain text

  • "bytes" (v0.0.1): Uses byte-level text encoding with tfds.features.text.ByteTextEncoder

  • "subwords8k" (v0.0.1): Uses tfds.features.text.SubwordTextEncoder with 8k vocab size

  • "subwords32k" (v0.0.2): Uses tfds.features.text.SubwordTextEncoder with 32k vocab size

"squad/plain_text"

FeaturesDict({
    'context': Text(shape=(), dtype=tf.string, encoder=None),
    'first_answer': Text(shape=(), dtype=tf.string, encoder=None),
    'question': Text(shape=(), dtype=tf.string, encoder=None),
})

"squad/bytes"

FeaturesDict({
    'context': Text(shape=(None,), dtype=tf.int64, encoder=<ByteTextEncoder vocab_size=257>),
    'first_answer': Text(shape=(None,), dtype=tf.int64, encoder=<ByteTextEncoder vocab_size=257>),
    'question': Text(shape=(None,), dtype=tf.int64, encoder=<ByteTextEncoder vocab_size=257>),
})

"squad/subwords8k"

FeaturesDict({
    'context': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8190>),
    'first_answer': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8190>),
    'question': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8190>),
})

"squad/subwords32k"

FeaturesDict({
    'context': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=32953>),
    'first_answer': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=32953>),
    'question': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=32953>),
})

Statistics

Split Examples
ALL 98,169
TRAIN 87,599
VALIDATION 10,570

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@article{2016arXiv160605250R,
       author = { {Rajpurkar}, Pranav and {Zhang}, Jian and {Lopyrev},
                 Konstantin and {Liang}, Percy},
        title = "{SQuAD: 100,000+ Questions for Machine Comprehension of Text}",
      journal = {arXiv e-prints},
         year = 2016,
          eid = {arXiv:1606.05250},
        pages = {arXiv:1606.05250},
archivePrefix = {arXiv},
       eprint = {1606.05250},
}


translate

"wmt_translate_ende"

Translate dataset based on the data from statmt.org.

wmt_translate_ende is configured with tfds.translate.wmt_ende.WMTConfig and has the following configurations predefined (defaults to the first one):

  • "ende_plain_text_t2t" (v0.0.1): Translation dataset from en to de, uses encoder plain_text. It uses the following data files (see the code for exact contents): {"dev": ["wmt17_newstest13"], "train": ["wmt18_news_commentary_ende", "wmt13_commoncrawl_ende", "wmt13_europarl_ende"]}.

  • "ende_subwords8k_t2t" (v0.0.1): Translation dataset from en to de, uses encoder subwords8k. It uses the following data files (see the code for exact contents): {"dev": ["wmt17_newstest13"], "train": ["wmt18_news_commentary_ende", "wmt13_commoncrawl_ende", "wmt13_europarl_ende"]}.

"wmt_translate_ende/ende_plain_text_t2t"

FeaturesDict({
    'de': Text(shape=(), dtype=tf.string, encoder=None),
    'en': Text(shape=(), dtype=tf.string, encoder=None),
})

"wmt_translate_ende/ende_subwords8k_t2t"

FeaturesDict({
    'de': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8267>),
    'en': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8216>),
})

Statistics

Split Examples
ALL 4,595,289
TRAIN 4,592,289
VALIDATION 3,000

Urls

Supervised keys (for as_supervised=True)

(u'en', u'de')

Citation

@InProceedings{bojar-EtAl:2018:WMT1,
  author    = {Bojar, Ond
{r}ej  and  Federmann, Christian  and  Fishel, Mark
    and Graham, Yvette  and  Haddow, Barry  and  Huck, Matthias  and
    Koehn, Philipp  and  Monz, Christof},
  title     = {Findings of the 2018 Conference on Machine Translation (WMT18)},
  booktitle = {Proceedings of the Third Conference on Machine Translation,
    Volume 2: Shared Task Papers},
  month     = {October},
  year      = {2018},
  address   = {Belgium, Brussels},
  publisher = {Association for Computational Linguistics},
  pages     = {272--307},
  url       = {http://www.aclweb.org/anthology/W18-6401}
}


"wmt_translate_enfr"

Translate dataset based on the data from statmt.org.

wmt_translate_enfr is configured with tfds.translate.wmt_enfr.WMTConfig and has the following configurations predefined (defaults to the first one):

  • "enfr_plain_text_t2t_small" (v0.0.1): Translation dataset from en to fr, uses encoder plain_text. It uses the following data files (see the code for exact contents): {"dev": ["opennmt_1M_enfr_valid"], "train": ["opennmt_1M_enfr_train"]}.

  • "enfr_subwords8k_t2t_small" (v0.0.1): Translation dataset from en to fr, uses encoder subwords8k. It uses the following data files (see the code for exact contents): {"dev": ["opennmt_1M_enfr_valid"], "train": ["opennmt_1M_enfr_train"]}.

  • "enfr_plain_text_t2t_large" (v0.0.1): Translation dataset from en to fr, uses encoder plain_text. It uses the following data files (see the code for exact contents): {"dev": ["wmt17_newstest13"], "train": ["wmt13_commoncrawl_enfr", "wmt13_europarl_enfr", "wmt14_news_commentary_enfr", "wmt13_undoc_enfr"]}.

  • "enfr_subwords8k_t2t_large" (v0.0.1): Translation dataset from en to fr, uses encoder subwords8k. It uses the following data files (see the code for exact contents): {"dev": ["wmt17_newstest13"], "train": ["wmt13_commoncrawl_enfr", "wmt13_europarl_enfr", "wmt14_news_commentary_enfr", "wmt13_undoc_enfr"]}.

"wmt_translate_enfr/enfr_plain_text_t2t_small"

FeaturesDict({
    'en': Text(shape=(), dtype=tf.string, encoder=None),
    'fr': Text(shape=(), dtype=tf.string, encoder=None),
})

"wmt_translate_enfr/enfr_subwords8k_t2t_small"

FeaturesDict({
    'en': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8215>),
    'fr': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8150>),
})

"wmt_translate_enfr/enfr_plain_text_t2t_large"

FeaturesDict({
    'en': Text(shape=(), dtype=tf.string, encoder=None),
    'fr': Text(shape=(), dtype=tf.string, encoder=None),
})

"wmt_translate_enfr/enfr_subwords8k_t2t_large"

FeaturesDict({
    'en': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8222>),
    'fr': Text(shape=(None,), dtype=tf.int64, encoder=<SubwordTextEncoder vocab_size=8181>),
})

Statistics

Split Examples
ALL 18,319,500
TRAIN 18,316,500
VALIDATION 3,000

Urls

Supervised keys (for as_supervised=True)

(u'en', u'fr')

Citation

@InProceedings{bojar-EtAl:2018:WMT1,
  author    = {Bojar, Ond
{r}ej  and  Federmann, Christian  and  Fishel, Mark
    and Graham, Yvette  and  Haddow, Barry  and  Huck, Matthias  and
    Koehn, Philipp  and  Monz, Christof},
  title     = {Findings of the 2018 Conference on Machine Translation (WMT18)},
  booktitle = {Proceedings of the Third Conference on Machine Translation,
    Volume 2: Shared Task Papers},
  month     = {October},
  year      = {2018},
  address   = {Belgium, Brussels},
  publisher = {Association for Computational Linguistics},
  pages     = {272--307},
  url       = {http://www.aclweb.org/anthology/W18-6401}
}


video

"bair_robot_pushing_small"

This data set contains roughly 44,000 examples of robot pushing motions, including one training set (train) and two test sets of previously seen (testseen) and unseen (testnovel) objects. This is the small 64x64 version.

Features

SequenceDict({
    'action': Tensor(shape=(4,), dtype=tf.float32),
    'endeffector_pos': Tensor(shape=(3,), dtype=tf.float32),
    'image_aux1': Image(shape=(64, 64, 3), dtype=tf.uint8),
    'image_main': Image(shape=(64, 64, 3), dtype=tf.uint8),
})

Statistics

Split Examples
ALL 43,520
TRAIN 43,264
TEST 256

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@inproceedings{conf/nips/FinnGL16,
  added-at = {2016-12-16T00:00:00.000+0100},
  author = {Finn, Chelsea and Goodfellow, Ian J. and Levine, Sergey},
  biburl = {https://www.bibsonomy.org/bibtex/230073873b4fe43b314724b772d0f9256/dblp},
  booktitle = {NIPS},
  crossref = {conf/nips/2016},
  editor = {Lee, Daniel D. and Sugiyama, Masashi and Luxburg, Ulrike V. and Guyon, Isabelle and Garnett, Roman},
  ee = {http://papers.nips.cc/paper/6161-unsupervised-learning-for-physical-interaction-through-video-prediction},
  interhash = {2e6b416723704f4aa5ad0686ce5a3593},
  intrahash = {30073873b4fe43b314724b772d0f9256},
  keywords = {dblp},
  pages = {64-72},
  timestamp = {2016-12-17T11:33:40.000+0100},
  title = {Unsupervised Learning for Physical Interaction through Video Prediction.},
  url = {http://dblp.uni-trier.de/db/conf/nips/nips2016.html#FinnGL16},
  year = 2016
}


"moving_mnist"

Moving variant of MNIST database of handwritten digits. This is the data used by the authors for reporting model performance. See tfds.video.moving_mnist.image_as_moving_sequence for generating training/validation data from the MNIST dataset.

Features

FeaturesDict({
    'image_sequence': Video(shape=(20, 64, 64, 1), dtype=tf.uint8, feature=Image(shape=(64, 64, 1), dtype=tf.uint8)),
})

Statistics

Split Examples
TEST 10,000
ALL 10,000

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@article{DBLP:journals/corr/SrivastavaMS15,
  author    = {Nitish Srivastava and
               Elman Mansimov and
               Ruslan Salakhutdinov},
  title     = {Unsupervised Learning of Video Representations using LSTMs},
  journal   = {CoRR},
  volume    = {abs/1502.04681},
  year      = {2015},
  url       = {http://arxiv.org/abs/1502.04681},
  archivePrefix = {arXiv},
  eprint    = {1502.04681},
  timestamp = {Mon, 13 Aug 2018 16:47:05 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/SrivastavaMS15},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}


"starcraft_video"

This data set contains videos generated from Starcraft.

starcraft_video is configured with tfds.video.starcraft.StarcraftVideoConfig and has the following configurations predefined (defaults to the first one):

  • "brawl_64" (v0.1.2): Brawl map with 64x64 resolution.

  • "brawl_128" (v0.1.2): Brawl map with 128x128 resolution.

  • "collect_mineral_shards_64" (v0.1.2): CollectMineralShards map with 64x64 resolution.

  • "collect_mineral_shards_128" (v0.1.2): CollectMineralShards map with 128x128 resolution.

  • "move_unit_to_border_64" (v0.1.2): MoveUnitToBorder map with 64x64 resolution.

  • "move_unit_to_border_128" (v0.1.2): MoveUnitToBorder map with 128x128 resolution.

  • "road_trip_with_medivac_64" (v0.1.2): RoadTripWithMedivac map with 64x64 resolution.

  • "road_trip_with_medivac_128" (v0.1.2): RoadTripWithMedivac map with 128x128 resolution.

"starcraft_video/brawl_64"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 64, 64, 3), dtype=tf.uint8, feature=Image(shape=(64, 64, 3), dtype=tf.uint8)),
})

"starcraft_video/brawl_128"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 128, 128, 3), dtype=tf.uint8, feature=Image(shape=(128, 128, 3), dtype=tf.uint8)),
})

"starcraft_video/collect_mineral_shards_64"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 64, 64, 3), dtype=tf.uint8, feature=Image(shape=(64, 64, 3), dtype=tf.uint8)),
})

"starcraft_video/collect_mineral_shards_128"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 128, 128, 3), dtype=tf.uint8, feature=Image(shape=(128, 128, 3), dtype=tf.uint8)),
})

"starcraft_video/move_unit_to_border_64"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 64, 64, 3), dtype=tf.uint8, feature=Image(shape=(64, 64, 3), dtype=tf.uint8)),
})

"starcraft_video/move_unit_to_border_128"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 128, 128, 3), dtype=tf.uint8, feature=Image(shape=(128, 128, 3), dtype=tf.uint8)),
})

"starcraft_video/road_trip_with_medivac_64"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 64, 64, 3), dtype=tf.uint8, feature=Image(shape=(64, 64, 3), dtype=tf.uint8)),
})

"starcraft_video/road_trip_with_medivac_128"

FeaturesDict({
    'rgb_screen': Video(shape=(None, 128, 128, 3), dtype=tf.uint8, feature=Image(shape=(128, 128, 3), dtype=tf.uint8)),
})

Statistics

Split Examples
ALL 14,000
TRAIN 10,000
VALIDATION 2,000
TEST 2,000

Urls

Supervised keys (for as_supervised=True)

(u'', u'')

Citation

@article{DBLP:journals/corr/abs-1812-01717,
  author    = {Thomas Unterthiner and
               Sjoerd van Steenkiste and
               Karol Kurach and
               Rapha{"{e}}l Marinier and
               Marcin Michalski and
               Sylvain Gelly},
  title     = {Towards Accurate Generative Models of Video: {A} New Metric and
               Challenges},
  journal   = {CoRR},
  volume    = {abs/1812.01717},
  year      = {2018},
  url       = {http://arxiv.org/abs/1812.01717},
  archivePrefix = {arXiv},
  eprint    = {1812.01717},
  timestamp = {Tue, 01 Jan 2019 15:01:25 +0100},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1812-01717},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}