Missed TensorFlow Dev Summit? Check out the video playlist. Watch recordings

imagenet2012_subset

  • Description:

Imagenet2012Subset is a subset of original ImageNet ILSVRC 2012 dataset. The dataset share the same validation set as the original ImageNet ILSVRC 2012 dataset. However, the training set is subsampled in a label balanced fashion. In 1pct configuration, 1%, or 12811, images are sampled, most classes have the same number of images (average 12.8), some classes randomly have 1 more example than others; and in 10pct configuration, ~10%, or 128116, most classes have the same number of images (average 128), and some classes randomly have 1 more example than others.

This is supposed to be used as a benchmark for semi-supervised learning, and has been originally used in SimCLR paper (https://arxiv.org/abs/2002.05709).

FeaturesDict({
    'file_name': Text(shape=(), dtype=tf.string),
    'image': Image(shape=(None, None, 3), dtype=tf.uint8),
    'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=1000),
})
@article{chen2020simple,
  title={A Simple Framework for Contrastive Learning of Visual Representations},
  author={Chen, Ting and Kornblith, Simon and Norouzi, Mohammad and Hinton, Geoffrey},
  journal={arXiv preprint arXiv:2002.05709},
  year={2020}
}
@article{ILSVRC15,
  Author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
  Title = { {ImageNet Large Scale Visual Recognition Challenge} },
  Year = {2015},
  journal   = {International Journal of Computer Vision (IJCV)},
  doi = {10.1007/s11263-015-0816-y},
  volume={115},
  number={3},
  pages={211-252}
}

imagenet2012_subset/1pct (default config)

  • Config description: 1pct of total ImageNet training set.
  • Download size: 254.22 KiB
  • Dataset size: 7.61 GiB
  • Splits:
Split Examples
'train' 12,811
'validation' 50,000

imagenet2012_subset/10pct

  • Config description: 10pct of total ImageNet training set.
  • Download size: 2.48 MiB
  • Dataset size: 19.91 GiB
  • Splits:
Split Examples
'train' 128,116
'validation' 50,000