TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

stl10

Description:

The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It is inspired by the CIFAR-10 dataset but with some modifications. In particular, each class has fewer labeled training examples than in CIFAR-10, but a very large set of unlabeled examples is provided to learn image models prior to supervised training. The primary challenge is to make use of the unlabeled data (which comes from a similar but different distribution from the labeled data) to build a useful prior. All images were acquired from labeled examples on ImageNet.

Additional Documentation: Explore on Papers With Code
Homepage: http://ai.stanford.edu/~acoates/stl10/
Source code: tfds.datasets.stl10.Builder
Versions:
- 1.0.0 (default): No release notes.
Download size: 2.46 GiB
Dataset size: 1.86 GiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'test'`	8,000
`'train'`	5,000
`'unlabelled'`	100,000

Feature structure:

FeaturesDict({
    'image': Image(shape=(96, 96, 3), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(96, 96, 3)	uint8
label	ClassLabel		int64

Supervised keys (See as_supervised doc): ('image', 'label')
Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

Citation:

@inproceedings{coates2011stl10,
  title={ {An Analysis of Single Layer Networks in Unsupervised Feature Learning} },
  author={Coates, Adam and Ng, Andrew and Lee, Honglak},
  booktitle={AISTATS},
  year={2011},
  note = {\url{https://cs.stanford.edu/~acoates/papers/coatesleeng_aistats_2011.pdf} },
}