TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

emnist

Description:

The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset.

Additional Documentation: Explore on Papers With Code
Homepage: https://www.nist.gov/itl/products-and-services/emnist-dataset
Source code: tfds.image_classification.EMNIST
Versions:
- 3.0.0: New split API (https://tensorflow.org/datasets/splits)
- 3.1.0 (default): Updated broken download URL
Download size: 535.73 MiB
Supervised keys (See as_supervised doc): ('image', 'label')
Citation:

@article{cohen_afshar_tapson_schaik_2017,
    title={EMNIST: Extending MNIST to handwritten letters},
    DOI={10.1109/ijcnn.2017.7966217},
    journal={2017 International Joint Conference on Neural Networks (IJCNN)},
    author={Cohen, Gregory and Afshar, Saeed and Tapson, Jonathan and Schaik, Andre Van},
    year={2017}
}

emnist/byclass (default config)

Config description: EMNIST ByClass
Dataset size: 349.16 MiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'test'`	116,323
`'train'`	697,932

Feature structure:

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=62),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(28, 28, 1)	uint8
label	ClassLabel		int64

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

emnist/bymerge

Config description: EMNIST ByMerge
Dataset size: 349.16 MiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'test'`	116,323
`'train'`	697,932

Feature structure:

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=47),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(28, 28, 1)	uint8
label	ClassLabel		int64

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

emnist/balanced

Config description: EMNIST Balanced
Dataset size: 56.63 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	18,800
`'train'`	112,800

Feature structure:

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=47),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(28, 28, 1)	uint8
label	ClassLabel		int64

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

emnist/letters

Config description: EMNIST Letters
Dataset size: 44.14 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	14,800
`'train'`	88,800

Feature structure:

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=37),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(28, 28, 1)	uint8
label	ClassLabel		int64

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

emnist/digits

Config description: EMNIST Digits
Dataset size: 120.32 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	40,000
`'train'`	240,000

Feature structure:

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(28, 28, 1)	uint8
label	ClassLabel		int64

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

emnist/mnist

Config description: EMNIST MNIST
Dataset size: 30.09 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'test'`	10,000
`'train'`	60,000

Feature structure:

FeaturesDict({
    'image': Image(shape=(28, 28, 1), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(28, 28, 1)	uint8
label	ClassLabel		int64

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):