TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

voc

Description:

This dataset contains the data from the PASCAL Visual Object Classes Challenge, corresponding to the Classification and Detection competitions.

In the Classification competition, the goal is to predict the set of labels contained in the image, while in the Detection competition the goal is to predict the bounding box and label of each individual object. WARNING: As per the official dataset, the test set of VOC2012 does not contain annotations.

Additional Documentation: Explore on Papers With Code
Source code: tfds.object_detection.Voc
Versions:
- 4.0.0 (default): No release notes.
Auto-cached (documentation): No
Feature structure:

FeaturesDict({
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image/filename': Text(shape=(), dtype=string),
    'labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=20)),
    'labels_no_difficult': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=20)),
    'objects': Sequence({
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'is_difficult': bool,
        'is_truncated': bool,
        'label': ClassLabel(shape=(), dtype=int64, num_classes=20),
        'pose': ClassLabel(shape=(), dtype=int64, num_classes=5),
    }),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
image	Image	(None, None, 3)	uint8
image/filename	Text		string
labels	Sequence(ClassLabel)	(None,)	int64
labels_no_difficult	Sequence(ClassLabel)	(None,)	int64
objects	Sequence
objects/bbox	BBoxFeature	(4,)	float32
objects/is_difficult	Tensor		bool
objects/is_truncated	Tensor		bool
objects/label	ClassLabel		int64
objects/pose	ClassLabel		int64

Supervised keys (See as_supervised doc): None

voc/2007 (default config)

Config description: This dataset contains the data from the PASCAL Visual Object Classes Challenge 2007, a.k.a. VOC2007.

A total of 9963 images are included in this dataset, where each image contains a set of objects, out of 20 different classes, making a total of 24640 annotated objects.

Homepage: http://host.robots.ox.ac.uk/pascal/VOC/voc2007/
Download size: 868.85 MiB
Dataset size: 837.73 MiB
Splits:

Split	Examples
`'test'`	4,952
`'train'`	2,501
`'validation'`	2,510

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

Citation:

@misc{pascal-voc-2007,
    author = "Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.",
    title = "The {PASCAL} {V}isual {O}bject {C}lasses {C}hallenge 2007 {(VOC2007)} {R}esults",
    howpublished = "http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html"}

voc/2012

Config description: This dataset contains the data from the PASCAL Visual Object Classes Challenge 2012, a.k.a. VOC2012.

A total of 11540 images are included in this dataset, where each image contains a set of objects, out of 20 different classes, making a total of 27450 annotated objects.

Homepage: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/
Download size: 3.59 GiB
Dataset size: 2.44 GiB
Splits:

Split	Examples
`'test'`	10,991
`'train'`	5,717
`'validation'`	5,823

Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

Citation:

@misc{pascal-voc-2012,
    author = "Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.",
    title = "The {PASCAL} {V}isual {O}bject {C}lasses {C}hallenge 2012 {(VOC2012)} {R}esults",
    howpublished = "http://www.pascal-network.org/challenges/VOC/voc2012/workshop/index.html"}