TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

segment_anything

Description:

SA-1B Download

Segment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in the paper "Segment Anything".

The SA-1B dataset consists of 11M diverse, high-resolution, licensed, and privacy-protecting images and 1.1B mask annotations. Masks are given in the COCO run-length encoding (RLE) format, and do not have classes.

The license is custom. Please, read the full terms and conditions on https://ai.facebook.com/datasets/segment-anything-downloads

All the features are in the original dataset except image.content (content of the image).

You can decode segmentation masks with:

import tensorflow_datasets as tfds

pycocotools = tfds.core.lazy_imports.pycocotools

ds = tfds.load('segment_anything', split='train')
for example in tfds.as_numpy(ds):
  segmentation = example['annotations']['segmentation']
  for counts, size in zip(segmentation['counts'], segmentation['size']):
    encoded_mask = {'size': size, 'counts': counts}
    mask = pycocotools.decode(encoded_mask)  # np.array(dtype=uint8) mask
    ...

Homepage: https://ai.facebook.com/datasets/segment-anything-downloads
Source code: tfds.datasets.segment_anything.Builder
Versions:
- 1.0.0 (default): Initial release.
Download size: 10.28 TiB
Dataset size: 10.59 TiB
Manual download instructions: This dataset requires you to download the source data manually into download_config.manual_dir (defaults to ~/tensorflow_datasets/downloads/manual/):
Download the links file from https://ai.facebook.com/datasets/segment-anything-downloads manual_dir should contain the links file saved as segment_anything_links.txt.
Auto-cached (documentation): No
Splits:

Split	Examples
`'train'`	11,185,362

Feature structure:

FeaturesDict({
    'annotations': Sequence({
        'area': Scalar(shape=(), dtype=uint64, description=The area in pixels of the mask.),
        'bbox': BBoxFeature(shape=(4,), dtype=float32, description=The box around the mask, in TFDS format.),
        'crop_box': BBoxFeature(shape=(4,), dtype=float32, description=The crop of the image used to generate the mask, in TFDS format.),
        'id': Scalar(shape=(), dtype=uint64, description=Identifier for the annotation.),
        'point_coords': Tensor(shape=(1, 2), dtype=float64, description=The point coordinates input to the model to generate the mask.),
        'predicted_iou': Scalar(shape=(), dtype=float64, description=The model's own prediction of the mask's quality.),
        'segmentation': FeaturesDict({
            'counts': string,
            'size': Tensor(shape=(2,), dtype=uint64),
        }),
        'stability_score': Scalar(shape=(), dtype=float64, description=A measure of the mask's quality.),
    }),
    'image': FeaturesDict({
        'content': Image(shape=(None, None, 3), dtype=uint8, description=Content of the image.),
        'file_name': string,
        'height': uint64,
        'image_id': uint64,
        'width': uint64,
    }),
})

Feature documentation:

Feature	Class	Shape	Dtype	Description
	FeaturesDict
annotations	Sequence
annotations/area	Scalar		uint64	The area in pixels of the mask.
annotations/bbox	BBoxFeature	(4,)	float32	The box around the mask, in TFDS format.
annotations/crop_box	BBoxFeature	(4,)	float32	The crop of the image used to generate the mask, in TFDS format.
annotations/id	Scalar		uint64	Identifier for the annotation.
annotations/point_coords	Tensor	(1, 2)	float64	The point coordinates input to the model to generate the mask.
annotations/predicted_iou	Scalar		float64	The model's own prediction of the mask's quality.
annotations/segmentation	FeaturesDict			Encoded segmentation mask in COCO RLE format (dict with keys `size` and `counts`).
annotations/segmentation/counts	Tensor		string
annotations/segmentation/size	Tensor	(2,)	uint64
annotations/stability_score	Scalar		float64	A measure of the mask's quality.
image	FeaturesDict
image/content	Image	(None, None, 3)	uint8	Content of the image.
image/file_name	Tensor		string
image/height	Tensor		uint64
image/image_id	Tensor		uint64
image/width	Tensor		uint64

Supervised keys (See as_supervised doc): None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):

Citation:

@misc{kirillov2023segment,
  title={Segment Anything},
  author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},
  year={2023},
  eprint={2304.02643},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}