Thanks for tuning in to Google I/O. View all sessions on demandWatch on demand


  • Description:

SA-1B Download

Segment Anything 1 Billion (SA-1B) is a dataset designed for training general-purpose object segmentation models from open world images. The dataset was introduced in the paper "Segment Anything".

The SA-1B dataset consists of 11M diverse, high-resolution, licensed, and privacy-protecting images and 1.1B mask annotations. Masks are given in the COCO run-length encoding (RLE) format, and do not have classes.

The license is custom. Please, read the full terms and conditions on

All the features are in the original dataset except image.content (content of the image).

You can decode segmentation masks with:

import tensorflow_datasets as tfds

pycocotools = tfds.core.lazy_imports.pycocotools

ds = tfds.load('segment_anything', split='train')
for example in tfds.as_numpy(ds):
  segmentation = example['annotations']['segmentation']
  for counts, size in zip(segmentation['counts'], segmentation['size']):
    encoded_mask = {'size': size, 'counts': counts}
    mask = pycocotools.decode(encoded_mask)  # np.array(dtype=uint8) mask
Split Examples
  • Feature structure:
    'annotations': Sequence({
        'area': Scalar(shape=(), dtype=uint64),
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'crop_box': BBoxFeature(shape=(4,), dtype=float32),
        'id': Scalar(shape=(), dtype=uint64),
        'point_coords': Tensor(shape=(1, 2), dtype=float64),
        'predicted_iou': Scalar(shape=(), dtype=float64),
        'segmentation': FeaturesDict({
            'counts': string,
            'size': Tensor(shape=(2,), dtype=uint64),
        'stability_score': Scalar(shape=(), dtype=float64),
    'image': FeaturesDict({
        'content': Image(shape=(None, None, 3), dtype=uint8),
        'file_name': string,
        'height': uint64,
        'image_id': uint64,
        'width': uint64,
  • Feature documentation:
Feature Class Shape Dtype Description
annotations Sequence
annotations/area Scalar uint64 The area in pixels of the mask.
annotations/bbox BBoxFeature (4,) float32 The box around the mask, in TFDS format.
annotations/crop_box BBoxFeature (4,) float32 The crop of the image used to generate the mask, in TFDS format.
annotations/id Scalar uint64 Identifier for the annotation.
annotations/point_coords Tensor (1, 2) float64 The point coordinates input to the model to generate the mask.
annotations/predicted_iou Scalar float64 The model's own prediction of the mask's quality.
annotations/segmentation FeaturesDict Encoded segmentation mask in COCO RLE format (dict with keys size and counts).
annotations/segmentation/counts Tensor string
annotations/segmentation/size Tensor (2,) uint64
annotations/stability_score Scalar float64 A measure of the mask's quality.
image FeaturesDict
image/content Image (None, None, 3) uint8 Content of the image.
image/file_name Tensor string
image/height Tensor uint64
image/image_id Tensor uint64
image/width Tensor uint64
  title={Segment Anything},
  author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},