セグメント何でも

説明:

SA-1B ダウンロード

Segment Anything 1 Billion (SA-1B) は、オープンワールド画像から汎用オブジェクトセグメンテーションモデルをトレーニングするために設計されたデータセットです。このデータセットは、「Segment Anything」という論文で紹介されました。

SA-1B データセットは、1,100 万枚の多様で高解像度の、ライセンスを取得したプライバシー保護画像と 1.1B のマスクアノテーションで構成されています。マスクは COCO ランレングスエンコーディング (RLE) 形式で指定され、クラスを持ちません。

ライセンスはカスタムです。 https://ai.facebook.com/datasets/segment-anything-downloadsで利用規約全文をお読みください。

image.content (画像のコンテンツ) を除くすべての特徴は、元のデータセット内にあります。

次の方法でセグメンテーションマスクをデコードできます。

import tensorflow_datasets as tfds

pycocotools = tfds.core.lazy_imports.pycocotools

ds = tfds.load('segment_anything', split='train')
for example in tfds.as_numpy(ds):
  segmentation = example['annotations']['segmentation']
  for counts, size in zip(segmentation['counts'], segmentation['size']):
    encoded_mask = {'size': size, 'counts': counts}
    mask = pycocotools.decode(encoded_mask)  # np.array(dtype=uint8) mask
    ...

ホームページ: https://ai.facebook.com/datasets/segment-anything-downloads
ソースコード: tfds.datasets.segment_anything.Builder
バージョン:
- 1.0.0 (デフォルト): 初期リリース。
ダウンロードサイズ: 10.28 TiB
データセットのサイズ: 10.59 TiB
手動ダウンロード手順: このデータセットでは、ソースデータをdownload_config.manual_dirに手動でダウンロードする必要があります (デフォルトは~/tensorflow_datasets/downloads/manual/ )。
https://ai.facebook.com/datasets/segment-anything-downloadsからリンクファイルをダウンロードしますmanual_dir segment_anything_links.txt として保存されたリンクファイルが含まれている必要があります。
自動キャッシュ(ドキュメント): いいえ
分割:

スプリット	例
`'train'`	11,185,362

機能の構造:

FeaturesDict({
    'annotations': Sequence({
        'area': Scalar(shape=(), dtype=uint64),
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'crop_box': BBoxFeature(shape=(4,), dtype=float32),
        'id': Scalar(shape=(), dtype=uint64),
        'point_coords': Tensor(shape=(1, 2), dtype=float64),
        'predicted_iou': Scalar(shape=(), dtype=float64),
        'segmentation': FeaturesDict({
            'counts': string,
            'size': Tensor(shape=(2,), dtype=uint64),
        }),
        'stability_score': Scalar(shape=(), dtype=float64),
    }),
    'image': FeaturesDict({
        'content': Image(shape=(None, None, 3), dtype=uint8),
        'file_name': string,
        'height': uint64,
        'image_id': uint64,
        'width': uint64,
    }),
})

機能ドキュメント:

特徴	クラス	形	Dタイプ	説明
	特徴辞書
注釈	順序
注釈/エリア	スカラー		uint64	マスクのピクセル単位の領域。
注釈/Bbox	BBox機能	(4,)	float32	TFDS 形式のマスクの周囲のボックス。
注釈/crop_box	BBox機能	(4,)	float32	マスクの生成に使用されるイメージのトリミング (TFDS 形式)。
注釈/ID	スカラー		uint64	注釈の識別子。
注釈/point_coords	テンソル	(1、2)	float64	マスクを生成するためにモデルに入力されるポイント座標。
注釈/predicted_iou	スカラー		float64	マスクの品質についてのモデル自身の予測。
注釈/セグメンテーション	特徴辞書			COCO RLE 形式でエンコードされたセグメンテーションマスク (キーの`size`と`counts`を含む辞書)。
注釈/セグメンテーション/カウント	テンソル		弦
注釈/セグメンテーション/サイズ	テンソル	(2,)	uint64
注釈/安定性スコア	スカラー		float64	マスクの品質の尺度。
画像	特徴辞書
画像/内容	画像	(なし、なし、3)	uint8	画像の内容。
画像/ファイル名	テンソル		弦
画像/高さ	テンソル		uint64
画像/画像ID	テンソル		uint64
画像/幅	テンソル		uint64

監視キー( as_supervised docを参照): None
図( tfds.show_examples ): サポートされていません。
例( tfds.as_dataframe ): 欠落しています。
引用：

@misc{kirillov2023segment,
  title={Segment Anything},
  author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},
  year={2023},
  eprint={2304.02643},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}