imagenet2012_multilabel

説明:

このデータセットには、「Evaluating Machine Accuracy on ImageNet」、ICML、2020 からのマルチクラスラベルで注釈が付けられた ILSVRC-2012 (ImageNet) 検証画像が含まれています。 ImageNet クラス階層における細分化されたクラスの区別 (詳細については、論文を参照してください)。元のラベルと比較して、これらの専門家がレビューしたマルチクラスラベルは、より意味的に一貫した精度の評価を可能にします。

このデータセットのバージョン 3.0.0には、「生地がベーグルになるのはいつですか?」からのより多くの修正されたラベルが含まれています。

50,000 の ImageNet 検証画像のうち、マルチラベルアノテーションを持つのは 20,000 だけです。マルチラベルのセットは、最初に 67 個のトレーニング済み ImageNet モデルのテストベッドによって生成され、その後、個々のモデル予測は、専門家によってcorrect (ラベルは画像に対して正しい)、 wrong (ラベルは画像に対して正しくない) として手動で注釈が付けられました。画像）、またはunclear （専門家の間で合意に達しませんでした）。

さらに、注釈の際に、専門家パネルは一連の問題のある画像を特定しました。以下の基準のいずれかを満たしている場合、画像は問題がありました。

元の ImageNet ラベル (トップ 1 ラベル) が正しくないか、不明確でした
画像は、図面、絵画、スケッチ、漫画、またはコンピューターでレンダリングされたものでした
過度に編集された画像
画像に不適切なコンテンツが含まれていた

問題のある画像はこのデータセットに含まれていますが、マルチラベル精度を計算するときは無視する必要があります。さらに、20,000 個のアノテーションの初期セットはクラスバランスが取れていますが、問題のある画像のセットはそうではないため、クラスごとの精度を計算してから平均化することをお勧めします。また、予測が正しいまたは不明確であるとマークされている場合 (つまり、不明確なラベルに寛大であること)、その予測を正しいと見なすことをお勧めします。

これを行う 1 つの可能な方法は、次の NumPy コードを使用することです。

import tensorflow_datasets as tfds

ds = tfds.load('imagenet2012_multilabel', split='validation')

# We assume that predictions is a dictionary from file_name to a class index between 0 and 999

num_correct_per_class = {}
num_images_per_class = {}

for example in ds:
    # We ignore all problematic images
    if example[‘is_problematic’].numpy():
        continue

    # The label of the image in ImageNet
    cur_class = example['original_label'].numpy()

    # If we haven't processed this class yet, set the counters to 0
    if cur_class not in num_correct_per_class:
        num_correct_per_class[cur_class] = 0
        assert cur_class not in num_images_per_class
        num_images_per_class[cur_class] = 0

    num_images_per_class[cur_class] += 1

    # Get the predictions for this image
    cur_pred = predictions[example['file_name'].numpy()]

    # We count a prediction as correct if it is marked as correct or unclear
    # (i.e., we are lenient with the unclear labels)
    if cur_pred is in example['correct_multi_labels'].numpy() or cur_pred is in example['unclear_multi_labels'].numpy():
        num_correct_per_class[cur_class] += 1

# Check that we have collected accuracy data for each of the 1,000 classes
num_classes = 1000
assert len(num_correct_per_class) == num_classes
assert len(num_images_per_class) == num_classes

# Compute the per-class accuracies and then average them
final_avg = 0
for cid in range(num_classes):
  assert cid in num_correct_per_class
  assert cid in num_images_per_class
  final_avg += num_correct_per_class[cid] / num_images_per_class[cid]
final_avg /= num_classes

ホームページ: https://github.com/modestyachts/evaluating_machine_accuracy_on_imagenet
ソースコード: tfds.datasets.imagenet2012_multilabel.Builder
バージョン:
- 1.0.0 : 初期リリース。
- 2.0.0 : ILSVRC2012_img_val.tar ファイルを修正しました。
- 3.0.0 (デフォルト): ラベルと ImageNet-M の分割を修正しました。
ダウンロードサイズ: 191.13 MiB
データセットサイズ: 2.50 GiB
手動ダウンロードの手順: このデータセットでは、ソースデータを手動でdownload_config.manual_dir (デフォルトは~/tensorflow_datasets/downloads/manual/ ) にダウンロードする必要があります。
manual_dir にはILSVRC2012_img_val.tarファイルが含まれている必要があります。データセットをダウンロードするためのリンクを取得するには、 http://www.image-net.org/download-imagesに登録する必要があります。
自動キャッシュ(ドキュメント): いいえ
スプリット:

スプリット	例
`'imagenet_m'`	68
`'validation'`	20,000

機能構造:

FeaturesDict({
    'correct_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
    'file_name': Text(shape=(), dtype=string),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'is_problematic': bool,
    'original_label': ClassLabel(shape=(), dtype=int64, num_classes=1000),
    'unclear_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
    'wrong_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
})

機能のドキュメント:

特徴	クラス	形	Dtype
	特徴辞書
correct_multi_labels	シーケンス(クラスラベル)	（なし、）	int64
ファイル名	文章		ストリング
画像	画像	(なし、なし、3)	uint8
is_problematic	テンソル		ブール
original_label	クラスラベル		int64
unknown_multi_labels	シーケンス(クラスラベル)	（なし、）	int64
wrong_multi_labels	シーケンス(クラスラベル)	（なし、）	int64

監視キー( as_supervised docを参照): ('image', 'correct_multi_labels')
図( tfds.show_examples ):

視覚化

例( tfds.as_dataframe ):

引用：

@article{shankar2019evaluating,
  title={Evaluating Machine Accuracy on ImageNet},
  author={Vaishaal Shankar* and Rebecca Roelofs* and Horia Mania and Alex Fang and Benjamin Recht and Ludwig Schmidt},
  journal={ICML},
  year={2020},
  note={\url{http://proceedings.mlr.press/v119/shankar20c.html} }
}
@article{ImageNetChallenge,
  title={ {ImageNet} large scale visual recognition challenge},
  author={Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause
   and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and
   Alexander C. Berg and Fei-Fei Li},
  journal={International Journal of Computer Vision},
  year={2015},
  note={\url{https://arxiv.org/abs/1409.0575} }
}
@inproceedings{ImageNet,
   author={Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei},
   booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
   title={ {ImageNet}: A large-scale hierarchical image database},
   year={2009},
   note={\url{http://www.image-net.org/papers/imagenet_cvpr09.pdf} }
}
@article{vasudevan2022does,
  title={When does dough become a bagel? Analyzing the remaining mistakes on ImageNet},
  author={Vasudevan, Vijay and Caine, Benjamin and Gontijo-Lopes, Raphael and Fridovich-Keil, Sara and Roelofs, Rebecca},
  journal={arXiv preprint arXiv:2205.04596},
  year={2022}
}