imagenet2012_multilabel

תיאור :

מערך נתונים זה מכיל תמונות אימות ILSVRC-2012 (ImageNet) עם הערות בתוויות מרובות מחלקות מ- "Evaluating Machine Accuracy on ImageNet" , ICML, 2020. התוויות הרב-מעמדות נבדקו על ידי פאנל של מומחים שעבר הכשרה מקיפה במורכבויות של עדינות- הבחנות מחלקות מגוונות בהיררכיית המעמדות של ImageNet (ראה מאמר לפרטים נוספים). בהשוואה לתוויות המקוריות, תוויות רב-מעמדות אלו שנבדקו על ידי מומחים מאפשרות הערכה קוהרנטית יותר מבחינה סמנטית של דיוק.

גרסה 3.0.0 של מערך נתונים זה מכילה תוויות מתוקנות יותר מ"מתי הבצק הופך לבייגל? ניתוח הטעויות הנותרות ב-ImageNet וכן את הדוגמה של ImageNet-Major (ImageNet-M) 68 מפוצלת תחת 'imagenet-m'.

רק ל-20,000 מתוך 50,000 תמונות האימות של ImageNet יש הערות מרובות תוויות. ערכת התוויות הרבות נוצרה תחילה על ידי מבחן של 67 מודלים מאומנים של ImageNet, ולאחר מכן כל חיזוי דגם בודד סומן באופן ידני על ידי המומחים כאו correct (התווית נכונה לתמונה), wrong (התווית אינה נכונה עבור התמונה), או unclear (לא הושגה הסכמה בין המומחים).

בנוסף, במהלך ההערה, פאנל המומחים זיהה קבוצה של תמונות בעייתיות . תמונה הייתה בעייתית אם היא עמדה באחד מהקריטריונים הבאים:

התווית המקורית של ImageNet (תווית מובילה 1) הייתה שגויה או לא ברורה
התמונה הייתה ציור, ציור, שרטוט, קריקטורה או עיבוד מחשב
התמונה עברה עריכה מוגזמת
בתמונה היה תוכן לא הולם

התמונות הבעייתיות כלולות במערך נתונים זה, אך יש להתעלם מהן בעת חישוב דיוק ריבוי תוויות. בנוסף, מכיוון שהקבוצה הראשונית של 20,000 הערות מאוזנת במחלקה, אך קבוצת התמונות הבעייתיות אינה, אנו ממליצים לחשב את הדיוקים לכל מחלקה ולאחר מכן לבצע את הממוצע שלהם. כמו כן, אנו ממליצים לספור תחזית כנכונה אם היא מסומנת כנכונה או לא ברורה (כלומר, להיות מקלה עם התוויות הלא ברורות).

דרך אפשרית אחת לעשות זאת היא באמצעות קוד NumPy הבא:

import tensorflow_datasets as tfds

ds = tfds.load('imagenet2012_multilabel', split='validation')

# We assume that predictions is a dictionary from file_name to a class index between 0 and 999

num_correct_per_class = {}
num_images_per_class = {}

for example in ds:
    # We ignore all problematic images
    if example[‘is_problematic’].numpy():
        continue

    # The label of the image in ImageNet
    cur_class = example['original_label'].numpy()

    # If we haven't processed this class yet, set the counters to 0
    if cur_class not in num_correct_per_class:
        num_correct_per_class[cur_class] = 0
        assert cur_class not in num_images_per_class
        num_images_per_class[cur_class] = 0

    num_images_per_class[cur_class] += 1

    # Get the predictions for this image
    cur_pred = predictions[example['file_name'].numpy()]

    # We count a prediction as correct if it is marked as correct or unclear
    # (i.e., we are lenient with the unclear labels)
    if cur_pred is in example['correct_multi_labels'].numpy() or cur_pred is in example['unclear_multi_labels'].numpy():
        num_correct_per_class[cur_class] += 1

# Check that we have collected accuracy data for each of the 1,000 classes
num_classes = 1000
assert len(num_correct_per_class) == num_classes
assert len(num_images_per_class) == num_classes

# Compute the per-class accuracies and then average them
final_avg = 0
for cid in range(num_classes):
  assert cid in num_correct_per_class
  assert cid in num_images_per_class
  final_avg += num_correct_per_class[cid] / num_images_per_class[cid]
final_avg /= num_classes

דף הבית : https://github.com/modestyachts/evaluating_machine_accuracy_on_imagenet
קוד מקור : tfds.datasets.imagenet2012_multilabel.Builder
גרסאות :
- 1.0.0 : שחרור ראשוני.
- 2.0.0 : קובץ ILSRC2012_img_val.tar תוקן.
- 3.0.0 (ברירת מחדל): תוויות מתוקנות ופיצול ImageNet-M.
גודל הורדה : 191.13 MiB
גודל מערך נתונים : 2.50 GiB
הוראות הורדה ידניות : מערך נתונים זה מחייב אותך להוריד את נתוני המקור באופן ידני אל download_config.manual_dir (ברירת המחדל היא ~/tensorflow_datasets/downloads/manual/ ):
manual_dir צריך להכיל קובץ ILSVRC2012_img_val.tar . עליך להירשם בכתובת http://www.image-net.org/download-images כדי לקבל את הקישור להורדת מערך הנתונים.
שמירה אוטומטית במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'imagenet_m'`	68
`'validation'`	20,000

מבנה תכונה :

FeaturesDict({
    'correct_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
    'file_name': Text(shape=(), dtype=string),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'is_problematic': bool,
    'original_label': ClassLabel(shape=(), dtype=int64, num_classes=1000),
    'unclear_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
    'wrong_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
נכון_תוויות_מרובות	Sequence (ClassLabel)	(אף אחד,)	int64
שם קובץ	טֶקסט		חוּט
תמונה	תמונה	(אין, אין, 3)	uint8
הוא_בעייתי	מוֹתֵחַ		bool
תווית_מקורית	ClassLabel		int64
תוויות_מרובות_לא ברורות	Sequence (ClassLabel)	(אף אחד,)	int64
תוויות_מרובות_שגויות	Sequence (ClassLabel)	(אף אחד,)	int64

מפתחות בפיקוח (ראה כמסמך as_supervised ): ('image', 'correct_multi_labels')
איור ( tfds.show_examples ):

רְאִיָה

דוגמאות ( tfds.as_dataframe ):

ציטוט :

@article{shankar2019evaluating,
  title={Evaluating Machine Accuracy on ImageNet},
  author={Vaishaal Shankar* and Rebecca Roelofs* and Horia Mania and Alex Fang and Benjamin Recht and Ludwig Schmidt},
  journal={ICML},
  year={2020},
  note={\url{http://proceedings.mlr.press/v119/shankar20c.html} }
}
@article{ImageNetChallenge,
  title={ {ImageNet} large scale visual recognition challenge},
  author={Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause
   and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and
   Alexander C. Berg and Fei-Fei Li},
  journal={International Journal of Computer Vision},
  year={2015},
  note={\url{https://arxiv.org/abs/1409.0575} }
}
@inproceedings{ImageNet,
   author={Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei},
   booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
   title={ {ImageNet}: A large-scale hierarchical image database},
   year={2009},
   note={\url{http://www.image-net.org/papers/imagenet_cvpr09.pdf} }
}
@article{vasudevan2022does,
  title={When does dough become a bagel? Analyzing the remaining mistakes on ImageNet},
  author={Vasudevan, Vijay and Caine, Benjamin and Gontijo-Lopes, Raphael and Fridovich-Keil, Sara and Roelofs, Rebecca},
  journal={arXiv preprint arXiv:2205.04596},
  year={2022}
}