coco_captions

תיאור :

COCO הוא מערך נתונים לזיהוי אובייקטים, פילוח וכתוביות בקנה מידה גדול. גרסה זו מכילה תמונות, תיבות תוחמות, תוויות וכיתובים מ-COCO 2014, מפוצלת לתת-קבוצות שהוגדרו על ידי Karpathy ו-Li (2015). זה למעשה מחלק את נתוני האימות המקוריים של COCO 2014 למערכות אימות ובדיקות חדשות של 5000 תמונות, בתוספת ערכת "restval" המכילה את 30,000 התמונות הנותרות. לכל הפיצולים יש הערות כיתוב.

תיעוד נוסף : חקור על ניירות עם קוד
תיאור תצורה : גרסה זו מכילה תמונות, תיבות תוחמות ותוויות עבור גרסת 2014.
דף הבית : http://cocodataset.org/#home
קוד מקור : tfds.object_detection.CocoCaptions
גרסאות :
- 1.1.0 (ברירת מחדל): אין הערות שחרור.
גודל הורדה : 37.61 GiB
גודל ערכת נתונים : 18.83 GiB
שמור אוטומטי במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'restval'`	30,504
`'test'`	5,000
`'train'`	82,783
`'val'`	5,000

מבנה תכונה :

FeaturesDict({
    'captions': Sequence({
        'id': int64,
        'text': string,
    }),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image/filename': Text(shape=(), dtype=string),
    'image/id': int64,
    'objects': Sequence({
        'area': int64,
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'id': int64,
        'is_crowd': bool,
        'label': ClassLabel(shape=(), dtype=int64, num_classes=80),
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
כתוביות	סדר פעולות
כתוביות/מזהה	מוֹתֵחַ		int64
כיתובים/טקסט	מוֹתֵחַ		חוּט
תמונה	תמונה	(אין, אין, 3)	uint8
תמונה/שם קובץ	טֶקסט		חוּט
תמונה/מזהה	מוֹתֵחַ		int64
חפצים	סדר פעולות
חפצים/שטח	מוֹתֵחַ		int64
אובייקטים/bbox	BBoxFeature	(4,)	לצוף32
אובייקטים/מזהה	מוֹתֵחַ		int64
objects/is_crowd	מוֹתֵחַ		bool
חפצים/תווית	ClassLabel		int64

מפתחות בפיקוח (ראה as_supervised doc ): None
איור ( tfds.show_examples ):

רְאִיָה

דוגמאות ( tfds.as_dataframe ):

ציטוט :

@article{DBLP:journals/corr/LinMBHPRDZ14,
  author    = {Tsung{-}Yi Lin and
               Michael Maire and
               Serge J. Belongie and
               Lubomir D. Bourdev and
               Ross B. Girshick and
               James Hays and
               Pietro Perona and
               Deva Ramanan and
               Piotr Doll{'{a} }r and
               C. Lawrence Zitnick},
  title     = {Microsoft {COCO:} Common Objects in Context},
  journal   = {CoRR},
  volume    = {abs/1405.0312},
  year      = {2014},
  url       = {http://arxiv.org/abs/1405.0312},
  archivePrefix = {arXiv},
  eprint    = {1405.0312},
  timestamp = {Mon, 13 Aug 2018 16:48:13 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/LinMBHPRDZ14},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}@inproceedings{DBLP:conf/cvpr/KarpathyL15,
  author    = {Andrej Karpathy and
               Fei{-}Fei Li},
  title     = {Deep visual-semantic alignments for generating image
               descriptions},
  booktitle = { {IEEE} Conference on Computer Vision and Pattern Recognition,
               {CVPR} 2015, Boston, MA, USA, June 7-12, 2015},
  pages     = {3128--3137},
  publisher = { {IEEE} Computer Society},
  year      = {2015},
  url       = {https://doi.org/10.1109/CVPR.2015.7298932},
  doi       = {10.1109/CVPR.2015.7298932},
  timestamp = {Wed, 16 Oct 2019 14:14:50 +0200},
  biburl    = {https://dblp.org/rec/conf/cvpr/KarpathyL15.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

coco_captions קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.

coco_captions/2014 (תצורת ברירת המחדל)

coco_captions