coco_captions

وصف :

COCO عبارة عن مجموعة بيانات واسعة النطاق للكشف عن الكائنات وتقسيمها والتعليق عليها. يحتوي هذا الإصدار على صور ومربعات محيطة وتسميات وتسميات توضيحية من COCO 2014، مقسمة إلى مجموعات فرعية محددة بواسطة Karpathy وLi (2015). يؤدي هذا إلى تقسيم بيانات التحقق من صحة COCO 2014 الأصلية بشكل فعال إلى مجموعات جديدة للتحقق من صحة واختبار 5000 صورة، بالإضافة إلى مجموعة "restval" التي تحتوي على الصور المتبقية التي يبلغ حجمها حوالي 30 ألف صورة. تحتوي جميع الانقسامات على تعليقات توضيحية.

وثائق إضافية : استكشف الأوراق ذات الكود
وصف التكوين : يحتوي هذا الإصدار على صور ومربعات محيطة وتسميات لإصدار 2014.
الصفحة الرئيسية : http://cocodataset.org/#home
كود المصدر : tfds.object_detection.CocoCaptions
الإصدارات :
- 1.1.0 (افتراضي): لا توجد ملاحظات الإصدار.
حجم التحميل : 37.61 GiB
حجم مجموعة البيانات : 18.83 GiB
التخزين المؤقت التلقائي ( الوثائق ): لا
الإنشقاقات :

ينقسم	أمثلة
`'restval'`	30,504
`'test'`	5000
`'train'`	82,783
`'val'`	5000

هيكل الميزة :

FeaturesDict({
    'captions': Sequence({
        'id': int64,
        'text': string,
    }),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image/filename': Text(shape=(), dtype=string),
    'image/id': int64,
    'objects': Sequence({
        'area': int64,
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'id': int64,
        'is_crowd': bool,
        'label': ClassLabel(shape=(), dtype=int64, num_classes=80),
    }),
})

وثائق الميزة :

ميزة	فصل	شكل	نوع D
	المميزاتDict
التسميات التوضيحية	تسلسل
التسميات التوضيحية/معرف	الموتر		int64
التسميات التوضيحية/النص	الموتر		خيط
صورة	صورة	(لا شيء، لا شيء، 3)	uint8
الصورة/اسم الملف	نص		خيط
الصورة/المعرف	الموتر		int64
أشياء	تسلسل
الكائنات / المنطقة	الموتر		int64
الكائنات/bbox	ميزة BBox	(4،)	float32
الكائنات/المعرف	الموتر		int64
الكائنات/is_crowd	الموتر		منطقي
الكائنات/التسمية	ClassLabel		int64

المفاتيح الخاضعة للإشراف (راجع as_supervised doc ): None
الشكل ( tfds.show_examples ):

التصور

أمثلة ( tfds.as_dataframe ):

الاقتباس :

@article{DBLP:journals/corr/LinMBHPRDZ14,
  author    = {Tsung{-}Yi Lin and
               Michael Maire and
               Serge J. Belongie and
               Lubomir D. Bourdev and
               Ross B. Girshick and
               James Hays and
               Pietro Perona and
               Deva Ramanan and
               Piotr Doll{'{a} }r and
               C. Lawrence Zitnick},
  title     = {Microsoft {COCO:} Common Objects in Context},
  journal   = {CoRR},
  volume    = {abs/1405.0312},
  year      = {2014},
  url       = {http://arxiv.org/abs/1405.0312},
  archivePrefix = {arXiv},
  eprint    = {1405.0312},
  timestamp = {Mon, 13 Aug 2018 16:48:13 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/LinMBHPRDZ14},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}@inproceedings{DBLP:conf/cvpr/KarpathyL15,
  author    = {Andrej Karpathy and
               Fei{-}Fei Li},
  title     = {Deep visual-semantic alignments for generating image
               descriptions},
  booktitle = { {IEEE} Conference on Computer Vision and Pattern Recognition,
               {CVPR} 2015, Boston, MA, USA, June 7-12, 2015},
  pages     = {3128--3137},
  publisher = { {IEEE} Computer Society},
  year      = {2015},
  url       = {https://doi.org/10.1109/CVPR.2015.7298932},
  doi       = {10.1109/CVPR.2015.7298932},
  timestamp = {Wed, 16 Oct 2019 14:14:50 +0200},
  biburl    = {https://dblp.org/rec/conf/cvpr/KarpathyL15.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

coco_captions

coco_captions/2014 (التكوين الافتراضي)