coco_captions

COCO adalah kumpulan data pendeteksian, segmentasi, dan keterangan objek berskala besar. Versi ini berisi gambar, kotak pembatas, label, dan keterangan dari COCO 2014, dibagi ke dalam himpunan bagian yang ditentukan oleh Karpathy dan Li (2015). Ini secara efektif membagi data validasi COCO 2014 asli menjadi validasi 5000 gambar baru dan set pengujian, ditambah set "restval" yang berisi ~30k gambar tersisa. Semua pemisahan memiliki anotasi keterangan.

Membelah Contoh
'restval' 30.504
'test' 5.000
'train' 82.783
'val' 5.000
  • Struktur fitur :
FeaturesDict({
    'captions': Sequence({
        'id': int64,
        'text': string,
    }),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image/filename': Text(shape=(), dtype=string),
    'image/id': int64,
    'objects': Sequence({
        'area': int64,
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'id': int64,
        'is_crowd': bool,
        'label': ClassLabel(shape=(), dtype=int64, num_classes=80),
    }),
})
  • Dokumentasi fitur :
Fitur Kelas Membentuk Dtype Keterangan
fiturDict
teks Urutan
keterangan/id Tensor int64
keterangan/teks Tensor rangkaian
gambar Gambar (Tidak ada, Tidak ada, 3) uint8
gambar/nama file Teks rangkaian
gambar/id Tensor int64
objek Urutan
benda/daerah Tensor int64
benda/bbox Fitur BBox (4,) float32
benda/id Tensor int64
objek/kerumunan_is Tensor bool
benda/label LabelKelas int64

Visualisasi

  • Kutipan :
@article{DBLP:journals/corr/LinMBHPRDZ14,
  author    = {Tsung{-}Yi Lin and
               Michael Maire and
               Serge J. Belongie and
               Lubomir D. Bourdev and
               Ross B. Girshick and
               James Hays and
               Pietro Perona and
               Deva Ramanan and
               Piotr Doll{'{a} }r and
               C. Lawrence Zitnick},
  title     = {Microsoft {COCO:} Common Objects in Context},
  journal   = {CoRR},
  volume    = {abs/1405.0312},
  year      = {2014},
  url       = {http://arxiv.org/abs/1405.0312},
  archivePrefix = {arXiv},
  eprint    = {1405.0312},
  timestamp = {Mon, 13 Aug 2018 16:48:13 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/LinMBHPRDZ14},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}@inproceedings{DBLP:conf/cvpr/KarpathyL15,
  author    = {Andrej Karpathy and
               Fei{-}Fei Li},
  title     = {Deep visual-semantic alignments for generating image
               descriptions},
  booktitle = { {IEEE} Conference on Computer Vision and Pattern Recognition,
               {CVPR} 2015, Boston, MA, USA, June 7-12, 2015},
  pages     = {3128--3137},
  publisher = { {IEEE} Computer Society},
  year      = {2015},
  url       = {https://doi.org/10.1109/CVPR.2015.7298932},
  doi       = {10.1109/CVPR.2015.7298932},
  timestamp = {Wed, 16 Oct 2019 14:14:50 +0200},
  biburl    = {https://dblp.org/rec/conf/cvpr/KarpathyL15.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

coco_captions/2014 (konfigurasi default)