TFDS תומך כעת בפורמט קרואסון 🥐 ! קרא את התיעוד כדי לדעת יותר.

דף זה תורגם על ידי Cloud Translation API.

youtube_vis

תיאור :

Youtube-vis הוא מערך פילוח של מופעי וידאו. הוא מכיל 2,883 סרטוני YouTube ברזולוציה גבוהה, ערכת תווית קטגוריה לפיקסל הכוללת 40 אובייקטים נפוצים כגון אדם, חיות וכלי רכב, 4,883 מופעי וידאו ייחודיים ו-131,000 הערות ידניות באיכות גבוהה.

מערך הנתונים של YouTube-VIS מחולק ל-2,238 סרטוני הדרכה, 302 סרטוני אימות ו-343 סרטוני בדיקה.

לא הוסרו או שונו קבצים במהלך העיבוד המקדים.

תיעוד נוסף : חקור על ניירות עם קוד
דף הבית : https://youtube-vos.org/dataset/vis/
קוד מקור : tfds.video.youtube_vis.YoutubeVis
גרסאות :
- 1.0.0 (ברירת מחדל): שחרור ראשוני.
גודל הורדה : Unknown size
הוראות הורדה ידניות : מערך נתונים זה מחייב אותך להוריד את נתוני המקור באופן ידני אל download_config.manual_dir (ברירת המחדל היא ~/tensorflow_datasets/downloads/manual/ ):
הורד את כל הקבצים עבור גרסת 2019 של מערך הנתונים (test_all_frames.zip, test.json, train_all_frames.zip, train.json, valid_all_frames.zip, valid.json) מאתר youtube-vis והעבר אותם אל ~/tensorflow_datasets/ הורדות/ידנית/.

שים לב שדף הנחיתה של מערך הנתונים נמצא בכתובת https://youtube-vos.org/dataset/vis/, ולאחר מכן הוא יפנה אותך לדף ב- https://competitions.codalab.org שבו תוכל להוריד את גרסת 2019 של מערך הנתונים. תצטרך ליצור חשבון ב-codalab כדי להוריד את הנתונים. שימו לב שבזמן כתיבת שורות אלו, תצטרכו לעקוף אזהרת "חיבור לא מאובטח" בעת גישה ל-codalab.

שמירה אוטומטית במטמון ( תיעוד ): לא
מפתחות בפיקוח (ראה as_supervised doc ): None
איור ( tfds.show_examples ): לא נתמך.
ציטוט :

@article{DBLP:journals/corr/abs-1905-04804,
  author    = {Linjie Yang and
               Yuchen Fan and
               Ning Xu},
  title     = {Video Instance Segmentation},
  journal   = {CoRR},
  volume    = {abs/1905.04804},
  year      = {2019},
  url       = {http://arxiv.org/abs/1905.04804},
  archivePrefix = {arXiv},
  eprint    = {1905.04804},
  timestamp = {Tue, 28 May 2019 12:48:08 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1905-04804.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

youtube_vis/full (תצורת ברירת מחדל)

תיאור תצורה : גרסת הרזולוציה המלאה של מערך הנתונים, עם כל המסגרות, כולל אלה ללא תוויות.
גודל מערך נתונים : 33.31 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	343
`'train'`	2,238
`'validation'`	302

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(None, None, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(None, None, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(אין, אין, אין, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(אין, אין, אין, 3)	uint8

דוגמאות ( tfds.as_dataframe ):

youtube_vis/480_640_full

תיאור תצורה : כל התמונות משתנות באופן ביליניארי ל-480 X 640 עם כל המסגרות כלולות.
גודל מערך נתונים : 130.02 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	343
`'train'`	2,238
`'validation'`	302

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(480, 640, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(480, 640, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(ללא, 480, 640, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(ללא, 480, 640, 3)	uint8

דוגמאות ( tfds.as_dataframe ):

youtube_vis/480_640_only_frames_with_labels

תיאור תצורה : כל התמונות משתנות באופן ביליניארי ל-480 X 640 עם רק מסגרות עם תוויות כלולות.
גודל ערכת נתונים: 26.27 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	343
`'train'`	2,238
`'validation'`	302

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(480, 640, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(480, 640, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(ללא, 480, 640, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(ללא, 480, 640, 3)	uint8

דוגמאות ( tfds.as_dataframe ):

youtube_vis/only_frames_with_labels

תיאור תצורה : רק תמונות עם תוויות כלולות ברזולוציה המקורית שלהן.
גודל ערכת נתונים : 6.91 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	343
`'train'`	2,238
`'validation'`	302

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(None, None, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(None, None, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(אין, אין, אין, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(אין, אין, אין, 3)	uint8

דוגמאות ( tfds.as_dataframe ):

youtube_vis/full_train_split

תיאור תצורה : גרסת הרזולוציה המלאה של מערך הנתונים, עם כל המסגרות, כולל אלה ללא תוויות. פיצולי ה- Val ובדיקה מיוצרים מנתוני ההדרכה.
גודל מערך נתונים : 26.09 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	200
`'train'`	1,838
`'validation'`	200

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(None, None, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(None, None, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(אין, אין, אין, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(אין, אין, אין, 3)	uint8

דוגמאות ( tfds.as_dataframe ):

youtube_vis/480_640_full_train_split

תיאור תצורה : כל התמונות משתנות באופן ביליניארי ל-480 X 640 עם כל המסגרות כלולות. פיצולי ה- Val ובדיקה מיוצרים מנתוני ההדרכה.
גודל מערך נתונים : 101.57 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	200
`'train'`	1,838
`'validation'`	200

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(480, 640, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(480, 640, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(ללא, 480, 640, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(ללא, 480, 640, 3)	uint8

דוגמאות ( tfds.as_dataframe ):

youtube_vis/480_640_only_frames_with_labels_train_split

תיאור תצורה : כל התמונות משתנות באופן ביליניארי ל-480 X 640 עם רק מסגרות עם תוויות כלולות. פיצולי ה- Val ובדיקה מיוצרים מנתוני ההדרכה.
גודל מערך נתונים : 20.55 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	200
`'train'`	1,838
`'validation'`	200

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(480, 640, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(480, 640, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(ללא, 480, 640, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(ללא, 480, 640, 3)	uint8

דוגמאות ( tfds.as_dataframe ):

youtube_vis/only_frames_with_labels_train_split

תיאור תצורה : רק תמונות עם תוויות כלולות ברזולוציה המקורית שלהן. פיצולי ה- Val ובדיקה מיוצרים מנתוני ההדרכה.
גודל מערך נתונים : 5.46 GiB
פיצולים :

לְפַצֵל	דוגמאות
`'test'`	200
`'train'`	1,838
`'validation'`	200

מבנה תכונה :

FeaturesDict({
    'metadata': FeaturesDict({
        'height': int32,
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'areas': Sequence(float32),
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=40),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'segmentations': Video(Image(shape=(None, None, 1), dtype=uint8)),
    }),
    'video': Video(Image(shape=(None, None, 3), dtype=uint8)),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
מטא נתונים	FeaturesDict
מטא נתונים/גובה	מוֹתֵחַ		int32
metadata/num_frames	מוֹתֵחַ		int32
metadata/video_name	מוֹתֵחַ		חוּט
מטא נתונים/רוחב	מוֹתֵחַ		int32
מסלולים	סדר פעולות
מסלולים/שטחים	רצף (טנזור)	(אף אחד,)	לצוף32
רצועות/bboxes	רצף (BBoxFeature)	(אין, 4)	לצוף32
מסלולים/קטגוריה	ClassLabel		int64
מסלולים/מסגרות	רצף (טנזור)	(אף אחד,)	int32
tracks/is_crowd	מוֹתֵחַ		bool
מסלולים/פילוחים	וידאו (תמונה)	(אין, אין, אין, 1)	uint8
וִידֵאוֹ	וידאו (תמונה)	(אין, אין, אין, 3)	uint8

דוגמאות ( tfds.as_dataframe ):