TFDS תומך כעת בפורמט קרואסון 🥐 ! קרא את התיעוד כדי לדעת יותר.

דף זה תורגם על ידי Cloud Translation API.

d4rl_mujoco_walker2d

תיאור :

D4RL הוא אמת מידה בקוד פתוח ללמידת חיזוק לא מקוון. הוא מספק סביבות סטנדרטיות ומערכי נתונים עבור אלגוריתמי הדרכה ומידוד.

מערכי הנתונים פועלים לפי פורמט RLDS כדי לייצג שלבים ופרקים.

תיעוד נוסף : חקור על ניירות עם קוד
תיאור תצורה : ראה פרטים נוספים על המשימה וגרסאותיה ב- https://github.com/rail-berkeley/d4rl/wiki/Tasks#gym
דף הבית : https://sites.google.com/view/d4rl-anonymous
קוד מקור : tfds.d4rl.d4rl_mujoco_walker2d.D4rlMujocoWalker2d
גרסאות :
- 1.0.0 : שחרור ראשוני.
- 1.1.0 : נוסף is_last.
- 1.2.0 (ברירת מחדל): עודכן כדי לקחת בחשבון את התצפית הבאה.
מפתחות בפיקוח (ראה as_supervised doc ): None
איור ( tfds.show_examples ): לא נתמך.
ציטוט :

@misc{fu2020d4rl,
    title={D4RL: Datasets for Deep Data-Driven Reinforcement Learning},
    author={Justin Fu and Aviral Kumar and Ofir Nachum and George Tucker and Sergey Levine},
    year={2020},
    eprint={2004.07219},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

d4rl_mujoco_walker2d/v0-expert (תצורת ברירת המחדל)

גודל הורדה : 78.41 MiB
גודל מערך נתונים : 98.64 MiB
שמור אוטומטי במטמון ( תיעוד ): כן
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,628

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v0-medium

גודל הורדה : 80.83 MiB
גודל ערכת נתונים : 99.72 MiB
שמור אוטומטי במטמון ( תיעוד ): כן
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	5,315

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v0-medium-expert

גודל הורדה : 159.24 MiB
גודל ערכת נתונים : 198.36 MiB
שמור אוטומטי במטמון ( תיעוד ): רק כאשר shuffle_files=False (רכבת)
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	6,943

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v0-mixed

גודל הורדה : 8.42 MiB
גודל מערך נתונים : 10.06 MiB
שמור אוטומטי במטמון ( תיעוד ): כן
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	501

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v0-אקראי

גודל הורדה : 78.41 MiB
גודל מערך נתונים : 112.04 MiB
שמור אוטומטי במטמון ( תיעוד ): כן
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	50,988

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v1-expert

גודל הורדה : 143.06 MiB
גודל ערכת נתונים : 452.72 MiB
שמור אוטומטי במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,003

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 17), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(9,), dtype=float32),
            'qvel': Tensor(shape=(9,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
מְדִינִיוּת	FeaturesDict
מדיניות/fc0	FeaturesDict
policy/fc0/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc0/weight	מוֹתֵחַ	(256, 17)	לצוף32
מדיניות/fc1	FeaturesDict
policy/fc1/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc1/weight	מוֹתֵחַ	(256, 256)	לצוף32
policy/last_fc	FeaturesDict
policy/last_fc/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc/weight	מוֹתֵחַ	(6, 256)	לצוף32
policy/last_fc_log_std	FeaturesDict
policy/last_fc_log_std/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc_log_std/weight	מוֹתֵחַ	(6, 256)	לצוף32
מדיניות/אי-לינאריות	מוֹתֵחַ		חוּט
מדיניות/פלט_הפצה	מוֹתֵחַ		חוּט
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף32
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף32
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v1-medium

גודל הורדה : 144.23 MiB
גודל ערכת נתונים : 510.08 MiB
שמור אוטומטי במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,207

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 17), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(9,), dtype=float32),
            'qvel': Tensor(shape=(9,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
מְדִינִיוּת	FeaturesDict
מדיניות/fc0	FeaturesDict
policy/fc0/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc0/weight	מוֹתֵחַ	(256, 17)	לצוף32
מדיניות/fc1	FeaturesDict
policy/fc1/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc1/weight	מוֹתֵחַ	(256, 256)	לצוף32
policy/last_fc	FeaturesDict
policy/last_fc/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc/weight	מוֹתֵחַ	(6, 256)	לצוף32
policy/last_fc_log_std	FeaturesDict
policy/last_fc_log_std/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc_log_std/weight	מוֹתֵחַ	(6, 256)	לצוף32
מדיניות/אי-לינאריות	מוֹתֵחַ		חוּט
מדיניות/פלט_הפצה	מוֹתֵחַ		חוּט
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף32
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף32
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v1-medium-expert

גודל הורדה : 286.69 MiB
גודל מערך נתונים : 342.46 MiB
שמור אוטומטי במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	2,209

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(9,), dtype=float32),
            'qvel': Tensor(shape=(9,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף32
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף32
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v1-medium-replay

גודל הורדה : 84.37 MiB
גודל מערך נתונים : 52.10 MiB
שמור אוטומטי במטמון ( תיעוד ): כן
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,093

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float64),
        'discount': float64,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float64),
        'reward': float64,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף64
צעדים/הנחה	מוֹתֵחַ		לצוף64
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף64
צעדים/פרס	מוֹתֵחַ		לצוף64

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v1-מלא-שידור חוזר

גודל הורדה : 278.95 MiB
גודל ערכת נתונים : 171.66 MiB
שמור אוטומטי במטמון ( תיעוד ): רק כאשר shuffle_files=False (רכבת)
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,888

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float64),
        'discount': float64,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float64),
        'reward': float64,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף64
צעדים/הנחה	מוֹתֵחַ		לצוף64
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף64
צעדים/פרס	מוֹתֵחַ		לצוף64

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v1-random

גודל הורדה : 132.36 MiB
גודל מערך נתונים : 192.06 MiB
שמור אוטומטי במטמון ( תיעוד ): רק כאשר shuffle_files=False (רכבת)
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	48,790

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float32,
            'qpos': Tensor(shape=(9,), dtype=float32),
            'qvel': Tensor(shape=(9,), dtype=float32),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף32
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף32
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף32
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v2-expert

גודל הורדה : 219.89 MiB
גודל מערך נתונים : 452.16 MiB
שמור אוטומטי במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,001

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 17), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
מְדִינִיוּת	FeaturesDict
מדיניות/fc0	FeaturesDict
policy/fc0/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc0/weight	מוֹתֵחַ	(256, 17)	לצוף32
מדיניות/fc1	FeaturesDict
policy/fc1/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc1/weight	מוֹתֵחַ	(256, 256)	לצוף32
policy/last_fc	FeaturesDict
policy/last_fc/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc/weight	מוֹתֵחַ	(6, 256)	לצוף32
policy/last_fc_log_std	FeaturesDict
policy/last_fc_log_std/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc_log_std/weight	מוֹתֵחַ	(6, 256)	לצוף32
מדיניות/אי-לינאריות	מוֹתֵחַ		חוּט
מדיניות/הפצה_תפוקה	מוֹתֵחַ		חוּט
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v2-מלא-שידור חוזר

גודל הורדה : 271.91 MiB
גודל ערכת נתונים : 171.66 MiB
שמור אוטומטי במטמון ( תיעוד ): רק כאשר shuffle_files=False (רכבת)
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,888

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v2-medium

גודל הורדה : 221.50 MiB
גודל מערך נתונים : 505.58 MiB
שמור אוטומטי במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,191

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'policy': FeaturesDict({
        'fc0': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 17), dtype=float32),
        }),
        'fc1': FeaturesDict({
            'bias': Tensor(shape=(256,), dtype=float32),
            'weight': Tensor(shape=(256, 256), dtype=float32),
        }),
        'last_fc': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'last_fc_log_std': FeaturesDict({
            'bias': Tensor(shape=(6,), dtype=float32),
            'weight': Tensor(shape=(6, 256), dtype=float32),
        }),
        'nonlinearity': string,
        'output_distribution': string,
    }),
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
מְדִינִיוּת	FeaturesDict
מדיניות/fc0	FeaturesDict
policy/fc0/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc0/weight	מוֹתֵחַ	(256, 17)	לצוף32
מדיניות/fc1	FeaturesDict
policy/fc1/bias	מוֹתֵחַ	(256,)	לצוף32
policy/fc1/weight	מוֹתֵחַ	(256, 256)	לצוף32
policy/last_fc	FeaturesDict
policy/last_fc/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc/weight	מוֹתֵחַ	(6, 256)	לצוף32
policy/last_fc_log_std	FeaturesDict
policy/last_fc_log_std/bias	מוֹתֵחַ	(6,)	לצוף32
policy/last_fc_log_std/weight	מוֹתֵחַ	(6, 256)	לצוף32
מדיניות/אי-לינאריות	מוֹתֵחַ		חוּט
מדיניות/פלט_הפצה	מוֹתֵחַ		חוּט
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v2-medium-expert

גודל הורדה : 440.79 MiB
גודל ערכת נתונים : 342.45 MiB
שמור אוטומטי במטמון ( תיעוד ): לא
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	2,191

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v2-medium-replay

גודל הורדה : 82.32 MiB
גודל ערכת נתונים : 52.10 MiB
שמור אוטומטי במטמון ( תיעוד ): כן
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	1,093

מבנה תכונה :

FeaturesDict({
    'algorithm': string,
    'iteration': int32,
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
אַלגוֹרִיתְם	מוֹתֵחַ		חוּט
איטרציה	מוֹתֵחַ		int32
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d/v2-אקראי

גודל הורדה : 206.10 MiB
גודל ערכת נתונים : 192.11 MiB
שמור אוטומטי במטמון ( תיעוד ): רק כאשר shuffle_files=False (רכבת)
פיצולים :

לְפַצֵל	דוגמאות
`'train'`	48,908

מבנה תכונה :

FeaturesDict({
    'steps': Dataset({
        'action': Tensor(shape=(6,), dtype=float32),
        'discount': float32,
        'infos': FeaturesDict({
            'action_log_probs': float64,
            'qpos': Tensor(shape=(9,), dtype=float64),
            'qvel': Tensor(shape=(9,), dtype=float64),
        }),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'observation': Tensor(shape=(17,), dtype=float32),
        'reward': float32,
    }),
})

תיעוד תכונה :

תכונה	מעמד	צוּרָה	Dtype
	FeaturesDict
צעדים	מערך נתונים
צעדים/פעולה	מוֹתֵחַ	(6,)	לצוף32
צעדים/הנחה	מוֹתֵחַ		לצוף32
שלבים/מידע	FeaturesDict
steps/infos/action_log_probs	מוֹתֵחַ		לצוף64
צעדים/מידע/qpos	מוֹתֵחַ	(9,)	לצוף64
צעדים/מידע/qvel	מוֹתֵחַ	(9,)	לצוף64
צעדים/הוא_ראשון	מוֹתֵחַ		bool
צעדים/הוא_אחרון	מוֹתֵחַ		bool
steps/is_terminal	מוֹתֵחַ		bool
צעדים/תצפית	מוֹתֵחַ	(17,)	לצוף32
צעדים/פרס	מוֹתֵחַ		לצוף32

דוגמאות ( tfds.as_dataframe ):

d4rl_mujoco_walker2d קל לארגן דפים בעזרת אוספים אפשר לשמור ולסווג תוכן על סמך ההעדפות שלך.