- Description:
The datasets were created with a SAC agent trained on the environment reward of MuJoCo locomotion tasks. These datasets are used in What Matters for Adversarial Imitation Learning? Orsini et al. 2021.
The datasets follow the RLDS format to represent steps and episodes.s
Homepage: https://github.com/google-research/rlds
Source code:
tfds.rlds.datasets.locomotion.Locomotion
Versions:
1.0.0
(default): Initial release.
Supervised keys (See
as_supervised
doc):None
Figure (tfds.show_examples): Not supported.
Citation:
@article{orsini2021matters,
title={What Matters for Adversarial Imitation Learning?},
author={Orsini, Manu and Raichuk, Anton and Hussenot, L{'e}onard and Vincent, Damien and Dadashi, Robert and Girgin, Sertan and Geist, Matthieu and Bachem, Olivier and Pietquin, Olivier and Andrychowicz, Marcin},
journal={International Conference in Machine Learning},
year={2021}
}
locomotion/ant_sac_1M_single_policy_stochastic (default config)
Config description: Dataset generated by a SAC agent trained for 1M steps for Ant.
Download size:
6.49 MiB
Dataset size:
23.02 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'train' |
50 |
- Feature structure:
FeaturesDict({
'steps': Dataset({
'action': Tensor(shape=(8,), dtype=float32),
'discount': float32,
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'observation': Tensor(shape=(111,), dtype=float32),
'reward': float32,
}),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
steps | Dataset | |||
steps/action | Tensor | (8,) | float32 | |
steps/discount | Tensor | float32 | ||
steps/is_first | Tensor | bool | ||
steps/is_last | Tensor | bool | ||
steps/is_terminal | Tensor | bool | ||
steps/observation | Tensor | (111,) | float32 | |
steps/reward | Tensor | float32 |
- Examples (tfds.as_dataframe):
locomotion/hopper_sac_1M_single_policy_stochastic
Config description: Dataset generated by a SAC agent trained for 1M steps for Hopper.
Download size:
2.26 MiB
Dataset size:
2.62 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'train' |
50 |
- Feature structure:
FeaturesDict({
'steps': Dataset({
'action': Tensor(shape=(3,), dtype=float32),
'discount': float32,
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'observation': Tensor(shape=(11,), dtype=float32),
'reward': float32,
}),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
steps | Dataset | |||
steps/action | Tensor | (3,) | float32 | |
steps/discount | Tensor | float32 | ||
steps/is_first | Tensor | bool | ||
steps/is_last | Tensor | bool | ||
steps/is_terminal | Tensor | bool | ||
steps/observation | Tensor | (11,) | float32 | |
steps/reward | Tensor | float32 |
- Examples (tfds.as_dataframe):
locomotion/halfcheetah_sac_1M_single_policy_stochastic
Config description: Dataset generated by a SAC agent trained for 1M steps for HalfCheetah.
Download size:
4.49 MiB
Dataset size:
4.93 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'train' |
50 |
- Feature structure:
FeaturesDict({
'steps': Dataset({
'action': Tensor(shape=(6,), dtype=float32),
'discount': float32,
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'observation': Tensor(shape=(17,), dtype=float32),
'reward': float32,
}),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
steps | Dataset | |||
steps/action | Tensor | (6,) | float32 | |
steps/discount | Tensor | float32 | ||
steps/is_first | Tensor | bool | ||
steps/is_last | Tensor | bool | ||
steps/is_terminal | Tensor | bool | ||
steps/observation | Tensor | (17,) | float32 | |
steps/reward | Tensor | float32 |
- Examples (tfds.as_dataframe):
locomotion/walker2d_sac_1M_single_policy_stochastic
Config description: Dataset generated by a SAC agent trained for 1M steps for Walker2d.
Download size:
4.35 MiB
Dataset size:
4.91 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'train' |
50 |
- Feature structure:
FeaturesDict({
'steps': Dataset({
'action': Tensor(shape=(6,), dtype=float32),
'discount': float32,
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'observation': Tensor(shape=(17,), dtype=float32),
'reward': float32,
}),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
steps | Dataset | |||
steps/action | Tensor | (6,) | float32 | |
steps/discount | Tensor | float32 | ||
steps/is_first | Tensor | bool | ||
steps/is_last | Tensor | bool | ||
steps/is_terminal | Tensor | bool | ||
steps/observation | Tensor | (17,) | float32 | |
steps/reward | Tensor | float32 |
- Examples (tfds.as_dataframe):
locomotion/humanoid_sac_15M_single_policy_stochastic
Config description: Dataset generated by a SAC agent trained for 15M steps for Humanoid.
Download size:
192.78 MiB
Dataset size:
300.94 MiB
Auto-cached (documentation): No
Splits:
Split | Examples |
---|---|
'train' |
200 |
- Feature structure:
FeaturesDict({
'steps': Dataset({
'action': Tensor(shape=(17,), dtype=float32),
'discount': float32,
'is_first': bool,
'is_last': bool,
'is_terminal': bool,
'observation': Tensor(shape=(376,), dtype=float32),
'reward': float32,
}),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
steps | Dataset | |||
steps/action | Tensor | (17,) | float32 | |
steps/discount | Tensor | float32 | ||
steps/is_first | Tensor | bool | ||
steps/is_last | Tensor | bool | ||
steps/is_terminal | Tensor | bool | ||
steps/observation | Tensor | (376,) | float32 | |
steps/reward | Tensor | float32 |
- Examples (tfds.as_dataframe):