• Description:

Real dataset. Imitating mobile manipulation tasks that are bimanual and require whole-body control. 50 demonstrations for each task.

Split Examples
'train' 276
  • Feature structure:
    'episode_metadata': FeaturesDict({
        'file_path': string,
    'steps': Dataset({
        'action': Tensor(shape=(16,), dtype=float32),
        'discount': Scalar(shape=(), dtype=float32),
        'is_first': bool,
        'is_last': bool,
        'is_terminal': bool,
        'language_instruction': string,
        'observation': FeaturesDict({
            'cam_high': Image(shape=(480, 640, 3), dtype=uint8),
            'cam_left_wrist': Image(shape=(480, 640, 3), dtype=uint8),
            'cam_right_wrist': Image(shape=(480, 640, 3), dtype=uint8),
            'state': Tensor(shape=(14,), dtype=float32),
        'reward': Scalar(shape=(), dtype=float32),
  • Feature documentation:
Feature Class Shape Dtype Description
episode_metadata FeaturesDict
episode_metadata/file_path Tensor string
steps Dataset
steps/action Tensor (16,) float32
steps/discount Scalar float32
steps/is_first Tensor bool
steps/is_last Tensor bool
steps/is_terminal Tensor bool
steps/language_instruction Tensor string
steps/observation FeaturesDict
steps/observation/cam_high Image (480, 640, 3) uint8
steps/observation/cam_left_wrist Image (480, 640, 3) uint8
steps/observation/cam_right_wrist Image (480, 640, 3) uint8
steps/observation/state Tensor (14,) float32
steps/reward Scalar float32
  • Citation:
@inproceedings{fu2024mobile,author = {Fu, Zipeng and Zhao, Tony Z. and Finn, Chelsea},title = {Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation},booktitle = {arXiv},year = {2024},}