Mingyo Seo, H. Andy Park, Shenli Yuan, Yuke Zhu†, Luis Sentis†
Cross-embodiment imitation learning enables policies trained on specific embodiments to transfer across different robots, unlocking the potential for large-scale imitation learning that is both cost-effective and highly reusable. This paper presents LEGATO, a cross-embodiment imitation learning framework for visuomotor skill transfer across varied kinematic morphologies. We introduce a handheld gripper that unifies action and observation spaces, allowing tasks to be defined consistently across robots. Using this gripper, we train visuomotor policies via imitation learning, applying a motion-invariant transformation to compute the training loss. Gripper motions are then retargeted into high-degree-of-freedom whole-body motions using inverse kinematics for deployment across diverse embodiments. Our evaluations in simulation and real-robot experiments highlight the framework’s effectiveness in learning and transferring visuomotor skills across various robots.
If you find our work useful in your research, please consider citing.
- Python 3.9.2 (recommended)
- Robosuite 1.4.1
- Robomimic 0.3.0
- PyTorch
- pytorch3d (We are using these functions from the original PyTorch3D repository.)
We provide our demonstration dataset in simulation environment (link) and trained models of the Visuomotor Policies (link).
We are currently working on open-sourcing the scripts for deploying on real robots.
LEGATO is released under the MIT License. The flex_ik_solver
part of this code was produced as part of Mingyo Seo's internship at the Boston Dynamics AI Institute in 2023 and 2024 and is provided "as is" without active maintenance. For questions, please contact Mingyo Seo or H. Andy Park.
@misc{seo2024legato,
title={LEGATO: Cross-Embodiment Imitation Using a Grasping Tool},
author={Seo, Mingyo and Park, H. Andy and Yuan, Shenli and Zhu, Yuke and
and Sentis, Luis},
year={2024}
eprint={2411.03682},
archivePrefix={arXiv},
primaryClass={cs.RO}
}