You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for releasing the code and checkpoints! We spent considerable effort evaluating the released RoboTwin-MeM checkpoint and would like to report our findings and ask a few questions.
Setup issues we had to fix first
(sharing in case they help others)
Assets: as in missing 003_cover asset #2, the download script pulls RoboTwin 2.0 base objects whose indices collide with RoboTwin-MeM objects (e.g. 009_kettle vs 009_toycar, 010_pen vs 010_mouse). Fixed by overlaying the full assets from the HF dataset repo.
Embodiment: the training data metadata (lerobot_2.1/*/meta/info.json) says robot_type: "aloha", so we evaluate with embodiment: [aloha-agilex]. Note assets/embodiments/aloha-agilex/curobo_left.yml and curobo_right.yml contain hardcoded absolute paths (/mnt/workspace/yangganlin/code/RMBench/...) that need manual fixing.
Checkpoint config: the released config.yaml has framework.name: QwenOFT, which build_framework rejects; we changed it to EventVLA. The weights then load with zero missing/unexpected/shape-mismatched keys.
Websocket: client ping_interval=20 drops the connection when one inference exceeds 20s; we set it to None.
Results
After these fixes the policy behaves qualitatively correctly: the ALOHA arms reach and press buttons, and keyframe events fire (conf≈1.0) exactly at press moments, with the scene matching the training videos. However:
All runs are clean (no crashes/disconnects). A typical failure: in press_button_keyframe the policy presses 2–5 times and then idles until timeout, i.e. it fails the memory-dependent counting — even though keyframes are being committed.
Questions
Is RoboTwin-MeM/final_model/pytorch_model.pt the exact model used for Table 2?
Could you share the exact evaluation configuration (task_config yaml, instruction_type, step limits, seed range)?
Is there any known issue where committed keyframes fail to be injected into the VLM at inference in the released code path (examples/RoboTwin-Mem/eval_files/)?
Thanks for releasing the code and checkpoints! We spent considerable effort evaluating the released RoboTwin-MeM checkpoint and would like to report our findings and ask a few questions.
Setup issues we had to fix first
(sharing in case they help others)
003_coverasset #2, the download script pulls RoboTwin 2.0 base objects whose indices collide with RoboTwin-MeM objects (e.g.009_kettlevs009_toycar,010_penvs010_mouse). Fixed by overlaying the full assets from the HF dataset repo.lerobot_2.1/*/meta/info.json) saysrobot_type: "aloha", so we evaluate withembodiment: [aloha-agilex]. Noteassets/embodiments/aloha-agilex/curobo_left.ymlandcurobo_right.ymlcontain hardcoded absolute paths (/mnt/workspace/yangganlin/code/RMBench/...) that need manual fixing.config.yamlhasframework.name: QwenOFT, whichbuild_frameworkrejects; we changed it toEventVLA. The weights then load with zero missing/unexpected/shape-mismatched keys.ping_interval=20drops the connection when one inference exceeds 20s; we set it toNone.Results
After these fixes the policy behaves qualitatively correctly: the ALOHA arms reach and press buttons, and keyframe events fire (conf≈1.0) exactly at press moments, with the scene matching the training videos. However:
demo_clean,instruction_type: unseen, step limit 1000)All runs are clean (no crashes/disconnects). A typical failure: in press_button_keyframe the policy presses 2–5 times and then idles until timeout, i.e. it fails the memory-dependent counting — even though keyframes are being committed.
Questions
RoboTwin-MeM/final_model/pytorch_model.ptthe exact model used for Table 2?examples/RoboTwin-Mem/eval_files/)?Happy to provide full logs. Thanks!