TeleOpBench from Shanghai AI Laboratory introduces a simulator-centric benchmark for dual-arm dexterous teleoperation, comparing four teleop pipelines side by side. In their evaluation, the motion capture pipeline using Xsens Link and Xsens Metagloves by Manus delivered the highest task precision in the shortest time.
Teleoperation tools are hard to compare fairly. Different hardware, operators, and task setups make it difficult to know which interface delivers the best combination of speed and precision for dual-arm dexterous work.
The team at Shanghai AI Laboratory built TeleOpBench in NVIDIA Isaac Sim, running 30 manipulation tasks across three commercial robots (Unitree Robotics H1-2, Fourier GR1-T2, Unitree G1) under one protocol and metric suite, then mirrored a representative subset in a user study and on a physical dual-arm robot.
When building bimanual manipulation for embodied robot learning, teleoperation often provides the fastest path to high-quality demonstrations. The hard part is not only collecting data, but it is also choosing a teleoperation interface you can trust and comparing alternatives fairly.
TeleOpBench tackles exactly that gap by introducing a simulator-centric benchmark for dual-arm dexterous teleoperation, built to enable rigorous and reproducible comparisons across competing teleoperation pipelines.
Comparing teleoperation systems in the real world is messy. Hardware differences, environment setup, and task variability can easily dominate the result. TeleOpBench uses NVIDIA Isaac Sim to fix the environment and the robot embodiment, then evaluates teleoperation interfaces using consistent success criteria, with task success and completion time as the primary metrics.
To test whether the simulator results actually translate, the authors mirror experiments on a physical dual arm platform and report a strong alignment between simulation and real-world outcomes.
Robots: three commercially available humanoids spanning different scales and hand designs: Unitree H1-2, Fourier GR1-T2, and Unitree G1
Tasks: 30 bimanual manipulation environments, spanning pick and place, tool use, and collaborative manipulation, organized by complexity so that low-fidelity and high-fidelity interfaces can be compared meaningfully.
Interfaces: four representative pipelines implemented under one protocol:
Motion capture (Xsens Link and Manus Gloves)
VR device
Arm and hand exoskeleton
Monocular vision tracking
Evaluation protocol: from the full suite, TeleOpBench selects 10 representative tasks for a user study with four participants, recording success rates and completion times.
TeleOpBench is useful because it makes the trade space tangible.
Vision tracking reduces hardware requirements but can be sensitive to occlusion and frame rate limits.
VR improves wrist and hand tracking fidelity compared to monocular vision, and tends to land between vision and higher fidelity systems on performance.
Exoskeleton offers kinematic alignment and direct mapping advantages, but has its own mobility limits depending on the design.
Inertial motion capture focuses on capturing body segment motion precisely and mapping it robustly to the robot.
TeleOpBench’s motion capture pipeline is built around Xsens Links and Manis Gloves.
Xsens Link uses 17 IMUs attached to human body segments for limb movement tracking.
Xsens Metagloves by Manus provide 20 degrees of freedom per hand, capturing detailed finger joint motion.
Across the reported evaluation, the paper highlights that the Xsens-based method excels in smoothness and motion precision, completing tasks accurately and typically with the least time cost.
TeleOpBench also reports that, when plotting completion-time curves, the inertial MoCap pipeline is the fastest, and that simulation and real-world performance show a strong positive correlation.
If you are choosing a teleoperation setup to scale demonstration collection, TeleOpBench offers a concrete way to reason about tradeoffs.
Throughput: faster completion times can translate to more demonstrations per operator hour.
Reliability: higher success rates reduce wasted runs and cleanup time in dataset building.
Project page: https://gorgeous2002.github.io/TeleOpBench/
Learn more about robot motion training with Xsens.