ICRA 2026 Workshop

Beyond Teleoperation

Learning from Diverse Human and Simulation Data

Date: 5th June  ·  Location: Strauss 3
Sponsored by Lightwheel Toyota Research Institute

Introduction

Teleoperation has driven progress in robot learning, but it is fundamentally limiting: control interfaces introduce latency and reduce degrees of freedom, producing demonstrations that lack the natural dexterity and diverse strategies humans exhibit in everyday manipulation. Human video and simulation each offer qualitatively different strengths. Unscripted human video captures whole-body coordination, tool use, and manipulation strategies that no teleoperation interface can reproduce. Simulation enables exploration of contact-rich dynamics, failure recovery, and edge cases at scale. Yet both carry challenges: embodiment mismatch and the sim-to-real gap. The central question is not just how do we get more data, but how do we extract what makes these sources qualitatively richer, and bridge the gaps that separate them from deployable robot skills?

This workshop focuses on methods that:

We bring together researchers working on learning from human data, simulation, and robot teleoperation to share insights and collaborate on building more general-purpose manipulation systems.

Confirmed Speakers

Panel Discussion

Our panel with the invited speakers (4:20-5:20pm) will cover questions such as:

Workshop Schedule

Start Time End Time Event
08:30 09:00 Welcome (Organizers)
09:00 09:30 Talk 1: Jitendra Malik
09:30 10:00 Talk 2: Edward Johns
10:00 11:00 Break + Poster Session
11:00 11:30 Talk 3: Danfei Xu
11:30 12:00 Talk 4: Karen Liu
12:00 12:30 Talk 5: Katerina Fragkiadaki
12:30 13:30 Break
13:30 14:00 Talk 6: Yue Wang
14:00 14:30 Talk 7: Roberto Martín-Martín
14:30 15:00 Spotlight Talks (4 papers)
15:00 16:00 Break + Poster Session
16:00 16:20 Sponsor Talk: Steve Xie (Lightwheel)
16:20 17:20 Panel Discussion
17:20 17:30 Closing Remarks

Accepted Papers

All accepted papers will be presented as posters across two sessions. Four papers were also selected for spotlight talks.

Spotlight Talks — 14:30–15:00

  1. MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation
  2. UniDex-ViTac: Learning Unified Visuo-Tactile Dexterous Manipulation Policy from Human Video Data
  3. DemoDiffusion: One-Shot Human Imitation using Pre-trained Diffusion Policy
  4. Humanoid Bimanual Dexterous Manipulation Driven by Egocentric Video

Poster Session 1 — 10:00–11:00

  1. Reconstructing Hand-Held Objects in 3D from Images and Videos
  2. MotionTrans: Human VR Data Enable Motion-Level Learning for Robotic Manipulation Policies
  3. One-Shot Learning of Manipulation from RGB-D Videos via Object-Centric Interaction Reasoning
  4. Humanoid Bimanual Dexterous Manipulation Driven by Egocentric Video
  5. YUBI: Yielding Universal Bidigital Interface for Bimanual Dexterous Manipulation at Scale
  6. Dex4D: Task-Agnostic Point Track Policy for Sim-to-Real Dexterous Manipulation
  7. Tune to Learn: How Controller Gains Affect Robot Policy Learning
  8. CRAFT: Video Diffusion for Bimanual Robot Data Generation
  9. Whole-Body Mobile Manipulation using Offline Reinforcement Learning on Sub-optimal Controllers
  10. Point Bridge: 3D Representations for Cross Domain Policy Learning
  11. IFG: Internet-Scale Guidance for Functional Grasping Generation
  12. MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation
  13. HumanoidMimicGen: Data Generation for Loco-Manipulation via Whole-Body Planning and Adaptation
  14. X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations

Poster Session 2 — 15:00–16:00

  1. UniDex-ViTac: Learning Unified Visuo-Tactile Dexterous Manipulation Policy from Human Video Data
  2. PHABS: A Handheld Haptic Device for Force-Annotated Bimanual Demonstration Data
  3. Learning Whole-Body Humanoid Locomotion via Motion Generation and Motion Tracking
  4. Learning Quadruped Locomotion from Casual Videos
  5. HOMimic: Distilling Manipulation Trajectories from Human Videos via Multi-Stage Interaction Reasoning and Taxonomy-Aware Retargeting
  6. Few-Shot Learning of Tool-Use Skills with Proximity and Tactile Sensing
  7. Object-Centric Reward Learning from Action-Free Videos for Long-Horizon Manipulation Beyond Teleoperation
  8. Semantic–Geometric Task Representations for Bimanual Manipulation from Human Demonstrations to Robot Action Planning
  9. MobileEgo Anywhere: Open Infrastructure for Long-Horizon Egocentric Data on Commodity Hardware
  10. Learning Sim-Grounded Policies for Bimanual Rope Manipulation from Human Teleoperation Data
  11. DemoDiffusion: One-Shot Human Imitation using Pre-trained Diffusion Policy
  12. Overcoming Distribution Shifts with Autonomous Embodied Data Collection
  13. Dream2Flow: Bridging Video Generation and Open-World Manipulation with 3D Object Flow
  14. UniLatent: Cross-Embodiment Transfer via Latent Observation Alignment
  15. Learning Structured Policies for General Humanoid Loco-Manipulation

Organizers

Contact

For inquiries, reach us at icra-beyond-teleop@googlegroups.com.