Show HN: Scaling up robotic data collection with AI enhanced teleoperation

3 lorepieri 0 7/26/2025, 9:53:14 AM

TLDR: I am using AI&more to make robotic teleoperation faster and sustainable over long periods, enabling large real robotic data collection for robotic foundational models.

We are probably 5-6 orders of magnitude short of the real robotic data we will need to train a foundational model for robotics, so how do we get that? I believe simulation or video can be a complement, but there is no substitution for a ton of real robotic data.

I’ve been exploring approaches to scale robotic teleoperation, traditionally relegated to slow high-value use cases (nuclear decommissioning, healthcare). Here’s a short video from a raw testing session (requires a lot of explanation!):

https://youtu.be/QYJNJj8m8Hg

What is happening here?

First of all, this is true robotic teleoperation (often people confuse controlling a robot in line-of-sight with teleoperation): I am controlling a robotic arm via a VR teleoperation setup without wearing it, to improve ergonomics, but watching at camera feeds. Over wifi, with a simulated 300ms latency + 10ms jitter (international round trip latency, say UK to Australia).

On the right a pure teleoperation run is shown. Disregard the weird “dragging” movements, they are a drag-and-drop implementation I built to allow the operator to reposition the human arm in a more favorable position without moving the robotic arm. Some of the core issues with affordable remote teleoperation are reduced spatial 3D awareness, human-robot embodiment gap, and poor force-tactile feedback. Combined with network latency and limited robotic hardware dexterity they result in slow and mentally draining operations. Often teleoperators employ a “wait and see” strategy similar to the video, to reduce the effects of latency and reduced 3D awareness. It’s impractical to teleoperate a robot for hour-long sessions.

On the left an AI helps the operator twice to sustain long sessions at a higher pace. There is an "action AI" executing individual actions such as picking (the “action AI” right now is a mixture of VLAs [Vision Language Action models], computer vision, motion planning, dynamic motion primitives; in the future it will be only VLAs) and a "human-in-the-loop AI", which is dynamically arbitrating when to give control to the teleoperator or to the action AI. The final movement is the fusion of the AI and the operator movement, with some dynamic weighting based on environmental and contextual factors. In this way the operator is always in control and can handle all the edge cases that the AI is not able to, while the AI does the lion share of the work in subtasks where enough data is already available.

Currently it can speed up experienced teleoperators by 100-150% and much more for inexperienced teleoperators. The reduction in mental workload is noticeable from the first few sessions. An important challenge is speeding up further vs a human over long sessions. Technically, besides AI, it’s about improving robotic hardware, 3D telepresence, network optimisation, teleoperation design and ergonomics.

I see this effort as part of a larger vision to improve teleoperation infra, scale up robotic data collection and deploy general purpose robots everywhere.

About me, I am currently head of AI in Createc, a UK applied robotic R&D lab, in which I built hybrid AI systems. Also 2x startup founder (last one was an AI-robotics exit).

I posted this to gather feedback early. I am keen to connect if you find this exciting or useful! I am also open to early stage partnerships.

LLM Inevitabilism (tomrenner.com)

Do not download the app, use the website (idiallo.com)

Kiro: A new agentic IDE (kiro.dev)

CARA – High precision robot dog using rope (aaedmusa.com)

Linux Reaches 5% Desktop Market Share in USA (ostechnix.com)

Show HN: Tinder but it's only pictures of my wife and I can only swipe right (trytender.app)

Valve confirms credit card companies pressured it to delist certain adult games (pcgamer.com)

Hyatt Hotels are using algorithmic Rest “smoking detectors” (twitter.com)

Global hack on Microsoft Sharepoint hits U.S., state agencies, researchers say (washingtonpost.com)

How to Firefox (kau.sh)

Reflections on OpenAI (calv.info)

Qwen3-Coder: Agentic coding in the world (qwenlm.github.io)

AI overviews cause massive drop in search clicks (arstechnica.com)

Graphene OS: a security-enhanced Android build (lwn.net)

It's time for modern CSS to kill the SPA (jonoalderson.com)

Ukrainian hackers destroyed the IT infrastructure of Russian drone manufacturer (prm.ua)

ChatGPT agent: bridging research and action (openai.com)

Mistral Releases Deep Research, Voice, Projects in Le Chat (mistral.ai)

Windsurf employee #2: I was given a payout of only 1% what my shares where worth (twitter.com)

Cops say criminals use a Google Pixel with GrapheneOS – I say that's freedom (androidauthority.com)

The United States withdraws from UNESCO (state.gov)

XMLUI (blog.jonudell.net)

TrackWeight: Turn your MacBook's trackpad into a digital weighing scale (github.com)

My Self-Hosting Setup (codecaptured.com)

Show HN: Shoggoth Mini – A soft tentacle robot powered by GPT-4o and RL (matthieulc.com)

Ozzy Osbourne has died (bbc.co.uk)

Coding with LLMs in the summer of 2025 – an update (antirez.com)

Oakland cops gave ICE license plate data; SFPD also illegally shared with feds (sfstandard.com)

Steam, Itch.io are pulling ‘porn’ games. Critics say it's a slippery slope (wired.com)

Cloudflare 1.1.1.1 Incident on July 14, 2025 (blog.cloudflare.com)

Death by AI (davebarry.substack.com)

You can now disable all AI features in Zed (zed.dev)

New colors without shooting lasers into your eyes (dynomight.net)

Complete silence is always hallucinated as "ترجمة نانسي قنقر" in Arabic (github.com)

UK backing down on Apple encryption backdoor after pressure from US (arstechnica.com)

Ask HN: Is it time to fork HN into AI/LLM and "Everything else/other?"

Apple's MLX adding CUDA support (github.com)

Data brokers are selling flight information to CBP and ICE (eff.org)

Gemini with Deep Think achieves gold-medal standard at the IMO (deepmind.google)

Women dating safety app 'Tea' breached, users' IDs posted to 4chan (404media.co)

AccountingBench: Evaluating LLMs on real long-horizon business tasks (accounting.penrose.com)

Nobody knows how to build with AI yet (worksonmymachine.substack.com)

Electric cars produce less brake dust pollution than combustion-engine cars (modernengineeringmarvels.com)

Ex-Waymo engineers launch Bedrock Robotics to automate construction (techcrunch.com)

Rust running on every GPU (rust-gpu.github.io)

Uv: Running a script with dependencies (docs.astral.sh)

What went wrong inside recalled Anker PowerCore 10000 power banks? (lumafield.com)

Cognition (Devin AI) to Acquire Windsurf (cognition.ai)

A 14kb page can load much faster than a 15kb page (2022) (endtimes.dev)

OpenAI claims gold-medal performance at IMO 2025 (twitter.com)

Show HN: Scaling up robotic data collection with AI enhanced teleoperation

Comments (0)