Show HN: Velda – Run any command directly on cloud compute
To get instant access to a GPU for interactive debugging or rapid prototyping, many ML teams rely on expensive, always-on dev machines—either a powerful local workstation or a cloud VM with a dedicated GPU. This setup is great for speed, but it's costly and the GPU often sits idle.
The usual alternative is a slow remote ML workflow, where you wait 10 minutes for a docker build/push loop just to test a small change.
We built a tool to give you the interactive speed of a local GPU with the power and flexibility of the cloud, skipping both the high cost and the slow workflow.
With our CLI, vrun, you can run any command on remote compute, with output streamed back as if it were running locally.
For example, to run a training script on an H100:
$ vrun -P gpu-h100-1 python train.py --epochs 100
Jobs start in ~1 second on a warm worker. No git commits, no container builds, no CI pipelines.
https://www.youtube.com/watch?v=fr58LREZ6vQ&t=2s for a video demo
How it works:
- Like most Cloud IDE, you get a personal persistent dev-container. This is your single source of truth for dependencies. You can use all the familiar tools (apt, pip, uv, conda, whatever) to manage your environment.
- When scaling up with vrun, it will mount your dev container. Only the necessary files will be sent to the worker to speed up launch.
- In the back, can plug in any compute source (EC2, GCE, Kubernetes, HPC cluster) and developers access it through a simple pool name (gpu-h100-1).
We focus on improving the iterative part of the machine learning workflow. For developers, this means faster research cycles. For the company, it can mean lower costs by not needing dedicated per-developer GPUs. It will complement your existing CI/CD and production deployment systems.
Core is open source https://github.com/velda-io/velda. More examples: https://velda.io/blog/vrun-is-all-you-need
You can explore our playground at https://novahub.dev running training/inference with Nvidia-T4 with pool GPU.
On our roadmap: We’re exploring hosted options, deeper integration with other job frameworks, lightweight serving & workflows.
We'd love to hear your thoughts and questions. What's the biggest point of friction in your team's remote development workflow today?
No comments yet