Ask HN: Go libraries for managing Docker container pools and executing commands?

3 magundu 3 4/2/2025, 6:31:05 PM
I’m developing a system in Go that maintains a fixed pool of Docker containers (e.g., 10) running a specific image (like ‘node’), where each container remains alive (using a command like tail -f) to be ready for executing arbitrary commands via docker exec. The system tracks the workload of each container, distributes commands to the least loaded one, and monitors container health to automatically restart or replace unhealthy instances.

I’m aware of the official Docker Go SDK (github.com/docker/docker/client) for managing containers, but I’m curious if there are any higher-level tools or libraries in Go that provide additional support for scheduling, load balancing, or enhanced health monitoring of containers in such a setup. Has anyone built or used libraries that streamline this kind of container orchestration and command execution?

Any insights, recommendations, or experiences would be greatly appreciated!

Comments (3)

SonuSitebot · 25d ago
I've worked on a similar setup in Go — managing a pool of "always-on" containers for isolated task execution via docker exec. The official Docker SDK is solid but pretty low-level, so I get the desire for something more ergonomic. In my experience, there aren't many off-the-shelf Go libraries that give you full orchestration primitives (load balancing, health checks, scheduling) out of the box like you'd find in Nomad or K8s. But here are a few options worth exploring:

gofiber/fiber – not container-specific, but useful for building lightweight async schedulers if you're rolling your own orchestration logic.

dockertest – primarily for testing, but you can adapt its logic for simplified lifecycle management.

hashicorp/go-plugin – good for decoupling workloads, especially if you're considering container-based isolation per plugin/command.

That said, most teams I’ve seen build their own lightweight layer on top of the Docker SDK with Redis or internal queues for tracking load/health. Curious if you're doing multi-host management or keeping this local? Also, make sure to aggressively timeout and clean up zombie exec sessions — they sneak up fast when you're doing docker exec a lot.

Would love to hear more if you open source anything from this!

magundu · 23d ago
Our use case is to execute test scripts in a sandbox mode. This is multi host and multi region setup. We might run millions of test scripts per day.

One of our engineers found https://testcontainers.com. We find it interesting and it seems like it won’t maintain container live. Instead, it start and remove the container for each test. We might need to implement lock mechanism to run only maximum number of containers at a time. I don’t know whether it fits for highly scalable test cases.

SonuSitebot · 15d ago
That’s a super exciting use case — running millions of test scripts across a multi-host, multi-region setup is no small feat. You're spot on about Testcontainers — it's elegant for one-off, isolated runs (like in CI), but when you're pushing at scale, the overhead of spinning up and tearing down containers for every single test can start to hurt. In high-throughput environments, most scalable setups I’ve seen shift towards a pre-warmed pool of sandbox containers — essentially keeping a fleet of "hot" containers alive and routing tasks into them via docker exec. You lose a bit of isolation granularity but gain massively in performance. You could even layer in a custom scheduler (Redis- or NATS-backed maybe?) that tracks container load and availability across hosts. Pair that with a smart TTL+health checker, and you can recycle containers efficiently without zombie buildup. Also — curious if you've explored running lighter-weight VMs (like Firecracker or Kata Containers) instead of full Docker containers? They can offer tighter isolation with decent spin-up times, and could be a better fit for multi-tenant test runs at this scale. Would love to nerd out more on this — are you planning to open source anything from your infra? Or even just blog about the journey? I feel like this would resonate with a lot of folks in the testing/devops space.