Async Ruby Is the Future of AI Apps (and It's Already Here)

67 doppp 10 7/10/2025, 2:12:35 AM paolino.me ↗

Comments (10)

horsawlarway · 1d ago
Mmmm...

I find it somewhat ironic that you pitch this as "No callbacks. No promises. No async/await keywords. Just Ruby code that scales."

When you literally show in the example right above that you need both an "async do" and a "end.map(&:wait)".

I'll add - the one compelling argument you make about needing a db connection per worker is mitigated with something like pgbouncer without much work. The OS overhead per thread (or hell, even per process: https://jacob.gold/posts/serving-200-million-requests-with-c...) isn't an argument I really buy, especially given your use case is long running llm chat tasks as stated above.

Personally - if I really want to be fast and efficient I'm not picking Ruby anyways (or python for that matter - but at least python has the huge ecosystem for the LLM/AI space right now).

earcar · 1d ago
Fair point on the syntax, I should have been clearer. What I meant is that your existing Ruby code doesn't need modifications. In Python you'd need to use a different HTTP library, add `async def` and `await` everywhere, etc. In Ruby the same `Net::HTTP` call works in both sync and async context.

The `Async do` wrapper just at the orchestration level, not throughout your codebase. That's a huge difference in practice.

Regarding pgbouncer - yes, it helps with connection pooling, but you still have the fundamental issue of 25 workers = 25 max concurrent LLM streams. Your 26th user waits. With fibers, you can handle thousands on the same hardware because they yield during the 30-60s of waiting for tokens.

Sure, for pure performance you'd pick another language. But that's not the point - the point is that you can get much better performance for IO-bound workloads in Ruby today, without switching languages or rewriting everything.

It's about making Ruby better at what it's already being used for, not competing with system languages.

horsawlarway · 12h ago
> Regarding pgbouncer - yes, it helps with connection pooling, but you still have the fundamental issue of 25 workers = 25 max concurrent LLM streams.

I guess my point is why are you picking an arbitrarily low number like 25? If you know that workers are going to be "waiting for tokens" most of the time, why not bump that number way, WAY up?

And I guess I should clarify - I'm coming into this outside of the Python space (I touch python because it's hard to avoid when doing AI work right now, but it's hardly my favorite language). Basically - having done a lot of GoLang, which uses goroutines in basically the same way Ruby uses Fibers (lightweight runtime managed thread replacements) I'll tell you up front - The orchestration level still matters a LOT, and you're going to be dealing with a lot of complexity there to make things work, even if it does mean that some lower level code can remain unaware (colorless).

Even good ol' fashioned c++ has had this concept bouncing around for a long time ( https://github.com/boostorg/fiber ). It's good at some things, but it's absolutely not the silver bullet I feel like you're trying to pitch it as here.

hakunin · 1d ago
> Personally - if I really want to be fast and efficient I'm not picking Ruby anyways (or python for that matter - but at least python has the huge ecosystem for the LLM/AI space right now

"Fast and efficient" can mean almost anything. You can be fast and efficient in Ruby at handling thousands of concurrent llm chats (or other IO-bound work), as per the article. You can also be fast and efficient at CPU-bound work (it's possible to enjoy Ruby while keeping in mind how it will translate into C). You probably cannot be fast and efficient at micro-managing memory allocations in Ruby. If you're ok to brush ruby aside over a vague generalization, maybe you just don't see its appeal in the first place, which is fair, but that makes the other reasons you provide kind of moot.

jufter · 23h ago
Aren't threads overkill for an IO workload? You can do a lot with 1 thread and epoll(7).
hakunin · 1d ago
Gotta give credit for wonderfully clear writing. You can tell a person understands what they're saying by how well they express it. Reads smooth, and makes me see the author's mental model.

As far as substance: I love ruby libraries that allow you to simply "insert any ruby code". Many libraries tell you to call specific declarative functions, but I think Ruby shines at letting you use Ruby, instead of some limited subset of it. Examples of not-great approaches (imo) are libraries that try to take over how you write code, and give you a special declarative syntax for runtime type checking, building services out of lambdas, composing functions. Ruby's async is an example of "just insert any ruby in here". You can build runtime type checking the same way — allow people to check the value with any ruby code they like. Essentially, I agree with author's sentiment, and wish more people appreciated the beauty of this approach.

earcar · 1d ago
Author here. Thank you, that means a lot!

Happy to answer any questions.

knowitnone · 1d ago
"these microseconds add up to real latency"

While I love Ruby, if performance is your main motiviation, you would not be using a scripting language.

Alifatisk · 1d ago
What an interesting perspective on Ruby async, the I/O multiplexing example was quite faschinating to see aswell
moralestapia · 1d ago
Python and Ruby developers discovering what was standard on Javascript a decade ago.

*yawn*

Imitation is the sincerest form of flattery, at least.