Java Virtual Threads Ate My Memory: A Web Crawler's Tale of Speed vs. Memory

38 dariobalinzo 14 5/29/2025, 1:08:17 PM dariobalinzo.medium.com ↗

Comments (14)

koito17 · 4h ago
> Without any back-pressure, the program kept stuffing memory with pending results.

The key phrase in the whole article.

This is the reason I am not fully confident in the "pervasive async" mindset overtaking web development. In JavaScript, pervasive async makes sense, because all functions eventually return control to a central event loop anyway. In environments with real threading, the move from OS threads to "green threads and async everywhere" implicitly gives up the tried-and-tested scheduler of your OS and its mechanisms to handle back-pressure. Developers must now take full responsibility for avoiding buffer bloat.

> they fundamentally change how we think about concurrency limits and resource management.

Virtual threads make concurrency cheaper, but nothing about them (or any other concurrency abstraction) eliminates the need for flow control. By switching to virtual threads, you traded off the safeguards once provided by your (bounded!) thread pool and OS scheduler. It's now your responsibility to put safeguards back in.

As an aside: in Clojure's core.async, channel limits (like 1024 pending puts) are intentionally there as a warning sign that your async pipeline isn't handling back-pressure correctly. This is exactly the "hyperactive downloader with no brakes" problem described in the article. With core.async, you would have eventually hit an exception related to excessive puts on a channel, rather than silently expanding buffers until resource exhaustion.

pron · 4h ago
> By switching to virtual threads, you traded off the safeguards once provided by your (bounded!) thread pool and OS scheduler.

The OS scheduler provides no safeguards here (you can use the thread-per-task executor with OS threads - you'll just get an OOME much sooner), and as the virtual thread adoption guide says, you should replace bounded thread pools with a semaphore when needed because it's just a better construct. It allows you to limit concurrency for different resources in an easy and efficient way even when different threads make use of different resources.

In this case, by switching to virtual threads you get to use more appropriate concurrency constructs without needing to manage threads directly, and without sharing them among tasks.

ndriscoll · 2h ago
Wouldn't there be significant overhead in waking tasks from IO just to put them back to sleep on a semaphore vs. there being fewer threads servicing the IO ring for a given type of task?
mdavidn · 1h ago
Presumably the semaphore would be used to limit the number of concurrent virtual threads.

Incoming tasks can be rejected proactively if a system would overflow its limit on concurrent tasks. This approach uses a semaphore’s non-waiting acquire and release operations. See the “bulkhead pattern.”

Alternatively, if a system has a fixed number of threads ingesting tasks, these would respond to backpressure by waiting for available permits before creating more virtual threads.

tombert · 4h ago
At my last job I made a thing that took in tens of millions of records from Kafka, did some processing, plopped stuff into Postgres, and moved along. It wasn't hard, but getting throughput was difficult to achieve my throughput goals with regular blocking JDBC.

Eventually I moved to an async-non-blocking model, and for my initial tests it went great, stuff returned almost immediately and I could grab the next one. The problem is that, sort of by definition, non-blocking stuff doesn't really know when it's going to complete right away, so it just kept enqueuing stuff until it ate all my memory.

I figured out pretty early on what was happening, but the other people on my team really seemed to struggle with the concept of backpressure, I think because they weren't really used to a non-blocking model. You kind of get backpressure for free with blocking code.

Eventually I was able to solve my problem using a BlockingQueue,

wahern · 42m ago
The OOM error is the back pressure. This is why some engineers working on highly concurrent frameworks insist on pervasive OOM recovery.

I haven't done any Java in years, but I always thought OOM errors were recoverable in the JVM. Or is this just a case where the author never thought to catch OOM exceptions? My instinct would be to catch OOM early (i.e. in entry function) in a virtual thread and re-queue the task. In theory re-queueing might fail, too, I guess, but in practice probably not.

This is why I like to code in C (and Lua) for this kind of thing, as in C my core data structures like lists and trees don't require additional allocations for insert. My normal pattern here would be to have, say, three job lists--todo, pending, complete; a job would always exist in one of those lists (presuming it could be allocated and initialized, of course), and no allocations would be required to move a job back from pending to todo on an OOM error during processing.

koito17 · 22m ago
In Java, there are two kinds of Throwable instances[1]: Error and Exception. As the name suggests, OutOfMemoryError is a subclass of Error. In contrast to Exception, an Error "indicates serious problems that a reasonable application should not try to catch"[2]. For this reason, it's considered bad practice in Java to catch all Throwable instances (or catch Error instances explicitly).

> My instinct would be to catch OOM early

OutOfMemoryError subclasses VirtualMachineError, and when a VirtualMachineError is thrown, your program seems to be firmly in undefined behavior territory. Quoting the JVM specification [3]:

  A Java Virtual Machine implementation throws an object that is an instance of a subclass of the class VirtualMachineError when an internal error or resource limitation prevents it from implementing the semantics described in this chapter. This specification cannot predict where internal errors or resource limitations may be encountered and does not mandate precisely when they can be reported. 
For context: "this chapter" refers to the whole chapter describing the behavior of each JVM instruction. The specification seems to suggest that all bets are off on any guarantees the JVM makes by the time a VirtualMachineError is thrown.

[1] https://docs.oracle.com/en/java/javase/21/docs/api/java.base...

[2] https://docs.oracle.com/en/java/javase/21/docs/api/java.base...

[3] https://docs.oracle.com/javase/specs/jvms/se21/html/jvms-6.h...

unscaled · 8m ago
You can catch OutOfMemoryErrors in Java, but it wouldn't be as simple as you describe. In a real-world program, you may be doing other things besides downloading URLs and feeding the URLs to the download scheduler. Even though the URL downloader is what's causing the memory pressure, an OutOfMemoryError may very well be thrown by an allocation anywhere else. _You only have one shared heap_, so unless there is only one single thing that is being allocated on the heap you cannot use it as.

Coming from C, you might think you can just avoid heap allocations for anything that you don't manage with backpressure, but that doesn't work in Java, since Java can only store primitives and references on the stack (at least until project Valhalla is delivered[1]). Almost everything you do triggers a heap allocation.

Virtual Threads make this even worse, since their "stack" is also stored on the heap and it can grow dynamically and require more heap allocations even if you just add a primitive object on the stack!

tl;dr: No, you cannot rely on catching OutOfMemoryError in Java in anything but the simplest scenarios and things break down when you have multiple threads and especially virtual threads.

[1] https://openjdk.org/projects/valhalla/

PaulKeeble · 3h ago
It's not really any different to the limitations of memory that we have always dealt with, it just consumes it faster than we might expect. The techniques we use for ensuring we don't run out of memory also apply to anything that can make threads and we have to put limitations on how many can exist, especially since they consume quite a bit of memory even in their virtual thread variant.
3cats-in-a-coat · 3h ago
Everything, always, returns to a main event loop, even the actual hardware threads on your CPU. So JavaScript developers don't get a pass on this.

The problem is you keep creating async requests, they need to be stored somewhere, and that's in memory. Thread or no thread, it's in memory.

And backpressure is essential, yes. Except for basic tasks.

But there is another solution, which is coalescing requests. I.e. in my current project, I combine thousands of operations, each of which could've been its own async request, into a single "unit of work" which then results in a single async request, for all operations combined.

If we treat IO seriously, we won't just fire off requests into the dark by the hundreds. We'd think about it, and batch things together. This can be done manually, but it's best done at a lower level, so you don't have to think about it at the business logic level.

fjasdfwa · 4h ago
> Virtual Threads Ate My Memory

From what I see, It's a developer trade off.

If you want to save on memory then you need to drop down to callbacks, promise chains, or async-await sugar that is a compiler transform to a state machine.

But if you do that, then you will write a blog complaining about the boilerplate!

I think Zig had an ability to kind of give you the best of both worlds. A zig expert would need to correct me on this.

geodel · 4h ago
I used VThreads recently with good results I should say. In my case I limited concurrency by a blocking queue of concurrent tasks. So number of virtual threads does not keep rising based of millions of input records I was processing.

I found for me keeping VThreads about ~100X of concurrent call to external resources was kind of sweet spot for overall throughput.

swsieber · 3h ago
Good timing. I was actually reading the official docs the other day on virtual threads, and it has a big section decidicated to rate limiting virtual threads:

"The hardest thing to internalize about virtual threads is that, while they have the same behavior as platform threads they should not represent the same program concept."

...

"Sometimes there is a need to limit the concurrency of a certain operation. For example, some external service may not be able to handle more than ten concurrent requests. Because platform threads are a precious resource that is usually managed in a pool, thread pools have become so ubiquitious that they're used for this purpose of restricting concurrency, "

...

"But restricting concurrency is only a side-effect of thread pools' operation. Pools are designed to share scarce resources, and virtual threads aren’t scarce and therefore should never be pooled!

"When using virtual threads, if you want to limit the concurrency of accessing some service, you should use a construct designed specifically for that purpose: the Semaphore class."

...

"Simply blocking some virtual threads with a semaphore may appear to be substantially different from submitting tasks to a fixed thread pool, but it isn't. Submitting tasks to a thread pool queues them up for later execution, but the semaphore internally (or any other blocking synchronization construct for that matter) creates a queue of threads that are blocked on it that mirrors the queue of tasks waiting for a pooled thread to execute them. Because virtual threads are tasks, the resulting structure is equivalent"

"Even though you can think of a pool of platform threads as workers processing tasks that they pull from a queue and of virtual threads as the tasks themselves, blocked until they may continue, the underlying representation in the computer is virtually identical. _Recognizing the equivalence between queued tasks and blocked threads will help you make the most of virtual threads._" (emphasis mine)

So it was cool seeing him arrive at the same conclusion (or he read the docss). Either way it was a nice timely article.

Link to the docs: https://docs.oracle.com/en/java/javase/21/core/virtual-threa...

geodel · 1h ago
Well 6 hours of debugging can save 5 min of reading the docs.