lsr: ls with io_uring

244 mpweiher 136 7/18/2025, 12:40:41 PM rockorager.dev ↗

Comments (136)

rockorager · 3h ago

Author of the project here! I have a little write up on this here: https://rockorager.dev/log/lsr-ls-but-with-io-uring

ChuckMcM · 15m ago

Nice writeup. I suspect you're measuring the cost of abstraction. Specifically, routines that can handle lots of things (like locale based strings and utf8 character) have more things to do before they can produce results. This was something I ran into head on at Sun when we did the I18N[1] project.

In my experience there was a direct correlation between the number of different environments where a program would "just work" and its speed. The original UNIX ls(1) which had maximum sized filenames, no pesky characters allowed, all representable by 7-bit ASCII characters, and only the 12 bits of meta data that God intended[2] was really quite fast. You add things like a VFS which is mapping the source file system into the parameters of the "expected" file system that adds delay. You're mapping different character sets? adds delay. Colors for the display? Adds delay. Small costs that add up.

1: The first time I saw a long word like 'internationalization' reduced to first and last letter and the count of letters in between :-).

2: Those being Read, Write, and eXecute for user, group, and other, setuid, setgid, and 'sticky' :-)

benreesman · 1h ago

This is fantastic stuff. I'm doing a C++ project right now that I'm doing with an eye to eventual migration in whole or in part to Zig. My little `libevring` thing is pretty young and I'd be very open to replacing it with `ourio`.

What's your feeling on having C/C++ bindings in the project as a Zig migration path for such projects?

rockorager · 1h ago

I think exposing a C lib would be very nice. Feel free to open a discussion or issue on the Github.

dang · 1h ago

(Thanks - we'll make that the main link (since it has more background info) and include the repo thread at the top as well.)

tavianator · 2h ago

My bfs project also uses io_uring: https://github.com/tavianator/bfs/blob/main/src/ioq.c

I'm curious how lsr compares to bfs -ls for example. bfs only uses io_uring when multiple threads are enabled, but maybe it's worth using it even for bfs -j1

rockorager · 2h ago

Oh that's cool. `find` is another tool I thought could benefit from io_uring like `ls`. I think it's definitely worth enabling io_uring for single threaded applications for the batching benefit. The kernel will still spin up a thread pool to get the work done concurrently, but you don't have to manage that in your codebase.

tavianator · 2h ago

I did try it a while ago and it wasn't profitable, but that was before I added stat() support. Batching those is probably good

mshockwave · 2h ago

and grep / ripgrep. Or did ripgrep migrate to using io_uring already?

burntsushi · 2h ago

No, ripgrep doesn't use io_uring. Idk if it ever will.

porridgeraisin · 42m ago

Curious: Why? Is it not a good fit for what ripgrep does? Isn't the sort of "streaming" "line at a time" I/O that ripgrep does a good fit for async io?

jeffbee · 3h ago

How much of the speedup over GNU ls is due to lacking localization features? Your results table is pretty much consistent with my local observations: in a dir with 13k files, `ls -al` needs 33ms. But 25% of that time is spent by libc in `strcoll`. Under `LC_ALL=C` it takes just 27ms, which is getting closer to the time of your program.

rockorager · 3h ago

I didn't include `busybox` in my initial table, so it isn't on the blog post but the repo has the data...but I am 99% sure busybox does not have locale support, so I think GNU ls without locale support would probably be closer to busybox.

Locales also bring in a lot more complicated sorting - so that could be a factor also.

swiftcoder · 2h ago

Kind of fascinating that slashing syscalls by ~35x (versus the `ls -la` benchmark) is "only" worth a 2x speedup

Galanwe · 43m ago

These syscalls are mostly through VDSO, so not very costly

bogwog · 1h ago

I vaguely remember some benchmark I read a while back for some other io_uring project, and it suggested that io_uring syscalls are more expensive than whatever the other syscalls were that it was being used to replace. It's still a big improvement, even if not as big as you'd hope.

I wish I could remember the post, but I've had that impression in the back of my mind ever since.

ninkendo · 6h ago

I wonder how it performs against an NFS server with lots of files, especially one over a kinda-crappy connection. Putting an unreliable network service behind blocking POSIX syscalls is one of the main reasons NFS is a terrible design choice (as can be seen by anyone who's tried to ctrl+c any app that's reading from a broken NFS folder), but I wonder if io_uring mitigates the bad parts somewhat.

mprovost · 3h ago

The designers of NFS chose to make a distributed system emulate a highly consistent and available system (a hard drive), which was (and is) a reasonable tradeoff. It didn't require every existing tool, such as ls, to deal with things like the server rebooting while listing a directory. (The original NFS protocol is stateless, so clients can survive server reboots.) What does vi do when the server hosting the file you're editing stop responding? None of these tools have that kind of error handling.

I don't know how io_uring solves this - does it return an error if the underlying NFS call times out? How long do you wait for a response before giving up and returning an error?

ninkendo · 2h ago

> The designers of NFS chose to make a distributed system emulate a highly consistent and available system (a hard drive), which was (and is) a reasonable tradeoff

I don't agree that it was a reasonable tradeoff. Making an unreliable system emulate a reliable one is the very thing I find to be a bad idea. I don't think this is unique to NFS, it applies to any network filesystem you try to present as if it's a local one.

> What does vi do when the server hosting the file you're editing stop responding? None of these tools have that kind of error handling.

That's exactly why I don't think it's a good idea to just pretend a network connection is actually a local disk. Because tools aren't set up to handle issues with it being down.

Contrast it with approaches where the client is aware of the network connection (like HTTP/GRPC/etc)... the client can decide for itself how long it should retry failed requests, whether it should bubble up failures to the caller, or work "offline" until it gets an opportunity to resync, etc. With NFS the syscall just hangs forever by default.

Distributed systems are hard, and NFS (and other similar network filesystems) just pretend it isn't hard at all, which is great until something goes wrong, and then the abstraction leaks.

(Also I didn't say io_uring solves this, but I'm curious as to whether its performance would be any better than blocking calls.)

pvtmert · 47m ago

I think it highly depends on your architecture and the scale you are pushing.

The other far-edge is the S3, where appending has just been possible within the last a few years as far as I can tell. Meanwhile editing a file requiring a full download/upload, not great either.

For the NFS case, I cannot say it's my favorite, but certainly easy to setup and run on your own. Obviously a rebooting server may cause certain issues during the unavailability, but the NFS server should be in highly-available. with NFSv4.1, you may use UDP as the primary transport, which allows you to swap/switch servers pretty quickly. (Given you connect to a DNS/FQDN rather than the IP address)

Another case is the plug and play, with NFS, UNIX permissions, ownership/group details, execute bit, etc are all preserved nicely...

Besides, you could always have a "cache" server locally. Similar to GDrive or OneDrive clients, constantly syncing back and forth, caching the data locally, using file-handles to determine locks. Works pretty well _at scale_ (ie. many concurrent users in the case of GDrive or OneDrive).

cwillu · 5m ago

Do you have similar thoughts about iscsi?

JonChesterfield · 1h ago

> Making an unreliable system emulate a reliable one is the very thing I find to be a bad idea.

It's the only idea though. We don't know how to make reliable systems, other than by cobbling together a lot of unreliable ones and hoping the emergent behaviour is more reliable than that of the parts.

mrlongroots · 1h ago

I think "making an unreliable system emulate a reliable one = bad" is too simplistic a heuristic.

We do this all the time with things like ECC and retransmissions and packet recovery. This intrinsically is not bad at all, the question is: what abstraction does this expose to the higher layer.

With TCP the abstraction we expect is "pretty robust but has tail latencies, do not use for automotive networks or avionics" and that works out well. The right question IMO is always "what kind of tail behaviors does this expose, and are the consumers of the abstraction prepared for them".

loeg · 5h ago

> as can be seen by anyone who's tried to ctrl+c any app that's reading from a broken NFS folder

Theoretically "intr" mounts allowed signals to interrupt operations waiting on a hung remote server, but Linux removed the option long ago[1] (FreeBSD still supports it)[2]. "soft" might be the only workaround on Linux.

[1]: https://man7.org/linux/man-pages/man5/nfs.5.html

[2]: https://man.freebsd.org/cgi/man.cgi?query=mount_nfs&sektion=...

ape4 · 4h ago

Samba too

pvtmert · 57m ago

> I have no idea what lsd is doing. I haven’t read the source code, but from viewing it’s strace, it is calling clock_gettime around 5 times per file. Why? I don’t know. Maybe it’s doing internal timing of steps along the way?

Maybe calculating "X minutes/hours/days/weeks ago" thing for each timestamp? (access, create, modify, ...). Could just be an old artifact of another library function...

namibj · 35m ago

This shouldn't be an actual syscall these days; it should be handled by vDSO (`man 7 vDSO`). Maybe zig doesn't use that, though.

Imustaskforhelp · 5h ago

Really interesting, the difference is real though I would just hope that some better coloring support could be added because I have "eza --icons=always -1" command set as my ls and it looks really good, whereas when I use lsr -1, yes the fundamental thing is same, the difference is in the coloring.

Yes lsr also colors the output but it doesn't know as many things as eza does

For example .opus will show up as a music icon and with the right color (green-ish in my case?) in eza whereas it would be shown up as any normal file in lsr.

Really no regrets though, its quite easy to patch I think but yes this is rock solid and really fast I must admit.

Can you please create more such things but for cat and other system utilities too please?

Also love that its using tangled.sh which is using atproto, kinda interesting too.

I also like that its written in zig which imo feels way more easier for me to touch as a novice than rust (sry rustaceans)

hnlmorg · 53m ago

“bat” is a pretty good modern “cat”

https://github.com/sharkdp/bat

Imustaskforhelp · 9m ago

So I just ran strace -c cat <file> and strace -c bat <file>

Bat did 445 syscall Cat did 48 syscall

Sure bat does beautify some things a lot but still I just wanted to tell this, I want something that can use io_uring for cat too I think,

like what's the least number of syscalls that you can use for something like cat?

johnisgood · 5h ago

As for coloring support, I think the best way would be to implement LS_COLORS / dircolors. My GNU ls looks nice.

SillyUsername · 6h ago

Love it.

I'm trying to understand why all command line tools don't use io_uring.

As an example, all my nvme's on usb 3.2 gen 2 only reach 740MB/s peak.

If I use tools with aio or io_uring I get 1005MB/s.

I know I may not be copying many files simultaneously every time, but the queue length strategies and the fewer locks also help I guess.

tyingq · 5h ago

Probably historical preference for portability without a bunch of #ifdef means platform+version-specific stuff is very late to get adopted. Though, at this point, the benefit of portability across various posixy platforms is much lower.

Retr0id · 5h ago

Has anyone written an io_uring "polyfill" library with fallback to standard posix-y IO? It could presumably be done via background worker threads - at a perf cost.

vlovich123 · 5h ago

Seems like a huge lift since io_uring is an ever growing set of interfaces that is encompassing more and more of the kernel surface area. Also, the problem tends to not necessarily be that the io_uring interface isn’t available at compile time but a) the version you distribute to has a kernel with it disabled or you don’t have permission to use it meaning you need to do LD_preload magic or use a framework b) the kernel you’re using supports some of the interfaces you’re trying to use but not all. Not sure how you solve that one without using a framework.

But I agree. It would be cool if it was transparent, but this is actually what a bunch of io-uring runtimes do, using epoll as a fallback (eg in Rust monoio)

namibj · 21m ago

You can just ask io_uring what commands you have available to you. Though the way of the background thread should be readily available/usable by just indirectly calling the syscall (-helper) and replacing it with a futex-based handshake/wrapper. If you're not using the backend-ordering-imposing link bit, you could probably even use minor futex trickery to dispatch multiple background threads to snatch up from the submission queue in a "grab one at a time" fashion.

fpoling · 3h ago

io_uring is the asynchronous interface and that requires to use even-based architecture to use it effectively. But many command-line tools are still written is a straightforward sequential style. If C would have async or similar mechanism to pretend doing async programming sequentially, it would be easier to port. But without that a very significant refactoring is necessary.

Besides, io_uring is not yet stable and who knows may be in 10 years it will be replaced by yet another mechanism to take advantage of even newer hardware. So simply waiting for io_uring prove it is here to stay is very viable strategy. Besides in 10 years we may have tools/AI that will do the rewrite automatically...

mananaysiempre · 3h ago

> If C would have async or similar mechanism to pretend doing async programming sequentially, it would be easier to port.

The *context() family of formerly-POSIX functions (clownishly deprecated as “use pthreads instead”) is essentially a full implementation of stackful coroutines. Even the arguable design botch of them preserving the signal mask (the reason why they aren’t the go-to option even on Linux) is theoretically fixable on the libc level without system calls, it’s just a lot of work and very few can be bothered to do signals well.

As far as stackless coroutines, there’s a wide variety of libraries used in embedded systems and such (see the recent discussion[1] for some links), which are by necessity awkward enough that I don’t see any of them becoming broadly accepted. There were also a number of language extensions, among which I’d single out AC[2] (from the Barrelfish project) and CPC[3]. I’d love for, say, CPC to catch on, but it’s been over a decade now.

[1] https://news.ycombinator.com/item?id=44546640

[2] https://users.soe.ucsc.edu/~abadi/Papers/acasync.pdf

[3] https://www.irif.fr/~jch/research/cpc-2012.pdf

elcapitan · 5h ago

iirc io_uring also had some pretty significant security issues early on (a couple of years ago). Those should be fixed by now, but that probably dampened adoption as well.

raesene9 · 30m ago

Last I checked it's blocked by most container runtimes exactly because of the security problems, and Google blocked io_uring across all their services. I've not checked recently if that's still the case, but https://security.googleblog.com/2023/06/learnings-from-kctf-... has some background.

jeffbee · 3h ago

Not years ago. io_uring has been a continuous parade of security problems, including a high severity one that wasn't fixed until a few months ago. Many large organizations have patched it out of their kernels on safety basis, which is one of the reasons it suffers from poor adoption.

Agingcoder · 11m ago

Iouring is very recent

cesarb · 1h ago

> I'm trying to understand why all command line tools don't use io_uring.

Because it's fairly new. The coreutils package which contains the ls command (and the three earlier packages which were merged to create it) is decades old; io_uring appeared much later. It will take time for the "shared ring buffer" style of system call to win over traditional synchronous system calls.

Thaxll · 5h ago

io_uring is a security nightmare.

marcodiego · 26m ago

I updated the Wikipedia article on io_uring to dispute that.

pjc50 · 4h ago

How so?

Thaxll · 4h ago

This is a good read on the topic: https://chomp.ie/Blog+Posts/Put+an+io_uring+on+it+-+Exploiti...

raesene9 · 30m ago

https://security.googleblog.com/2023/06/learnings-from-kctf-... - Has some interesting information on that topic.

sim7c00 · 4h ago

you give process direct access to a piece of kernel memory. its a reason why there is separation. thats all.

wtallis · 4h ago

Most of the security concerns with io_uring that I've seen aren't related to the shared buffers at all but simply stem from the fact that io_uring is a mechanism to instruct the kernel to do stuff without making system calls, so security measures that focus on what system calls a process is allowed to do are ineffective.

loeg · 3h ago

This isn't the issue; it's relatively easy to safely share some ring buffers. The issue was/is that io_uring is rapidly growing the equivalent of ~all historical Linux syscall interfaces and sometimes comparable security measures were missed on the new interfaces. (Also, stuff like seccomp filters on syscalls are kind of meaningless for io_uring.)

duped · 3h ago

...don't you supply the memory in the submission queue? or do you mean the queues themselves?

superkuh · 5h ago

One reason is so that they work in all linux environments rather than just bleeding edge installs from the last couple years.

tln · 5h ago

Thats a great speed boost. What tools are these?

never_inline · 5h ago

Poe's law hits again.

maplant · 4h ago

This seems more interesting as demonstration of the amortized performance increase you'd expect from using io_uring, or as a tutorial for using it. I don't understand why I'd switch from using something like eza. If I'm listing 10,000 files the difference is between 40ms and 20ms. I absolutely would not notice that for a single invocation of the command.

rockorager · 3h ago

Yeah, I wrote this as a fun little experiment to learn more io_uring usage. The practical savings of using this are tiny, maybe 5 seconds over your entire life. That wasn't the point haha

JuettnerDistrib · 3h ago

I'd be curious to know if this helps on supercomputers, which are notorious for frequently hanging for a few seconds on an ls -l.

mrlongroots · 46m ago

It could, but important to keep in mind that the filesystem architecture there is also very different with a parallel filesystem with disaggregated data and metadata.

When you run `ls -l` you could potentially be enumerating a directory with one file per rank, or worse, one file per particle or something. You could try making the read fast, but I also think that it makes no sense to have that many files: you can do things to reduce the number of files on disk. Also many are trying to push for distributed object stores instead of parallel filesystems... fun space.

maplant · 2h ago

It's a very cool experiment. Just wanted to perhaps steer the conversation towards those things rather than whether or not this was a good ls replacement because like you say that feels like it was missing the point

0x000xca0xfe · 4h ago

Well I have a directory with a couple million JSON files and ls/du take minutes.

Most of the coreutils are not fast enough to actually utilize modern SSDs.

otterley · 3h ago

What’s the filesystem type? Ext4 suffers terrible lookup performance with large directories, while xfs absolutely flies.

0x000xca0xfe · 3h ago

Yup, default ext4 and most files are <4KB, so it's extra bad.

Thanks for the comment, didn't know that!

jasonjmcghee · 3h ago

I find it funny that there are icons for .mjs and .cjs file extensions but not .c, .h, .sh

the8472 · 5h ago

io_uring doesn't support getdents though. so the primary benefit is bulk statting (ls -l). It'd be nice if we could have a getdents in flight while processing the results of the previous one.

loeg · 3h ago

POSIX adopting NFS' "readdirplus" operation (getdents + stat) could negate some of the benefit towards io_uring, too.

tln · 4h ago

The times seem sublinear, 10k files is less than 10x 1k files.

I remember getting in to a situation during the ext2 and spinning rust days where production directories had 500k files. ls processes were slow enough to overload everything. ls -F saved me there.

And filesystems got a lot better at lots of files. What filesystem was used here?

It's interesting how well busybox fares, it's written for size not speed iirc?

SkiFire13 · 3h ago

> The times seem sublinear, 10k files is less than 10x 1k files

Two points are not enough to say it's sublinear. It might very well be some constant factor that becomes less and less important the bigger the linear factor becomes.

Or in other words 10000n+C < 10000(n+C)

tln · 1h ago

The article has data points for n=10,100,1000,10000. Taking (n=10,000 - n=10)/(n=1,000 - n=10) would eliminate the constant factor and we'd expect about 10.09x higher times for a linear algorithm.

But for lsr, it's 9.34. The other tools have factors close to 10.09 or higher. Since ls has to sort it's output (unless -F is specified) I'd not be too surprised with a little superlinearity.

https://docs.google.com/spreadsheets/d/1EAYua3B3UeTGBtAejPw2...

otterley · 3h ago

Ext2 never got better with large directories even with SSDs (this includes up to ext4). The benchmarks don’t include the filesystem type, which is actually extremely important when it comes to the performance of reading directories.

quibono · 5h ago

Lovely, I might try doing this for some other "classic" utility!

A bit off-topic too, but I'm new to Zig and curious. This here: ``` const allocator = sfb.get();

    var cmd: Command = .{ .arena = allocator };

``` means that all allocations need to be written with an allocator in mind? I.e. one has to pick an allocator per each memory allocation? Or is there a default one?

kristoff_it · 5h ago

Allocator is an interface so you write library code only once, and then the caller decides which concrete implementation to use.

There's cases where you do want to change your code based on the expectation that you will be provided a special kind of allocator (e.g. arenas), but that's a more niche thing and in any case it all comes together pretty well in practice.

IggleSniggle · 5h ago

Caveat emptor, I don't write Zig but followed its development closely for awhile. A core design element of zig is that you shouldn't be stuck with one particular memory model. Zig encourages passing an allocator context around, where those allocators conform to a standardized interface. That means you could pass in different allocators with different performance characteristics at runtime.

But yes, there is a default allocator, std.heap.page_allocator

SkiFire13 · 3h ago

> you shouldn't be stuck with one particular memory model

Nit: an allocator is not a "memory model", and I very much want the memory model to not change under my feet.

throwawaymaths · 3h ago

> Zig encourages passing an allocator context around, where those allocators conform to a standardized interface.

in libraries. if youre just writing a final product it's totally fine to pick one and use it everywhere.

> std.heap.page_allocator

strongly disrecommend using this allocator as "default", it will take a trip to kernelland on each allocation.

hansvm · 4h ago

std.heap.smp_allocator

You should basically only use the page allocator if you're writing another allocator.

mnw21cam · 1h ago

Love the idea and execution, don't love the misplaced apo'strophe's.

rockorager · 1h ago

Oh no - where at?

nbf_1995 · 51m ago

Technically, the first, third, and fifth occurrence of "it's" should be "its". The dog chased its tail.

I didn't notice when I read the article though. The original commenter is being pedantic.

adgjlsfhk1 · 5h ago

It's a shame to see uutils doing so poorly here. I feel like they're our best hope for an organization to drive this sort of core modernization forward, but 2x slower than GNU isn't a good start.

Bender · 5h ago

I am curious what would happen if ls and other commands were replaced using io_uring and kernel.io_uring_disabled was set to 1. Would it fall back to an older behavior or would the ability to disable it be removed?

yencabulator · 2h ago

I just realized that one could probably write a userspace io_uring emulator in a library that spawns a thread to read the ringbuffer and a worker pool of threads to do the blocking operations. You'd need to get the main software to make calls to your library instead of the io_uring syscalls, that's it; the app logic could remain the same.

Then all the software wanting to use io_uring wouldn't need to write their low-level things twice.

rockorager · 3h ago

You would have to write your IO to have a fallback. The Ghostty project uses `io_uring`, but on kernels where it isn't available it falls back to an `epoll` model. That's all handled at the library level by libxev.

ReDress · 5h ago

I've been playing around with io_uring for a while.

Still, I am yet to come across a some tests that simulate typical real life application workload.

I heard of fio but are yet to check how exactly it works and whether it might be possible to simulate real life application workload with it.

izabera · 2h ago

what a "real life application workload" looks like is entirely dependent on your use case, but fio is very widely used in the storage industry

it's a good first approximation to test the cartesian product of

- sequential/random

- reads/writes

- in arbitrary sizes

- with arbitrarily many workers

- with many different backends to perform such i/o including io_uring

and its reporting is solid and thorough

implementing the same for your specific workload is often not trivial at all

neuroelectron · 5h ago

There used to be lsring by Jens Axboe (author of io_uring), but it no longer exists. This is more extreme than abandoning the project. Perhaps there is some issue with using io_uring this way, perhaps vulnerabilities are exposed.

arghwhat · 5h ago

> Perhaps there is some issue with using io_uring this way, perhaps vulnerabilities are exposed.

... no. It's just not interesting or particularly valuable to optimize ls, and Jens probably just used it as a demo and didn't want to keep it around.

neuroelectron · 5h ago

I'm sure there are uses in Bash scripts that could benefit from it but most people would use it directly in a compiled program, I suppose, if the performance was a reoccurring need.

neuroelectron · 1h ago

Explicit Vulnerabilities (Documented CVEs and Exploits)

These are actual discovered vulnerabilities, typically assigned CVEs and often exploited in sandbox escapes or privilege escalations: 1. CVE-2021-3491 (Kernel 5.11+)

    Type: Privilege escalation

    Mechanism: Failure to check CAP_SYS_ADMIN before registering io_uring restrictions allowed unprivileged users to bypass sandboxing.

    Impact: Bypass of security policy mechanisms.

2. CVE-2022-29582

    Type: UAF (Use-After-Free)

    Mechanism: io_uring allowed certain memory structures to be freed and reused improperly.

    Impact: Local privilege escalation.

3. CVE-2023-2598

    Type: Race condition

    Mechanism: A race in the io_uring timeout code could lead to memory corruption.

    Impact: Arbitrary code execution or kernel crash.

4. CVE-2022-2602, CVE-2022-1116, etc.

    Type: UAF and out-of-bounds access

    Impact: Escalation from containers or sandboxed processes.

5. Exploit Tooling:

    Tools like io_uring_shock and custom kernel exploits often target io_uring in container escape scenarios (esp. with Docker or LXC).

Implicit Vulnerabilities (Architectural and Latent Risks)

These are not necessarily exploitable today, but reflect deeper systemic design risks or assumptions. 1. Shared Memory Abuse

    io_uring uses shared rings (memory-mapped via mmap) between kernel and user space.

    Risk: If ring buffer memory management has reference count bugs, attackers could force races, data corruption, or misuse stale pointers.

 2. User-Controlled Kernel Pointers

    Some features allow user-specified buffers, SQEs, and CQEs to reference arbitrary memory (e.g. via IORING_OP_PROVIDE_BUFFERS, IORING_OP_MSG_RING).

    Risk: Incomplete validation could allow crafting fake kernel structures or triggering speculative attacks.

 3. Speculative Execution & Side Channels

    Since io_uring relies on pre-submitted work queues and long-lived kernel threads, it opens timing side channels.

    Risk: Predictable scheduling or timing leaks, esp. combined with hardware speculation (Spectre-class).

 4. Bypassing seccomp or AppArmor Filters

    io_uring operations can effectively batch or obscure syscall behavior.

    Example: A program restricted from calling sendmsg() directly might still use io_uring to perform similar actions.

    Risk: Policy enforcement tools become less effective, requiring explicit io_uring filtering.

 5. Poor Auditability

    The batched and asynchronous nature makes logging or syscall audit trails incomplete or confusing.

    Risk: Harder for defenders or monitoring tools to track intent or detect misuse in real time.

 6. Ring Reuse + Threaded Offload

    With IORING_SETUP_SQPOLL or IORING_SETUP_IOPOLL, I/O workers can run in kernel threads detached from user context.

    Risk: Desynchronized security context can lead to privileged operations escaping sandbox context (e.g., post-chroot but pre-fork).

 7. File Descriptor Reuse and Lifecycle Mismatch

    Some operations in io_uring rely on fixed file descriptors or registered files. Race conditions with FD reuse or closing can cause inconsistencies.

    Risk: UAF, type confusion, or logic bombs triggered by kernel state confusion.

 Emerging Threat Vectors
 eBPF + io_uring

    Some exploits chain io_uring with eBPF to do arbitrary memory reads or writes. e.g., io_uring to perform controlled allocations, then eBPF to read or write memory.

 io_uring + userfaultfd

    Combining userfaultfd with io_uring allows very fine-grained control over page faults during I/O — great for fuzzing, also for exploit primitives.

fermuch · 6h ago

The link isn't working for me. For those who were able to see it: does it improve anything by using that instead of what ls does now??

ta988 · 6h ago

70% faster, but more importantly 35x times less syscalls.

loeg · 5h ago

Why do you say more importantly? The time is all that matters, I think.

plq · 4h ago

%70 faster = you wait less

35x less system calls = others wait less for the kernel to handle their system calls

loeg · 4h ago

> 35x less system calls = others wait less for the kernel to handle their system calls

That isn't how it works. There isn't a fixed syscall budget distributed among running programs. Internally, the kernel is taking many of the same locks and resources to satisfy io_uring requests as ordinary syscall requests.

plq · 3h ago

More system calls mean more overall OS overhead eg. more context switches, or as you say more contention on internal locks etc.

Also, more fs-related system calls mean less available kernel threads to process these system calls. eg. XFS can paralellize mutations only up to its number of allocation groups (agcount)

loeg · 3h ago

> More system calls mean more overall OS overhead [than the equivalent operations performed with io_uring]

Again, this just isn't true. The same "stat" operations are being performed one way or another.

> Also, more fs-related system calls mean less available kernel threads to process these system calls.

Generally speaking sync system calls are processed in the context of the calling (user) thread. They don't consume kernel threads generally. In fact the opposite is true here -- io_uring requests are serviced by an internal kernel thread pool, so to the extent this matters, io_uring requests consume more kernel threads.

plq · 3h ago

> Again, this just isn't true.

Again, it just is true.

More fs-related operations mean less kthreads available for others. More syscalls means more OS overhead. It's that simple.

eviks · 5h ago

Is there a noticeable benefit of this huge syscall reduction?

Imustaskforhelp · 5h ago

Yes I just checked it after installing strace

strace -c ls gave me this

100.00 0.002709 13 198 5 total

strace -c eza gave me this

100.00 0.006125 12 476 48 total

strace -c lsr gave me this

100.00 0.001277 33 38 total

So seeing the number of syscalls in the calls directory

198 : ls

476 : eza

33 : lsr

A meaningful difference indeed!

richardwhiuk · 4h ago

That's just observing there is a difference, not explaining why that's a good thing.

fpoling · 3h ago

syscalls are expensive and their relative latency compared with the rest of code only grow especially in view of mitigations against cache-related and other other hardware bugs.

rybosome · 6h ago

It improves the latency of ls calls.

Imustaskforhelp · 6h ago

Hm interesting, it worked for me.

movomito · 6h ago

Link doesn’t work

Imustaskforhelp · 6h ago

Hm, well I have replied it to some other comment too but the link is working fine for me.

Currently downloading zig to build it.

api · 4h ago

Why isn’t it possible — or is it — to make libc just use uring instead of syscall?

Yes I know uring is an async interface, but it’s trivial to implement sync behavior on top of a single chain of async send-wait pairs, like doing a simple single threaded “conversational” implementation of a network protocol.

It wouldn’t make a difference in most individual cases but overall I wonder how big a global speed boost you’d get by removing a ton of syscalls?

Or am I failing to understand something about the performance nuances here?

ninkendo · 4h ago

In order to make this work, libc would have to:

- Start some sort of async executor thread to service the io_uring requests/responses

- Make it so every call to "normal" syscalls causes the calling thread to sleep until the result is available (that's 1 syscall)

- When the executor thread gets a result, have it wake up the original thread (that's another syscall)

So you're basically turning 1 syscall into 2 in order to emulate the legacy syscalls.

io_uring only makes sense if you're already async. Emulating sync on top of async is nearly always a terrible idea.

wtallis · 3h ago

You don't need to start spawning new threads to use io_uring as a backend for synchronous IO APIs. You just need to set up the rings once, then when the program does an fwrite or whatever, that gets implemented as sending a submission queue entry followed by a single io_uring_enter syscall that informs the kernel there's something in the submission queue, and using the arguments indicating that the calling process wants to block until there's something in the completion queue.

ninkendo · 1h ago

> using the arguments indicating the calling process wants to block

Nice to know io_uring has facilities for backwards compatibility with blocking code here. But yeah, that's still a syscall, and given that the whole benefit of io_uring is in avoiding (or at least, coalescing) syscalls, I doubt having libc "just" use io_uring is going to give any tangible benefit.

yencabulator · 3h ago

Not speaking of ls which is more about metadata operations, but general file read/write workloads:

io_uring requires API changes because you don't call it like the old read(please_fill_this_buffer). You maintain a pool of buffer that belong to the ringbuffer, and reads take buffers from the pool. You consume the data from the buffer and return it to the pool.

With the older style, you're required to maintain O(pending_reads) buffers. With the io_uring style, you have a pool of O(num_reads_completing_at_once) (I assume with backpressure but haven't actually checked).

api · 1h ago

In a single threaded flow your buffer pool is just the buffer you were given, and you don't return until the call completes. There are no actual concurrent calls in the ring. All you're doing is using io_uring to avoid syscall.

Other replies lead me to believe it's not worth doing though, that it would not actually save syscalls and might make things worse.

loeg · 3h ago

In addition to sibling's concern about syscall amplification, the async just isn't useful to the application (from a latency perspective) if you just serialize a bunch of sync requests through it.

No comments yet

rkangel · 5h ago

This was more interesting for the tangled.sh platform it's hosted on. Wasn't aware of that!

dang · 1h ago

One past thread so far:

Show HN: Tangled – Git collaboration platform built on atproto - https://news.ycombinator.com/item?id=43234544 - March 2025 (15 comments)

nikodunk · 3h ago

Same! Just signed up and will be following tangled and this repo. I like how tangled is built on atproto (bluesky).

seanw444 · 2h ago

Is there any actual focus on ATProto as a decentralized protocol? So far it seems like its only purpose is building Bluesky as a centralized service, which I have no interest in at all.

Retr0id · 2h ago

Doesn't the existence of tangled answer your question?

danbruc · 5h ago

Why does this require inventing lsr as an alternative to ls instead of making ls use io_uring? It seems pretty annoying to have to install replacements for the most basic command line tools. And especially in this case, where you do not even do it for additional features, just for getting the exact same thing done a bit faster.

tiagod · 5h ago

You don't have to install it. You can modify ls yourself too.

bicolao · 4h ago

The author answered on lobster thread [1]. This is more of an io_uring exercise than an attempt to replace ls.

[1] https://lobste.rs/s/mklbl9/lsr_ls_with_io_uring

nailer · 5h ago

`ls` is in C, `lsr` is in Zig. The `lsr` programmer probably doesn't want to make new code in C.

loeg · 5h ago

In addition, the author might not want to sign away their rights to the FSF.

andrepd · 5h ago

What on earth are you talking about? Why would this be the case?

scott_w · 4h ago

Depending on the implementation (and I don't know which `ls` is being referred to), modifying `ls` might mean modifying an FSF project which require copyright assignment as a condition of patch submissions.

leni536 · 17m ago

That's only the case if the author would want to upstream their changes. If they wanted to only fork ls then they would only be required to comply with the license, without assigning copyright over.

loeg · 3h ago

Are you unfamiliar with contributing to GNU projects (ls is part of GNU corutils)?

https://www.gnu.org/prep/maintain/maintain.html#Copyright-Pa...

mschuster91 · 5h ago

> Why does this require inventing lsr as an alternative to ls instead of making ls use io_uring?

Good luck getting that upstreamed and accepted. The more foundational the tools (and GNU coreutils definitely is foundational), the more difficult that process will be.

Releasing a standalone utility makes iteration much faster, partially because one is not bound to the release cycles of distributions.

WorldMaker · 5h ago

In the history of Unix its also a common way to propose tool replacements, for instance how `less` became `more` on most systems, or `vim` became the new `vi` which in its day became the new `ed`.

JdeBP · 3h ago

Yes and no. We don't really have the equivalent of comp.sources.unix nowadays, which is where the early versions of those occurred, and comp.sources.unix did not take just anything. Rich Salz had rules.

Plus, since I actually took stevie and screen and others from comp.sources.unix and worked on them, and wasn't able to even send my improvements to M. Salz or the original authors at all, from my country, I can attest that contributing improvements had hurdles just as large to overcome back then as there exist now. They're just different.

nailer · 4h ago

> instance how `less` became `more` on most systems

How `more` became `less`.

The name of 'more' was from paging - rather than having text scroll off the screen, it would show you one page, then ask if you wanted to see 'more' and scroll down.

'less' is a joke by the less authors. 'less is more' etc.

yencabulator · 3h ago

For a while there was a less competitor named most.

JdeBP · 2h ago

It hasn't gone away.

* https://freshports.org/sysutils/most/

* https://ftp.netbsd.org/pub/pkgsrc/current/pkgsrc/misc/most/i...

* https://packages.debian.org/sid/most

One can even get pg still, with Ilumos-based systems; even though that was actually taken out of the SUS years ago. This goes to show that what's standard is not the same as what exists, of course.

* https://illumos.org/man/1/pg

* https://pubs.opengroup.org/onlinepubs/9699919799.2008edition...

s1mplicissimus · 5h ago

> Releasing a standalone utility makes iteration much faster, partially because one is not bound to the release cycles of distributions.

which certainly is a valid way or prioritizing. similarly, distros/users may prioritize stability, which means the theoretical improvement would now be stuck in not-used-land. the value of software appears when it's run, not when it's written

KPGv2 · 5h ago

> the value of software appears when it's run, not when it's written

Have you ever tried to contribute to open source projects?

The question was why wouldn't someone writing software not take the route likely to end in rejection/failure. I don't know about you, but if I write software, I am not going to write it for a project whose managers will make it difficult for my PR to be accepted, and that 99% likely it never will be.

I will always contribute to the project likely to appreciate my work and incorporate it.

I'll share an anecdote: I got involved with a project, filed a couple PRs that were accepted (slowly), and then I talked about refactoring something so it could be tested better and wasn't so fragile and tightly coupled to IO. "Sounds great" was the response.

So I did the refactor. Filed a PR and asked for code review. The response was (after a long time waiting) "thanks but no, we don't want this." PR closed. No feedback, nothing.

I don't even use the software anymore. I certainly haven't tried to fix any bugs. I don't like being jerked around by management, especially when I'm doing it for free.

(For the record, I privately forked the code and run my own version that is better because by refactoring and then writing tests, I discovered a number of bugs I couldn't be arsed to file with the original project.)

s1mplicissimus · 4h ago

> Have you ever tried to contribute to open source projects?

yes, and it was often painful enough to make me consider very well wether I want to bother contributing. I can only imagine how terrible the experience must be at a core utility such as ls.

> The question was why wouldn't someone writing software not take the route likely to end in rejection/failure

Obviously they wouldn't - in my comment I assumed that the lsr author aimed for providing a better ls for people and tried to offer a perspective with a different definition of what success is.

> I don't like being jerked around by management, especially when I'm doing it for free

I get that. The older OSS projects become, the more they fossilize too - and that makes it more annoying to contribute. But you can try to see it from the maintainers perspective too: They have actual people relying on the program being stable and are often also not paid. Noone is forcing you to contribute to their project, but if you don't want to deal with existing maintainers, you won't have their users enjoying your patchset. Know what you want to achieve and act accordingly, is all I'm trying to say.

mschuster91 · 3h ago

> The older OSS projects become, the more they fossilize too - and that makes it more annoying to contribute.

Newer ones can be just as braindead, if they came out of some commercial entity. CLAs and such.

I'm Peter Roberts, immigration attorney who does work for YC and startups. AMA

Ask HN: Any active COBOL devs here? What are you working on?

Ask HN: GCP Outage?

Gmail's backup codes are useless to access account

Ask HN: What Pocket alternatives did you move to?

Ask HN: Does anyone have OpenBSD projects looking for unpaid/paid help?

I just got banned by Immunefi for reporting a real replay attack on LayerZero V2

Ask HN: Cursor is using 269,738 tokens to edit 1200 token file

Tell HN: Notion Desktop is monitoring your audio and network

Ask HN: What's Your Useful Local LLM Stack?

Ask HN: Changing Developer Career Specialty

Ask HN: How did Soham Parekh get so many jobs?

Ask HN: How do you find free academic/scientific material?

Ask HN: Is it time to fork HN into AI/LLM and "Everything else/other?"

Ask HN: What is the state of support for mutable torrents?

Ask HN: How are you tracking dev productivity without feeling micromanaging?

Ask HN: Is OpenAI Acquiring Cursor?

Ask HN: What should we do about state ID legislation?

How big is carpooling market?

Ask HN: Developer-as-a-Service?

Ask HN: How do you stay on top of AI tech?

Google raising Nest Aware Plus pricing by 25%

AIHint an open standard for signed verifiable metadata readable by AI on the web

Tell HN: Humanloop acquired, sunsetting Sept 8th

Ask HN: What is the best way to learn 3D modeling for 3D printing?

Ask HN: How do you avoid Kanban boards becoming "to-do list graveyards"?

Ask HN: How much of OpenAI code is written by AI?

lsr: ls with io_uring

Comments (136)