QUIC for the kernel

93 Bogdanp 48 7/31/2025, 3:57:32 PM lwn.net ↗

Comments (48)

kibwen · 2m ago

I'm confused, I thought the revolution of the past decade or so was in moving network stacks to userspace for better performance.

Bender · 6m ago

I don't know about using it in the kernel but I would love to see OpenSSH support QUIC so that I get some of the benefits of Mosh [1] while still having all the features of OpenSSH including SFTP, SOCKS, port forwarding, less state table and keep alive issues, roaming support, etc... Could OpenSSH leverage the kernel support?

[1] - https://mosh.org/

Ericson2314 · 34m ago

What will the socket API look like for multiple streams? I guess it is implied it is the same as multiple connections, with caching behind the scenes.

I would hope for something more explicit, where you get a connection object and then open streams from it, but I guess that is fine for now.

https://github.com/microsoft/msquic/discussions/4257 ah but look at this --- unless this is an extension, the server side can also create new streams, once a connection is established. The client creating new "connections" (actually streams) cannot abstract over this. Something fundamentally new is needed.

My guess is recvmsg to get a new file descriptor for new stream.

gte525u · 28m ago

I would look at SCTP socket API it supports multistreaming.

WASDx · 1h ago

I recall this article on QUIC disadvantages: https://www.reddit.com/r/programming/comments/1g7vv66/quic_i...

Seems like this is a step in the right direction to resole some of those issues. I suppose nothing is preventing it from getting hardware support in future network cards as well.

miohtama · 54m ago

QUIC does not work very well for use cases like machine-to-machine traffic. However most of traffic in Internet today is from mobile phones to servers and it is were QUIC and HTTP 3 shine.

For other use cases we can keep using TCP.

thickice · 36m ago

Why doesn't QUIC work well for machine-to-machine traffic ? Is it due to the lack of offloads/optimizations for TCP and machine-to-machine traffic tend to me high volume/high rate ?

yello_downunder · 25m ago

QUIC would work okay, but not really have many advantages for machine-to-machine traffic. Machine-to-machine you tend to have long-lived connections over a pretty good network. In this situation TCP already works well and is currently handled better in the kernel. Eventually QUIC will probably be just as good for TCP in this use case, but we're not there yet.

jabart · 6m ago

You still have latency, legacy window sizes, and packet schedulers to deal with.

wosined · 40m ago

The general web is slowed down by bloated websites. But I guess this can make game latency lower.

fmbb · 30m ago

https://en.m.wikipedia.org/wiki/Jevons_paradox

The Jevons Paradox is applicable in a lot of contexts.

More efficient use of compute and communications resources will lead to higher demand.

In games this is fine. We want more, prettier, smoother, pixels.

In scientific computing this is fine. We need to know those simulation results.

On the web this is not great. We don’t want more ads, tracking, JavaScript.

01HNNWZ0MV43FF · 14m ago

No, the last 20 years of browser improvements has made my static site incredibly fast!

I'm benefiting from WebP, JS JITs, Flexbox, zstd, Wasm, QUIC, etc, etc

dahfizz · 1h ago

> QUIC is meant to be fast, but the benchmark results included with the patch series do not show the proposed in-kernel implementation living up to that. A comparison of in-kernel QUIC with in-kernel TLS shows the latter achieving nearly three times the throughput in some tests. A comparison between QUIC with encryption disabled and plain TCP is even worse, with TCP winning by more than a factor of four in some cases.

Jesus, that's bad. Does anyone know if userspace QUIC implementations are also this slow?

Veserv · 2m ago

Yes. msquic is one of the best performing implementations and only achieves ~7 Gbps [1]. The benchmarks for the Linux kernel implementation only get ~3 Gbps to ~5 Gbps.

To be fair, the Linux kernel TCP implementation only gets ~4.5 Gbps at normal packets sizes and still only achieves ~24 Gbps with large segmentation offload. Both of which are ridiculously slow. It is easy to achieve ~100 Gbps/core of control plane at normal packet sizes with a properly designed protocol so you should only be bottlenecking on your encryption at ~50 Gbps/core.

[1] https://microsoft.github.io/msquic/

dan-robertson · 49m ago

I think the ‘fast’ claims are just different. QUIC is meant to make things fast by:

- having a lower latency handshake

- avoiding some badly behaved ‘middleware’ boxes between users and servers

- avoiding resetting connections when user up addresses change

- avoiding head of line blocking / the increased cost of many connections ramping up

- avoiding poor congestion control algorithms

- probably other things too

And those are all things about working better with the kind of network situations you tend to see between users (often on mobile devices) and servers. I don’t think QUIC was meant to be fast by reducing OS overhead on sending data, and one should generally expect it to be slower for a long time until operating systems become better optimised for this flow and hardware supports offloading more of the work. If you are Google then presumably you are willing to invest in specialised network cards/drivers/software for that.

jeroenhd · 9m ago

> - avoiding some badly behaved ‘middleware’ boxes between users and servers

Surely badly behaving middleboxes won't just ignore UDP traffic? If anything, they'd get confused about udp/443 and act up, forcing clients to fall back to normal TCP.

dahfizz · 40m ago

Yeah I totally get that it optimizes for different things. But the trade offs seem way too severe. Does saving one round trip on the handshake mean anything at all if you're only getting one fourth of the throughput?

yello_downunder · 6m ago

It depends on the use case. If your server is able to handle 45k connections but 42k of them are stalled because of mobile users with too much packet loss, QUIC could look pretty attractive. QUIC is a solution to some of the problematic aspects of TCP that couldn't be fixed without breaking things.

eptcyka · 15m ago

There are claims of 2x-3x operating costs on the server side to deliver better UX for phone users.

dan-robertson · 32m ago

Are you getting one fourth of the throughput? Aren’t you going to be limited by:

- bandwidth of the network

- how fast the nic on the server is

- how fast the nic on your device is

- whether the server response fits in the amount of data that can be sent given the client’s initial receive window or whether several round trips are required to scale the window up such that the server can use the available bandwidth

brokencode · 27m ago

Maybe it’s a fourth as fast in ideal situations with a fast LAN connection. Who knows what they meant by this.

It could still be faster in real world situations where the client is a mobile device with a high latency, lossy connection.

klabb3 · 1h ago

Yes, they are. Worse, I’ve seen them shrink down to nothing in the face of congestion with TCP traffic. If Quic is indeed the future protocol, it’s a good thing to move it into the kernel IMO. It’s just madness to provide these massive userspace impls everywhere, on a packet switched protocol nonetheless, and expect it to beat good old TCP. Wouldn’t surprise me if we need optimizations all the way down to the NIC layer, and maybe even middleboxes. Oh and I haven’t even mentioned the CPU cost of UDP.

OTOH, TCP is like a quiet guy at the gym who always wears baggy clothes but does 4 plates on the bench when nobody is looking. Don't underestimate. I wasted months to learn that lesson.

vladvasiliu · 52m ago

Why is QUIC being pushed, then?

toast0 · 43m ago

It has good properties compared to tcp-in-tcp (http/2), especially when connected to clients without access to modern congestion control on iffy networks. http/2 was perhaps adopted too broadly; binary protocol is useful, header compression is useful (but sometimes dangerous), but tcp multiplexing is bad, unless you have very low loss ... it's not ideal for phones with inconsistent networking.

favflam · 48m ago

I know in the p2p space, peers have to send lots of small pieces of data. QUIC stops stream blocking on a single packet delay.

fkarg · 20m ago

because it _does_ provide a number of benefits (potentially fewer initial round-trips, more dynamic routing control by using UDP instead of TCP, etc), and is a userspace softare implementation compared with a hardware-accelerated option.

QUIC getting hardware acceleration should close this gap, and keep all the benefits. But a kernel (software) implementation is basically necessary before it can be properly hardware-accelerated in future hardware (is my current understanding)

01HNNWZ0MV43FF · 1m ago

To clarify, the userspace implementation is not a benefit, it's just that you can't have a brand new protocol dropped into a trillion dollars of existing hardware overnight, you have to do userspace first as PoC

It does save 2 round-trips during connection compared to TLS-over-TCP, if Wikipedia's diagram is accurate: https://en.wikipedia.org/wiki/QUIC#Characteristics That is a decent latency win on every single connection, and with 0-RTT you can go further, but 0-RTT is stateful and hard to deploy and I expect it will see very little use.

dan-robertson · 48m ago

The problem it is trying to solve is not overhead of the Linux kernel on a big server in a datacenter

eptcyka · 16m ago

QUIC performance requires careful use of batching. Using UDP spckets naively, i.e. sending one QUIC packet per syscall, will incur a lot of oberhead - every time the kernel has to figure out which interface to use, queue it up on a buffer, and all the rest. If one uses it like TCP, batching up lots of data and enquing packets in one “call” helps a ton. Similarly, the kernel wireguard implementation can be slower than wireguard-go since it doesn’t batch traffic. At the speeds offered by modern hardware, we really need to use vectored I/O to be efficient.

rayiner · 19m ago

It’s an interesting testament to how well designed TCP is.

euphamism · 32m ago

> causing the next cat video to be that much slower to arrive.

You mean "causing the next advertisement to be that much slower"

> for that all-important web-browsing use case.

You mean "for that all-important advertising-display use case."

> But middleboxes on the Internet also make free use of connection information > [...] > As QUIC gains the hardware support that TCP benefits from,

It will gain the ossification problems that TCP suffers from. That _was_ quick!

No comments yet

valorzard · 48m ago

Would this (eventually) include the unreliable datagram extension?

wosined · 39m ago

Don't know if it could get faster than UDP if it is on top of it.

valorzard · 28m ago

The use case for this would be running a multiplayer game server over QUIC

jeffbee · 1h ago

This seems to be a categorical error, for reasons that are contained in the article itself. The whole appeal of QUIC is being immune to ossification, being free to change parameters of the protocol without having to beg Linux maintainers to agree.

corbet · 1h ago

Ossification does not come about from the decisions of "Linux maintainers". You need to look at the people who design, sell, and deploy middleboxes for that.

jeffbee · 1h ago

I disagree. There is plenty of ossification coming from inside the house. Just some examples off the top of my head are the stuck-in-1974 minimum RTO and ack delay time parameters, and the unwillingness to land microsecond timestamps.

otterley · 59m ago

Not a networking expert, but does TCP in IPv6 suffer the same maladies?

pumplekin · 49m ago

Yes.

Layer4 TCP is pretty much just slapped on top of Layer3 IPv4 or IPv6 in exactly the same way for both of them.

Outside of some little nitpicky things like details on how TCP MSS clamping works, it is basically the same.

toast0 · 1h ago

IMHO, you likely want the server side to be in the kernel, so you can get to performance similar to in-kernel TCP, and ossification is less of a big deal, because it's "easy" to modify the kernel on the server side.

OTOH, you want to be in user land on the client, because modifying the kernel on clients is hard. If you were Google, maybe you could work towards a model where Android clients could get their in-kernel protocol handling to be something that could be updated regularly, but that doesn't seem to be something Google is willing or able to do; Apple and Microsoft can get priority kernel updates out to most of their users quickly; Apple also can influence networks to support things they want their clients to use (IPv6, MP-TCP). </rant>

If you were happy with congestion control on both sides of TCP, and were willing to open multiple TCP connections like http/1, instead of multiplexing requests on a single connection like http/2, (and maybe transfer a non-pessimistic bandwidth estimate between TCP connections to the same peer), QUIC still gives you control over retransmission that TCP doesn't, but I don't think that would be compelling enough by itself.

Yes, there's still ossification in middle boxes doing TCP optimization. My information may be old, but I was under the impression that nobody does that in IPv6, so the push for v6 is both a way to avoid NAT and especially CGNAT, but also a way to avoid optimizer boxes as a benefit for both network providers (less expense) and services (less frustration).

jeffbee · 1h ago

This is a perspective, but just one of many. The overwhelming majority of IP flows are within data centers, not over planet-scale networks between unrelated parties.

toast0 · 50m ago

I've never been convinced by an explanation of how QUIC applies for flows in the data center.

Ossification doesn't apply (or it shouldn't, IMHO, the point of Open Source software is that you can change it to fit your needs... if you don't like what upstream is doing, you should be running a local fork that does what you want... yeah, it's nicer if it's upstreamed, but try running a local fork of Windows or MacOS); you can make congestion control work for you when you control both sides; enterprise switches and routers aren't messing with tcp flows. If you're pushing enough traffic that this is an issue, the cost of QUIC seems way too high to justify, even if it helps with some issues.

darksaints · 1h ago

For the love of god, can we please move to microkernel-based operating systems already? We're adding a million lines of code to the linux kernel every year. That's so much attack surface area. We're setting ourselves up for a kessler syndrome of sorts with every system that we add to the kernel.

wosined · 36m ago

I might be wrong, but microkernel also need drivers, so the attack surface would be the same, or not?

mdavid626 · 1h ago

Most of that code is not loaded into the kernel, only when needed.

darksaints · 48m ago

True, but the last time I checked (several years ago), the size of the portion of code that is not drivers or kernel modules was still 7 million lines of code, and the average system still has to load a few million more via kernel modules and drivers. That is still a phenomenally large attack surface.

The SeL4 kernel is 10k lines of code. OKL4 is 13k. QNX is ~30k.

arp242 · 31m ago

Can I run Firefox or PostgreSQL with reasonable performance on SeL4, OKL4, or QNX?

regularfry · 42m ago

You've still got combinatorial complexity problem though, because you never know what a specific user is going to load.