Branch Privilege Injection: Exploiting branch predictor race conditions
421 alberto-m 213 5/13/2025, 4:44:51 PM comsec.ethz.ch ↗
See also: ETH Zurich researchers discover new security vulnerability in Intel processors - https://ethz.ch/en/news-and-events/eth-news/news/2025/05/eth...
Paper: https://comsec.ethz.ch/wp-content/files/bprc_sec25.pdf
> [...] the contents of the entire memory to be read over time, explains Rüegge. “We can trigger the error repeatedly and achieve a readout speed of over 5000 bytes per second.” In the event of an attack, therefore, it is only a matter of time before the information in the entire CPU memory falls into the wrong hands.
Just fix the processors?
No comments yet
On the bright side, they will get to enjoy a much better music scene, because they’ll be visiting the 90’s.
> No. Our analysis has not found any issues on the evaluated AMD and ARM systems.
But if the fix for this bug (how many security holes have ther been now in Intel CPUs? 10?) brings only a couple % performance loss, like most of the them so far, how can you even justify that at all? Isn't there a fundamental issue in there?
The ARM Cortex-R5F and Cortex-M7, to name a few, have branch predictors as well, for what it’s worth ;)
I thought about adding the blog post link to the top text (a bit like in this thread: https://news.ycombinator.com/item?id=43936992), but https://news.ycombinator.com/item?id=43974971 was the top comment for most of the day, and that seemed sufficient.
Edit: might as well belatedly do that!
No comments yet
- Predictor updates may be deferred until sometime after a branch retires. Makes sense, otherwise I guess you'd expect that branches would take longer to retire!
- Dispatch-serializing instructions don't stall the pipeline for pending updates to predictor state. Also makes sense, considering you've already made a distinction between "committing the result of the branch instruction" and "committing the result of the prediction".
- Privilege-changing instructions don't stall the pipeline for pending updates either. Also makes sense, but only if you can guarantee that the privilege level is consistent between making/committing a prediction. Otherwise, you might be creating a situation where predictions generated by code in one privilege level may be committed to state used in a different one?
Maybe this is hard because "current privilege level" is not a single unambiguous thing in the pipeline?
Do you know if there is any official recording or notes online?
Thanks in advance.
The Binary and Malware Analysis course that you mentioned builds on top of the book "Practical Binary Analysis" by Dennis Andriesse, so you could grab a copy of that if you are interested.
More info here: https://krebsonsecurity.com/2014/06/operation-tovar-targets-...
it's been a while back :)
I do have the book! I bought it a while ago but did not have the pleasure to check it out.
If I knew what I was getting into at the time, I'd do it. I did pay for extra, but in my case it was the low Dutch rate, so for me it was 400 euro's to follow hardware security, since I already graduated.
But I can give a rough outline of what they taught. It has been years ago but here you go.
Hardware security:
* Flush/Reload
* Cache eviction
* Spectre
* Rowhammer
* Implement research paper
* Read all kinds of research papers of our choosing (just use VUSEC as your seed and you'll be good to go)
Binary & Malware Analysis:
* Using IDA Pro to find the exact assembly line where the unpacker software we had to analyze unpacked its software fully into memory. Also we had to disable GDB debug protections. Something to do with ptrace and nopping some instructions out, if I recall correctly (look, I only low level programmed in my security courses and it was years ago - I'm a bit flabbergasted I remember the rough course outlines relatively well).
* Being able to dump the unpacked binary program from memory onto disk. Understanding page alignment was rough. Because even if you got it, there were a few gotcha's. I've looked at so many hexdumps it was insane.
* Taint analysis: watching user input "taint" other variables
* Instrumenting a binary with Intel PIN
* Cracking some program with Triton. I think Triton helped to instrument your binary with the help of Intel PIN by putting certain things (like xor's) into an SMT equation or something and you had this SMT/Z3 solver thingy and then you cracked it. I don't remember got a 6 out of 10 for this assignment, had a hard time cracking the real thing.
Computer & Network Security:
* Web securtiy: think XSS, CSRF, SQLi and reflected SQLi
* Application security: see binary and malware analysis
* Network security: we had to create our own packet sniffer and we enacted a Kevin Mitnick attack (it's an old school one) where we had to spoof our IP addresses, figure out the algorithm to create TCP packet numbers - all in the blind without feedback. Kevin in '97 I believe attacked the San Diego super computer (might be wrong about the details here). He noticed that the super computer S trusted a specific computer T. So the assignment was to spoof the address of T and pretend we were sending packets from that location. I think... writing this packet sniffer was my first C program. My prof. thought I was crazy that this was my first time writing C. I was, I also had 80 hours of time and motivation per week. So that helped.
* Finding vulnerabilities in C programs. I remember: stack overflows, heap overflows and format strings bugs.
-----
For binary & malware analsys + computer & network security I highly recommend hackthebox.eu
For hardware security, I haven't seen an alternative. To be fair, I'm not looking. I like to dive deep into security for a few months out of the year and then I can't stand it for a while.
BTW I can see you were very motivated back then. It got to be pretty steep but you managed to break through. Congrats!
> BTW I can see you were very motivated back then. It got to be pretty steep but you managed to break through. Congrats!
Thanks! Yea I was :)
Training Solo: - Enter the kernel (and switch privilege level) and “self train” to mispredict branches to a disclosure gadget, leak memory.
Branch predictor race conditions: - Enter the kernel while your trained branch predictor updates are still in flight, causing the updates to be associated with the wrong privilege level. Again, use this to redirect a branch in the kernel to a disclosure gadget, leak memory.
Doing what you want would essentially require a hardware architecture where every load/store has to go through some kind of "augmented address" that stores boundary information.
Which is to say, you're asking for 80286 segmentation. We had that, it didn't do what you wanted. And the reason is that those segment descriptors need to be loaded by software that doesn't mess things up. And it doesn't, it's "just a pointer" to software and amenable to the same mistakes.
(CHERI already exists on ARM and RISC-V though.)
A "far pointer" was, again, a *software* concept where you could tell the compiler that this particular pointer needed to use a different descriptor than the one the toolchain assumed (by convention!) was loaded in DS or SS.
Again, it's just not a software problem. In the real world we have hardware that exposes "memory" to running instructions as a linear array of numbers with sequential addresses. As long as that's how it works, you can demand an out of bounds address (because the "bounds" are a semantic thing and not a hardware thing).
It is possible to change that basic design principle (again, x86 segmentation being a good example), but it's a whole lot more involved than just "Rust Will Fix All The Things".
(*) ... although I don't think I can abstain ...
In the case of speculative execution, you need an insane amount of prep to use that exploit to actually do something. The only real way this could ever be used is if you have direct access to the computer where you can run low level code. Its not like you can write JS code with this that runs on browsers that lets you leak arbitrary secrets.
And in the case of systems that are valuable enough to exploit with a risk of a dedicated private or state funded group doing the necessary research and targeting, there should be a system that doesn't allow unauthorized arbitrary code to run in the first place.
I personally disable all the mitigations because performance boost is actually noticeable.
That's precisely what Spectre and Meltdown were though. It's unclear whether this attack would work in modern browsers but they did reenable SharedArrayBuffer & it's unclear if the existing mitigations for Spectre/Meltdown stimy this attack.
> I personally disable all the mitigations because performance boost is actually noticeable.
Congratulations, you are probably susceptible to JS code reading crypto keys on your machine.
OTOH if an adversary gets a low-privilege RCE on your box, exploiting something like Spectre or RowHammer could help elevate the privilege level, and more easily mount an attack on your other infrastructure.
But also note my caveat about database servers, for example. A database server shared between accounts of different trust levels will be affected, if the database supports stored procedures for example. Basically, as soon as there's anything on the box that not all users of it should be able to access anyway, you'll have to be very, very careful.
A quick search reveals that there is at least a timer mechanism, but I have no idea of any of its properties: https://docs.oracle.com/en/database/oracle/oracle-database/1...
But what I'm actually trying to say, is: For multiple intents and purposes (which might or might not include relevance to this specific vulnerability), as soon as you allow stored procedures in your database, "not running arbitrary code" is not a generally true statement instead.
I got really good a CS because I used to work for a contractor in a SCIF where we counldn't bring in any external packages, so I basically had to write C code for things like web servers from scratch.
But I'd never state to definitively, as I don't know enough about what HTML without JS can do these days. For all I know there's a turing tarpit in there somewhere...
With JS or WASM, it's much more straightforward.
1. No timers - timers are generally a required gadget & often they need to be hires or building a suitable timing gadget gets harder & your bandwidth of the attack goes down
2. No loops - you have to do timing stuff in a loop to exploit bugs in the predictor.
By the way, you have to be careful on your database server to not actually run arbitrary code as well. If your database supports stored procedures (think PL/SQL), that qualifies, if the clients that are able to create the stored procedures are not supposed to be able to access all data on that server anyway.
Physical isolation simplifies a lot of this. This class of attacks isn't (as) relevant for single-tenant single-workload dedicated machines.
In other words, properly drawing the boundary around “this is safe with meltdown disabled” is very hard, non-intuitive, and you’re one configuration/SW change or a violated assumption away from a Meltdown attack which is cross-process memory access & one notch below remote access. There’s a reason you design for security in depth rather than trying to carefully build a jenga tower where you’re one falling block away from total compromise.
No wonder you guys are scared AI is going to take your job lol.
Thats not how it works at all. To grab a key stored in a JS variable, the following would need to happen
1. Attacker needs to find a way to inject arbitrary JS code in a website, which means controlling either an iframe that is loaded or some component. This is a pretty hard thing to do these days with Same-Site strictness
2. The code needs to know specifically what memory address to target. When things like JWT or other tokens are stored in session or local storage, the variable name usually contains a random string. Injected code will have to figure out a way to find what that variable name is.
3. For attack to work, the cache has to get evicted. This is highly processor specific on how well it works, and also, the web app has to be in a state where no other process is referencing that variable. With JS, you also have to infer memory layout (https://security.googleblog.com/2021/03/a-spectre-proof-of-c...) first, which takes time. Then you have to train the branch predictor, which also takes time.
So basically, I have a statistically higher chance of losing my keys to someone who physically robs me rather than a cache timing attack.
Generally when an exploit like this drops, people always have failures to update their systems, and you see it being used in the wild. With Spectre/Meltdown, this didn't really happen, because of the nature of how these attacks work and the difficulty of getting the cache timing code to work correctly without specific targeting of a processor and ability to execute arbitrary code on the machine.
The vulnerability however allows arbitrary reading of any memory in the system in at least some circumstances, the presented PoC (https://www.youtube.com/watch?v=jrsOvaN7PaA ) demonstrates this by literally searching memory for the system's /etc/shadow and dumping that.
Whether the attack is practical using JS instead of a compiled C program is unknown to me, but if it is, it's not clear to me why the attacker would need to inject JS code into other websites or know what addresses to target. (If it is not, the question is moot.)
The PoC uses compiled C code. I hope I don't have to explain the difference between C code that runs on the system versus JS code that runs in the context of the browser...
I personally would not trust that you couldn't, in the most extreme case, get close enough to the kernel (like the PoC does through system calls) to mispredict it into leaking any kernel-mapped memory through a timing side channel. And nowadays, kernels typically map almost all physical memory in their shared address space (it's not too expensive in a 64 bit address space).
EDIT: See my extended reasoning here: https://news.ycombinator.com/item?id=43991696
With C code, you can pretty much reference any memory location, so you can make things work.
Do you understand the scope of the issue? Do you know that this couldn't personally affect you in a dragnet (so, not targeted, but spread out, think opportunistic ransomware) attack?
Because this statement of yours:
> Its not like you can write JS code with this that runs on browsers that lets you leak arbitrary secrets.
was not true for Spectre. The original spectre paper notoriously mentions JS as an attack vector.
If you truly disable all mitigations (assuming CPU and OS allow you to do so), you will reopen that hole.
So:
> The only real way this could ever be used is if you have direct access to the computer where you can run low level code.
I'm a low level kernel engineer, and I don't know this to be true in the general case. JITs, i.e. the JavaScript ones, also generate "low level code". How do you know of this not being sufficient?
The issue is not whether or not it could affect me, the issue is what is the risk. And I can say for certain that the risk is very low, because I seem to have more understanding of the space.
>The original spectre paper notoriously mentions JS as an attack vector.
In an analogy, having an attack vector is having a certain type of weapon, while executing a full exploit end to end is on the scope of waging a war. Sure, a right person at the right place with that weapon can take out a critical target and win the war, but just having that weapon doesn't guarantee you winning a war.
In the cases of certain exploits, like Log4Shell, thats like having a portable shotgun that shoots hypersonic missiles in a scatter pattern. Log4Shell basically means that if anything gets logged, even an error message, that can be used to execute arbitrary code, and its super easy to check if this is the case - send payloads to all services with a JNI url that you control and see what pops up, and boom, you can have shells on those computers.
In the case of Spectre/Meltdown, its like having a specific type of booby trap. Whether or not you can actually set up that booby trap highly depends on environment. If a website is fully protected against code injection, then executing JS cache timing would be impossible. And even if it wasn't, there would be other
Of course nothing is ever for certain. For example, browsers can contain some crazy logic bug that bypasses Same-Origin checks that nobody has found yet. But the chance of this happening is extremely low, as browser code is public.
If you can run arbitrary machine code on a system, that memory is the entire memory space (in theory) - you can assign a value to any pointer and attempt to read that address through side channel attack.
In reality the task is much harder - you don't know where in memory the thing you want is because of ASLR, virtual memory maps, and other factors, and to exploit cache timing attacks you need to have cache eviction happen first, and that's not really that straight forward for some memory addresses.
Javascript that runs in browser on the other hand has a lot more restrictions. You can't dereference a pointer to an arbitrary memory address in JS, you need an existing variable in the current context that is mapped to some memory.
The paper demonstrates this by the C PoC using a system call as a gadget. Any value can be passed into the system call before it gets checked for validity on the other side of the kernel boundary. In their example, they use the "buffer" and "buflen" arguments to the keyctl system call, which results the values passed into the system call being in the registers r12 and r13. Then, they mispredict into a disclosure gadget that uses r12 and r13 for dereferencing pointers:
Note how "buflen" isn't even a pointer (for keyctl) to begin with, but the (as far as I understand) unrelated disclosure gadget code dereferences r13 (because it treats it as a pointer), and they managed to mispredict into it through keyctl's call to the "read" function pointer (this is the part where it's still a bit fuzzy to me, as I unfortunately don't fully understand the misprediction itself and how they control for arbitrary destinations).Now, obviously you can't directly make system calls through JS. But I don't understand yet what, if anything, is in place to absolutely make sure that there are no indirect ways that result in a system call (or another path!) where benign, but arbitrary values get passed as arguments in registers, executing benign code, but being mispredicted into a different kernel code path where those registers would be used as pointers.
And then, once you can do that, you can affect almost arbitrary physical memory, since typically almost all physical memory is mapped in the kernel's address space.
Sure, this is much harder because of the layers in between, but I still don't quite understand why it's impossible, and why a sufficiently motivated attacker might not eventually find a workable solution?
Spectre just seems so fundamentally catastrophic for me, that anything but proper hardware fixes to how privilege boundaries are honored by speculative execution seems to merely make things harder to me, but how hard is a very non-trivial question for me. Is it hard enough?
(As for ASLR, in their paper they break that as their first step using their own methods.)
Still, who really knows if there isn't some jump table generator or whatever to allow an attacker to generate branch targets arbitrarily enough (remember that it's not necessary to branch to the full address to train the branch predictor).
Because this would not be a vulnerability in any sense by itself. It would be yet another completely benign but unlucky piece of code that just allows the tire fire that Spectre is to be leveraged.
I'm probably missing other relevant aspects.
As for cache flushing, I think that's what the disclosure gadget does: "The disclosure gadget needs to use the two attacker-controlled registers to leak and transmit the secret via Flush+Reload", so that's also kernel code which we mispredict into. But I'm not totally sure.
So overall, putting together an exploit with this through JS becomes a matter of lots and lots of research and testing, for a specific target - i.e not worth the effort for anyone but a state sponsored agency.
> Does Branch Privilege Injection affect non-Intel CPUs?
> No. Our analysis has not found any issues on the evaluated AMD and ARM systems.
Source: https://comsec.ethz.ch/research/microarch/branch-privilege-i...
There are probably similar bugs in AMD and ARM, I mean how long did these bugs sit undiscovered in Intel, right?
Unfortunately the only real fix is to recognize that you can’t isolate code running on a modern system, which would be devastating to some really rich companies’ business models.
Does pinning VMs to hardware cores (including any SMT'd multiples) fix this particular instance? My understanding was that doing that addressed many of the modern side channel exploits.
Of course that's not ideal, but it's not too bad in an era where the core count of high end CPUs continues to creep upwards.
You could say we only update the predictor at retirement to solve this. But that can get a little dicy also: the retirement queue would have to track this locally and retirement frees up registers, better be sure it's not the one your jump needs to read. Doable but slightly harder than you might think.
Why mention only Windows, what about Linux users?
Not expert enough to know what to look for to see if these particular mitigations are present yet.
INTEL-SA-01247 covers that CVE.
Microcode release 20250512 has that INTEL-SA mitigated.
https://github.com/intel/Intel-Linux-Processor-Microcode-Dat...
https://www.intel.com/content/www/us/en/security-center/advi...
On top of that x86 seems to be pushed out more and more by ARM hardware and now increasingly RISC-V from China. But of course there's the US chip angle - will the US, especially after the problems during Covid, let a key manufacturer like Intel bite the dust?
It's not great but lol the sensationalism is hilarious.
Remember, gamers only make up a few percentage of users for what Intel makes. But that's what you hear about the most. One or two data center orders are larger than all the gaming cpus Intel will sell in a year. And Intel is still doing fine in the data center market.
Add in that Intel still dominates the business laptop market which is, again, larger than the gamer market by a pretty wide margin.
The two areas you mention (data center, integrated OEM/mobile) are the two that are most supply chain and business-lead dependent. They center around reliable deliveries of capable products at scale, hardware certifications, IT department training, and organizational bureaucracy that Intel has had captured for a long time.
But!
Data center specifically is getting hit hard from AMD in the x86 world and ARM on the other side. AWS's move to Graviton alone represents a massive dip in Intel market share, and it's not the only game in town.
Apple is continuing to succeed in the professional workspace, and AMD's share of laptop and OEM contracts just keeps going up. Once an IT department or their chosen vendor has retooled to support non-Intel, that toothpaste is not going back into the tube - not fully, at least.
For both of these, AMD's improvement in reliability and delivery at scale will be bearing fruit for the next decade (at Intel's expense), and the mindshare, which gamers and tech sensationalism are indicators for, has already shifted the market away from an Intel-dominated world to a much more competitive one. Intel will have to truly compete in that market. Intel has stayed competitive in a price-to-performance sense by undermining their own bottom line, but that lever only has so far it can be pulled.
So I'm not super bullish on Intel, sensationalism aside. They have a ton of momentum, but will need to make use of it ASAP, and they haven't shown an ability to do that so far.
https://en.wikipedia.org/wiki/Intel_Management_Engine
https://en.wikipedia.org/wiki/AMD_Platform_Security_Processo...
Now, i may be misremembering and i don't have time today to download all his talks and grep the .vtt for "ARM"; however, my memory is reinforced by literally 30 seconds of internet searches. i bought the Minix book because of one of the presentations.
i'm not doing any more research for free on this. Even if it isn't ARM, it isn't x86.
Are you perhaps looking at some slides from Cyber@UC Meeting 81, held Jan 16 2019, located at https://www.cyberatuc.org/files/slides/meeting_081.pdf which clearly say
> Physically separate processor embedded within the x86 processor that runs a custom MINIX image
and misreading that as saying more than what it does?
And those slides link to more resources saying it's an x86.
you win.
And somehow this processor would also not be on the target list for Coreboot/Libreboot/Purism/Google people trying to de-ME their hardware?
Mr. Occam says I have very little reason to trust your recall/judgement, at this time.
It's funny that i knew about the minix even though according to your sources that wasn't what was running on the x86 chips until after they removed the RISC embedded cpu and switched to "x86." i've looked at your wiki link and followed the footnote, to an archive.org page where it is merely claimed that it is "now x86" and "running minix 3".
So we're at an impasse. I'm not downloading a bunch of youtube .vtt files and you've linked as authoritative sources as i have at this point; "someone said so."
that is: wiki cites the ptsecurity blogpost from august 2017 as the source for the claim that it is x86. furthermore, the blogpost claims that the architecture is "lakemont" which is 32nm, but the blog claims it's 22nm. Further, it claims it's specifically the quark, which was discontinued in 2019. i understand they can use the IP in the toolchain to put that on the main die, as well as build that part of the die at a larger size. However, there are a few other assertions that appear in there (in the code listings) that appear nowhere else on the internet.
oh, and ask mister occam if a physically separate chip (the Intel PCH 100 and up) counts as "embedded in the intel CPU" which is what i've been saying (ring -1, ring -4 are all on the physical die of the CPU.)
since we like wiki so much https://en.wikipedia.org/wiki/Platform_Controller_Hub that's where the ME is, per your link, first paragraph: It is located in the Platform Controller Hub of modern Intel motherboards.
I knew this was a waste of time, and now i spent an hour digging through crap that makes my eyes bleed like https://www.intel.com/content/www/us/en/content-details/3326...
you're talking about a completely separate chip, and that was a red herring. I'm pretty annoyed at myself right now.
Product aside, from a shareholder/business point of view (I like to think of this separately these days as financial performance is becoming less and less reflective of the end product) I think they are too big to fail.
"Unfortunately for John, the branches made a pact with Satan and quantum mechanics [...] In exchange for their last remaining bits of entropy, the branches cast evil spells on future generations of processors. Those evil spells had names like “scaling-induced voltage leaks” and “increasing levels of waste heat” [...] the branches, those vanquished foes from long ago, would have the last laugh."
https://www.usenix.org/system/files/1401_08-12_mickens.pdf
> The Mossad is not intimidated by the fact that you employ https://. If the Mossad wants your data, they’re going to use a drone to replace your cellphone with a piece of uranium that’s shaped like a cellphone, and when you die of tumors filled with tumors, […] they’re going to buy all of your stuff at your estate sale so that they can directly look at the photos of your vacation instead of reading your insipid emails about them.
Kinda like the old chestnut that rich people are only rich on paper and then, Musk buys twitter. Not tesla, or some DBA, Musk.
This decade might actually be the season of reveal.
The Cold War for example was full of these intricate, complex and stunning feats of spycraft that they'd pull off on each other.
> “Making processors faster is increasingly difficult,” John thought, “but maybe people won’t notice if I give them more processors.” This, of course, was a variant of the notorious Zubotov Gambit, named after the Soviet-era car manufacturer who abandoned its attempts to make its cars not explode, and instead offered customers two Zubotovs for the price of one, under the assumption that having two occasionally combustible items will distract you from the fact that both items are still occasionally combustible.
> Formerly the life of the party, John now resembled the scraggly, one-eyed wizard in a fantasy novel who constantly warns the protagonist about the variety of things that can lead to monocular bescragglement.
And in 2013 the below would have been correct, but we live in a very different world now:
> John’s massive parallelism strategy assumed that lay people use their computers to simulate hurricanes, decode monkey genomes, and otherwise multiply vast, unfathomably dimensioned matrices in a desperate attempt to unlock eigenvectors whose desolate grandeur could only be imagined by Edgar Allen Poe. Of course, lay people do not actually spend their time trying to invert massive hash values while rendering nine copies of the Avatar planet in 1080p.
He wasn't too far off about the monkeys, though...
Suppose you want to measure the distribution of the delay between recurring events (which is basically what's at the heart of those vulnerabilities). Suppose the delays are all sub-milliseconds, and that your timer, to pick something ridiculous, only has a 2 second granularity.
You may at first think that you cannot measure the sub-millisecond distribution with such a corse timer. But consider that event and timers are not synchronized to each other, so with enough patience, you will still catch some events barely on the left or on the right side of your 2 second timer tick. Do this over a long enough time, and you can reconstruct the original distribution. Even adding some randomness to the timer tick just means you need more samples to suss the statistic out.
Again, I am not an expert, and I don't know if this actually works, but that's what I came up with intuitively, and it matches with what I heard from some trustworthy people on the subject, namely that non-precision timers are not a panacea.
If each timer draws from the same random distribution then sure, you could work out the real tick with greater accuracy, but I don’t know if that is practical.
If the timers draw from different distributions then it is going to be much harder.
I imagine there is an upper limit of how much processing can be done per tick to before any attack becomes implausible.
Again, I'm an amateur, but I think you just need to know that distribution, which I guess you usually do (open source vs. closed source barely matters there), law of large numbers and all.
Anyway, looking through literature, this article presents some actual ways to circumvent timers being made corse-grained: https://attacking.systems/web/files/timers.pdf
In that article, the "Clock interpolation" sounds vaguely related to what I was describing on a quick read, or maybe it's something else entirely... Later, the article mentions alternative timing sources altogether.
Either way, the conclusion of the article is that the mitigation approach as a whole is indeed ineffective: "[...] browser vendors decided to reduce the timer resolution. In this article, we showed that this attempt to close these vulnerabilities was merely a quick-fix and did not address the underlying issue. [...]"
The mitigating factor is actually that you don't go to malicious websites all the time, hopefully. But it happens, including with injected code on ads and stuff that may enabled by secondary vulnerabilities.
[1] Not even including "potentially exploitable from JavaScript", which Spectre was. It's sufficient if you name one where an ordinary userspace program can do it.
The exploit is being able to do it from usermode through an api (browser/js) that normally forbids that.
Userspace can only access its own memory, rather than the whole systems.
https://cheatengine.org/
https://www.wemod.com/
From that piece of text on the blog, I don‘t quite unterstand if Kaby Lake CPUs are affected or not.
https://comsec.ethz.ch/research/microarch/branch-privilege-i...
Look forward to learning how this can be meaningfully mitigated.
[1] https://www.intel.com/content/www/us/en/security-center/advi...
AMD has had SEV support in QEMU for a long time, which some cloud hosting providers use already, that would mitigate any such issue if it occurred on AMD EPYC processors.
[1] See, e.g., https://www.amd.com/en/resources/product-security/bulletin/a... and https://www.intel.com/content/www/us/en/developer/articles/t...
Their new processors are quite inviting, but like with all CPU’s I’d prefer to keep the entire thing to myself.
Then people say "no that's not possible, we got security in place."
So then the researchers showcase a new demo where they use their existing knowledge with the same issue (i.e. scaling-induced voltage leaks).
I suspect this will go on and on for decades to come.
Of course at launch, the vulns haven't been found yet, so the performance uplift claims are all fluff based on the reckless actions of an unprotected trick. By the time the skeletons have come out of the closet, the release hype cycle is far in the past, and only a few nerd blogs will cover the corresponding performance hit.
Lather, rinse, repeat for the next launch. New tricks, new reckless performance gains, big uplift numbers, hope it takes a little while before anybody points out the holes in the emperor's clothes. But at the end of the day, the actual performance progress over time is, I suspect, dramatically slower than what the naïve summation of individual launch-day claims would suggest.
To your second point:
I think it's here to stay as long as we allow someone else's code (via js, webassembly, and their ilk) to run on our processors as a matter of course.
A rethink of the entire modern web, back to a simpler markup language that isn't Turing-complete and can't leak data back to the attacker's server, would be needed before we could turn the mitigations off and enjoy the performance we were promised.
But of course, if we get rid of the modern web as we know it, we probably don't need all that performance anyway. An old BlueWave/QWK mailer consumes 10,000x less resources than a Gmail tab, you know?
So not very up to date, but I suppose mitigations haven't changed significantly upstream since then.
The kernel has nothing to do with Ubuntu, its release schedule and LTS's. Distro LTS releases also often mean custom kernels, backports, hardware enablement, whatnot, which makes it a fork, so unless were analyzing Ubuntu security rather than Linux security, mainline should be used.
And as security updates are back ported to all supported versions - and 24.04 being an LTS release, it is as up2date as it gets.
If you're being pedantic, be the right kind of pedantic ;)
This differs from an actual later release which is closer to mainline and includes all newer fixes, including ones that are important but weren't flagged, and with less risk of having new downstream bugs.
If you're going to fight pedantism by being pedantic, better be the right kind of pedantic. ;)
Edit: "LTS" added due to popular demand
Distro LTS releases often mean custom kernels, backports, hardware enablement, whatnot, which makes it effectively a fork.
Unless were interested in discovering kernel variation discrepancies, its more interesting to analyze mainline.
On an LTS, you'll be running a Canonical kernel, or a Red Hat kernel, or a SuSE kernel, or an Oracle kernel, or...
Each will have different backports, different hardware enablement, different random patches of choice, and so different bugs and problems.
Unless were evaluating the security of a particular distro release, mainline is what is Linux and will ultimately be the shared base for future releases.
LTS does not mean you get all updates, it only means you get to drag your feet for longer with random bugfixes. Only the latest release has updates.
But security research should be done against the current state. Something as simple as a performance optimization can end up affecting the exploitability, and while that doesn't change whether the CPU is vulnerable it does change the conclusion.
Evaluering if a particular old, forked codebase is security-wise is identical is a fools errand, and then that doesn't answer whether an equivalent Red Hat kernel is vulnerable as that's a different fork with different backports and local patches. Mainline is the shared base.
The kernel has numerous CPU bug mitigations that change kernel behavior to make the CPU bug ineffective for active exploitation (microcode rarely fixes bugs other than just disabling a whole subsystem - they usually take silicon iterations to fix, and the kernel has to pick up the slack), and current kernel design choices may also unintentionally render the vulnerability ineffective.
That's why they specifically say what OS and version they're running, exactly because it is crucial. It's just that they are not, in fact, up to date when it comes to the kernel.