My understanding was that many of the fixes for speculative execution issues themselves led to performance degradation, does anyone know the latest on that and how this compares?
Are these performance hit numbers inclusive of turning off the other mitigations?
snvzz · 35m ago
There's about one way[0] to fix timing side channels.
The RISC-V ISA has an effort to standardize a timing fence[1][2], to take care of this once and for all.
What kind of workloads have noticeably lower performance with VBS?
kookamamie · 46m ago
We're working on HPC / graphics / computer-vision software and noticed a particularly nasty issue with VBS enabled just last week. Although, have to be mentioned it was on Win10 Pro.
api · 2h ago
That's still really massive. It would only make sense in very high security environments.
Honestly running system services in VMs would be cheaper and just as good, or an OS like Qubes. VM hit is much smaller, less than 1% in some cases on newer hardware.
gpapilion · 1h ago
It makes sense in any environment you have two workloads sharing compute from two parties, public clouds.
The protection here is to ensure the vms are isolated. Without doing this there is the potential you can leak data via speculative execution across guests.
eptcyka · 1h ago
VMs suffer from memory use overhead. Would be cool if the guest kernel would cooperate with the host on that.
traverseda · 1h ago
It will! For Linux hosts and Linux guests, if you use virtio and memory ballooning.
shortrounddev2 · 1h ago
This was an issue for me a few years ago running docker on macOS. macOS required you to allocate memory to docker ahead of time, whereas Windows/Hyper-V was able to use memory ballooning in WSL2
api · 1h ago
It's possible to address this to some extent with ballooning memory drivers, etc.
riedel · 2h ago
From reading the article that is the exactly also the feeling of the people involved. The question is if they are on track towards e.g. the 1% eventually.
Traubenfuchs · 1h ago
Sometimes something in me starts thinking about if this regularly occurring slowing of chips through exploit mitigation is deliberate.
All of big tech wins: CPUs get slower and we need more vcpu's and more memory to serve our javascript slop to end customers: The hardware companies sell more hardware, the cloud providers sell more cloud.
gpapilion · 1h ago
I think it’s more pragmatic. We can eliminate hyperthreading to solve this, or increase memory safety at the cost of performance. One is a 50% hit in terms of vcpus, the other is now sub 50%.
Traubenfuchs · 1m ago
They also need some phony justifications though.
Can't just turn off hyperthreading.
Avamander · 1h ago
These types of mitigations have the biggest benefit when resources are shared. Do you really think cloud vendors want to lose performance to CPU or other mitigations when they could literally sell those resources to customers instead?
bzzzt · 27m ago
They don't lose anything since they sell the same instance which performs less with the mitigations on.
Customers are paying because they need more instances.
depingus · 48m ago
Sometimes its fun to engage in a little conspiratorial thinking. My 2 cents... That TPM 2.0 requirement on Windows 11 is about to create a whole ton of e-waste in October (Windows 10 EOL).
bzzzt · 1h ago
Why would big tech do this when customers bring it upon themselves by building Javascript slop?
Are these performance hit numbers inclusive of turning off the other mitigations?
The RISC-V ISA has an effort to standardize a timing fence[1][2], to take care of this once and for all.
0. https://tomchothia.gitlab.io/Papers/EuroSys19.pdf
1. https://lf-riscv.atlassian.net/wiki/spaces/TFXX/pages/538379...
2. https://sel4.org/Summit/2024/slides/hardware-support.pdf
What kind of workloads have noticeably lower performance with VBS?
Honestly running system services in VMs would be cheaper and just as good, or an OS like Qubes. VM hit is much smaller, less than 1% in some cases on newer hardware.
The protection here is to ensure the vms are isolated. Without doing this there is the potential you can leak data via speculative execution across guests.
All of big tech wins: CPUs get slower and we need more vcpu's and more memory to serve our javascript slop to end customers: The hardware companies sell more hardware, the cloud providers sell more cloud.
Can't just turn off hyperthreading.