Maybe an easier way out is to add safe access instructions to LLVM itself. Its an IR after all, it should be possible to do a 3 phase update - add instructions to the IR, update the intermediate LLVM generator, then update the targetting backends.
wyldfire · 2h ago
A really good accompaniment to this is Carruth's "C++, bounds checking, performance, and compilers" [1]:
> ... strong belief that bounds checks couldn’t realistically be made cheap enough to enable by default. However, so far they are looking very affordable. From the above post, 0.3% for bounds checks in all the standard library types!
There's more to the hardening story than just bounds checks. But it's a big part IMO.
Even if bounds checks were only active in debug builds, that would already be of high value.
pjmlp · 1h ago
That at least has been covered almost since C++ exists.
First in compiler vendors frameworks, pre C++98, afterwards with build settings.
It is quite telling from existing community culture, that some folks only read their compiler manuals when government knocks on the door.
dilawar · 4h ago
> So this mode needs to set user expectations appropriately: your code breaking between compiler releases is a feature, not a bug.
Good luck. I feel that the C++ community values backward compatibility way too much for this to succeed. Most package maintainers are not going to like it a bit.
pjmlp · 3h ago
There has been plenty of breakage throughout ISO revisions.
The biggest problem is ABI, in theory that isn't something that standard cares about, in practice all compiler vendors do, thus proposals that break ABI from existing binary libraries tend to be an issue.
Another issue is that WG21 nowadays is full of people without compiler experience, willing to push through their proposals, even without implementations, which then compiler vendors are supposed to suck it up and implement them somehow.
After around C++14 time, it became cool to join WG21 and now the process is completely broken, there are more than 200 members.
There is no guidance on an overall vision per se, everyone gets to submit their pet proposal, and then needs to champion it.
Most of these folks aren't that keen into security, hence the kind of baby steps that have been happening.
dzaima · 2h ago
Compilers at least allow specifying the standard to target, which solves the ISO revision issue. But breaking within the same -std=... setting is quite a bit more annoying, forcing either indefinite patching on otherwise-complete functional codebases, or keeping potentially every compiler version on your system, both of which are pretty terrible options.
pjmlp · 1h ago
Breaking within the same std, is something impossible to prevent in compiled languages with enough freedom in build.
Even the C ABI many talk about, most of them don't have any idea of what they are actually talking about.
First of all, it is the OS ABI, in operating systems that happened to be written in C.
Secondly, even C binary libraries have plenty of breakage opportunities within the same std, and compiler.
ABI stability even in languages that kind of promise it, is in reality an half promise.
Bytecode, or some part of the language is guaranteed to be stable, while being tied to a specific version, not all build flags fall under the promise, and not much is promised over the standard library.
Even other good examples that go to great efforts like Java, .NET or Swift, aren't fully ABI safe.
dzaima · 35m ago
It's certainly not impossible to write code that breaks, or modify a library in an ABI-incompatible way, but ABI stability, at least on Linux, does largely Just Work™. A missing older shared library can be quite problematic, but that's largely it.
And while, yes, there are times where ABIs are broken, compiler versions affecting things would add a whole another uncontrollable axis on top of that. I would quite like to avoid a world of "this library can only be used by code compiled with clang-25" as much as possible.
pjmlp · 30m ago
Works most of the time, probably, isn't really the meaning of stable.
dzaima · 7m ago
Can't solve the issue of "you just don't have the library (or a specific version thereof) installed".
But you can make it worse by changing "You must have version X of library Y installed" to "You must have version X of library Y compiled by compiler Z installed".
But at least one can reasonably achieve ABI stability for their C library if they want to; really all it takes is "don't modify exposed types or signatures of exposed functions" and "don't use intmax_t", and currently you can actually break the latter.
yjftsjthsd-h · 40m ago
> First of all, it is the OS ABI, in operating systems that happened to be written in C.
It may be per-OS (I wouldn't try linking Linux and NT object files even if they were both compiled from C by GCC with matching versions and everything), but enough details come from C that I think it's fair to call it a C ABI. Like, I can write unix software in pascal, but in order to write to stdout that code is gonna have to convert pascal strings into C strings. OTOH, pascal binaries using pascal libraries can use pascal semantics even on an OS that uses C ABIs.
pjmlp · 35m ago
Strings is the easy part.
Try to link two binary libraries in Linux, both compiled with GCC, while not using exactly the same compiler flags, or the same data padding, for example things like structures.
Since committee people can explain it even better,
Assuming the code is position independent why can't the linker translate the ABI?
dzaima · 1h ago
Maybe some things could be translated by a linker, but a linker can't change the size/layout of an in-memory data structure, and there's no info on what to translate from, even if info was added on what to translate to, anyway.
tempodox · 1h ago
Data sizes, alignment, the way stuff is loaded into registers, all that can change.
porridgeraisin · 39m ago
I don't like that statement (or that whole paragraph) one bit either. My packages breaking between compiler releases is most definitely a big fat bug.
If bounds checks are going to be added, cool, -fstl-bounds-check. Or -fhardened like GCC. But not by default.
Working existing code is working existing code, I don't care if it looks "suspicious" to some random guy's random compiler feature.
> ... strong belief that bounds checks couldn’t realistically be made cheap enough to enable by default. However, so far they are looking very affordable. From the above post, 0.3% for bounds checks in all the standard library types!
There's more to the hardening story than just bounds checks. But it's a big part IMO.
[1] https://chandlerc.blog/posts/2024/11/story-time-bounds-check...
First in compiler vendors frameworks, pre C++98, afterwards with build settings.
It is quite telling from existing community culture, that some folks only read their compiler manuals when government knocks on the door.
Good luck. I feel that the C++ community values backward compatibility way too much for this to succeed. Most package maintainers are not going to like it a bit.
The biggest problem is ABI, in theory that isn't something that standard cares about, in practice all compiler vendors do, thus proposals that break ABI from existing binary libraries tend to be an issue.
Another issue is that WG21 nowadays is full of people without compiler experience, willing to push through their proposals, even without implementations, which then compiler vendors are supposed to suck it up and implement them somehow.
After around C++14 time, it became cool to join WG21 and now the process is completely broken, there are more than 200 members.
There is no guidance on an overall vision per se, everyone gets to submit their pet proposal, and then needs to champion it.
Most of these folks aren't that keen into security, hence the kind of baby steps that have been happening.
Even the C ABI many talk about, most of them don't have any idea of what they are actually talking about.
First of all, it is the OS ABI, in operating systems that happened to be written in C.
Secondly, even C binary libraries have plenty of breakage opportunities within the same std, and compiler.
ABI stability even in languages that kind of promise it, is in reality an half promise.
Bytecode, or some part of the language is guaranteed to be stable, while being tied to a specific version, not all build flags fall under the promise, and not much is promised over the standard library.
Even other good examples that go to great efforts like Java, .NET or Swift, aren't fully ABI safe.
And while, yes, there are times where ABIs are broken, compiler versions affecting things would add a whole another uncontrollable axis on top of that. I would quite like to avoid a world of "this library can only be used by code compiled with clang-25" as much as possible.
But you can make it worse by changing "You must have version X of library Y installed" to "You must have version X of library Y compiled by compiler Z installed".
But at least one can reasonably achieve ABI stability for their C library if they want to; really all it takes is "don't modify exposed types or signatures of exposed functions" and "don't use intmax_t", and currently you can actually break the latter.
It may be per-OS (I wouldn't try linking Linux and NT object files even if they were both compiled from C by GCC with matching versions and everything), but enough details come from C that I think it's fair to call it a C ABI. Like, I can write unix software in pascal, but in order to write to stdout that code is gonna have to convert pascal strings into C strings. OTOH, pascal binaries using pascal libraries can use pascal semantics even on an OS that uses C ABIs.
Try to link two binary libraries in Linux, both compiled with GCC, while not using exactly the same compiler flags, or the same data padding, for example things like structures.
Since committee people can explain it even better,
"To Save C, We Must Save ABI"
https://thephd.dev/to-save-c-we-must-save-abi-fixing-c-funct...
If bounds checks are going to be added, cool, -fstl-bounds-check. Or -fhardened like GCC. But not by default.
Working existing code is working existing code, I don't care if it looks "suspicious" to some random guy's random compiler feature.