Memory safety is table stakes

78 comradelion 80 6/26/2025, 7:36:11 PM usenix.org ↗

Comments (80)

0xbadcafebee · 4h ago
Lisp, Algol 68, Pascal, Smalltalk, ML, all had both memory safety and type safety. Nobody uses it today. Why? Because software isn't developed by rational beings choosing the best tool for the job. It's developed by humans who are influenced by their cultural norms and environment. You can give someone a perfect programming language that produces bug-free programs, and they'll reject it because it uses curly-braces or some shit. Write all the papers you want; as long as the inmates are running the asylum, there is no safety.
pron · 3h ago
Algol 68 and Pascal weren't memory-safe, and as for Lisp, Smalltalk, and ML, their style of memory safety - based on GC - took over the world pretty much the second it became practical enough for widespread use.

It is true that some decisions people make aren't rational, and it may even be true that most decisions most people make aren't entirely rational, but the claim that the whole software market, which is under selective pressures, manages to make irrationally wrong decisions in a consistently biased way is quite extraordinary and highly unlikely. What is more likely is that the decisions are largely rational, just don't correspond to your preferences. It's like the VHS vs. Betamax story. Fans of the latter thought that the preference for the former was irrational because of the inferior picture quality, but VHS was superior in another respect - recording time - that mattered more to more people.

I was programming military applications in Ada in the nineties (also not memory-safe, BTW) and I can tell you we had very good reasons to switch to C++ at the time, even from a software correctness perspective (I'm not saying C++ still retains those particular advantages today).

If you think so many people who compete with each other make a decision you think is obviously irrational, it's likely that you're missing some information.

daymanstep · 3h ago
What were the reasons for switching from ADA to C++ if I may ask?
pron · 3h ago
The compiler was much faster, the tooling better, and it was easier to find knowledgeable programmers (we were spending quite a bit of time sifting through thick Ada reference manuals). Whatever correctness benefits Ada provided at the language level were more than made up for by C++'s productivity boosts (at the time) that allowed writing and running more tests and fixing bugs more quickly, resulting in code that was no less correct and easier to maintain and evolve to boot.
steveklabnik · 52m ago
"table stakes" does not mean "guaranteed to succeed," so it's no surprise that some memory safe languages have died.
burnt-resistor · 3h ago
Pascal has subrange integer types. I'm wondering if any other language besides family relatives Ada or Delphi has this, apart from dependent type systems like Idris[0] or an explicit Haskell type like Data.Range.[1]

0. https://stackoverflow.com/questions/28426191/how-to-specify-...

1. https://hackage.haskell.org/package/range-0.3.0.2/docs/Data-...

ksec · 4h ago
I am the only one on HN that brings up Ada because I think it deserve some credit. But then it seems there are a lot of hate towards Pascal style syntax.
nick_ · 7h ago
If Rust is the language that finally overwhelms the resistance to memory safe languages, that's good.

I think it's also important not to centre Rust alone. In the larger picture, Rust has a combo of A) good timing, and B) the best evangelism. It stands on decades of memory safe language & runtime development, as well as the efforts of their many advocates.

jandrewrogers · 6h ago
This statement seems imprecise. We've had memory-safe languages for decades and they are the primary programming languages used today e.g. Java and Python. There is no meaningful resistance to them.

If you look at what unsafe languages are used for, it mostly falls into two camps (ignoring embedded). You have legacy code e.g. browsers, UNIX utilities, etc which are too expensive to rewrite except on an opportunistic basis even though they could be in principle. You have new high-performance data infrastructure e.g. database kernels, performance-engineered algorithms, etc where there are still significant performance and architectural advantages to using languages like C++ that are not negotiable, again for economic reasons.

Most of the "resistance" is economic reality impinging on wishful thinking. We still don't have a practical off-ramp for a lot of memory-unsafe code. To the extent a lot of evangelism targets these cases it isn't helpful. It is like telling people living in the American suburbs that they should sell their cars and take the bus instead.

odyssey7 · 4h ago
I don’t buy the economic argument favoring memory-unsafe languages. There are fast memory-safe options. Legacy codebases can eventually become more expensive to maintain than to rewrite. What is the economic cost of an Achilles’ heel when critical systems are destroyed?

There are critical systems today that are essentially Prince Rupert’s drops. Mightily impressive, but with catastrophic weaknesses in the details.

tptacek · 6h ago
I think it's important to keep the scope of the debate well-defined, because memory-safe languages completely stomped out memory-unsafe languages more than 20 years ago; almost all new code is written in languages that are unshowily memory safe (like Java and Python).

We're really talking about resistance to memory safety in the last redoubts of unsafety: browsers and operating systems.

olarm · 5h ago
> We're really talking about resistance to memory safety in the last redoubts of unsafety: browsers and operating systems.

And control systems, c++ (along with PLCs ofcourse) dominates in my experience from developing maritime software and there doesnt appear to be much inclination towards change.

chubot · 5h ago
There’s also google, yandex, baidu, and bing, which are incredible amounts of c++ code

And probably lots of robotics, defense, and other industries

Granted, those aren’t consumer problems, but I would push back on the “last redoubts”.

We should absolutely move toward memory safe languages, but I also think there are still things to be tried and learned

zahlman · 5h ago
To be fair, there's a pretty clear difference between achieving memory safety with a garbage collector and run-time type information, versus achieving it through static analysis.
fiddlerwoaroof · 2h ago
Static analysis is worse and limits the programs you can write in annoying ways?
tuveson · 5h ago
> browsers and operating systems

And the VMs for the two languages that you mentioned above (edit: though to be fair to your comment, I suppose those were initially written 20+ years ago).

npalli · 4h ago
> We're really talking about resistance to memory safety in the last redoubts of unsafety: browsers and operating systems.

.. and other performance critical areas like Financial applications (HFT), High Performance Computing (incl. AI/ML), embedded, IoT, Gaming/Engines, Databases, Compilers etc.. Browsers and OS are highly visible, but there is a gigantic ton of new C++ code written everyday in spite of the availability of memory safe languages.

tptacek · 2h ago
People keep coming up with all these examples of things still written in C/C++. Sure. So are most AAA games. But so far nothing that's been identified --- maybe excepting databases, but vulnerabilities there are still rare --- that is a meaningful component of insecurity, which is what "memory safety" addresses.
spacechild1 · 34m ago
> that is a meaningful component of insecurity, which is what "memory safety" addresses.

There are plenty of people, though, who argue that everything must be memory safe (and therefore rewritten in Rust :) I personally don't agree with that sentiment and it seems like you don't agree either.

Ar-Curunir · 6h ago
and cryptographic code.
wglb · 1h ago
My favorite crypto bug was not a memory safety issue: https://i.blackhat.com/us-18/Wed-August-8/us-18-Valsorda-Squ...

Was a fascinating detective story to illustrate it.

noelwelsh · 6h ago
Rust also didn't give up, whereas earlier languages like Cyclone did. This is a problem with the different incentives in research; once you've shown it works there is no funding for further development.
jekwoooooe · 4h ago
Go is fast and memory safe. It has some data race protections built in but doesn’t go as far as rust. This has its benefits like not having to deal with borrow checker insanity (or rust syntax for that matter)

Unlike python or java, it’s both compiled and fast

haimez · 4h ago
Java is both compiled (first to bytecode, then to machine code by the JIT) and fast (once JIT compiled).
jandrewrogers · 43m ago
It depends on what you are using it for, “fast” is relative. Java can be fast for applications where performance and scalability are not a primary features. If performance and scalability are core objectives, even performance-engineered Java isn’t really competitive with a systems language. You can bend Java to make it perform better than most people believe, especially today, but the gap is still pretty large in practice.

I wrote performance-engineered Java for years. Even getting it to within 2x worse than performance-engineered C++ took heroic efforts and ugly Java code.

Raidion · 3h ago
Java is "fast" but not fast. Most of the time if performance is a true concern, you are not writing code in Java.
lmm · 55m ago
Java is fast for long-running server processes. Even HFT shops competing for milliseconds use it. But yeah every user-facing interactive Java application manages to feel clunky.
symbolicAGI · 23m ago
Learned from an NYC exchange 10 years ago that Java can be written so as to not use garbage collection. Fast and no pause for GC.

1. Resource and reuse objects that otherwise are garbage collected. Use `new` sparingly.

2. Avoid Java idioms that create garbage, e.g. for (String s : strings) {...}, substitute with (int i = 0, strings_len = strings.length(), i < strings_len) { String s = strings[i]; ...}

frollogaston · 3h ago
I have yet to run a Java program that I haven't had to later kill due to RAM exhaustion. I don't know why. Yeah an Integer takes 160 bits and that's without the JVM overhead, but still. Somehow it feels like Java uses even more memory than Python. Logically you'd point the finger at whoever wrote the software rather than the language/runtime itself, but somehow it's always Java. It's like the Prius of languages.

Ok, just glanced at my corp workstation and some Java build analysis server is using 25GB RES, 50GB VIRT when I have no builds going. The hell is it doing.

lmm · 54m ago
> Ok, just glanced at my corp workstation and some Java build analysis server is using 25GB RES, 50GB VIRT when I have no builds going. The hell is it doing.

Allocating a heap of the size it was configured to use, probably.

AlotOfReading · 3h ago
More of a historical footnote than a serious example, but you've never had to kill the Java applications running on your SIM card (or eSIM).
frollogaston · 3h ago
I don't know about that, my flip phone used to crash quite often. And it displayed a lot of Java logos.
AlotOfReading · 2h ago
Different processor and JVM. My understanding is that early versions of the Java card runtime didn't even support garbage collection. It was a very different environment to program, even if the language was "Java".
Animats · 26m ago
"Omniglot" is a rather dramatic title for something that's basically a way to call C from Rust with additional checking on the C side for type compatibility.

That said, it might be useful. The demo case is contrived, though. Passing Rust async semantics into C code is inherently iffy. I'd like to see something like OpenJPEG (a JPEG 2000 encoder written in C) safely encapsulated in this way.

b0a04gl · 7h ago
c/c++ you're in unsafe mode by default, unless you build guardrails yourself. rust built different: unsafe is loud, compiler flags it, tooling keeps count, you can gate it in ci. bugs don’t slip in quiet.. burden of proof shifts
taping-memory · 4h ago
I'm reading the article and so far it's great.

I'm just wondering in the explanation of listing 2 you say:

> a discriminant value indicating the enum’s active variant (4 bytes)

As far as I can find, there's no guarantee for that, the only thing I can find is that it might be interpreted as an `isize` value but the compiler is permitted to use smaller values: https://doc.rust-lang.org/reference/items/enumerations.html#...

Is there any reason to say it should be 4 bytes?

It doesn't change any of the conclusions, I'm just curious

OptionOfT · 3h ago
Using repr(C) makes it 4 bytes.

But then again, modeling a C enum to a Rust enum is bad design. You want to use const in Rust and match against those.

But it is a bad example in general, because the author passes on a pointer of a string slice to FFI without first converting it to a CString, so it isn't null terminated.

taping-memory · 3h ago
> Using repr(C) makes it 4 bytes.

That makes sense, they just don't use repr(C) for the PrintResult so I didn't consider that.

> But then again, modeling a C enum to a Rust enum is bad design. You want to use const in Rust and match against those.

That makes sense but if there could be a way to safely generate code that converts to an enum safely as proposed in the article that would be good as the enum is more idiomatic.

> But it is a bad example in general, because the author passes on a pointer of a string slice to FFI without first converting it to a CString, so it isn't null terminated.

The signature for async_print in C is `async_res_t async_print(const *uint8_t, size_t)` and they are passing a pointer to a &[u8] created from a byte string literal, so I think it's correct.

xTachyon · 6h ago
(Copied from Reddit)

What they're saying is kind of true, but the example is very bad. bindgen already doesn't generate Rust enums for C enums exactly for this reason. It insteads generates const's with each variant's value, and the enum type is just an alias to its basic type (i32 or something else).

This forces you to do a match on an integer, where you have to treat the _ case (with unreachable!() probably).

I can't tell if this is the whole paper, but it seems low effort at best.

Ar-Curunir · 6h ago
You can just read the paper instead of making negative comments: https://patpannuto.com/pubs/schuermann2025omniglot.pdf

They are in particular careful to never state that bindgen emits the wrong code. Maybe they could have said that bindgen in fact does handle this case correctly. But Omniglot seems to be doing a lot more than bindgen, and

IshKebab · 5h ago
Well... he does have a point. Don't demonstrate your great tool with an issue that the existing solution doesn't actually have.
ARob109 · 4h ago
Learning Rust ATM and using bindgen on a C header. Just looked and it generates Rust enums from C enums. I'm not sure what the default behavior of bindgen is, but it seems there is option for constifying enums

--constified-enum <REGEX> Mark any enum whose name matches REGEX as a series of constants

--constified-enum-module <REGEX> Mark any enum whose name matches REGEX as a module of constants

IMO, saying bindgen avoids the issue presented in the article is not accurate.

edit: formatting

gavinray · 5h ago
Where'd you find this paper link, out of curiosity?

The referenced footnote, [9], leads to: https://www.usenix.org/conference/osdi25/presentation/schuer...

timewizard · 7h ago
> if it compiles, then it’s correct … or at least, will not contain use-after-free or other memory safety errors

In a language with the `unsafe` construct and effectively no automated tooling to audit the uses of it. You have no guarantee of any significance. You've just slightly changed where the security boundary _might_ lie.

> There is a great amount of software already written in other languages.

Yea. And development of those languages is on going. C++ has improved the memory safety picture quite a bit of the past decade and shows no signs of slowing down. There is no "one size fits all" solution here.

Finally, if memory safety were truly "table stakes" then we would have been using the dozens of memory safe languages that already existed. It should be blindingly obvious that /performance/ is table stakes.

zaphar · 6h ago
Languages with unsafe don't just change where the security boundary lies. It shrinks the size of the area that the boundary surrounds.

C++ has artificially limited how much it can improve the memory safety picture because of their quite valid dedication to backwards compatibility. This is a totally valid choice on their part but it does mean that C++ is largely out of the running for the kinds of table stakes memory safety stuff the article talks about.

There are dozens of memory safe languages that already exist: Java, Go, Python, C#, Rust, ... And a whole host of other ones I'm not going to bother listing here.

torstenvl · 6h ago
All of the languages you listed are proprietary languages. Most of them have a single implementation. They could disappear tomorrow. While that's unlikely, it's a possibility that some will go the way of ColdFusion, and more will fade away like Pascal.
zahlman · 5h ago
> Most of them have a single implementation.

None of them have a single implementation. It only took a few minutes to find all the following:

* https://en.wikipedia.org/wiki/Free_Java_implementations

* Go has gofrontend and GopherJS aside from the reference implementation

* Python has a whole slew of alternate implementations listed on the main Python web site: https://www.python.org/download/alternatives/

* C# has Mono, which actually implements the entire .NET framework

* Rust has Rust-GCC and gccrs

johnfernow · 6h ago
The Java language specification is open and there are multiple implementations. OpenJDK is the official open-source reference implementation, and many of the alternative implementations pull from upstream, but OpenJ9 is a different JVM implementation (though does currently use OpenJDK's class libraries to form a complete JDK.)

Before Microsoft opened-up C#, Mono was a completely independent alternative implementation.

Python has CPython (reference open source implementation), but also PyPy, MicroPython and several others.

torstenvl · 5h ago
I'm not sure what you mean when you say the Java spec is open, but Oracle certainly took the position—and the Supreme Court confirmed—that they own copyright in the APIs.

Has Oracle dedicated those to the public domain in the meantime? Or at least licensed them extremely permissively?

More importantly, is there a public body that owns the spec?

Kranar · 5h ago
>the Supreme Court confirmed—that they own copyright in the APIs.

To use your own terminology, this is clearly and objectively false. The US Supreme Court made no such finding.

What the court concluded was that even if Oracle had a copyright on the API, Google's use of it fell under fair use so that making a ruling on the question of whether the API was protected by copyright was moot.

torstenvl · 3h ago
Your point of order is partially accurate. It was the Federal Circuit that held APIs copyrightable. SCOTUS did not disturb that holding, but did not explicitly affirm it either. However, your contention that this makes copyrightability moot is a stretch.
jcranmer · 2h ago
The majority opinion in Google v Oracle did an involved fair use analysis for the reimplementation of the API that really makes it clear that it's hard for anybody to violate the copyright of an API by doing a clean-room implementation and not have it be covered by fair use.
Jtsummers · 5h ago
For C# there is the ECMA specification for it https://ecma-international.org/publications-and-standards/st...

But who cares if there's a public body who owns the specification? The Supreme Court ruled Google's use of the copyrighted APIs fell within fair use. That gives, within the US (other countries will have other legal circumstances) a basis for anyone to copy pretty much any language so long as they steer clear of the actual copyrighted source code (don't copy MS's C# source code, for instance) and trademark violations.

torstenvl · 3h ago
I don't understand your thought process. You seem to be arguing that whether something has a proprietor is irrelevant to the question of whether it is proprietary. I cannot fathom the kind of reasoning that would lead to such a conclusion.

If the language is owned—by control and by IP—by a single corporation, it is proprietary.

Jtsummers · 2h ago
> I don't understand your thought process.

You claim to be a lawyer, I doubt your reading comprehension is really this bad but just in case I'll spell it out for you. You asked:

> More importantly, is there a public body that owns the spec?

And I answered:

> For C# there is the ECMA specification for it https://ecma-international.org/publications-and-standards/st...

Anyone can implement a compiler or interpreter for C# if they want, and there is a link to the standard for it. Is this clear enough for you?

Also, from an earlier comment you made a false claim and a strange reference.

You claimed that "most of" Java, Rust, C#, Python, and Go have only a single implementation. This is false. There are multiple implementations of each.

Second, you make a bizarre reference to "fad[ing] away like Pascal." Why do you think Pascal faded? I'll give a hint: It had nothing to do with being proprietary. At best that reference is a non sequitur, at worst it demonstrates more confusion on your part.

umanwizard · 6h ago
What does “proprietary” mean to you?

No comments yet

AlotOfReading · 7h ago

    In a language with the `unsafe` construct and effectively no automated tooling to audit the uses of it.
You can forbid using unsafe code with the lints built into rustc: https://doc.rust-lang.org/stable/nightly-rustc/rustc_lint/bu...

Cargo allows you to apply rustc lints to the entire project, albeit not dependencies (currently). If you want dependencies you need something like cargo-geiger instead. If you find unsafe that way, you can report it to the rust safety dance people, who work with the community to eliminate unsafe in crates.

All of this is worlds ahead of the situation in C++.

vlovich123 · 6h ago
OP is wrong that there's no tooling. All the C++ tooling that I'm aware of (e.g. ASAN/UBSAN/MSAN/TSAN) is still available on Rust. Additionally, it has MIRI which can check certain code constructs for defined behavior at the MIR level which, unlike sanitizers, validates that all code is sound according to language rules regardless of what would be run by generated assembly; this validation includes unsafe code which still has to follow the language rules. C/C++ doesn't have anything like that for undefined behavior by the way.

However, if I can apply a nitpicking attitude here that you're applying to their argument about the ease with which unsafe can be kept out of a complex codebase. unsafe is pretty baked into the language because there's either simply convenient constructs that the Rust compiler can't ever prove safely (e.g. doubly-linked list), can't prove safely today (e.g. various accessors like split), or is required for basic operations (e.g. allocating memory). Pretending like you can really forbid unsafe code wholesale in your dependency chain is not practical & this is ignoring soundness bugs within the compiler itself. That doesn't detract from the inherent advantage of safe by default.

AlotOfReading · 6h ago
I do safety critical code. I would consider banning allocation (e.g. just using Core) or avoiding certain data structures a completely feasible strategy to avoid unsafe if I wanted to exclude it from my safety model. It's what I'm already doing in C++. The difference is that in C++, I can never prove the absence of undefined behavior from any part of the codebase, even if I review every single line. Even if I could, that proof might be invalidated by a single change anywhere.

It's not easy in Rust, but it's possible.

xvedejas · 7h ago
Safe rust is a safe language. Yes, it is built upon unsafe rust. But I still consider Python to be a memory safe language despite it being built on C. I can still trust that my Python code doesn't contain such memory errors. Safe Rust is the same in terms of guarantees. That's all that anyone is claiming.
burnt-resistor · 2h ago
The main problem now is that there isn't a platform that has the tooling or infrastructure to prove, including through formal methods, that they are correct and free from bugs in the spirit of the seL4 project.
UltraSane · 4h ago
It is a lot like how you have to trust the core proving kernel in a theorem prover but if you do then you can trust every proof created using it.
burnt-resistor · 2h ago
https://github.com/CertiCoq/certicoq can prove (most of) itself.
imglorp · 7h ago
That's an extreme take now and maybe uncharitable. The safe parts of rust are simply no comparison to the whole c/c++ world: the tooling is eliminating vast swaths of "easy" errors. Unsafe parts might be comparable if they're calling the same libraries.

Industry is seeing quantifiable improvements, eg: https://thehackernews.com/2024/09/googles-shift-to-rust-prog...

noisem4ker · 7h ago
> It should be blindingly obvious that /performance/ is table stakes.

I think a big part of it is just inertia.

dwattttt · 6h ago
It's been a very slow learning process trying to undo the "performance at every cost" mantra.
djha-skin · 3h ago
Nope: ease of use is table stakes. Rust is not easy to use. It will never become mainstream because of this. For all its faults, C is comparatively simple.
another_twist · 1h ago
I actually think rust is very very easy to use to the point where I'd consider using it for scripting. They need to write out more detailed guides on how to do X with Rust though. e.g there's no runtime polymorphism in Rust since every trait + struct binding is unique. However, it similar behaviour can be accomplished by generics hence so many angular brackets in normal Rust code.
TylerE · 1h ago
I find attitudes like this simply bizzare. nothing about rust is 'easy' no matter how much it's fans insist it so.

Just the syntax is miserable punctuation soup to start with.

userbinator · 2h ago
The next step towards authoritarian dystopia.