Bloat is still software's biggest vulnerability (2024)

260 kristianp 212 5/6/2025, 11:33:54 PM spectrum.ieee.org ↗

Comments (212)

GuB-42 · 33d ago
I am beginning to think that the terrible situation with dependency management in traditional C and C++ is a good thing.

Now, with systems like npm, maven or cargo, all you need to do to get a package is to add a line in a configuration file, and it fetches all the dependencies you need automatically from a central repository. Very convenient, however, you can quickly find yourself with 100+ packages from who knows where and 100s of MB of code.

In C, traditionally, every library you include requires some consideration. There is no auto-download, and the library the user has may be a different version from the one you worked with, and you have to accommodate it, and so does the library publisher. Or you may have to ship is with your own code. Anyways, it is so messy that the simplest solution is often not to use a library at all and write the thing yourself, or even better, realize that you don't need the feature you would have used that library for.

Bad reason, and reinventing the wheel comes with its own set of problems, but at least, the resulting code is of a manageable size.

otikik · 33d ago
I thought about this several years ago and I think I hit the right balance with these 2 rules of thumb:

* The closer something is to your core business, the less you externalize.

* You always externalize security (unless security is your exclusive core business)

Say you are building a tax calculation web app. You use dependencies for things like the css generation or database access. You do not rely on an external library for tax calculation. You maintain your own code. You might use an external library for handling currencies properly, because it's a tricky math problem. But you may want to use your own fork instead, as it is close to your core business.

On the security side, unless that's your speciality, there's guys out there smarter than you and/or who have dedicated more time and resources than you to figure that stuff out. If you are programming a tax calculation web app you shouldn't be implementing your own authentication algorithm, even if having your tax information secure is one of your core needs. The exception to this is that your core business is literally implementing authentication and nothing else.

j_w · 33d ago
I feel like "shouldn't be implementing your own authentication" is overblown. Don't write the crypto algorithms. But how hard is it to write your own auth? If you are pulling in a third party dependency for that you still would need to audit it, and if you can audit authentication software why can't you implement it?

Just follow OWASP recommendations. A while back this was posted to HN and it also provides great recommendations: https://thecopenhagenbook.com/ .

bluefirebrand · 32d ago
The main challenge isn't necessarily implementing the algorithms, it is keeping up with the security space

Do you expect your team to be keeping up with new exploits in hardware and networking that might compromise your auth? That takes a lot of expertise and time, which they could instead be spending building features that add business value

It sounds cynical, and it kind of is, but offloading this onto external experts makes way more business sense and probably is what allows you to deliver at all. Security is just too big a space for every software company to have experts on staff to handle

GuB-42 · 32d ago
The thing is, your "roll your own" auth is likely way smaller and less targeted than the library everyone uses. So the new exploits may simply not apply to your case.

Many famous vulnerabilities happen in parts of software people don't actually use. For example, the "Heartbleed" vulnerability in OpenSSL targeted the "heartbeat" feature few people actually used. In the "Log4Shell" vulnerability the exploit targeted LDAP support in log4j, which I have never seen used and didn't even know existed.

In addition, the "experts" maybe aren't. You may think that whoever is writing that popular library has a team of experts in security, it is used by big, serious companies after all. But in reality it may just be one overworked guy, and people only notice when the system has been publicly compromised. And that's if the developers themselves don't have malicious intent, or have accepted someone with malicious intent in the team (for the latter, see the xz story).

j_w · 31d ago
I think this is good retort to what was argued.

What's missed in me saying roll your own auth, even though I did say it, is that you aren't implementing the network stack or crypto. As long as you keep your dependencies up to date you shouldn't have any increased risk over using a third party library.

If there is a novel security flaw discovered, consider the first SQL-injection or XSS attack, then you definitely should know about it. The idea that not rolling your own security related functionality absolves you from the responsibility to know or understand major security considerations is incorrect. It is the responsibility of every programmer to be knowledgeable of the security risks in their space and the patterns that protect against those risks.

pphysch · 33d ago
There have been major F-ups in recent history with Okta, CrowdStrike, and so on. Keycloak had some major long-standing vulnerabilities. I've had PRs accepted in popular open-source IAM libraries a bit too easily.

Yeah, we shouldn't roll our own cryptography, but security isn't as clean cut as this comment implies. It also frequently bleeds into your business logic.

Don't confuse externalizing security with externalizing liability.

ablob · 33d ago
As far as I know tacking on security after the fact usually leads to issues. It should be a primary concern from the beginning. Even if you don't do it 100% right, you'd be surprised how many issues you can avoid by thinking about this during (and not after) development.

Dropping your rights to open files as soon as possible, for example, or thinking about what information would be available to an attacker should they get RCE on the process. Shoehorning in solutions to these things after the fact tends to be so difficult that it's a rare sight.

I have been recommended to think of security as a process rather than an achievable state and have become quite fond of that perspective.

Extasia785 · 32d ago
You are describing domain-driven design. Outsource generic subdomains, focus your expertise on the core subdomains.

https://blog.jonathanoliver.com/ddd-strategic-design-core-su...

cogman10 · 33d ago
I think this helps, but I also think the default for any dev (particularly library authors) should be to minimize dependencies as much as possible. Dependencies have both a maintenance and a security cost. Bad libraries have deep and sprawling trees.

I've seen devs pull in frameworks just to get access to single simple to write functions.

casey2 · 31d ago
Even if you make the obviously wrong assumption that every library is more secure than the one you would write (that will do less the vast majority of the time) We still end up in a eggs in one basket situation.

You haven't thought through any cyber security games or you are funded to post this bad argument over and over again by state agencies with large 0-day stockpiles.

the__alchemist · 33d ago
I would like to dig into point 2 a bit. Do you think this is a matter of degree, or of kind? Does security, in this, imply a network connection, or some other way that exposes your application to vulnerabilities, or is it something else? Are there any other categories that you would treat in a similar way as security, but to a lesser degree, or that almost meet that threshold for a special category, but don't?
SkiFire13 · 33d ago
How many vulnerabilities were due to badly reinventing the wheel in C/C++ though?

Also, people often complain about "bloat", but don't realize that C/C++ are often the most bloated ones precisely because importing libraries is a pain, so they try to include everything in a single library, even though you only need to use less than 10% of it. Look for example at Qt, it is supposed to be a UI framework but it ends up implementing vectors, strings, json parser and who knows how much more stuff. But it's just 1 dependency so it's fine, right?

phkahler · 33d ago
>> Look for example at Qt, it is supposed to be a UI framework but it ends up implementing vectors, strings, json parser and who knows how much more stuff. But it's just 1 dependency so it's fine, right?

Qt is an application development framework, not a GUI toolkit. This is one reason I prefer GTK (there are things I dislike about it too).

r0ze-at-hn · 32d ago
I remember back in the early 2000's that discussion, but now with the the tonnage that systems like npm can pull in I laugh that we ever thought it wouldn't get worse.
GuB-42 · 31d ago
There is still an advantage to using Qt over dozens of libraries that offer the same functionality.

Qt is backed by a single company, so all you have to watch out for is that company. Also, Qt is generally high quality, I have worked with it, read the source code, etc... and I generally liked what I saw. So I can reasonably assume that quality is consistent overall. When you have many libraries from many independent developers, it doesn't work. The JSON parser may be good, but it doesn't tell me anything about the library that deal with internationalization for instance, and if I wanted to keep track of everything, that's several time the work compared to a single vendor.

I agree that Qt is bloated though, but multiplatform UI frameworks are hard to keep light. There is a lot going on in a desktop UIs that people only notice when it isn't there. I tend to treat them like I treat the standard libraries, the OS, and for web apps, the browser. Big components, but you reasonably can't do without.

reaperducer · 33d ago
How many vulnerabilities were due to badly reinventing the wheel in C/C++ though?

I don't know. Suppose you tell us.

ChrisSD · 33d ago
In my experience every developer, company, team, sub-team, etc has their own "library" of random functions, utilities, classes, etc that just end up being included into new projects sooner or later (and everyone and their dog has their own bespoke string handling libraries). Copy/pasting large chunks of code from elsewhere is also rampant.

I'm not so sure C/C++ solves the actual problem. Only sweeps it under a carpet so it's much less visible.

achierius · 33d ago
It definitely does solve one problem. Like it or not, you can't be hit by supply chain attacks if you don't have a supply chain.
dgfitz · 33d ago
I mirror all deps locally and only build from the mirror. It isn’t an issue. C/C++ is my dayjob
procaryote · 33d ago
at some point you could mirror a supply chain attack... xz was a pretty long game and only found by accident for example
dgfitz · 33d ago
I’m sure I will.
josephg · 33d ago
This runs the risk of shipping C/C++ libraries with known vulnerabilities. How do you keep track of that? At least with npm / cargo / etc, updating dependencies is a single command away.
dgfitz · 33d ago
Pull, update, build?
josephg · 32d ago
How do you even know a dependency has an open vulnerability?
dgfitz · 31d ago
Conversely, how do you know when a dependency doesn’t have a vulnerability?
Frieren · 33d ago
> every developer, company, team, sub-team, etc has their own "library" of random functions, utilities, classes, etc

You are right. But my conclusion is different.

If it is a stable and people have been there for a while then developers know that code as well as the rest. So, when something fails they know how to fix it.

Bringing generic libraries may create long callstacks of very generic code (usually templates) that is very difficult to debug while adding a lot of functionality that is never used.

Bringing a new library into the code base need to be a though decision.

ryandrake · 33d ago
> In my experience every developer, company, team, sub-team, etc has their own "library" of random functions, utilities, classes, etc that just end up being included into new projects sooner or later

Same here. And a lot of those homegrown functions, utilities and classes are actually already available, and better implemented, in the C++ Standard Library. Every C++ place I've worked had its own homegrown String class, and it was always, ALWAYS worse in all ways than std::string. Maddening. And you could never make a good business case to switch over to sanity. The homegrown functions had tendrils everywhere and many homegrown classes relied on each other, so your refactor would end up touching every file in the source tree. Nobody is going to approve that risky project. Once you start down the path of rolling your own standard library stuff, the cancer spreads through your whole codebase and becomes permanent.

rileymat2 · 33d ago
Although I like std::string for somethings becomes a little tricky with cross platform work that involves both linux and windows. It also can be tricky with unicode and lengths.
grg0 · 33d ago
This is something that I think about constantly and I have come to the same conclusion. While the idea of being able to trivially share code worldwide is appealing, so far it seems to encourage shittier software more than anything else, and the benefit of sharing trivially seems to be defeated by the downsides that bloat and bad software bring with it. Adding friction to code re-use (by means of having to manually download shit from a website and compile it yourself like it's 1995) seems to be a good thing for now until a better package management system is figured out. The friction forces you to think seriously where you actually need that shit or you can write the subset of the functionality you need yourself. To be clear, I also think C++ projects suffer a lot from re-inventing the wheel, particularly in the gamedev world, but that seems to be less worse than, e.g., initializing some nodejs framework project and starting with 100+ dependencies when you haven't even started to write shit.
pixl97 · 33d ago
When doing SBOM/SCA we see apps with 1000+ deps. It's insane. It's so often we see large packages pulled in because a single function/behavior is needed and ends up massively increasing the risk profile.
1over137 · 33d ago
Holy cow. What domain is this? Web-based probably?
whstl · 33d ago
Could be a Hello World React app using the legacy creator-tool :/

Check this out: https://news.ycombinator.com/item?id=39019001

Of course, this is the whole environment except for Node.js itself. And Vite has improved it.

But there are definitely some tools that are worse than others.

pixl97 · 33d ago
Npm/node_modules is typically one of the worst offenders, but programmers can do this with any import/library based system.
thewebguyd · 33d ago
> Npm/node_modules is typically one of the worst offenders, but programmers can do this with any import/library based system.

You can, but I think this thread speaks volumes about the problem with the JavaScript/NPM ecosystem as a whole vs. pretty much any other.

We need something else for the web. The only reason we have 200+ NPM packages for a blank project is because JavaScript is atrocious and has almost nothing built-in. We got crap like LeftPad, isodd, is-array, etc. because of the language. Most of what NPM will pull in on any new web front end project is likely already part of the standard library in C#/dotnet, or Java, Go, etc.

But you could go further back and say it's not javascript's fault, it's the fault of trying to hammer the web into doing things it was never designed to do in the first place. But, we insisted on making it an application delivery platform, and now we're suffering the consequences of that. I'm hopeful for WASM, but ideally I'd love to see a resurgence of native apps.

1over137 · 32d ago
The problem with native apps is they are (mostly) locked behind "stores" run by Big Tech.
hombre_fatal · 32d ago
And they're opaque. You have to mitm proxy to even see if they're making requests to who knows where. They run with too many privileges. You can't block ads. You can't link to them.

Meanwhile, the web runs in a web browser. You have a network bar. You can inspect element. You can inject Javascript. You can run your own code inside people's apps.

The former scenario isn't better than the latter scenario just because some people build their website with too many NPM dependencies.

1over137 · 32d ago
I take your point, but a lot of that needn't be the case. Native apps can have lesser privileges with sandboxing, you can inject code if your OS doesn't forbid it, etc. A lot of this is just in how many of are native apps are, not how they must be.
hombre_fatal · 31d ago
True. But one of the biggest wins of the web is that it's how things are, and it was only a historical fluke of luck that things panned out this way.

As it's often said, there's no way the concept of a web browser would be feasible in today's walled garden app store norms. You mean a god app than can run remote, arbitrary code? And it lets you sideload arbitrary extensions that can do things like block ads and mutate apps?

The only reason it's allowed is because the web was ubiquitous by the time computing became so highly controlled.

So we shouldn't be quick to dismiss it, despite its warts.

rglullis · 33d ago
Cathedrals vs Bazaars.

Cathedrals are conservative. Reactionary, even. You can measure the rate of change by generations.

Bazaars are accessible and universal. The whole system is chaotic. Changes happen every day. No single agent is in control.

We need both to make meaningful progress, and it's the job of engineers to take any given problem and see where to look for the solution.

staunton · 33d ago
> While the idea of being able to trivially share code worldwide is appealing, so far it seems to encourage shittier software more than anything else, and the benefit of sharing trivially seems to be defeated by the downsides that bloat and bad software bring with it.

A lot of projects would simply not exist without it. Linux, comes to mind. I guess one might take the position that "Windows is fine" but would there ever have been even competition for Windows?

Another example, everyone would be rolling their own crypto without openssl, and that would mean software that's yet a lot more insecure than what we have. Writing software with any cryptography functionality in mind would be the privilege of giant companies only (and still suck a lot more than what we have).

There's a lot more things. The internet and software in general would be set back ~20years. Even with all the nostalgia I can muster, that seems like a much worse situation than today.

grg0 · 32d ago
All those projects existed long before package managers in programming languages were a thing (although you could consider the distro's package manager to fulfill that purpose, I guess), so I don't think your point really takes away from mine. And for sure, there are critical dependencies like openssl that better be a shared endeavour. But whether you pull those dependencies in manually or through a package manager is somewhat tangential.
rgavuliak · 33d ago
I agree fully, most users care about making their lives easier, not about development purity. If you can't do both, the puritanistic approach loses.
crabbone · 33d ago
This is all heuristic (read "guessing") and not a real solution to the problem.

The ground truth is that software bloat isn't bad enough of a problem for software developers to try and fight it. We already know how to prevent this, if really want to. And if the problem was really hurting so much, we'd have automated ways of slimming down the executables / libraries.

In my role in creating CI for Python libraries, I did more hands-on dependency management. My approach was to first install libraries with pip, see what was installed, research why particular dependencies have been pulled in, then, if necessary, modify the packages in such a way that unnecessary dependencies would've been removed, and "vendor" the third party code (i.e. store it in my repository, at the version I need). This, obviously, works better for programs, where you typically end up distributing the program with its dependencies anyways. Less so for libraries, but in the context of CI this saved some long minutes of reinstalling dependencies afresh for every CI run.

In the end, it was a much better experience than what you usually get with CI targeting Pyhon. But, in the end, nobody really cared. If CI took less than a minute to complete instead of twenty minutes, very little was actually gained. The project didn't have enough CI traffic for this to have any actual effect. So, it was a nice proof of concept, but ended up being not all that useful.

ryandrake · 33d ago
The reason bloat doesn't get fixed is that it's a problem that doesn't really harm software developers. It is a negative externality whose pain is spread uniformly across users. Every little dependency developers add to make their work more convenient might increase the download size over the user's network by 100MB, or use another 0.5% of the user's CPU, or another 50MB of the user's RAM. The user gets hit, ever so slightly, but the developer sees only upside.
HPsquared · 33d ago
The phrase "cheap and nasty" comes to mind. Over time, some markets tend towards the cheap and nasty.
TeMPOraL · 33d ago
Some? Almost all. That's the default end state if there's actual competition on the market.
socalgal2 · 33d ago
100, ha! The official rust docs, built in rust, use ~750 dependencies - queue the apoligists
matheusmoreira · 33d ago
> There is no auto-download

There is. Linux distributions have package managers whose entire purpose is to distribute and manage applications and their dependencies.

The key difference between Linux distribution package managers and programming language package managers is the presence of maintainers. Any random person can push packages to the likes of npm or PyPI. To push packages to Debian or Arch Linux, you must be known and trusted.

Programming language package managers are made for developers who love the convenience of pushing their projects to the world whenever they want. Linux distribution package managers are made for users who prefer to trust the maintainers not to let malware into the repositories.

Some measured amount of elitism can be a force for good.

ozim · 33d ago
Writing everything from scratch by hand is an insane take. It is not just reinventing the wheel but there are whole frameworks one should use because writing that thing on your own will take you a lifetime.

Yes you should not just pull as dependency thing that kid in his parents basement wrote for fun or to get OSS maintainer on his CV.

But there are tons of legitimate libraries and frameworks from people who are better than you at that specific domain.

barrkel · 33d ago
That's not how it works.

Here's a scenario. You pull in some library - maybe it resizes images or something. It in turn pulls in image decoders and encoders that you may or may not need. They in turn pull in metadata readers, and those pull in XML libraries to parse metadata, and before you know it a fairly simple resize is costing you 10s of MB.

Worse, you pull in different libraries and they all pull in different versions of their own dependencies, with lots of duplication of similar but slightly different code. Node_modules usually ends up like this.

The point is not writing the resize code yourself. It's the cultural effect of friction. If pulling in the resize library means you need to chase down the dependencies yourself, first, you're more aware of the cost, and second, the library author will probably give you knobs to eliminate dependencies. Perhaps you only pull in a JPEG decoder because that's all you need, and you exclude the metadata functionality.

It's an example, but can you see how adding friction to pulling in every extra transitive dependency would have the effect of librabry authors giving engineers options to prune the dependency tree? The easier a library is to use, the more popular it will be, and a library that has you chasing dependencies won't be easy to use.

lmm · 33d ago
> You pull in some library - maybe it resizes images or something. It in turn pulls in image decoders and encoders that you may or may not need. They in turn pull in metadata readers, and those pull in XML libraries to parse metadata, and before you know it a fairly simple resize is costing you 10s of MB.

This is more likely to happen in C++, where any library that isn't header-only is forced to be an all encompassing framework, precisely because of all that packaging friction. In an ecosystem with decent package management your image resizing library will have a core library and then extensions for each image format, and you can pull in only the ones you actually need, because it didn't cost them anything to split up their library into 30 tiny pieces.

barrkel · 32d ago
Actually I think a big part of the problem in C++ is the low level of abstraction of the standard library. It isn't friction that might cause an image resizing library to be all-encompassing; it's the lack of an abstract Image class in the standard library which would enable a resizing library to live side by side with image encoders and decoders, instead of needing to bundle them together.

The C++ standard library isn't rich enough. It doesn't have enough concepts for a good ecosystem of smaller components.

nolist_policy · 33d ago
Do you have an example?
MonkeyClub · 33d ago
> The easier a library is to use, the more popular it will be

You're thinking correctly on principle, but I think this is also the cause of the issue: it's too easy to pull in a Node dependency even thoughtlessly, so it's become popular.

It would require adding friction to move back from that and render it less easy, which would probably give rise to a new, easy and frictionless solution that ends up in the same place.

procaryote · 33d ago
There's a difference between "I need to connect to the database and I need to parse json, so I need two commonly used libs for those two things" and whatever npm is doing, and to some extent cargo or popular java frameworks are doing.

Building everything from scratch is insane, but so's uncritically growing a dependency jungle

actionfromafar · 33d ago
I feel you are arguing a bit of a strawman. The take is much more nuanced than write everything from scratch.
ozim · 33d ago
... simplest solution is often not to use a library at all and write the thing yourself, or even better, realize that you don't need the feature you would have used that library for ... the resulting code is of a manageable size..

I don't see the nuance there, that is my take of the comment, those are pretty much strongest statements and points about using libraries are minimal.

That is why I added mine strongly pointing that real world systems are not going to be "managable size" unless they are really small or a single person is working on the.

actionfromafar · 33d ago
For me "realize that you don't need the feature" is strong and also hits home. I sometimes prototype in C because it makes me think really hard about "what does this thing really have to do? What can I omit for now?"

While in for instance C# I tend to think "this would be simple to implement with whatever-fancy-thing-is-just-a-package-away".

Neither way is impossible to judge as good or bad on its own.

A real world system is almost always part of a larger system or system of systems. Making one thing simple can make another complex. The world is messy.

BrouteMinou · 33d ago
When you "Reinvent the wheel", you implement only what you need in an optimized way.

This gives a couple of advantages: you own your code, no bloat, usually simpler due to not having all the bells and whistles, less abstraction, so faster because there is no free lunch, minimize the attack surface for supply chain attacks...

For fun, the next time you are tempted to install a BlaZiNg FaSt MaDe in RuSt software: get the source, install cargo audit and run the cargo audit on that project.

See how many vulnerabilities there are. So far, in my experience, all the software I checked come with their list of vulnerabilities from transitive dependencies.

I don't know about npm, I only know by reputation and it's enough for me to avoid.

nebula8804 · 33d ago
That wheel is only as good as your skill in making it. For many people (the majority i'd guess) someone else making that wheel will have a better end result.
doublerabbit · 33d ago
The skill is produced by carving the wheel. You've got to start somewhere. Whether a mess or not the returned product is a product of your own. By relying on dependencies you're forever reaching for a goal you'll never achieve.
nradov · 33d ago
There are no absolute good or bad reasons here, it depends on the problem domain and usage environment. If you're writing code where safety or security matters then of course you need to carefully manage the software supply chain. On the other hand, if you're writing an internal utility for limited use with no exposure then who cares, pull in all the dependencies you need and git 'er done.
account-5 · 33d ago
I'm not a professional Dev but thought this is was tree-shaking is about? Certainly this happens in flutter, whatever you feel about flutter/dart.

Or is this a sticking plaster? Genuinely don't know as I only develop personal projects.

victorNicollet · 33d ago
Tree-shaking is able to remove code that will never be called. And it's not necessarily good at it: we can detect some situations where a function is never called, and remove that function, but it's mostly the obvious situations such as "this function is never referenced".

It cannot detect a case such as: if the string argument to this function contains a substring shaped like XYZ, then replace that substring with a value from the environment variables (the Log4j vulnerability), or from the file system (the XML Entity Extension vulnerability). From the point of view of tree-shaking, this is legitimate code that could be called. This is the kind of vulnerable bloat that comes with importing large libraries (large in the sense of "has many complex features", rather than of megabytes).

account-5 · 33d ago
Thanks for the explanations, much appreciated.

I suppose the options are then:

1. Write everything yourself, time consuming and hard, less likely to lead to these types of vulnerabilities.

2. Import others code, easy and takes no time, can lead to vulnerabilities.

3. Use others code, but only what you actually need. Maybe less time consuming than 1 but more than 2, adds a different sort of complexity, done correctly less likely to lead to these vulnerabilities.

Not sure if there's any other options here?

victorNicollet · 33d ago
I would say 4. grab individual code files (as opposed to entire libraries) and manually edit them, removing unnecessary features and adding new ones where needed.
jajko · 33d ago
Yeah everybody should reimplement their own security for example, that's a really smart fool-proof approach especially down the line, no real cases for any contrarian opinions.

I do get what you mean, but it works only on some very specific types of projects, when you & potentially comparably (very) good & skilled peers are maintaining and evolving it long term. This was never the case in my 20 years of dev career.

This sort of shared well tested libraries -> gradually dependency hell is in some form shared across all similar languages since its pretty basic use case of software development as an engineering discipline. I haven't seen a good silver bullet so far, and ie past 14 years of my work wouldn't be possible with approach you describe.

hinkley · 32d ago
Within reason, we need to be able to promote third party libraries into the standard library.

A small standard library pairs well with an easy mechanism to download code, but at some point it's probably a crutch. There are maybe 5 functions in lodash at this point that show up routinely in production code but cannot be sufficed by existing editions to EcmaScript - sortBy, recursive get, and recursive merges being among the most useful. We could just have these and be done.

klysm · 33d ago
Unfortunately that comes with the baggage of terrible memory safety. I do agree with the sentiment though, that deps should be taken with more consideration.
privong · 33d ago
> Unfortunately that comes with the baggage of terrible memory safety.

Isn't this unrelated to the parent post's thoughts about the benefit's of the C/C++ ecosystem (or lack thereof) for dependency management? I.e., a Rust-like language could still exist with a dependency management system similar to what C/C++ have now -- that isn't predicated on how the language handles memory.

codr7 · 33d ago
Given how much critical software is written in C, and the number of problems we run into; I don't see a reason to keep repeating that line outside of the Rust marketing department.

Some people will always prefer C to Rust, might as well learn to live with that fact.

udev4096 · 33d ago
Remember how cloudflare (in 2017) leaked pretty much everyone's secret tokens in search engine cache due to a simple buffer overflow? Yeah, that wouldn't have happened with Rust
guappa · 33d ago
I've seen segmentation faults in java, go, python. All you need is a bug in a hidden library :)
MrJohz · 33d ago
A segfault won't leak sensitive data, though.
guappa · 32d ago
The problem is that the code is incorrect and what happens instead/before the segfault, not the segfault itself :)

What happens before/instead is normally worth a CVE.

packetlost · 33d ago
Segfaults, no. Usually they're a null dereference, but it could also be an out of bounds read on an array, which can leak data.
MrJohz · 33d ago
Not if it's segfaulting, no. That's the point of a segfault.
packetlost · 32d ago
This just isn't true. A segfault happens when an access happens outside of the memory space of the process, such as a 0x00 pointer. Pages are typically allocated in 4kB chunks and allocators will try to minimize syscalls to allocate more virtual pages by trying to maximize reuse of existing pages. All of this results in an out of bounds access being trivial and hard to detect without additional code to check: https://paste.sr.ht/~chiefnoah/be5864cb0d78d6691fe3e36946709...

A runaway loop can access program memory until it segfaults pretty easily.

lelanthran · 33d ago
Remember that the most expensive exploit the world has ever seen was in a memory safe GC language?

My argument is that you are missing the point: the point is that a larger attack surface enables more exploits regardless of language.

When using a language that has tremendous friction in expanding the attack surface you tend to have a small attack surface as a result.

Theres obviously a crossover point where you'd be safer with a memory safe language and a larger attack surface than with a memory unsafe language and a minuscule attack surface.

lmm · 33d ago
> Remember that the most expensive exploit the world has ever seen was in a memory safe GC language?

No I don't, which exploit are you talking about? The most expensive exploit I can think of was caused by heartbleed which was in a memory unsafe language. The "most expensive software bug" (not an exploit) caused by turning off the safe overflow handler in the language being used can hardly be considered an indictment of language level safety either. So what exploit are you talking about?

throw1111221 · 33d ago
Not the person you replied to, but they're probably talking about Log4j. It's a Java logging library that had a helpful feature where logging a special format string would pull code from a remote URL and execute it. So anywhere you can get a Java server to log something you can run arbitrary code. (Ex: by setting a malicious User-Agent.) Estimates say 93% of enterprise cloud environments where affected.

I suppose Stuxnet could also count, where the initial infection depends on the human curiosity of plugging an unknown usb drive into an air gapped system.

lelanthran · 32d ago
> No I don't, which exploit are you talking about?

Log4j

> The most expensive exploit I can think of was caused by heartbleed which was in a memory unsafe language.

Heartbleed was nowhere near as costly as Log4j. Last I checked, there was two orders of magnitude difference between the cost of fixing Log4j (which still isn't completely fixed for a few systems) than Heartbleed (which is completely fixed).

lmm · 32d ago
I wouldn't consider the remediation costs as being the costs of the exploit - that's more just a measurement of how widely used something is (if anything I'd say it should count for the other side - cost of exploitation divided by cost of remediation is a reasonable measure of how "bad" the bug was, because the cost of remediation is generally proportionate to the cost that was being saved in the first place). Heartbleed has the most expensive case of actual exploitation I can think of - the $73M JP Morgan hack. So far I haven't heard of any attackers actually using the log4j vulnerability.

> Log4j (which still isn't completely fixed for a few systems) than Heartbleed (which is completely fixed)

How are you counting that? There are definitely embedded systems out there running old versions of OpenSSL that will never be patched. Because there's no standard package management and vendoring dependencies is more common in the C world, it's probably less easy to get a list of vulnerable systems, but that doesn't mean the vulnerability isn't there.

lelanthran · 32d ago
> I wouldn't consider the remediation costs as being the costs of the exploit - that's more just a measurement of how widely used something is

Maybe you won't in general, but we're chatting on a thread about the threats of supply chain attacks.

Reading upthread, some GG...P thread espoused the idea that maybe the trade-off in using a memory-safe language with almost frictionless thirdy-party dependency might not always be worth it compared to a memory-unsafe language with very high friction for third-party dependencies.

In this context, the specific comment I replied to made a frankly asinine comment about how "this wouldn't happen in Rust", to which I felt compelled to point out that a) More expensive breaches have occurred in memory safe languages, and b) Supply chain attacks have large dollar impacts anyway.

To add, there's also c) The majority of breaches are occurring irrespective of tech stacks.

codr7 · 33d ago
Yeah I know, if only we could rewrite the entire world in Rust everything would be rainbows and unicorns. But it's not going to happen, deal with it.
klysm · 33d ago
I never mentioned rust. I’m just saying C and C++ have terrible memory safety.
codr7 · 33d ago
And what's the alternative then, from your perspective? What did you have in mind when you wrote the comment?
klysm · 33d ago
I had no alternative in mind. The topic at hand is security and bloat, and C/C++ might be leaner apps in practice but they are generally going to have memory safety bugs which is a security problem.
codr7 · 32d ago
It is, but there are very good tools and plenty of experience with dealing with the problem. It's been blown way the fuck out of proportion lately by the Rust mob.
atoav · 33d ago
Bloat might be correlated with the ease of bloating software and it indeed easier to do precisely that if you don't have to write it yourself.

Bloat is uncontrolled complexity and making it harder to manage complexity reduces bloat. But it also makes it harder to write software that has to be complex for legitimate reasons. Not everybody should write their own library handling SSL, SQL or regex for example. But those libraries are rarely the problem, things like leftpad are.

Or: you can use package systems for good and for evil. The only real way to fight bloat is to be diciplined an vet your dependencies. It must cost you something to pull them in. If you have to read and understand everything you pull in, pulling in everybody and their dog suddenly becomes less desireable.

Also I think this is much more an issue off the quality of dependencies than it is about using dependencies themselves (it would be stupid to write 1000 implementations of HTTP for a language, one that works really well is better).

RetroTechie · 32d ago
> But it also makes it harder to write software that has to be complex for legitimate reasons.

Might have stolen this quote somewhere, but imho:

Simple things should be easy, complex things should be possible.

Related: software (binary) size should reflect the complexity of the problem domain.

Some time ago, ran down the size of apps on my phone. Smallest one? ~2MB. What does that app do? Calculate some hash on a file. Select a file, it does its thing, shows the hash (and/or copy to clipboard).

What the ..!?!#$ 2,000,000+ bytes for that?

This is on Android, 'batteries included'. Selecting / opening a file should be a couple (or couple dozen) lines of source code, a function call to the OS, and presto. Same with reading file contents, and display output / clipboard copy.

Which leaves... computing the hash. I'm not an expert, but what hash functions are so complex that you'd need a MB+ of code to calculate? (answer: none).

Note that this app was the least worse offender.

Conclusion: Android app model is broken. Or SDKs used to build Android apps are crap. Or other reasons / some combination thereoff. Regardless, ~2MB to compute a file hash is ridiculous. Full-blown Graphical User Interfaces (GUI) have been done in less.

I'd be interested to know what that 2MB consists of, though. And where the hash function is at. And what (minute) % of overall binary size. And what all the rest of that binary does.

reaperducer · 33d ago
Now, with systems like npm, maven or cargo, all you need to do to get a package is to add a line in a configuration file

They can't hack what doesn't exist.

Reducing surface area is sometimes the easiest security measure one can take.

udev4096 · 33d ago
Go has the most lean and simple dependency management. It's far better than npm or pypi dumpster fire
watermelon0 · 33d ago
It's also worth mentioning the extensive standard library and golang.org/x/, which means that you generally don't even need that many 3rd party packages.
udev4096 · 33d ago
Also the extensive measures to combat supply chain security for packages [0]

[0] - https://go.dev/blog/supply-chain

guappa · 33d ago
Go is a dumpster fire as well as those.

edit: lol at the downvotes. Go developers showing how insecure they are once again.

staunton · 33d ago
Since you're apparently interested in downvotes (why?), I'm pretty sure it's not due to criticism of Go but rather the fact that your criticism is entirely non-specific and therefore doesn't add anything to the discussion...
guappa · 33d ago
Because the comment I replied to was so specific?

There's plenty of perfectly good libraries on npm and pypi, and there's awful ones. Likewise for go which pulls from "the internet".

Must I really demonstrate that bad code exists in go? You want examples? There's plenty of bad libraries in go, and pinning to a commit is a terrible practice in any language. Encourages unstable APIs and unfixable bugs.

lmm · 33d ago
It added just as much to the discussion as the comment it was in reply to, so downvoting one but not the other seems somewhat unfair.
dvh · 33d ago
People often think "speed" when they read "bloat". But bloat often means layers upon layers of indirection. You want to change the color of the button in one dialog. You find the dialog code, change the color and nothing. You dig deeper and find that some modules use different colors for common button, so you find the module setting, change the color and nothing. You dig deeper and find that global themes can change colors. You find the global theme, change the color and nothing. You start searching entire codebase and find that over 17 files change the color of that particular button and one of those files does it in a timer loop because your predecessor couldn't find out why the button color changed 16 times on startup so he just constantly change it to brown once a second. That is bloat. Trivial change will take you half a day. And PM is breathing on your neck asking why changing button color takes so long.
alganet · 33d ago
No. What you described is known as technical debt.

Bloat affects the end user, and it's a loose definition. Anything that was planned, went wrong, and affects user experience could be defined as bloat (many toolbars like Office had, many purposes like iTunes had, etc).

Bloat and technical debt are related, but not the same. There is a lot of software that has a very clean codebase and bloated experience, and vice-versa.

Speed is an ambiguous term. It is often better to think in terms of real performance and user-perceived performance.

For example, many Apple UX choices prioritize user perceived performance instead of real performance. Smooth animations to cover up loading times, things such as that. Their own users don't even know why, they often cannot explain why it feels smooth, even experienced tech people.

Things that are not performant but appear to be fast are good examples of good user-perceived performance.

Things that are performant but appear to be slow exist as well (fast backend lacking proper cache layer, fast responses but throttled by concurrent requests, etc).

FirmwareBurner · 33d ago
>many Apple UX choices prioritize user perceived performance instead of real performance.

Then why does Apple still ship 60Hz displays in 2025? The perceived performance on scrolling a web page on 60Hz is jarring no matter how performant your SoC is.

jsheard · 33d ago
Apple backed themselves into a corner with desktop monitors by setting the bar for Retina pixel density so high, display manufacturers still aren't able to provide panels which are that large and very dense and very fast. Nobody makes 5K 27" 120hz+ monitors because the panels just don't exist, not to mention that DisplayPort couldn't carry that much data losslessly until quite recently.

There's no excuse for 60hz iPhones though, that's just to upsell you to more expensive models.

os2warpman · 33d ago
> Then why does Apple still ship 60Hz displays in 2025?

To push people who want faster displays to their more expensive offerings.

60Hz: $1000

120Hz: $1600

That's one reason, among many, why Apple has a $3 trillion market cap.

For a site with so many people slavishly obsessed with startups and venture capital, there seems to be a profound lack of understanding of what the function of a business is. (mr_krabs_saying_the_word_money.avi)

alganet · 33d ago
I don't know why.

I said many choices are focused on user-perceived performance, not all of them.

Refresh rate only really makes a case for performance in games. In everyday tasks, like scrolling, it's more about aesthetics and comfort.

Also, their scrolling on 60Hz looks better than scrolling on Android at 60Hz. They know this. Why they didn't prioritize using 120Hz screens is out of my knowledge.

Also, you lack attention. These we're merely examples to expand on the idea of bloat versus technical debt.

I am answering out of kindness and in the spirit of sharing my perspective to point the thread in a more positive discussion.

FirmwareBurner · 33d ago
>Refresh rate only really makes a case for performance in games

Refresh rate really matters for everything in motion, not just games, that's why I said scrolling.

> In everyday tasks, like scrolling, it's more about aesthetics and comfort.

Smooth scrolling IS everyday comfort. Try going from 120Hz to 60Hz and see how you feel.

>their scrolling on 60Hz looks better than scrolling on Android at 60Hz.

Apple beat physics?

alganet · 33d ago
You lack attention. It matters for comfort in everything. It matters for performance on games much more. Most users don't even know about refresh rate, they just know their iPhones feels good.

They don't let you scroll as fast as Android does, which makes the flickering disorienting sensation of speed scrolling in a low refresh rate less prominent. It optimizes for comfort given the hardware they opted to use.

Android lets you scroll faster, and it does not adjust the scrolling dynamics according to the refresh rate setting. It's optimized for the high end models with 120Hz or more, so it sucks on low end settings or phones.

Some people take years to understand those things. It requires attention.

insomagent · 33d ago
Battery life? Temperature? Price-to-performance ratio? These are not decisions that are solved as simply as decreeing "every device must have at least 3000Hz refresh rate."
nicce · 33d ago
I have heard that battery life is the primary reason. After all, it is screen and modem that consumes most of it.

Could be about 20% worse battery life.

https://www.phonearena.com/news/120Hz-vs-60hz-battery-life-c...

BobbyTables2 · 33d ago
At the library level, I dislike how coarse grained most things are. Sadly becomes easier to reimplement things to avoid huge dependency chains.

Want a simple web server ? Well, you’re going to get something with a JSON parser, PAM authentication, SSL, QUIC, websockets, an async framework, database for https auth, etc.

Ever look at “curl”? The number protocols is dizzing — one could easily think that HTTP is only a minor feature.

At the distro level, it is ridiculous that so long after Alpine Linux, the chasm between them and Debian/RHEL remains. A minimal Linux install shouldn’t be 1GB…

We used to boot Linux from a 1.44mb floppy disk. A modern Grub installation would require a sizable stack of floppies! (Grub and Windows 3.0 are similar in size!)

procaryote · 33d ago
> Want a simple web server ? Well, you’re going to get something with a JSON parser, PAM authentication, SSL, QUIC, websockets, an async framework, database for https auth, etc.

Simple means different things for different people it seems. For a simple web server you need a tcp socket.

If you want a full featured high performance web server, it's not gonna be simple.

udev4096 · 33d ago
Alpine's biggest hurdle is musl. Most of the software still relies on libc. You should look into unikernels [0], it's the most slimmed down version of linux that you can ship. I am not sure how different a unikernel is from a distroless image tho

[0] - https://unikraft.org/

anacrolix · 32d ago
Alpine is not as good as it seems. It's mostly broken it just works when you ask it to run a handful of common tools. Everything out of view is completely broken.
actionfromafar · 33d ago
I think we lost something with static linking when going from C to Dotnet. (And I guess Java.) Many C (and C++, especially "header only") libraries when statically linked are pretty good at filtering out unused code.

Bundling stuff in Dotnet are done much more "runtime" often both by design of the library (it uses introspection¹) and the tools².

1: Simplified argument - one can use introspection and not expect all of the library to be there, but it's trickier.

2: Even when generating a self contained EXE, the standard toolchain performs no end-linking of the program, it just bundles everything up in one file.

anacrolix · 32d ago
I disagree. Most people here myself included aren't using Java or .NET. You are in a microcosm in this audience.
neonsunset · 32d ago
> I think we lost something with static linking when going from C to Dotnet. (And I guess Java.) Many C (and C++, especially "header only") libraries when statically linked are pretty good at filtering out unused code.

This is an interesting statement because, for example, in C version of Mimalloc you end up paying for opt-in assertions because they still exist in the code unless you compile a different version that strips them away. In C# port, you can set the same assertions/checks early with AppContext switch, and then the values will be cached in static readonly fields. Then, when JIT recompiles the code to a more optimized version, these values will become JIT constants leading to all the unreachable code to be optimized away completely (and to much better inlining of now streamlined methods).

> Even when generating a self contained EXE, the standard toolchain performs no end-linking of the program, it just bundles everything up in one file.

  /p:PublishTrimmed=true
or even

  /p:PublishAot=true # please note it's better to set it as a project property, but either way it requires non-optional linking
Lastly, consider that JITing the bytecode essentially acts like if everything is a single, statically-linked compilation unit since it's not subject to inconvenient compilation unit restrictions even Rust is subject to, the problems of which need to be cleaned up with link-time optimization.
kant2002 · 31d ago
I think you overestimate ability of Dotnet to trim unused things. As a person who spend a lot of time wandering across ecosystem and measuring what can be done, I would say we have very bulky and complicated libraries in the .Net.

Just bringing HttpClient(without SSL support) add 6Mb of generated code.

Minimal API gets you additional 21 Mb. And we not even talk about desktop applications here.

Reflection is very very core of .Net ecosystem and you cannot reliably trim with how we use it currently

neonsunset · 31d ago
Last time I checked the base web template (the one which uses minimal API) was around 10-12 MB (which is pretty good for something with a full web server, GC, async runtime and more). I’ll message you in private to see what’s going on.

But otherwise yes, reflection is used heavily even when completely unnecessary.

_fat_santa · 33d ago
> At the distro level, it is ridiculous that so long after Alpine Linux, the chasm between them and Debian/RHEL remains. A minimal Linux install shouldn’t be 1GB…

I would say this is a feature and not a bug. Alpine Linux is largely designed to be run in containerized environments so you can have an extremely small footprint cause you don't have to ship stuff like a desktop or really anything beyond the very very basics.

Compare that to Ubuntu which for the 5GB download is the "Desktop" variant that comes with much more software

michaelmrose · 33d ago
>A minimal Linux install shouldn’t be 1GB

Why not this seems pretty arbitrary. Seemingly developer time or functionality would suffer to achieve this goal. To what end?

Who cares how many floppies grub would require when its actually running on a 2TB ssd. The actually simpler thing is instead of duplicating effort to boot into Linux and use Linux to show the boot menu then kexec into the actual kernal or set it to boot next. See zfsbootmenu and "no more boot loader" this is simpler and less bloated but it doesnt use less space

spacerzasp · 33d ago
There is more to size than storage space. Larger applications take more memory, more cpu caches; things spill over to normal memory, latencies grow and everything runs much slower
michaelmrose · 32d ago
For practical purposes given more than enough RAM and fast storage there is no meaningful user discernible performance differences between a 500Mb OS and a 30GB OS.

Whereas very small linux distros are useful in several areas like containers and limited hardware running such on the desktop is an objectively worse experience and is moreso a minimalism fetish than a useful strategy.

RetroTechie · 32d ago
> (..) there is no meaningful user discernible performance differences between a 500Mb OS and a 30GB OS.

I call BS. A small single board computer I have, came with 8 GB of RAM. Not esoecially big or small. 500 MB would fit into this, comfortably. Leaving ~7.5GB for apps. Load everything into RAM once, run from there. RAM bandwith is ~8.5GB/s.

30 GB wouldn't fit. So: swap everything in & out using a (cheapish) SSD over a x1 PCIe lane. Or (more common) from an SD card / eMMC module. Think ~100 MB/s on a good day. That's with apps competing for the memory crumbs left.

That's a ~85x factor difference. 2 orders of magnitude. Yes users would notice.

Sure, developer with fully decked out system doesn't see this. Or even understand it. But:

Size matters.

Note: smartphones, tablets etc are not unlike that SBC. And flash storage tends to be on the low-end side where speed is concerned. Desktop you say? Nope, smartphones & tablets are where it's at these days.

michaelmrose · 32d ago
Intelligently swapping stuff from storage to RAM is literally how most OS on earth have worked for a while because as long as you have enough to keep what is liable to be used soon in RAM performance can trivially be excellent.

Libreoffice on my system spends 99.9% of the time consuming only 650MB of storage. Opening an office doc makes it require about 165MB of RAM. The consequence of it being swapped out at some point is that it takes slightly longer to get started the next time on the order of an additional 0.6 seconds.

If you watched me and the computer whilst I completed a 15 minute task with office you would note that the computer spent most of its time waiting on me rather than the other way around.

It would start 0.6 seconds faster but it wouldn't get done meaningfully faster. It would be 6 100ths of 1% faster rather than being "two orders of magnitude faster"

Worse if I really want faster libreoffice I can just start that at boot and thereafter create new writer windows in ms I wouldn't be obliged to run my entire OS from RAM to achieve this goal.

Virtually nobody runs standard desktop linux on smartphones or tablets. Distro's that target desktops and laptops should not reduce their fitness wherein they are actually used in order to be better suited for environments in which they are not.

RetroTechie · 32d ago
> Libreoffice on my system spends 99.9% of the time consuming only 650MB of storage. Opening an office doc makes it require about 165MB of RAM.

Most office type docs I have, are a few hundred KB (some smaller) to a couple of MBs.

So in your example, that means checking a small document takes (on average) in the order of 100..1000x the document's size worth of RAM. And 'only' 4x that amount of storage for the app doing it.

It wasn't long ago that file sizes vs. code to process it, were more like in the 10:1..1:10 range. 200KB text editor, 50KB text. 100KB image, image viewer under 1MB, etc.

As file sizes grow (higher screen resolutions etc), a reasonable expectation would be for code size (=file format complexity + interfacing with the OS) to lag behind. But the reverse seems to be happening. And let's not get started about browsers, or (worse) "web frameworks".

So if anything, your example nicely demonstrates the point of the article.

michaelmrose · 32d ago
There is expected to be no inherent stable ratio of RAM consumed to document size because the smallest possible document still requires 100% of the basic app and assets to be loaded in order for the app to work and thereafter this isn't expected to grow linearly with the size of the file.

What you are seeing is the expansion of the baseline app not the expansion of RAM required per kb of data. Indeed multiplying your post to a 3000 page monstrosity through the magical of cut and paste and select all only took around twice as much memory as a blank document.

> As file sizes grow (higher screen resolutions etc), a reasonable expectation would be for code size (=file format complexity + interfacing with the OS) to lag behind.

It is pretty clear that the opposite is always going to be true. Programs that don't die outright accumulate features and file formats over time multiply. Further even if the app were the same there are going to be opportunities to trade RAM consumed for a better experience that are going to make more sense the more plentiful RAM is.

There is no expectation whatsoever that coders targeting machines with 16GB of RAM and TB of storage to produce applications that are as parsimonious as those produced to target machines that have 512MB of RAM and GB of storage.

If you want parsimony you can always run emacs and export to pdf its rather fun.

datadrivenangel · 32d ago
There is never actually enough RAM and fast storage.
michaelmrose · 32d ago
There is enough that for most consumer use cases micro optimization only make sense in the context of poverty.