Show HN: iOS App Size Analysis Tool for macOS (apps.apple.com)
2 points by elpakal 3h ago 0 comments
Show HN: LocoStudio - A better UI for Ollama (locostudio.ai)
5 points by simmy345 17h ago 0 comments
Bloat is still software's biggest vulnerability (2024)
245 kristianp 187 5/6/2025, 11:33:54 PM spectrum.ieee.org ↗
Now, with systems like npm, maven or cargo, all you need to do to get a package is to add a line in a configuration file, and it fetches all the dependencies you need automatically from a central repository. Very convenient, however, you can quickly find yourself with 100+ packages from who knows where and 100s of MB of code.
In C, traditionally, every library you include requires some consideration. There is no auto-download, and the library the user has may be a different version from the one you worked with, and you have to accommodate it, and so does the library publisher. Or you may have to ship is with your own code. Anyways, it is so messy that the simplest solution is often not to use a library at all and write the thing yourself, or even better, realize that you don't need the feature you would have used that library for.
Bad reason, and reinventing the wheel comes with its own set of problems, but at least, the resulting code is of a manageable size.
* The closer something is to your core business, the less you externalize.
* You always externalize security (unless security is your exclusive core business)
Say you are building a tax calculation web app. You use dependencies for things like the css generation or database access. You do not rely on an external library for tax calculation. You maintain your own code. You might use an external library for handling currencies properly, because it's a tricky math problem. But you may want to use your own fork instead, as it is close to your core business.
On the security side, unless that's your speciality, there's guys out there smarter than you and/or who have dedicated more time and resources than you to figure that stuff out. If you are programming a tax calculation web app you shouldn't be implementing your own authentication algorithm, even if having your tax information secure is one of your core needs. The exception to this is that your core business is literally implementing authentication and nothing else.
Just follow OWASP recommendations. A while back this was posted to HN and it also provides great recommendations: https://thecopenhagenbook.com/ .
Dropping your rights to open files as soon as possible, for example, or thinking about what information would be available to an attacker should they get RCE on the process. Shoehorning in solutions to these things after the fact tends to be so difficult that it's a rare sight.
I have been recommended to think of security as a process rather than an achievable state and have become quite fond of that perspective.
I've seen devs pull in frameworks just to get access to single simple to write functions.
Yeah, we shouldn't roll our own cryptography, but security isn't as clean cut as this comment implies. It also frequently bleeds into your business logic.
Don't confuse externalizing security with externalizing liability.
Also, people often complain about "bloat", but don't realize that C/C++ are often the most bloated ones precisely because importing libraries is a pain, so they try to include everything in a single library, even though you only need to use less than 10% of it. Look for example at Qt, it is supposed to be a UI framework but it ends up implementing vectors, strings, json parser and who knows how much more stuff. But it's just 1 dependency so it's fine, right?
Qt is an application development framework, not a GUI toolkit. This is one reason I prefer GTK (there are things I dislike about it too).
I don't know. Suppose you tell us.
I'm not so sure C/C++ solves the actual problem. Only sweeps it under a carpet so it's much less visible.
You are right. But my conclusion is different.
If it is a stable and people have been there for a while then developers know that code as well as the rest. So, when something fails they know how to fix it.
Bringing generic libraries may create long callstacks of very generic code (usually templates) that is very difficult to debug while adding a lot of functionality that is never used.
Bringing a new library into the code base need to be a though decision.
Same here. And a lot of those homegrown functions, utilities and classes are actually already available, and better implemented, in the C++ Standard Library. Every C++ place I've worked had its own homegrown String class, and it was always, ALWAYS worse in all ways than std::string. Maddening. And you could never make a good business case to switch over to sanity. The homegrown functions had tendrils everywhere and many homegrown classes relied on each other, so your refactor would end up touching every file in the source tree. Nobody is going to approve that risky project. Once you start down the path of rolling your own standard library stuff, the cancer spreads through your whole codebase and becomes permanent.
Check this out: https://news.ycombinator.com/item?id=39019001
Of course, this is the whole environment except for Node.js itself. And Vite has improved it.
But there are definitely some tools that are worse than others.
You can, but I think this thread speaks volumes about the problem with the JavaScript/NPM ecosystem as a whole vs. pretty much any other.
We need something else for the web. The only reason we have 200+ NPM packages for a blank project is because JavaScript is atrocious and has almost nothing built-in. We got crap like LeftPad, isodd, is-array, etc. because of the language. Most of what NPM will pull in on any new web front end project is likely already part of the standard library in C#/dotnet, or Java, Go, etc.
But you could go further back and say it's not javascript's fault, it's the fault of trying to hammer the web into doing things it was never designed to do in the first place. But, we insisted on making it an application delivery platform, and now we're suffering the consequences of that. I'm hopeful for WASM, but ideally I'd love to see a resurgence of native apps.
Meanwhile, the web runs in a web browser. You have a network bar. You can inspect element. You can inject Javascript. You can run your own code inside people's apps.
The former scenario isn't better than the latter scenario just because some people build their website with too many NPM dependencies.
Cathedrals are conservative. Reactionary, even. You can measure the rate of change by generations.
Bazaars are accessible and universal. The whole system is chaotic. Changes happen every day. No single agent is in control.
We need both to make meaningful progress, and it's the job of engineers to take any given problem and see where to look for the solution.
A lot of projects would simply not exist without it. Linux, comes to mind. I guess one might take the position that "Windows is fine" but would there ever have been even competition for Windows?
Another example, everyone would be rolling their own crypto without openssl, and that would mean software that's yet a lot more insecure than what we have. Writing software with any cryptography functionality in mind would be the privilege of giant companies only (and still suck a lot more than what we have).
There's a lot more things. The internet and software in general would be set back ~20years. Even with all the nostalgia I can muster, that seems like a much worse situation than today.
The ground truth is that software bloat isn't bad enough of a problem for software developers to try and fight it. We already know how to prevent this, if really want to. And if the problem was really hurting so much, we'd have automated ways of slimming down the executables / libraries.
In my role in creating CI for Python libraries, I did more hands-on dependency management. My approach was to first install libraries with pip, see what was installed, research why particular dependencies have been pulled in, then, if necessary, modify the packages in such a way that unnecessary dependencies would've been removed, and "vendor" the third party code (i.e. store it in my repository, at the version I need). This, obviously, works better for programs, where you typically end up distributing the program with its dependencies anyways. Less so for libraries, but in the context of CI this saved some long minutes of reinstalling dependencies afresh for every CI run.
In the end, it was a much better experience than what you usually get with CI targeting Pyhon. But, in the end, nobody really cared. If CI took less than a minute to complete instead of twenty minutes, very little was actually gained. The project didn't have enough CI traffic for this to have any actual effect. So, it was a nice proof of concept, but ended up being not all that useful.
This gives a couple of advantages: you own your code, no bloat, usually simpler due to not having all the bells and whistles, less abstraction, so faster because there is no free lunch, minimize the attack surface for supply chain attacks...
For fun, the next time you are tempted to install a BlaZiNg FaSt MaDe in RuSt software: get the source, install cargo audit and run the cargo audit on that project.
See how many vulnerabilities there are. So far, in my experience, all the software I checked come with their list of vulnerabilities from transitive dependencies.
I don't know about npm, I only know by reputation and it's enough for me to avoid.
Yes you should not just pull as dependency thing that kid in his parents basement wrote for fun or to get OSS maintainer on his CV.
But there are tons of legitimate libraries and frameworks from people who are better than you at that specific domain.
Here's a scenario. You pull in some library - maybe it resizes images or something. It in turn pulls in image decoders and encoders that you may or may not need. They in turn pull in metadata readers, and those pull in XML libraries to parse metadata, and before you know it a fairly simple resize is costing you 10s of MB.
Worse, you pull in different libraries and they all pull in different versions of their own dependencies, with lots of duplication of similar but slightly different code. Node_modules usually ends up like this.
The point is not writing the resize code yourself. It's the cultural effect of friction. If pulling in the resize library means you need to chase down the dependencies yourself, first, you're more aware of the cost, and second, the library author will probably give you knobs to eliminate dependencies. Perhaps you only pull in a JPEG decoder because that's all you need, and you exclude the metadata functionality.
It's an example, but can you see how adding friction to pulling in every extra transitive dependency would have the effect of librabry authors giving engineers options to prune the dependency tree? The easier a library is to use, the more popular it will be, and a library that has you chasing dependencies won't be easy to use.
This is more likely to happen in C++, where any library that isn't header-only is forced to be an all encompassing framework, precisely because of all that packaging friction. In an ecosystem with decent package management your image resizing library will have a core library and then extensions for each image format, and you can pull in only the ones you actually need, because it didn't cost them anything to split up their library into 30 tiny pieces.
You're thinking correctly on principle, but I think this is also the cause of the issue: it's too easy to pull in a Node dependency even thoughtlessly, so it's become popular.
It would require adding friction to move back from that and render it less easy, which would probably give rise to a new, easy and frictionless solution that ends up in the same place.
Building everything from scratch is insane, but so's uncritically growing a dependency jungle
I don't see the nuance there, that is my take of the comment, those are pretty much strongest statements and points about using libraries are minimal.
That is why I added mine strongly pointing that real world systems are not going to be "managable size" unless they are really small or a single person is working on the.
While in for instance C# I tend to think "this would be simple to implement with whatever-fancy-thing-is-just-a-package-away".
Neither way is impossible to judge as good or bad on its own.
A real world system is almost always part of a larger system or system of systems. Making one thing simple can make another complex. The world is messy.
A small standard library pairs well with an easy mechanism to download code, but at some point it's probably a crutch. There are maybe 5 functions in lodash at this point that show up routinely in production code but cannot be sufficed by existing editions to EcmaScript - sortBy, recursive get, and recursive merges being among the most useful. We could just have these and be done.
Or is this a sticking plaster? Genuinely don't know as I only develop personal projects.
It cannot detect a case such as: if the string argument to this function contains a substring shaped like XYZ, then replace that substring with a value from the environment variables (the Log4j vulnerability), or from the file system (the XML Entity Extension vulnerability). From the point of view of tree-shaking, this is legitimate code that could be called. This is the kind of vulnerable bloat that comes with importing large libraries (large in the sense of "has many complex features", rather than of megabytes).
I suppose the options are then:
1. Write everything yourself, time consuming and hard, less likely to lead to these types of vulnerabilities.
2. Import others code, easy and takes no time, can lead to vulnerabilities.
3. Use others code, but only what you actually need. Maybe less time consuming than 1 but more than 2, adds a different sort of complexity, done correctly less likely to lead to these vulnerabilities.
Not sure if there's any other options here?
Isn't this unrelated to the parent post's thoughts about the benefit's of the C/C++ ecosystem (or lack thereof) for dependency management? I.e., a Rust-like language could still exist with a dependency management system similar to what C/C++ have now -- that isn't predicated on how the language handles memory.
Some people will always prefer C to Rust, might as well learn to live with that fact.
What happens before/instead is normally worth a CVE.
A runaway loop can access program memory until it segfaults pretty easily.
My argument is that you are missing the point: the point is that a larger attack surface enables more exploits regardless of language.
When using a language that has tremendous friction in expanding the attack surface you tend to have a small attack surface as a result.
Theres obviously a crossover point where you'd be safer with a memory safe language and a larger attack surface than with a memory unsafe language and a minuscule attack surface.
No I don't, which exploit are you talking about? The most expensive exploit I can think of was caused by heartbleed which was in a memory unsafe language. The "most expensive software bug" (not an exploit) caused by turning off the safe overflow handler in the language being used can hardly be considered an indictment of language level safety either. So what exploit are you talking about?
I suppose Stuxnet could also count, where the initial infection depends on the human curiosity of plugging an unknown usb drive into an air gapped system.
Log4j
> The most expensive exploit I can think of was caused by heartbleed which was in a memory unsafe language.
Heartbleed was nowhere near as costly as Log4j. Last I checked, there was two orders of magnitude difference between the cost of fixing Log4j (which still isn't completely fixed for a few systems) than Heartbleed (which is completely fixed).
> Log4j (which still isn't completely fixed for a few systems) than Heartbleed (which is completely fixed)
How are you counting that? There are definitely embedded systems out there running old versions of OpenSSL that will never be patched. Because there's no standard package management and vendoring dependencies is more common in the C world, it's probably less easy to get a list of vulnerable systems, but that doesn't mean the vulnerability isn't there.
There is. Linux distributions have package managers whose entire purpose is to distribute and manage applications and their dependencies.
The key difference between Linux distribution package managers and programming language package managers is the presence of maintainers. Any random person can push packages to the likes of npm or PyPI. To push packages to Debian or Arch Linux, you must be known and trusted.
Programming language package managers are made for developers who love the convenience of pushing their projects to the world whenever they want. Linux distribution package managers are made for users who prefer to trust the maintainers not to let malware into the repositories.
Some measured amount of elitism can be a force for good.
They can't hack what doesn't exist.
Reducing surface area is sometimes the easiest security measure one can take.
[0] - https://go.dev/blog/supply-chain
edit: lol at the downvotes. Go developers showing how insecure they are once again.
There's plenty of perfectly good libraries on npm and pypi, and there's awful ones. Likewise for go which pulls from "the internet".
Must I really demonstrate that bad code exists in go? You want examples? There's plenty of bad libraries in go, and pinning to a commit is a terrible practice in any language. Encourages unstable APIs and unfixable bugs.
I do get what you mean, but it works only on some very specific types of projects, when you & potentially comparably (very) good & skilled peers are maintaining and evolving it long term. This was never the case in my 20 years of dev career.
This sort of shared well tested libraries -> gradually dependency hell is in some form shared across all similar languages since its pretty basic use case of software development as an engineering discipline. I haven't seen a good silver bullet so far, and ie past 14 years of my work wouldn't be possible with approach you describe.
Bloat is uncontrolled complexity and making it harder to manage complexity reduces bloat. But it also makes it harder to write software that has to be complex for legitimate reasons. Not everybody should write their own library handling SSL, SQL or regex for example. But those libraries are rarely the problem, things like leftpad are.
Or: you can use package systems for good and for evil. The only real way to fight bloat is to be diciplined an vet your dependencies. It must cost you something to pull them in. If you have to read and understand everything you pull in, pulling in everybody and their dog suddenly becomes less desireable.
Also I think this is much more an issue off the quality of dependencies than it is about using dependencies themselves (it would be stupid to write 1000 implementations of HTTP for a language, one that works really well is better).
Might have stolen this quote somewhere, but imho:
Simple things should be easy, complex things should be possible.
Related: software (binary) size should reflect the complexity of the problem domain.
Some time ago, ran down the size of apps on my phone. Smallest one? ~2MB. What does that app do? Calculate some hash on a file. Select a file, it does its thing, shows the hash (and/or copy to clipboard).
What the ..!?!#$ 2,000,000+ bytes for that?
This is on Android, 'batteries included'. Selecting / opening a file should be a couple (or couple dozen) lines of source code, a function call to the OS, and presto. Same with reading file contents, and display output / clipboard copy.
Which leaves... computing the hash. I'm not an expert, but what hash functions are so complex that you'd need a MB+ of code to calculate? (answer: none).
Note that this app was the least worse offender.
Conclusion: Android app model is broken. Or SDKs used to build Android apps are crap. Or other reasons / some combination thereoff. Regardless, ~2MB to compute a file hash is ridiculous. Full-blown Graphical User Interfaces (GUI) have been done in less.
I'd be interested to know what that 2MB consists of, though. And where the hash function is at. And what (minute) % of overall binary size. And what all the rest of that binary does.
Bloat affects the end user, and it's a loose definition. Anything that was planned, went wrong, and affects user experience could be defined as bloat (many toolbars like Office had, many purposes like iTunes had, etc).
Bloat and technical debt are related, but not the same. There is a lot of software that has a very clean codebase and bloated experience, and vice-versa.
Speed is an ambiguous term. It is often better to think in terms of real performance and user-perceived performance.
For example, many Apple UX choices prioritize user perceived performance instead of real performance. Smooth animations to cover up loading times, things such as that. Their own users don't even know why, they often cannot explain why it feels smooth, even experienced tech people.
Things that are not performant but appear to be fast are good examples of good user-perceived performance.
Things that are performant but appear to be slow exist as well (fast backend lacking proper cache layer, fast responses but throttled by concurrent requests, etc).
Then why does Apple still ship 60Hz displays in 2025? The perceived performance on scrolling a web page on 60Hz is jarring no matter how performant your SoC is.
There's no excuse for 60hz iPhones though, that's just to upsell you to more expensive models.
To push people who want faster displays to their more expensive offerings.
60Hz: $1000
120Hz: $1600
That's one reason, among many, why Apple has a $3 trillion market cap.
For a site with so many people slavishly obsessed with startups and venture capital, there seems to be a profound lack of understanding of what the function of a business is. (mr_krabs_saying_the_word_money.avi)
I said many choices are focused on user-perceived performance, not all of them.
Refresh rate only really makes a case for performance in games. In everyday tasks, like scrolling, it's more about aesthetics and comfort.
Also, their scrolling on 60Hz looks better than scrolling on Android at 60Hz. They know this. Why they didn't prioritize using 120Hz screens is out of my knowledge.
Also, you lack attention. These we're merely examples to expand on the idea of bloat versus technical debt.
I am answering out of kindness and in the spirit of sharing my perspective to point the thread in a more positive discussion.
Refresh rate really matters for everything in motion, not just games, that's why I said scrolling.
> In everyday tasks, like scrolling, it's more about aesthetics and comfort.
Smooth scrolling IS everyday comfort. Try going from 120Hz to 60Hz and see how you feel.
>their scrolling on 60Hz looks better than scrolling on Android at 60Hz.
Apple beat physics?
Could be about 20% worse battery life.
https://www.phonearena.com/news/120Hz-vs-60hz-battery-life-c...
They don't let you scroll as fast as Android does, which makes the flickering disorienting sensation of speed scrolling in a low refresh rate less prominent. It optimizes for comfort given the hardware they opted to use.
Android lets you scroll faster, and it does not adjust the scrolling dynamics according to the refresh rate setting. It's optimized for the high end models with 120Hz or more, so it sucks on low end settings or phones.
Some people take years to understand those things. It requires attention.
Want a simple web server ? Well, you’re going to get something with a JSON parser, PAM authentication, SSL, QUIC, websockets, an async framework, database for https auth, etc.
Ever look at “curl”? The number protocols is dizzing — one could easily think that HTTP is only a minor feature.
At the distro level, it is ridiculous that so long after Alpine Linux, the chasm between them and Debian/RHEL remains. A minimal Linux install shouldn’t be 1GB…
We used to boot Linux from a 1.44mb floppy disk. A modern Grub installation would require a sizable stack of floppies! (Grub and Windows 3.0 are similar in size!)
[0] - https://unikraft.org/
Simple means different things for different people it seems. For a simple web server you need a tcp socket.
If you want a full featured high performance web server, it's not gonna be simple.
I would say this is a feature and not a bug. Alpine Linux is largely designed to be run in containerized environments so you can have an extremely small footprint cause you don't have to ship stuff like a desktop or really anything beyond the very very basics.
Compare that to Ubuntu which for the 5GB download is the "Desktop" variant that comes with much more software
Bundling stuff in Dotnet are done much more "runtime" often both by design of the library (it uses introspection¹) and the tools².
1: Simplified argument - one can use introspection and not expect all of the library to be there, but it's trickier.
2: Even when generating a self contained EXE, the standard toolchain performs no end-linking of the program, it just bundles everything up in one file.
Why not this seems pretty arbitrary. Seemingly developer time or functionality would suffer to achieve this goal. To what end?
Who cares how many floppies grub would require when its actually running on a 2TB ssd. The actually simpler thing is instead of duplicating effort to boot into Linux and use Linux to show the boot menu then kexec into the actual kernal or set it to boot next. See zfsbootmenu and "no more boot loader" this is simpler and less bloated but it doesnt use less space
Whereas very small linux distros are useful in several areas like containers and limited hardware running such on the desktop is an objectively worse experience and is moreso a minimalism fetish than a useful strategy.
I call BS. A small single board computer I have, came with 8 GB of RAM. Not esoecially big or small. 500 MB would fit into this, comfortably. Leaving ~7.5GB for apps. Load everything into RAM once, run from there. RAM bandwith is ~8.5GB/s.
30 GB wouldn't fit. So: swap everything in & out using a (cheapish) SSD over a x1 PCIe lane. Or (more common) from an SD card / eMMC module. Think ~100 MB/s on a good day. That's with apps competing for the memory crumbs left.
That's a ~85x factor difference. 2 orders of magnitude. Yes users would notice.
Sure, developer with fully decked out system doesn't see this. Or even understand it. But:
Size matters.
Note: smartphones, tablets etc are not unlike that SBC. And flash storage tends to be on the low-end side where speed is concerned. Desktop you say? Nope, smartphones & tablets are where it's at these days.
Libreoffice on my system spends 99.9% of the time consuming only 650MB of storage. Opening an office doc makes it require about 165MB of RAM. The consequence of it being swapped out at some point is that it takes slightly longer to get started the next time on the order of an additional 0.6 seconds.
If you watched me and the computer whilst I completed a 15 minute task with office you would note that the computer spent most of its time waiting on me rather than the other way around.
It would start 0.6 seconds faster but it wouldn't get done meaningfully faster. It would be 6 100ths of 1% faster rather than being "two orders of magnitude faster"
Worse if I really want faster libreoffice I can just start that at boot and thereafter create new writer windows in ms I wouldn't be obliged to run my entire OS from RAM to achieve this goal.
Virtually nobody runs standard desktop linux on smartphones or tablets. Distro's that target desktops and laptops should not reduce their fitness wherein they are actually used in order to be better suited for environments in which they are not.
The fewer 3rd parties you involve in your product, the more likely you will encounter a comprehensive resolution to whatever vulnerability as soon as a response is mounted. If it takes 40+ vendors to get pixels to your customers eyeballs, the chances of a comprehensive resolution rocket toward zero.
If every component is essential, does it matter that we have diversified the vendor base? Break one thing and nothing works. There is no gradient or portfolio of options. It is crystalline in every instance I've ever encountered.
I am at a big tech company and have seen some wildly insecure code make it into the codebase. I will forever maintain that we should consider checking if candidates actually understand software engineering rather than spending 4 or 5 hours seeing if they can solve brainteasers.
A 2024 plea for lean software - https://news.ycombinator.com/item?id=39315585 - Feb 2024 (240 comments)
Back when I had slow ADSL (like 2 Mbps) I couldn't use Docker at all at home because the repository server had low timeouts. I was downloading 20GB games with Steam not to mention Freebase data dumps and other things that large because I had reliable tools to do the downloads, which Docker didn't use so downloading 5GB of images was not "wait for it" but rather "you can't do it."
By accelerating the rate at which you can attach random dependencies you can run into problems because you are using 6 different versions of libc for Christ's sake. Rather than getting Python from some reputable source like conda or deadsnakes, Docker gives data scientists superpowers to get Pythons with random strange build options and character encodings. A 20 megabyte patch requires 2 GB of disk IO once it goes through the Docker IO multiplier. A 5 minute build becomes a 20 minutes build. Docker is fast from the viewpoint of "ops" but is slow from the viewpoint of "dev"; where people use Docker they are always taking forever to do the simplest things and facing extreme burnout.
There are some places where people really want to run 8 versions of Java and 3 versions of PHP and think it's going to make them productive that they can write 15 microservices in 15 different languages... It's a delusion. If you get purposeless variation of variances in your system in control you are in control and have a huge competitive advantage over 10x larger teams who use tools that let them barrel on without being in control.
https://thethreevirtues.com/
You do not say : "there is two task: add some feature, takes 1 day, and delete some cruft, takes 1 day".
You say: "Yes, that feature. That's one task. It will take 2 days."
As per Tame Impala's Elephant:
He pulled the mirrors off his Cadillac
Because he doesn't like it looking like he looks back
Looking back gives the impression of missteps or regret. We have no such thing!
And because it is based on nothing, you can just lie about it
(1) https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-i...
If they instead had filtered/disabled previews the security problems would still exist - and potentially have less visibility.
Every software component follows the same pattern. Software, thus made from these components, ends up being intractably complex. Nobody knows what a thing is, nor how things work.
This is where we are right now, before we add AI. Add AI and "vibe coding" to the mix, and we're in for a treat. But don't worry - there'll be another tool that'll make this problem, too, easy!
I'm hereby coining the term 'cognitive sovereignty'.
It’s like drugs: if a doctor prescribes, it’s probably ok. If you have an addiction, then you’re in for a lifetime of trouble.
The answer to your questions is already in my reply.
You're buying them with the risk that they could become a threat in the future. At one point it's not worth it anymore.
And of course, if you're doing just recreational coding to learn something, or if what you need differs from what is available, or the available thing seems sketchy somehow, then you'd write it yourself (if it's feasible). But for most things where what you need is clear and unambiguous, I don't see why you'd invent it yourself. For an established library it's unlikely that you'd do any better anyway.
(And again, if it's recreational what you are doing, you want to learn and have a hobby, of course, do it yourself. But in that case, you aren't actually looking for dependencies anyway - your goal is elsewhere.)
> So with infinite resources it would be best to write everything from scratch?
Re-read the parent and the other replies: A critical point you are missing is your interlocutor's practical mindset in contrast to your idealistic one. This is about making engineering-mindset tradeoffs; they vary depending on the specific scenario. The answer to your Reductio ad absurdum is yes, but I believe that side tracks rather than elucidates.
EDIT: Can't post any deeper, but the child can see that no such "statements" have been made.
There's just too much invested in the building of software to dismantle current arrangements or change methodologies quickly, it would take years to do so. Commercial interests depend on bloat for income, so do programmers and support industries.
For example, take Microsoft Windows, these days it's so huge it will not even fit onto a DVD, that's petty outrageous really. I recall Windows expert Mark Russinovich saying that the core/essential components of Windows only take up about 50MB.
But there's no incentive for Microsoft to make Windows smaller and thus have a smaller footprint for hackers to attack. Why? As that bloatware makes Microsoft money!
Rather than dispense with all that bloatware Microsoft has build a huge security edifice around it, there are never-ending security updates, secure Windows boot/UEFI, it's even had to resort to a hardware security processor—Pluton. And much of this infrastructure is nothing but a damn nuisance and inconvienience for end users/consumers.
Microsoft doesn't just stop there, it then makes matters worse by unnecessarily changing the Windows GUI with every new version. Moreover, it's not alone, every Linux distribution is different. What this means is that there's less time to perfect code as its features keep changing.
Now take the huge numbers of programming languages out there. There are so many that many programmers have to learn multiple languages thus cannot become truly proficient in all of them. That lack of expertise alone is problematic. Surely it would be better to concentrate on fewer languages and make those more adaptable. But we know that's not going to happen for all the usual reasons.
Same goes for Web browsers and Web bloat. Every time I complain on HN about browser bloat, the abuse of JS by websites and the never-ending number of Web protocols that keep appearing, I'm voted down. That's understandable of course because programmers and others have a financial vested interest in them. Also, programmers have taken much time to learn all this tech and don't want to see their efforts wasted by its obsolescence.
And I've not yet mentioned the huge and unnecessary proliferation of video, sound codecs, image and audio formats not to mention the many document formats. Programs that use all these formats are thus bigger and more bloated and more prone to bugs and security vulnerabilities. In a more organized world only faction that number would be necessary. Again, we know it's not just technological improvements that have brought such numbers into existence but also commercial and vested interests. Simply, there's money in introducing this tech even if it's only slightly different to the existing stuff.
I've hardly touched this subject and said almost nothing about the economic structure of the industry, but even at first glance it's obvious we can't fix any of this in the near future, except perhaps by tiny incremental steps which will hardly make much impact.
A distribution is just a collection of software to handle common needs. Most are quite similar: systemd, coreutils, glibc, dbus, polkit, pipewire/pulseaudio, and a DE, typically GNOME or KDE. You'll expect to see them on Debian, Ubuntu, Fedora, Nix, Arch, or anywhere else except Void, Alpine, and Gentoo. The only meaningful difference is typically the package manager. We have more standardization in the Linux ecosystem then ever and equally as much bloat, both thanks to systemd.
> Surely it would be better to concentrate on fewer languages and make those more adaptable.
Programming languages are a combination of tools and notation. Different domains have different needs and preferences. We don't lament quantum physicists using bra-ket standard linear algebra notation. Unlike notation, there are material reasons to use one beyond clarity. Some languages support deeper static analysis, some prove complete theorems about your specification, some are small enough to embed, some are easier to extend, and some exist only within a narrow domain like constraint satisfaction. We can add macros or introspection to a language, but in doing so it will fall outside a domain that might value predictability or performance.
> Now take the huge numbers of programming languages out there
I took data from the 2024 Stack Overflow survey filtered for professional developers. The median release year for languages above 25% market share is 1994. The youngest serious language on the list is Swift, dated 2014. I don't think this is evidence of a growing number of programming languages.See converted data below. The release year was augmented by o4-mini.
I don't know why I mentioned Linux here because every time I do in such comments it distracts from the main issue, we Linux users have very fixed and firm opinions about such matters.
Despite your comments, which I essentially agree with (at least in principle), I cannot see rhyme nor reason why there are so very many Linux distros. Yes, there'd be good reason if they were one-offs for a specific application, say in embedded systems etc., but to have so many widespread and in the public domain makes little sense to me. It not only causes confusion amongst users, especially novices, but also spreads human effort widely that would be otherwise better spent on developing fewer systems—it's the more hands make light work philosophy. For the same reason it's why Linux has been so slow to take hold on the desktop. Yes, the usual hardcore Linux user who knows Linux well says 'who cares, that's the least of our worries'. For some odd reason they don't care that the Linux ecosystem would be better off with a more cohesive and unified approach to development.
Even with that said, Linux is forever changing, new kernels come out so frequently that it's hard to find two Linux distributions with simultaneously the same kernel code. No matter how one views it, that load puts a constant strain on bug finding, security testing, etc. Frankly it's a mess, if for no other reason that so many versions are a nightmare for administrators. All these updates cause lots of extra work for all those who don't work in tightly controlled environments that have rigid/strict update procedures. …And that's many of them.
Leaving me out of the argument for a moment, I'd reckon many of the Linux fraternity would object to you lumping Arch with say Debian in the one sentence although they'd likely agree with you over Gentoo etc. That said, why then can't Linux have a single package manger? It's a damn nuisance that it's not so. As usual, not enough people can agree to reach a unified consensus (and they disagree for very questionable reasons). And it's why in many instances we've had to resort to messy kludged solutions such as flatpack. I've more but I'll stop there.
"Programming languages are a combination of tools and notation. Different domains have different needs and preferences."
Why? Yes, I've seen many reasons but I've never seen it justified with solid argument. Most of those reasons arise out of historical happenstance, and or favouritism, or that 'we've always done it that way' syndrome. As I said, programmers have an investment in learning and they don't want to see it made obsolete. Whilst that makes sense to them, it doesn't go any way towards solving the chronic software problems as outlined in the IEEE story.
Let's look at the number programming languages problem a little further. A quick search finds this quote on the CLRN—California Leaerning Resource website:
"According to The International Organization for Standardization (ISO), there are approximately 14,000 programming languages out there. However, this number is often disputed, and different sources may provide varying estimates. For instance, Wikipedia lists over 23,000 programming languages, while Rosetta Code, a website that aims to document programming languages, claims to have data on over 6,000 languages."
That makes me shudder.
OK, lets whittle that down to something more reasonable. Some references claim the number of well-known languages is upward of 700, with between 200 and 400 being those most commonly used. Others say the most frequently used languages number upward of 50. How correct that is and how much of those numbers can be put down to programmers' favorites I cannot say (I only know a few, Lisp, Fortran, C and a few others, so I'm not qualified to speak for those others). I would suggest however that a rational approach would reduce that number down to many fewer than we have now.
To test that hypothesis one could begin with a mathematical analysis of each language. Perhaps the formal mathematical logic à la Whitehead and Russell's Principia Mathematica would be a good place to start as not only the mathematical structures of a language could be tested for coherence and correctness but also so could its grammatical syntax. Possibly there are even better ways of going about such an analysis but I've not given them much thought. Little doubt, AI will rationalize all this in the near future irrespective programmers' wishes.
Suffice to say, until those analyses are done I remain unconvinced that all those (at least common) languages are needed. Preference and favoritism may drive the current status quo but it's not a logical way to proceed and to properly tackle the problems outlined in that story.
If you base your solution on top of something else, like writing a DSL in Lisp instead of starting from scratch, it will still become a new language as it diverges, like Coalton. Otherwise we'd say Perl isn't a language, because its interpreter is written in C.
If these tools weren't needed, nobody but their author would use them. There's a genetic hill climb happening in every sphere of life, from film and poetry to science and programming, where every new thing either sets a new threshold of goodness, or is forgotten when it fails to. Sooner or later things stabilize, when the new solution is not better enough to outweigh the old one: we used to see many version control systems, but only Git remained, because for all its flaws, something like Jujitsu or Pijul weren't as much better than Git as Git was than SVN. We measure how close we are to convergence not by the size of the population, but by the average age of the used solutions. By that metric, software is cooling.
There is no problem to solve: nobody is writing enterprise software in a bespoke SKI-combinator derived language they found on Rosetta Code, nor paralyzed by choice between 275 Linux distros to put on their server. The duplication of effort is a cost offset by dysfunctional application and maintenance of solutions designed by committee. Simplicity does not precede complexity, but follows it.
You are right, there is no central authority telling people what to do and what software to write. And you are correct "you can't force people to not reinvent the wheel…".
What can be done however is to mandate specified software that's gone through rigorous testing in certain buisnesses, government, utilities, the military, critical engineering—aircraft, nuclear, and so on. There's already been a bit of this with Ada and the military but it's miniscule compared with what I am advocating.
Think of it this way: no matter what country one is in all electrical outlets are the same and comply with strict electrical standards for that country. That's not to say there is only one standard worldwide but there are far fewer than if it were a free-for-all as it is in the software industry.
You don't stop people from doing anything, reinventing the wheel or whatever—instead you make it unlawful to supply sofware to those vital entities that does not comply with those specified standards (as set by the ISO, etc.). Outside that realm programmers can do what they want but if they want to play with the big end of town then they'll have to play strictly by the rules.
We'll get to this stage eventually, but it's taking undue time.
As you've said, "Sooner or later things stabilize, when the new solution is not better enough to outweigh the old one…"* but the software industry as a whole is nowhere near that stage of development. Individual program may have reached that stage of development, but in a global sense the software industry is still decades behind the professional standards of other well-established professions (don't take my word for it, just consult the literature).
Right, that sounds authoritarian and something a dictatorship would do. But not so fast: those electrical standards to which I referred were only mandated by govermnents after the free-for-all chaos of the early electrical era where industry could not or would not adopt common standards. . The same applies for other disciplines, electrical engineering has any number of rigorous standards in addition to the example I've already given, same with civil, chemical engineering, transport, shipping etc., weights and measures, and almost all of them are tied to national and international standards. Moreover, a large subset is mandated by law for reasons of compatibility/interoperability (shipping containers, etc.) and or health and safety reasons, or for economic reasons, to minimize costs, to stop people being cheated etc.
These standards and concomitant laws and regulations are a fact of life worldwide and in many instances penalties apply for violating them. About the only exception is the software industry, it's no longer young and should have matured by now but it still operates like the Wild West were anything goes.
I say that as someone who has sat on standards committees and been involved in writing standards. Moreover, in my profession if I were to act in the undisciplined manner of much of the software industry, I'd be struck off.
Right, those are harsh words indeed—but they are only harsh for an industry that has never had to comply with rigorous rules and regulations that have been set by law. Whilst other disciplines have learned to accept them long ago the software industry still does what it damn-well wants, and it's done so with impunity from its outset. That has to change.
So you think I'm a self-opinionated crank. OK, let me bring you back to this HN story and think again. Software programmers and developers like to call their work software engineering and themselves software engineers but I'd suggest many in other engineering professions just laugh at the notion. If you don't hear them shouting it out loud it's because they're being polite.
What we in other engineering professions laugh about isn't the skill sets of programmers and developers, we accept there are many very skilled people who work in the industry. The real issue is the laissez faire free-for-all attitude of the industry—an undisciplined industry not bound by strict procedures and lawful regulations. Without regulations and clearly defined rules and procedures we end up with inconsistent results, bugs and lots of mess.
I'd suggested you read this story again then read the document in the link below, it was written nearly 31 years ago and covers the issues I've addressed, it's a SciAm article titled Software's Chronic Crisis. One of its key postulates is that software development doesn't have the disciplined lineage of say chemical engineering and that programmers are more akin to artists than engineers because they operate without industry standard strictures and procedures (such as those set by law).
What's so poignant about that article nowadays is that precious little has changed in the software industry in respect to those matters it refers to. Now ask yourself why is that so given that there has been much development in other areas of software development.
Little doubt the above commet is correct. Look at the way Niklaus Wirth's Pascal lacks widespread support amongst programmers whereas languages such as C are very popular because programmers don't feel constrained to the extent that Pascal constrains them, they feel hemmed in by it. Pascal essentially works like other professions—you must define what you want first up, (the concept, say a bridge) and then draw up the plans and revise them before anyone starts building it. After it's built few if any changes can be made. That's the cultural difference between software development and other engineering professions. It's a fundamental one.
https://www.researchgate.net/publication/247573088_Software'... (best copy—PDF)
https://www.cse.psu.edu/~gxt29/bug/localCopies/SoftwareCrisi...
Bloated and crafted software - a rant:
The state of software is the state of the structures of primitive societies: some in cave shelters, some under roofs of sticks and leaves, some in mud huts.
We talk of Cathedral and Bazaar, but there are very few carefully designed Cathedrals of software, and those probably have plenty of barely hidden flaws.
The Bazaars are all around us, jammed together, spreading for miles and miles, tent walls and roofs billowing in the breeze, all awaiting a strong zephyr to carry many of them away, and leave most of the rest in ruins.
What software needs is building blocks. Bricks of uniform size, easily joined together. Concrete masonry units. Tilt-up walls. Trans-oceanic shipping containers (connex, seabox).
Solid, composable, engineered, units. We should be able to pull a well-known and heavily tested package or function to use, just like a contractor would call for a delivery of 200 8x8x16" CMU blocks, and be able to expect they will get just that, with no gaps, weak spots, or broken webs.
But, no , all of us software crafts-folk want to carefully create our very own artisanal version of whatever library functions, that are needed for the project at hand. In a world that could be made of solid concrete blocks, we are crafting our very own adobes, with our own special blend of straw and mud, and we think we have advanced far beyond the folks living in mud huts.
Some of us will say they are master masons, crafting cathedrals out of hand cut stones, each carefully measured and chiseled, and each stone unique. We're still duplicating effort when we could be using commercial off-shelf libraries. And all the while the project deadlines go zipping past as we try to craft our way to local perfection.
All I can suggest as a solution is a multi-government and multi-corporate effort to design and build fairly universal functions, libraries, and packages that are robust, exhaustively reviewed by humans, and tested thoroughly. I won't ask for provable correctness, yet. :-)
Would the result be an Ada on steroids? Depends on who is involved.
Choice of language should not matter. The APIs would matter, a lot. A few competing teams would be a possibility. Passing several existing functions to an AI, with a "do like these, only perfectly" might be useful, or useless.
And yes, then we would have 15 competing standards. https://xkcd.com/927/ (Well, we probably already have at last 1,500, so, go figure.)
The Erlang ecosystem has many useful abstractions at just about the right level that allow you to build the simplest possible custom solution instead of reaching for a 3rd party solution for a component of your (distributed) system.
Building just the right wheel for the task at hand does not mean you have to reinvent it first.
The services usually persisted except for automatic updates so I only had to restart all the services a few times per week so it didn't make sense to invest time to automate.
I'm sure the launch can be fully automated but it's kind of at the edge of not worth automating because of how relatively infrequently I need to restart everything... Also the CEO doesn't like to make time for work which doesn't yield visible features for end users.
I actually handed my resignation a month ago, without another job lined up. It became too much haha. Good practice though. Very stressful/annoying.
To me, that was the strangest idea - how could you "decouple" one service from another if it needs to know what to call, and what information to pass and in what format? Distributing the computing - for performance, or redundancy or security or organizational reasons - that I can understand - but "weak coupling" just never made sense to me.
The real reason for tight coupling is simply complex interfaces. That means a range of things; complex function signatures which rely on highly specific parameters (e.g. live instances instead of raw primitive values or raw data) or return complex values instead of raw information "here's what I did". It can also mean complex API parameters and response payloads. Ideally, complex processing should be hidden behind simple interfaces which don't encourage micromanaging the module/service. If the interface is as complex as the processing behind it, that's a design failure and will lead to tight coupling.
Separating code into modules and services may be intended as a way to encourage developers to think about separation of concerns so that they may end up designing simpler interfaces but it doesn't seem to help certain people. Some see it as an opportunity to add even more complexity.
Firing up the whole mess and debugging one or two of them locally is always a major pain, and god help you if you have no idea which services to stub and which to debug.
Sometimes you need them all from source to debug across the stack, when you don't you might need a local container to avoid pollution from a test env, sometimes it is just fine to port-forward to a test env and save yourself the local resources.
If you take care of the developer, the project looks after itself.
Having burned out employees is a cost implication for any business. I do not have concrete data to back this up, though, but from personal experience, I can attest to this. I had to take sick leave and lose days of productivity due to illness caused by burnout from having to deal with bloated software and the deadlines associated with that. Business makes promises to clients without realising how difficult and time-consuming it is to add features and try to keep software operational and secure can be if it is so bloated and difficult to understand.
I did not have the deadlines, but to bear having to deal with bloated software, my solution was vodka: since it has no color, I filled mineral water bottles with it and everyone thought I was drinking water.
From a user/fanboy/paranoid point of view, I don't like systemd. I've good development arguments for it's improved coding for usb device drivers. Still, when I have to reboot, because my system is frozen. It's more complex to use than say runit. Lastly, I'm nervous, that if a company took it over, it's the one piece that might help destroy most distros. Please no hate. This is only my personal point of view, as an amateur e.g. there are people on both sides that have a much better understanding of this.
Seems to favor the microkernel? I've been hoping we one day get daily driver micro-kernel distro. I asked about this but didn't get a lot of answers, except for those that mentioned projects that aren't there yet e.g. I would love to try Redox, but from my understanding, after 10yrs it's still not there yet.
It also brings me to a point that has confused me for years. As, an amateur how to I decide what is better for what level of virtualization from program images like appimage/flatpacks, containers, to VMs. So far, I've hated snaps/flatpacks because, they make a mess of other basic admin commands, and because there seems to be missing functionality. and/or configuration. It may be better now; I haven't tried in a while. Personally, I've enjoyed portage systems in the past, and they are so fast now (to compile). A lot of forums, forget that there are home enthusiast and basically talk about it from an enterprise perspective. Is there a good article or book that might explain when to choose what. Much of what I've read are just "how to" or "how it works". I guess, I would prefer someone who acknowledges we need something for the hardware to run on and when it makes more since to use a regular install vs an image (appimage/flatpack/snap).
Anyway, thanks so much for the article. I do believe you are right, a lot of companies just put out fires because none want to invest in the future. I mean even the CEO usually only is there a few years, historically comparatively; so why would they care? Also, I think H1-B is a security risk in and of itself because, at least in TX, most IT is Indian H1-B. I mean they want a better life, and don't have as many family ties here. If they were to "fall into" a large sum...they could live like Kings in India, or elsewhere.
A get of my lawn section :)
I remember when GUIs started becoming a thing, I dreaded the move from Text to GUIs due to complexity. I also remember most programs I wrote when I started on minis were 64k code and 64k text. They were rather powerful even by today's standards, they did one thing and people had to learn which one to use to perform a task.
Now we have all in one where in some cases you need to page through endless menus or buttons to find an obscure function. In some cases you just give up looking and move on. Progress I guess.