Ask HN: Best codebases to study to learn software design?
70 pixelworm 59 8/24/2025, 5:08:19 AM
I’m working on improving my software design skills, and it was recommended that I study existing well designed codebases. What are some publicly accessible codebases you would consider gold standards for software design?
* "well designed": What was the objectives and ideas...
* "codebases": How well that was implemented
They are a lot of lofty claims saying how this or that is "fast, secure, etc" but don't end like that in the actual implementation.
But most of the time, that could be seen in the "design claims" already! Good design is not just full of adjectives and nice sounding goals, but the concrete considerations, what was the trade-offs, false-starts, and reasons behind the decisions.
You can see some examples reading about the design of Erlang, early pascal, most RDBMS, etc.
So, you first mid/long term goal is to learn to distinguish what good design actual look like.
then, in relation with codebases then to be kinda easier: It actually follow the design?
A good example is the 'std' library of Rust. It has a lot of lofty claims about security and such things that could sound alarms, but then you dive in the code of it and see is there A LOT of care about it, and a lot of docs comments discussing this stuff and then the code match.
P.D: The "std" or equivalent of the lang is one of the most important codebases you need to learn and study, and the MAJOR way to judge how truly good is it.
But sometimes I think I'm just not their yet, if I become able to read code like a book and really understand what happens, which often I don't, then perhaps I'll enjoy the process more.
This shows how immature the field of software engineering is. Imagine bridges or houses were built like that. Or your surgeon was trained like that.
Over time, we hopefully develop estblished norms, but at the moment, things are too much in flux. Put 5 sw engineers in a room, pose a problem and you will get not just 5 different solution proposals, but there will likely be strong disagreements on which approach is a good one.
"I recognize a good solution when I see it" is just not good enough for a serious engineering discipline.
While I don't disagree with you in general, this does feel a bit off.
By that logic you can call the field of music immature, and all of the arts. I think the difference is that its easy to experiment without high costs.
I genuinely think that if building bridges was cheap and quick, the fastest way to learn was to try...
I think the field could get better at knowing when costs are low (eg sometimes scalability, cheaper to change a database choice than rebuild a bridge) and where the costs are sometimes very high (eg security).
Bridge building is a lot more conservative when it comes to taking risk in the construction, but that is how we build bridges and lots of bridges collapse because of similar causes:
An average of 128 bridges collapse annually in the United States. More than 17,000 bridges in America are considered "fracture critical" (vulnerable to collapse from a single impact).If they could afford experimenting and have a few bridges collapse before they get it right with no significant negative consequences IMHO it wouldn’t be the worst way to learn.
Maybe even more so for surgeons, being able to experiment and fail in a risk free environment seems like a good thing.
It's not that software engineering is immature, it's just more dynamic.
We are not the surgeon, we write the surgeon. We write a surgeon to fix a broken leg. Once that is done, we don't have to fix another leg. Now we need to reattach a finger. Once that is solved, maybe replace a kidney.
You cannot repetitively train or have strict rules for that, because every time it's something new. You need to have broad knowledge and experience to be able to fight the next unknown challenge. It's unknown because it's never been done before, or it has been done but your competitor will not reveal the details.
Building bridges or being a surgeon sounds very boring to me, since it's always the same (maybe some minor variants). Building software? Very much not the same.
In reality both things are necessary. The car analogy doesn't hold for road driving because we drive well within the limits, but for racing it really is necessary to know exactly where the limits are. I don't think we should really be treating our profession like a race, though.
But if you don't read it's going to be an incredibly long slow process and a lot of car crashes and mangled gearboxes etc. So I say read, read, and read some more. Even if you don't see the point of it right now your experience will later find a place for it and you won't end up descending a hill for the first time not knowing to shift to a lower gear.
I'd aay if you do it for a living, certain tedious chores must be learned. the best programmers i know (professional) can all read code. they spent many junior years learning to read it, being on code auditing desk.... nowadays idk how the landscape looks, but for all of them they had to review and read code to find bugs before they were allowed to produce code (they all worked at same company ofc... so my view is limited!)
i do feel such discipline is needed. they can always poke holes on my code no matter how many holes i plug :) - i am semi professional. i write code for work, but not production code. (experimental). i never learned to audit code and feel that makes it impossible for me to truly create production grade code
Software exists precisely because there is still a messy layer connecting user requirements to actions on a computer. If there was not messiness then we could just automate it all. Approaching software from some sort of Platonic ideal of what software should be will frequently lead to bad decisions on it's own.
When you start to see how certain pressures lead to certain paths you learn to recognize the wrong decisions that are often good at the time, and avoid them. At the same time, you need to learn to develop methods that work quickly and effectively. By far the biggest real challenge in real world software is time constraints. This is almost never discussed in theoretical views of software, but the truth is you're always going to be writing code under pressure to ship. You will come across situations where you do not have time to do what you want to do or think is best.
Good software is software that runs and solves the user need, but you will come to realize that there are design solutions that will make successfully running happen more often. The best way to find these is to study the real software you're writing.
What if the question is asked by a college student?
I learned a lot by just stepping through the code with the debugger of libaries I used. That brought more practical insight while learning about design patterns etc. In the end, it is all about patterns. Finding the right pattern for a given problem.
In my case, I'm a junior engineer that has recently been given more responsibility designing aspects of our product. I'm just trying to learn all I can so my designs will be good!
I definitely learn so much from my team's codebase. Most of what I learn is either from the good designs I see in there or from my googling trying to fix the not so good parts.
* https://news.ycombinator.com/item?id=36370684
* https://news.ycombinator.com/item?id=30752540
* https://news.ycombinator.com/item?id=9896369 (Python specific)
I think there was another one with a similar name but I can't think of it's name.
0: https://www.spinellis.gr/codereading/, check the TOC https://www.spinellis.gr/codereading/toc.html
https://medium.com/@012parth/what-source-code-is-worth-study...
That being said, taking into account the requirements does eliminate quite a few of the options. Right now, I work on safety-critical embedded systems which requires us to make some decisions that would most likely be way different in other environments.
To learn about design you need a wider perspective. You can theoretically learn it from code but it won’t be most effective. Look at great documentation and literature about design instead.
Is there any documentation or literature that has helped you?
https://aosabook.org/en/index.html
Maybe that would be a good start. You can then pick a project to dive in.
As a more specific tip, I've done some hacks in Nginx long time ago and found it quite nice.
EDIT: and to answer your question, if you're working on something that is "like X but different" then read the source code for X. You could also look at source code for software that you use from day to day: software where you already know what it does. For example, if you're in web, maybe the web framework, or web server, if you write python, maybe a core library that you use, or maybe the python interpreter, or if you use vscode ..., if you use android ..., you get the idea. At the start I would suggest smaller programs, and programs where you already know the domain (e.g. cpython might not be the best place to start if you never implemented an interpreter before, you may spend more time learning about interpreters than the design of this one, still a good thing to learn of course.)
https://www.postfix.org/OVERVIEW.html
You might need to know a bit about how email servers work to appreciate it though.
The decisions to exclude complexity, avoid premature abstractions, or reject certain patterns are often just as valuable as the code you can see. But when you're studying a codebase, you're essentially seeing the final edit without the editor's notes - all the architectural reasoning that shaped those choices is invisible.
This is why I've started maintaining Architectural Decision Records (ADRs) in my projects. These document the "why" behind significant technical choices, including the alternatives we considered and rejected. They're like technical blog posts explaining the complex decisions that led to the clean, simple code you see.
ADRs serve as pointers not just for future human maintainers, but also for AI tools when you're using them to help with coding. They provide readable context about architectural constraints and compromises - "we've agreed not to do X because of Y, so please adhere to Z instead." This makes AI assistance much more effective at respecting your design decisions rather than suggesting patterns you've deliberately avoided.
When studying codebases for design patterns, I'd recommend looking for projects that also maintain ADRs, design docs, or similar decision artifacts. The combination of clean code plus the architectural reasoning behind it - especially the restraint decisions - provides a much richer learning experience.
Some projects with good documentation of their design decisions include Rust's RFCs, Python's PEPs, or any project following the ADR pattern. Often the reasoning about what not to build is more instructive than the implementation itself.
The codebases I learned from the most are Git, Postgres, CPython. Not saying they are perfect designs, but they are well maintained, solve hard problems, have seen many years of evolution, and are very easy to get your hands on.
https://www.instagram.com/reel/C2x4Ge5RtNC/
Top 5 codebases for changing my mind about things:
Wietse Venema's Postfix mail server. Taught me tons about security posture, the architecture i'd describe as microservices before microservices was a thing, but contrary to the modern take on microservices (it's mostly a tool for decomposing work across large semi-isolated groups) this was primarily about security and simplicity.
Spring framework - this opened my eyes to ways of working that i hadn't really thought enough about before, the developers on that project have a culture of deeply considering the needs of their users (who are java developers often in an enterprise environment).
Git - the thing i like about the git code base is that once you've covered the objects database (e.g. blobs, trees and commits) and the implementation of refs, everything else just feels like additional incremental features. With those core concepts, everything else is kinda harmoniously built on top.
Varnish by Poul Henning-Kamp is another one - feels like he went to great lengths to make that code base a teaching tool despite the fact it's also a top tier reverse proxy.
Last one isn't a code base - but it will help with software design in the large; studying how the lieutenants model works in the linux kernel.
Thinking about my answers, i think i've highlighted something subtly different than "well designed codebases" it's more a list of codebases that left a notable long lasting impression on me because of design decisions they made.
you should not only study 'good' code... how will you know what is bad code?
study code that does similar things to what u want (client/server/game/ai/datacrunching etc.) and study lot of it..different qualities, ages, and sources
Or maybe open source projects are all usually well designed. I haven't looked at many in depth. The only one I've really looked into was Clang to try to figure out why clang format ignored my style rules when styling files of certain names. (Turns out there is a list of file names that automatically are considered Objective-C rather than C++)
Yes, all open-source projects are well designed with documentation and to-dos, otherwise open-source contribution becomes difficult.
One thing to keep in mind is that what was well-designed 30, 20 or 10 years ago may not be considered such now. Hardware changes and so do the design decisions involving performance.
For example, if you are looking at C++ networking libraries, learning from ACE or even Asio may not be the best idea - better look at "thread per core, share nothing" Seastar[0].
Another thing is that it may be better to read design docs, not the code. For example, the rationale for the mold linker design[1].
[0] https://docs.seastar.io/master/tutorial.html#asynchronous-pr...
[1] https://github.com/rui314/mold/blob/main/docs/design.md
A code base with excellent design will show you the end state but not how it got there but probably not the trade offs and decisions involved.
Practicing refactoring on subpar code bases and dealing with the consequences of your decisions is a better way to improve.
I read design patterns books when i was younger but in retrospect that was a hindrance more than a help.