I love this! I have done a lot of CS research lately, and some of these I haven’t come across yet.
Let me share some of my favourites not listed here, off the top of my head:
- Ian Piumarta’s “Open, Extensible Object Models” (https://www.piumarta.com/software/id-objmodel/objmodel2.pdf) is about creating the most minimal object-oriented metaobject system that allows the maximum amount of freedom for the programmer. It basically only defines a message send operation, everything else can be changed at runtime. The practical counterpart to the dense “Art of the Metaobject Protocol” book.
- John Ousterhout “Scripting: Higher-Level Programming for the 21st Century” (https://web.stanford.edu/~ouster/cgi-bin/papers/scripting.pd...) - not really a paper, but an article from the creator of Tcl about the dichotomy between systems programming languages and scripting languages. Obvious at first sight, the lessons therein have wide ramifications IMO. We always seek the perfect multi-paradigm language that can do anything at high performance with the most productivity, while perhaps it is best to have compiled, fast, clunky systems languages paired with ergonomic, flexible interpreted frontend. Often all you need is C+Tcl in the same app. A must-read for anyone writing yet another programming language.
- Niklaus Wirth's Project Oberon (https://people.inf.ethz.ch/wirth/ProjectOberon/) is the implementation of an entire computer system, from the high-level UI down to kernel, compiler, and a RISC-like CPU architecture. He wrote the seminal "plea for lean software" and actually walked the walk. A long lost art in the era of dependency hell and towering abstractions from mediocre coders.
johnecheck · 3h ago
Hmm, I disagree with Ousterhout's dichotomy and conclusions.
First, my understanding of his points - a language is either a systems language like C or a scripting language like TCL or python. Systems language have "strong" types and are for data structures/algorithms, scripting languages are "typeless" and are for "gluing things together".
The main claim is that scripting languages are more concise and allow for faster development when gluing due to their 'typeless' nature.
With C++
and Microsoft Foundation Classes (MFC), it requires
about 25 lines of code in three procedures. 1 Just set-
ting the font requires several lines of code in MFC:
CFont *fontPtr = new CFont();
fontPtr->CreateFont(16, 0, 0, 0, 700,
0, 0, 0, ANSI_CHARSET,
OUT_DEFAULT_PRECIS,
CLIP_DEFAULT_PRECIS,
DEFAULT_QUALITY,
DEFAULT_PITCH|FF_DONTCARE,
“Times New Roman”);
buttonPtr->SetFont(fontPtr);
Much of this code is a consequence of the strong typ-
ing[...] In Tcl, the essential characteristics of the
font (typeface Times, size 16 points) can be used
immediately with no declarations or conversions.
Furthermore, Tcl allows the button’s behavior
to be included directly in the command that cre-
ates the button, while C++ and Java require it to
be placed in a separately declared method.
End quote.
We've come a long way, and examples have made clear that this dichotomy is a false one. Ousterhout's view was colored by his limited experience, resulting in him misunderstanding what he actually likes about Tcl.
Let's talk about syntax. His exact Tcl code as presented could be parsed and compiled by a language that does static type analysis. It's not, because he's running it in an interpreter that only checks types at runtime. But the point is that whether code is compiled or interpreted is an implementation detail that has very little to do with the syntax. He likes his syntax because his syntax is obviously better than C++, nothing more.
And types: he claims that 'typeless' languages allow for faster development because of fewer restrictions. This is, ofc, nonsense. The amount of restrictions you have is a function of your problem, not your language. If it feels like dynamic typing is letting you develop faster, that's because you're putting off encountering those restrictions til later.
It's better to guarantee we encounter all type errors before the program even runs. But since you can do static type analysis for any language, why don't all languages do that?
Because it's hard. Type theory is complicated. Not all type systems are decidable, we need to pick one that is for practical reasons. Ones that are may require annotations or complex algorithms/semantics restrictions like Hindley-Milner.
As a PL designer, it's a whole lot easier to just give up and only check types at runtime. And that makes a whole lot of sense if your priority is just embedding a usable language ASAP. But let's not pretend that it's because it's actually better.
sph · 2h ago
> It's better to guarantee we encounter all type errors before the program even runs.
This is only valid if you either are writing mission-critical software, or have infinite time.
Your argument doesn’t consider the case that you have a deadline and need to ship, so optimising for productivity, rather than asymptotic typing perfection, is paramount. There is a reason even performance-critical environments, where scripting languages are not very well suited, somehow still lean on them for productivity.
Case in point, game dev (Unreal Engine with its blueprint system, Godot with GDScript, the myriad of game engines in C++ paired with Lua.) Of course in a vacuum game devs would like to write ideal data structures with strong typing so that the game doesn’t crash, but their goal is to ship a game within the decade, so limit the strong typing to performance critical engine and can focus on building and iterating on the core product without consulting type theory books.
The point of Mr. Ousterhout’s argument is that there are only two choices: either we invent the panacea, the mythical productive strong-typed language that gives 100% safety yet enables experimentation, optional typing and compiles to C++ speed native code, or we accept that this is an impossible pipe dream, and we need to use the correct tool for the problem. Again, obvious on the surface, but still a contentious point to this day.
RetroTechie · 1h ago
>The point of Mr. Ousterhout’s argument is that there are only two choices:
Ultimately, it all compiles down to assembly (which is executed), or an interpreter (which is executed). Assembly all the way. The real choice here is: turn into assembly ahead of time (programmer's code becomes assembly), or do so at runtime (assembly interprets programmer's code). Or some in-between like JIT compilation. And: what guardrails you put in place.
So you could say that higher-level languages are just a way to make tedious/nasty assembly coding more palatable. Eg. by producing error reports vs. just crashing a machine. Or provide a sandbox. Or provide a model of computation that fits well with programmer's thinking.
Between those, you simply need some useful abstractions. Preferably ones that are simple yet versatile/universal.
Eg. a (bitmap) "screen" could just be a flat memory space (framebuffer). Or a 2-D array of pixels, potentially with "pixel" defined elsewhere.
Programmer doesn't care how GPU is coerced into displaying those pixels, low-level code doesn't care what higher-level software does to produce them.
And then have as few as those abstractions as possible (but no less!). Tcl: "everything is a string". Lisp: "everything is a list". Unix(-like): "everything is a file". Etc.
Personally, I have a soft spot for languages that punch above their weight for expressiveness / few but useful abstractions vs. their implementation size. Things like Forth, Tcl or Lua come to mind.
But hey that's just me. Developers & organisations they're in make their own choices.
stonemetal12 · 50m ago
Languages introduce restrictions of their own all the time. In Java I can't have a function that works on any type of number, I have to specify the type. In Haskell I can. In Haskell I can't put random stuff in a list, in Java I can as long as they are all objects.
Those language restrictions slow down your development as you have to design and code around them. People see the extra design\code as a good trade off because it improves safety, but it slows you down if you are just messing around.
MadcapJake · 1h ago
What I've never understood about this argument is this:
How often are you passing around data that you don't fully understand?
Also, people use types and then end up reaching for reflection to perform pattern matching on types at which point you've just moved the typing from the user level to a type system. Not much gained imo.
coldtea · 9m ago
>How often are you passing around data that you don't fully understand?
Like all the time. As a program grows, the full extend of what is passed where, explodes.
>Also, people use types and then end up reaching for reflection to perform pattern matching
As a tool in their disposal, with the default being the opposite.
yason · 2h ago
> And types: he claims that 'typeless' languages allow for faster
> development because of fewer restrictions. This is, ofc, nonsense.
> The amount of restrictions you have is a function of your problem,
> not your language.
Unless you're implementing something that's already known, you're effectively carving out your problem when you're in the process of writing code. There are more unknowns than there are knowns. A strongly typed language forces you to focus on satisfying the types before you can do anything, at a point where you don't exactly yet know what you want to do. The right time to cast your data rigidly into types is gradually when your data structures settle in their own place one by one; in other words when you're pretty sure things won't change too easily anymore. This allows writing code that works first and then, depending on your needs, gets next correct and then fast, or next fast and then correct. You use the weakly typed or "typeless" style as a prototyping tool that generates snippets of solid code that you can then typefy and write down for good.
coldtea · 18m ago
>We've come a long way, and examples have made clear that this dichotomy is a false one.
I wouldn't be so sure. Even Javascript, when is added types in the form of Typescript, looks more like the convoluted Windows API example and less like the TCL example.
>Let's talk about syntax. His exact Tcl code as presented could be parsed and compiled by a language that does static type analysis. It's not, because he's running it in an interpreter that only checks types at runtime. But the point is that whether code is compiled or interpreted is an implementation detail that has very little to do with the syntax.
Pedantically true. You can of course compile anything, including BASIC and TCL, or intepret C++ if you wished so.
But to really take advantage of compilation, languages tend to add types and other annotations or semantic constructs (like Haskell does). And so compiled languages tend to a certain syntax/semantics, and interpreted languages to another.
>And types: he claims that 'typeless' languages allow for faster development because of fewer restrictions. This is, ofc, nonsense. The amount of restrictions you have is a function of your problem, not your language. If it feels like dynamic typing is letting you develop faster, that's because you're putting off encountering those restrictions til later.
So it's not "nonsense", but rather a tradeoff, just like Ousterhout puts it.
For many, it is much faster to "putting off encountering those restrictions til later", as that's a large part of exploring the problem space.
See, while "the amount of restrictions you have is a function of your problem", you don't always know your exact problem, its boundaries, and the form you'll use to solve it, until you've experimented and played with early prototype solutions and changed them, etc. While doing so, having to tackle hard restrictions based on assumptions that can and will change or be discarded, slows you down.
>It's better to guarantee we encounter all type errors before the program even runs.
Only if it doesn't entail other issues, like it making our exploratory programming to end up with a design for our program slower and more tedious.
7thaccount · 2h ago
But some languages DO have less restrictions. There's a lot less I have to worry about with Python than the quagmire of Java. You may say I'm putting something off (and maybe I am), but that is perfectly reasonable in a lot of scripting use cases.
e-topy · 1h ago
Commenting just because you can't favorite comments
tom_ · 1h ago
Click the comment timestamp and use the favourite link to add it to your favourites, accessible from your profile page.
gnubison · 1h ago
Click the time (next to the author) to go to a page for the comment, where you will be able to favorite it.
titzer · 2h ago
It's a shame that Abdulaziz went quiet after moving back to Kuwait. He was our intern on Maxine VM back in 2009. A super nice guy and that paper is a gem!
tekknolagi · 2h ago
I know :( but it looks like he opened a bakery and that business is going well. So in some sense he's living the dream
kierangill · 2h ago
Love this post. Writing on programming languages has changed how I think about _programming_ in general.
I often think about this quote from TAPL. This framing of “safety” changed how I design systems.
> Informally, though, safe languages can be defined as ones that make it impossible to shoot yourself in the foot while programming.
> Refining this intuition a little, we could say that a safe language _is one that protects its own abstractions_.
> Safety refers to the language's ability to guarantee the integrity of these abstractions and of higher-level abstractions introduced by the programmer using the definitional facilities of the language. For example, a language may provide arrays, with access and update operations, as an abstraction of the underlying memory. A programmer using this language then expects that an array can be changed only by using the update operation on it explicitly—and not, for example, by writing past the end of some other data structure.
I would also highly recommend watching one of Rich Hickey's talks (especially the earlier ones). Watching those certainly changed how I thought about programming in general.
sph · 3h ago
Skip “Simple made easy” because I cannot stand hearing that talk quoted by basically every single conference speaker in the last decade. It’s become its own cliché.
(Joking of course. I much prefer “Hammock driven development” but it’s not very corporate friendly)
cnity · 2h ago
You say that, but if there's one thing I wish more mid-level engineers understood better about good code it's the distinction between simple and easy.
7thaccount · 2h ago
Regarding weird development methods of interest...Aaron Hsu of APL fame writes a lot of code in calligraphy with fountain pens when trying to organize his thoughts. I do something kind of similar, but in print with a crummy bic pen and a flow chart of Python objects (kind of like poor man's UML).
cvoss · 52m ago
I also turn to a fountain pen for the hardest problems. It puts me in a completely different head space. Something about the limited editing ability forces more coherent, linear thought, but also the freedom to seamlessly switch between English, code, math, and diagrams opens up the creativity.
deanebarker · 4h ago
I wish someone would write this for higher-level languages: JavaScript or .NET. I'm sure this person is brilliant, but they're operating at a much lower (higher?) level than most of us.
e-topy · 1h ago
Damn, his other blog posts are stellar as well, nice!
AlphaGeekZulu · 8h ago
For micrograd: is there more documentation available than just the source code in the Github repo?
1_08iu · 8h ago
He (Andrej Karpathy) has got a series on youtube which goes through how he made it!
I like this guy, so nothing against him, but none of these are about PL and all about these are about compilers (except for the one about GC). Which is fine (I like compilers) but they're just not in any way about PL.
Let me share some of my favourites not listed here, off the top of my head:
- Ian Piumarta’s “Open, Extensible Object Models” (https://www.piumarta.com/software/id-objmodel/objmodel2.pdf) is about creating the most minimal object-oriented metaobject system that allows the maximum amount of freedom for the programmer. It basically only defines a message send operation, everything else can be changed at runtime. The practical counterpart to the dense “Art of the Metaobject Protocol” book.
- John Ousterhout “Scripting: Higher-Level Programming for the 21st Century” (https://web.stanford.edu/~ouster/cgi-bin/papers/scripting.pd...) - not really a paper, but an article from the creator of Tcl about the dichotomy between systems programming languages and scripting languages. Obvious at first sight, the lessons therein have wide ramifications IMO. We always seek the perfect multi-paradigm language that can do anything at high performance with the most productivity, while perhaps it is best to have compiled, fast, clunky systems languages paired with ergonomic, flexible interpreted frontend. Often all you need is C+Tcl in the same app. A must-read for anyone writing yet another programming language.
- Niklaus Wirth's Project Oberon (https://people.inf.ethz.ch/wirth/ProjectOberon/) is the implementation of an entire computer system, from the high-level UI down to kernel, compiler, and a RISC-like CPU architecture. He wrote the seminal "plea for lean software" and actually walked the walk. A long lost art in the era of dependency hell and towering abstractions from mediocre coders.
First, my understanding of his points - a language is either a systems language like C or a scripting language like TCL or python. Systems language have "strong" types and are for data structures/algorithms, scripting languages are "typeless" and are for "gluing things together".
The main claim is that scripting languages are more concise and allow for faster development when gluing due to their 'typeless' nature.
In his example, he creates a button in Tcl.
button .b -text Hello! -font {Times 16} -command {puts hello}
He goes on to say:
With C++ and Microsoft Foundation Classes (MFC), it requires about 25 lines of code in three procedures. 1 Just set- ting the font requires several lines of code in MFC: CFont *fontPtr = new CFont(); fontPtr->CreateFont(16, 0, 0, 0, 700, 0, 0, 0, ANSI_CHARSET, OUT_DEFAULT_PRECIS, CLIP_DEFAULT_PRECIS, DEFAULT_QUALITY, DEFAULT_PITCH|FF_DONTCARE, “Times New Roman”); buttonPtr->SetFont(fontPtr);
Much of this code is a consequence of the strong typ- ing[...] In Tcl, the essential characteristics of the font (typeface Times, size 16 points) can be used immediately with no declarations or conversions. Furthermore, Tcl allows the button’s behavior to be included directly in the command that cre- ates the button, while C++ and Java require it to be placed in a separately declared method.
End quote.
We've come a long way, and examples have made clear that this dichotomy is a false one. Ousterhout's view was colored by his limited experience, resulting in him misunderstanding what he actually likes about Tcl.
Let's talk about syntax. His exact Tcl code as presented could be parsed and compiled by a language that does static type analysis. It's not, because he's running it in an interpreter that only checks types at runtime. But the point is that whether code is compiled or interpreted is an implementation detail that has very little to do with the syntax. He likes his syntax because his syntax is obviously better than C++, nothing more.
And types: he claims that 'typeless' languages allow for faster development because of fewer restrictions. This is, ofc, nonsense. The amount of restrictions you have is a function of your problem, not your language. If it feels like dynamic typing is letting you develop faster, that's because you're putting off encountering those restrictions til later.
It's better to guarantee we encounter all type errors before the program even runs. But since you can do static type analysis for any language, why don't all languages do that?
Because it's hard. Type theory is complicated. Not all type systems are decidable, we need to pick one that is for practical reasons. Ones that are may require annotations or complex algorithms/semantics restrictions like Hindley-Milner.
As a PL designer, it's a whole lot easier to just give up and only check types at runtime. And that makes a whole lot of sense if your priority is just embedding a usable language ASAP. But let's not pretend that it's because it's actually better.
This is only valid if you either are writing mission-critical software, or have infinite time.
Your argument doesn’t consider the case that you have a deadline and need to ship, so optimising for productivity, rather than asymptotic typing perfection, is paramount. There is a reason even performance-critical environments, where scripting languages are not very well suited, somehow still lean on them for productivity.
Case in point, game dev (Unreal Engine with its blueprint system, Godot with GDScript, the myriad of game engines in C++ paired with Lua.) Of course in a vacuum game devs would like to write ideal data structures with strong typing so that the game doesn’t crash, but their goal is to ship a game within the decade, so limit the strong typing to performance critical engine and can focus on building and iterating on the core product without consulting type theory books.
The point of Mr. Ousterhout’s argument is that there are only two choices: either we invent the panacea, the mythical productive strong-typed language that gives 100% safety yet enables experimentation, optional typing and compiles to C++ speed native code, or we accept that this is an impossible pipe dream, and we need to use the correct tool for the problem. Again, obvious on the surface, but still a contentious point to this day.
Ultimately, it all compiles down to assembly (which is executed), or an interpreter (which is executed). Assembly all the way. The real choice here is: turn into assembly ahead of time (programmer's code becomes assembly), or do so at runtime (assembly interprets programmer's code). Or some in-between like JIT compilation. And: what guardrails you put in place.
So you could say that higher-level languages are just a way to make tedious/nasty assembly coding more palatable. Eg. by producing error reports vs. just crashing a machine. Or provide a sandbox. Or provide a model of computation that fits well with programmer's thinking.
Between those, you simply need some useful abstractions. Preferably ones that are simple yet versatile/universal.
Eg. a (bitmap) "screen" could just be a flat memory space (framebuffer). Or a 2-D array of pixels, potentially with "pixel" defined elsewhere.
Programmer doesn't care how GPU is coerced into displaying those pixels, low-level code doesn't care what higher-level software does to produce them.
And then have as few as those abstractions as possible (but no less!). Tcl: "everything is a string". Lisp: "everything is a list". Unix(-like): "everything is a file". Etc.
Personally, I have a soft spot for languages that punch above their weight for expressiveness / few but useful abstractions vs. their implementation size. Things like Forth, Tcl or Lua come to mind.
But hey that's just me. Developers & organisations they're in make their own choices.
Those language restrictions slow down your development as you have to design and code around them. People see the extra design\code as a good trade off because it improves safety, but it slows you down if you are just messing around.
How often are you passing around data that you don't fully understand?
Also, people use types and then end up reaching for reflection to perform pattern matching on types at which point you've just moved the typing from the user level to a type system. Not much gained imo.
Like all the time. As a program grows, the full extend of what is passed where, explodes.
>Also, people use types and then end up reaching for reflection to perform pattern matching
As a tool in their disposal, with the default being the opposite.
Unless you're implementing something that's already known, you're effectively carving out your problem when you're in the process of writing code. There are more unknowns than there are knowns. A strongly typed language forces you to focus on satisfying the types before you can do anything, at a point where you don't exactly yet know what you want to do. The right time to cast your data rigidly into types is gradually when your data structures settle in their own place one by one; in other words when you're pretty sure things won't change too easily anymore. This allows writing code that works first and then, depending on your needs, gets next correct and then fast, or next fast and then correct. You use the weakly typed or "typeless" style as a prototyping tool that generates snippets of solid code that you can then typefy and write down for good.
I wouldn't be so sure. Even Javascript, when is added types in the form of Typescript, looks more like the convoluted Windows API example and less like the TCL example.
>Let's talk about syntax. His exact Tcl code as presented could be parsed and compiled by a language that does static type analysis. It's not, because he's running it in an interpreter that only checks types at runtime. But the point is that whether code is compiled or interpreted is an implementation detail that has very little to do with the syntax.
Pedantically true. You can of course compile anything, including BASIC and TCL, or intepret C++ if you wished so.
But to really take advantage of compilation, languages tend to add types and other annotations or semantic constructs (like Haskell does). And so compiled languages tend to a certain syntax/semantics, and interpreted languages to another.
>And types: he claims that 'typeless' languages allow for faster development because of fewer restrictions. This is, ofc, nonsense. The amount of restrictions you have is a function of your problem, not your language. If it feels like dynamic typing is letting you develop faster, that's because you're putting off encountering those restrictions til later.
So it's not "nonsense", but rather a tradeoff, just like Ousterhout puts it.
For many, it is much faster to "putting off encountering those restrictions til later", as that's a large part of exploring the problem space.
See, while "the amount of restrictions you have is a function of your problem", you don't always know your exact problem, its boundaries, and the form you'll use to solve it, until you've experimented and played with early prototype solutions and changed them, etc. While doing so, having to tackle hard restrictions based on assumptions that can and will change or be discarded, slows you down.
>It's better to guarantee we encounter all type errors before the program even runs.
Only if it doesn't entail other issues, like it making our exploratory programming to end up with a design for our program slower and more tedious.
I often think about this quote from TAPL. This framing of “safety” changed how I design systems.
> Informally, though, safe languages can be defined as ones that make it impossible to shoot yourself in the foot while programming.
> Refining this intuition a little, we could say that a safe language _is one that protects its own abstractions_.
> Safety refers to the language's ability to guarantee the integrity of these abstractions and of higher-level abstractions introduced by the programmer using the definitional facilities of the language. For example, a language may provide arrays, with access and update operations, as an abstraction of the underlying memory. A programmer using this language then expects that an array can be changed only by using the update operation on it explicitly—and not, for example, by writing past the end of some other data structure.
https://www.cis.upenn.edu/~bcpierce/tapl/
(Joking of course. I much prefer “Hammock driven development” but it’s not very corporate friendly)
https://www.youtube.com/watch?v=VMj-3S1tku0&list=PLAqhIrjkxb...
https://www.cosmopolitan.com/lifestyle/a7664693/mandela-effe...