I've never understood why people care so much about the linter settings. It's so obviously bikeshedding, just make a choice, run the linter automatically and be done with it. I'm too busy doing actual software engineering to care about where exactly everything goes - I promise after a week you'll just get used to whatever format your team lands on.
kelseyfrog · 4h ago
The tradeoff here is not being able to use a universal set of tooling to interact with source files. Anything but text makes grep, diff, sed, and version control less effective. You end up locked into specialized tools, formats, or IDE extensions, while the Unix philosophy thrives on composability with plain text.
There's a scissor that cuts through the formatting debate: If initial space width was configurable in their editor of choice, would those who prefer tabs have any other arguments?
jsharpe · 4h ago
Exactly. This idea comes up time and time again, but the cost/benefit just doesn't make sense at all. You're adding an unbelievable amount of complex tooling just to avoid running a simple formatter.
The goal of having every developer viewing the code with their own preferences just isn't that important. On every team I've been on, we just use a standard style guide, enforced by formatter, and while not everyone agrees with every rule, it just doesn't matter. You get used to it.
Arguing and obsessing about code formatting is simply useless bikeshedding.
Buttons840 · 6m ago
> Arguing and obsessing about code formatting is simply useless bikeshedding.
Unless it's an accessibility issue, and it is an accessibility issue sometimes.
scubbo · 3h ago
I disagree with almost every choice made by the Go language designers, but `Gofmt's style is no one's favorite, yet gofmt is everyone's favorite` is solid. Pick a not-unreasonable standard, enforce it, and move on to more important things.
spyspy · 3h ago
My only complaint about gofmt is that it’s not even stricter about some things.
duskwuff · 46m ago
Good news: there are tools like https://github.com/mvdan/gofumpt which fork gofmt and enforce stricter rules (while remaining invariant under gofmt).
rbits · 2h ago
Yeah it would probably be a waste of time. It's a nice idea to dream about though. It would be nice to be able to look at some C# code and not have opening curly brackets on a separate line.
accelbred · 3h ago
What if the common intermediate encoding is text, not binary?
Then grep/diff/sed all still work.
If we had a formatting tool that operated solely on AST, checked in code could be in a canonical form for a given AST. Editors could then parse the AST and display the source with a different formatting of the users choice, and convert to canonical form when writing the file to disk.
sublinear · 3h ago
Nobody wants to have to run their own formatter rules in reverse in their head just to know what to grep for. That defeats the point of formatting at all.
pwdisswordfishz · 19m ago
That's why you grep for a syntactic structure, not undifferentiated text.
Avshalom · 4h ago
The entire OS was built around these source files.
the unix philosophy on the other hand only "thrives" if every other tool is designed around (and contains code to parse) "plain text"
lmm · 1h ago
> The entire OS was built around these source files.
And how did that work out for them?
This seems like one of the many cases where unix won out by being a lowest common denominator. Every platform can handle plain text.
cowsandmilk · 4h ago
How is diff less effective? I see the diff in the formatting I prefer? With sed, I can project the source into a formatting most convenient for what I’m trying to do with sed. And I have no idea what you’re on about version control. It ruins sending patch files that require a line number around, but most places don’t do that any more.
What I would be curious on is tracing from errors back to the source code. Nearly every language I’ve used prints line number and offset on the line for the error. How that worked in the Diana world would be interesting to learn.
sublinear · 3h ago
You'd have to run diff and sed before the formatter which is harder for everyone.
bee_rider · 2h ago
Is it possible converted from the DIANA ir back to something that looks like source code? Then the result of the conversion backward could be grepped, etc…
charcircuit · 3h ago
In practice how many tools do you really need to handle the custom format? Probably single digits and they could all use a common library to handle the formatting aspect of things.
davetron5000 · 2h ago
There’s also a typography element to formatting source code. The notion that all code formatting is mere personal preference isn’t true. Formatting code a certain way can help to communicate meaning and structure. This is lost when the minimal tokens are serialized and re-constituted using an automated tool.
> A C argument declaration is made up of modifiers (register, const), a data type (char *), and a name (from).
Now explain a declaration like "char *argv[]"...
> We’ve also re-set the data type such that there is no space between char and * - the data type of both of these variables is “pointer to char”, so it makes more sense to put the space before the argument name, not in the middle the data type’s name (update: it should be pointed out that this only makes sense for a single declaration. A construct like char* a, b will create a pointer to char, a, and a regular char, b).
Ah, yes, the delusional C++ formatting style. At least it's nice that the update provides the explanation why it should be avoided.
frizlab · 1h ago
Yes! I’m always appalled that people cannot see that.
jauntywundrkind · 2h ago
I'm pretty unconvinced by the examples.
> Some of us even align other parts of our code, such repeated inline comments
> Now, the arguments block forms a table of three columns. The modifiers make up the first column, the data types are aligned in the second column, and the names are in the third column
These feel like pretty trivial routines that can be encompassed by code formatting.
We can contrive more extreme examples, like the for loop, but super custom formatting ("typesetting") like that has always made me feel awkward, feels like it givesicemse for people to use all manners of arbitrary formatting. The author has some intent, but when you run into an inconsistent code based with lots of things going on, the variance doesn't feel informative or helpful: it sucks and it's a drain.
What's stored is perhaps more minimal, some kind of reference encoding, maybe prettier-ifies for js. The meat of this article to me is that it shouldn't matter: the IDE should let you view and edit as you like:
> Everyone had their own pretty-printing settings for viewing it however they wanted.
laserbeam · 6m ago
Reminds me of dion systems. A few years ago a group of devs was working on a programming environment that feels very close to what DIANA is describing.
The project is dead enough that they no longer own the TLD for the company. As far as I know, the only remnants of the project are youtube recordings of demos held at conferences.
crq-yml · 3h ago
I think the problem can be defined equally as: we can't invest in something more abstract than "plain text" at this time. When we try, it gets downgraded to a plain text projection of the syntax.
The plain text encoding itself exists in a process of incremental, path-dependent development from Morse Code signals to Unicode resulting in a "Gigantic Lookup Table" (GLUT, my coining) approach to symbolic comprehension. The assumption is useful - lots of features can "just work" by knowing that a particular bit pattern is always a particular symbol.
If we push up the abstraction level, we get a different set of symbols that are better suited to the app, but not equivalent GLUT tooling. Instead we usually get parsing of plain text as a transport. For example, CSV parsing. It is sloppy; it is also good enough.
Edit: XML is also a key example. It goes out of its way to respect the text transport approach. There are dedicated XML editors. But people want to edit it as plain text and they can't quite get there because funny-business with character encodings gets in the way, adding a bunch of ampersands and semicolons onto the symbols they want to edit. Thus we have ended up with "the CSV of hypertext documents", Markdown.
preommr · 3h ago
Others have already mentioned how why this is a bad idea (e.g. common plaintext tools don't work, added complexity, etc.)
But I'll also mention that this pretty much already exists. You can have whitespace options for git. I also imagine there's some setup using hooks that uses one formatter locally, and another for remote.
Also, the common IR already exists - it's just the AST. It was "solved" back in the day when people were throwing whatever they could to the wall to see what sticks since it was all so new. With the benfit of hindsight, I think we can say that it's not that good of an idea.
banashark · 4h ago
Interesting read. I’ve often wondered why the projection we see needs to be the same as the stored artifact. Even something like a git diff should be viewable via a projection of the source IR.
With things like treesitter and the like, I sometimes daydream about what an efficient and effective HCI for an AST or IR would look like.
Things like f#s ordered compilation often make code reviews more simple for me, but that’s because a piece of the intermediate form (dependency order) is exposed to me as a first class item. I find it much more simple to reason about compared to small changes in code with more lax ordering requirements, where I often find myself jumping up and down and back and forth in a diff and all the related interfaces and abstract classes and implementations to understand what effect the delta is having on the program as a whole.
PaulKeeble · 2h ago
In theory we could have an IDE apply a reformatting to any piece of code we looked at and formatted any changes back to the standard for the code base on updates. One of the things I dislike is that sometimes autoformatting does a poor job and looses some information that manually formatting provides but honestly in go fmt is mostly fine it just works.
All of this seems doable, I just think for the most part we don't care very much about our preferences, it has very little impact on readability. Its definitely doable however we could view the code however we most wanted it and have it stored in a different formatting. Might not be 100% round trip stable but it probably doesn't matter.
There is always better where the defaults can be overridden and formatting forced and we only format new and changed lines to reduce potential instability but again go fmt doesn't really suffer from this so its possible to make things pretty reliable. Its simple really, there is a default formatting and the code is stored that way and we can then have our view of choice reformat the code as we want it, when its stored its stored in the default.
kesor · 2h ago
This is how Chrome Dev Tools shows source code. The original is often minified or in whatever format the author left it. And when you check the "pretty" checkbox in dev tools, it shows up using whichever format Chrome developers decided it should look like.
lisper · 3h ago
It never ceases to amaze me how many times people can essentially re-invent S-expressions without realizing that's what they are doing.
mdaniel · 1h ago
Wait until that Bablr user shows up to these threads, and then you'll really have to start drinking
__MatrixMan__ · 1h ago
Unison doesn't move the formatting choices further than the machine on which the code was written. The codebase only contains the AST.
Its such a cool idea, though I haven't spent much time using it in anger, so its hard to say if its a useful idea.
ChrisMarshallNY · 3h ago
I've heard that Google works [sort of] that way (don't know, myself). They have a lot of tools that allow devs to use what formatting they want, and it's made standard, during checkin.
I heard this, many years ago, when we used Perforce. The Perforce consultant that we dealt with, told us this, as an example of triggers. Back then, I was told that Google was a big Perforce shop (maybe just a part of Google. I dunno).
I have heard that this was one of the goals of developing IDLs. I think the vision was, that you could have a dozen different programmers, working in multiple languages (for example, C for the drivers, Haskell for the engine, and Lua for the UI). They would be converted to a common IDL, when submitted to configuration management, and then extracted from that, when the user looks at it.
I can't see that working, but a lot of stuff that I used to think was crazy, has happened, so, who knows?
yojo · 3h ago
I can confirm that Google was using Perforce for version control extensively, at least through 2008. I think it was somehow customized, but I definitely have lingering muscle memory around “p4 sync” and “p4 submit”.
I was on an internal tools team doing distinctly unsexy LAMP-stack work, but all the documentation I ever saw talked about perforce/p4.
__loam · 3h ago
Go was designed at Google with a built in style checker to explicitly address this and prevent bikeshedding.
lordnacho · 4h ago
Aren't most projects these days written in a mix of languages, most of them text? You'd have to get them to change to use the same tools we currently use, or else you'd have to use special tools. The beauty of the modern stack is the base tools are near universal.
If you want everyone to see their own preference of format, either write a script or get AI to format it for you.
lxe · 2h ago
> you could view the source however you wanted. Spaces vs. tabs didn't matter because neither affects the semantics and the editor on the system let you modify the program tree directly (known today as projectional editing).
But formatting still doesn't matter. Outside of whitespace-dependent languages, formatting is a subjective thing -- it's a people concern, not a computer concern. I can store my JavaScript as AST if I want to.
hackerbrother · 3h ago
Along these lines, Go eliminates many formatting decisions at the syntax level. E.g.,
func main()
{
fmt.Println("HELLOWORLD")
}
is not just non-standard formatting, but illegal Go syntax. Similarly, extra parentheses around if clauses are not allowed.
yeasku · 2h ago
Kind of a stupid take if you ever plan on sharing your code or using git.
kesor · 2h ago
Is it? When every developer has an IDE that can easily format the code in whichever the way they prefer, and minify it back just before pushing a commit.
MaxLeiter · 2h ago
No reason websites couldn’t let you choose how to view it like editors, either
yeasku · 2h ago
The world is not javascript.
bertil · 2h ago
That’s essentially what black has done with Python, though.
dubya · 1h ago
I want to like Black (or rather, uv format), but the mandatory trailing commas weird me out, especially in function definitions. It always looks like an error to me.
shmerl · 4h ago
You can't easily search / grep etc. an IR, unless you use some kind of reverse translator. Readable source files have their benefits in being simple in that sense.
marssaxman · 3h ago
Imagine having to write a new diff tool for each language!
kesor · 2h ago
You don't need a special grep for every language, you just need a tool that translates the mini version into the formatted version and back. Then you chain the tools, just like anything else in UNIX.
marssaxman · 1h ago
Seems reasonable. Since you're likely to perform this translation more than once for any given file, it seems like it would be practical to cache the translated output, perhaps as a file on disk.
cnnlives83 · 3h ago
It basically is, unless you’re in a whitespace-Nazi language like Python (no offense!).
It doesn’t get much less formatted than Minified JavaScript, except maybe Perl or Brainfuck.
kesor · 2h ago
Minified JS often comes with mangling the names of functions, variables, etc... Formatters and prettifiers lack the ability to bring back the original names and meaning.
jmward01 · 1h ago
I have gotten into discussions with people about linters and code formatting standards in general and I always liken it to a work desk. If my company decided that every work desk had to be 100% generic and that every day if I put any adjustment, even to the seat height, on that desk they would reset it, I would probably think that place was hostile. Even if I could 'auto format' it back to something close every time I stepped up to the desk I would be pretty unhappy. It just wouldn't feel like mine and eventually I would be beaten into whatever style, which wasn't my own, the code came out of the repo as. Basically, linters are evil. They only work for the person that set them up.
Leave code format up to the primary owner of the file. It is pretty rare that code has more than one person that does 95% of the edits on a file so let them own the formatting. In the rare case where there are shared files with shared edits then it is ok to mandate some sort of enforced format but those are so rare that it generally isn't worth discussing. The proposed approach here ignores all the messy non-standard stuff that happens because of the margins or the rules that are very hard to build in when codifying personal coding style.
Let me have my messy desk and I'll let you have yours.
There's a scissor that cuts through the formatting debate: If initial space width was configurable in their editor of choice, would those who prefer tabs have any other arguments?
The goal of having every developer viewing the code with their own preferences just isn't that important. On every team I've been on, we just use a standard style guide, enforced by formatter, and while not everyone agrees with every rule, it just doesn't matter. You get used to it.
Arguing and obsessing about code formatting is simply useless bikeshedding.
Unless it's an accessibility issue, and it is an accessibility issue sometimes.
If we had a formatting tool that operated solely on AST, checked in code could be in a canonical form for a given AST. Editors could then parse the AST and display the source with a different formatting of the users choice, and convert to canonical form when writing the file to disk.
the unix philosophy on the other hand only "thrives" if every other tool is designed around (and contains code to parse) "plain text"
And how did that work out for them?
This seems like one of the many cases where unix won out by being a lowest common denominator. Every platform can handle plain text.
What I would be curious on is tracing from errors back to the source code. Nearly every language I’ve used prints line number and offset on the line for the error. How that worked in the Diana world would be interesting to learn.
https://naildrivin5.com/blog/2013/05/17/source-code-typograp...
Now explain a declaration like "char *argv[]"...
> We’ve also re-set the data type such that there is no space between char and * - the data type of both of these variables is “pointer to char”, so it makes more sense to put the space before the argument name, not in the middle the data type’s name (update: it should be pointed out that this only makes sense for a single declaration. A construct like char* a, b will create a pointer to char, a, and a regular char, b).
Ah, yes, the delusional C++ formatting style. At least it's nice that the update provides the explanation why it should be avoided.
> Some of us even align other parts of our code, such repeated inline comments
> Now, the arguments block forms a table of three columns. The modifiers make up the first column, the data types are aligned in the second column, and the names are in the third column
These feel like pretty trivial routines that can be encompassed by code formatting.
We can contrive more extreme examples, like the for loop, but super custom formatting ("typesetting") like that has always made me feel awkward, feels like it givesicemse for people to use all manners of arbitrary formatting. The author has some intent, but when you run into an inconsistent code based with lots of things going on, the variance doesn't feel informative or helpful: it sucks and it's a drain.
What's stored is perhaps more minimal, some kind of reference encoding, maybe prettier-ifies for js. The meat of this article to me is that it shouldn't matter: the IDE should let you view and edit as you like:
> Everyone had their own pretty-printing settings for viewing it however they wanted.
The project is dead enough that they no longer own the TLD for the company. As far as I know, the only remnants of the project are youtube recordings of demos held at conferences.
The plain text encoding itself exists in a process of incremental, path-dependent development from Morse Code signals to Unicode resulting in a "Gigantic Lookup Table" (GLUT, my coining) approach to symbolic comprehension. The assumption is useful - lots of features can "just work" by knowing that a particular bit pattern is always a particular symbol.
If we push up the abstraction level, we get a different set of symbols that are better suited to the app, but not equivalent GLUT tooling. Instead we usually get parsing of plain text as a transport. For example, CSV parsing. It is sloppy; it is also good enough.
Edit: XML is also a key example. It goes out of its way to respect the text transport approach. There are dedicated XML editors. But people want to edit it as plain text and they can't quite get there because funny-business with character encodings gets in the way, adding a bunch of ampersands and semicolons onto the symbols they want to edit. Thus we have ended up with "the CSV of hypertext documents", Markdown.
But I'll also mention that this pretty much already exists. You can have whitespace options for git. I also imagine there's some setup using hooks that uses one formatter locally, and another for remote.
Also, the common IR already exists - it's just the AST. It was "solved" back in the day when people were throwing whatever they could to the wall to see what sticks since it was all so new. With the benfit of hindsight, I think we can say that it's not that good of an idea.
With things like treesitter and the like, I sometimes daydream about what an efficient and effective HCI for an AST or IR would look like.
Things like f#s ordered compilation often make code reviews more simple for me, but that’s because a piece of the intermediate form (dependency order) is exposed to me as a first class item. I find it much more simple to reason about compared to small changes in code with more lax ordering requirements, where I often find myself jumping up and down and back and forth in a diff and all the related interfaces and abstract classes and implementations to understand what effect the delta is having on the program as a whole.
All of this seems doable, I just think for the most part we don't care very much about our preferences, it has very little impact on readability. Its definitely doable however we could view the code however we most wanted it and have it stored in a different formatting. Might not be 100% round trip stable but it probably doesn't matter.
There is always better where the defaults can be overridden and formatting forced and we only format new and changed lines to reduce potential instability but again go fmt doesn't really suffer from this so its possible to make things pretty reliable. Its simple really, there is a default formatting and the code is stored that way and we can then have our view of choice reformat the code as we want it, when its stored its stored in the default.
Its such a cool idea, though I haven't spent much time using it in anger, so its hard to say if its a useful idea.
I heard this, many years ago, when we used Perforce. The Perforce consultant that we dealt with, told us this, as an example of triggers. Back then, I was told that Google was a big Perforce shop (maybe just a part of Google. I dunno).
I have heard that this was one of the goals of developing IDLs. I think the vision was, that you could have a dozen different programmers, working in multiple languages (for example, C for the drivers, Haskell for the engine, and Lua for the UI). They would be converted to a common IDL, when submitted to configuration management, and then extracted from that, when the user looks at it.
I can't see that working, but a lot of stuff that I used to think was crazy, has happened, so, who knows?
I was on an internal tools team doing distinctly unsexy LAMP-stack work, but all the documentation I ever saw talked about perforce/p4.
If you want everyone to see their own preference of format, either write a script or get AI to format it for you.
But formatting still doesn't matter. Outside of whitespace-dependent languages, formatting is a subjective thing -- it's a people concern, not a computer concern. I can store my JavaScript as AST if I want to.
It doesn’t get much less formatted than Minified JavaScript, except maybe Perl or Brainfuck.
Leave code format up to the primary owner of the file. It is pretty rare that code has more than one person that does 95% of the edits on a file so let them own the formatting. In the rare case where there are shared files with shared edits then it is ok to mandate some sort of enforced format but those are so rare that it generally isn't worth discussing. The proposed approach here ignores all the messy non-standard stuff that happens because of the margins or the rules that are very hard to build in when codifying personal coding style.
Let me have my messy desk and I'll let you have yours.