Formatting code should be unnecessary

320 MaxLeiter 438 9/7/2025, 11:08:42 PM maxleiter.com ↗

Comments (438)

automatoney · 17h ago

I've never understood why people care so much about the linter settings. It's so obviously bikeshedding, just make a choice, run the linter automatically and be done with it. I'm too busy doing actual software engineering to care about where exactly everything goes - I promise after a week you'll just get used to whatever format your team lands on.

AdieuToLogic · 14h ago

> I've never understood why people care so much about the linter settings.

Source code formatting programs are not the same as lint[0] programs. The former rewrites source code files such that the output is conformant with a set of layout rules without altering existing logic. The latter is a category of idempotent source code analysis programs typically used to identify potential implementation errors within otherwise valid constructs.

Some language tools support both formatting and source code analysis, but this is an implementation detail.

0 - https://en.wikipedia.org/wiki/Lint_(software)

ryandrake · 5h ago

Thank you! I am almost going out of my mind reading this ~200 comment thread with everyone just casually saying "linter" when they mean "formatter". Do people really not distinguish between these two very different programs?

Narushia · 1h ago

I guess the line might feel fuzzy to some people, since nowadays many tools bundle both linting and formatting. And with modern IDE integrations, you might not even run them explicitly — the editor just does both automatically in one go.

trallnag · 1h ago

Lately I've switched to the terms "check" and "fix" because more and more formatters (at least in my bubble of Python and Go) are incorporating fixes. So not just rearranging code and maybe adding a comma here and there.

stavros · 14h ago

Right, but it's obvious they meant "formatter".

_mu · 5h ago

Why build understanding when you could be pedantic?

kristopolous · 10h ago

Formatters, if you want to be specific, are even worse.

They slyly add git noise and pollute your audit trails by just going through and moving shit around whenever you save a file.

And sometimes, they actually insert bugs - string formatting errors are my favorite example.

It's for people who think good code is a about adhering to aesthetic ideologies instead of making things documented and accountable.

This is most noticeable in open source contributions. Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

You think I accept that?

It's not a good idea

zahlman · 6h ago

> just going through and moving shit around whenever you save a file.

This only happens because the file doesn't already adhere to the rules it's implementing. These are normally highly configurable, and once your code complies to a standard, the tool prevents future code from pulling you away from that standard.

> And sometimes, they actually insert bugs - string formatting errors are my favorite example.

Do you have a concrete example?

> Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

Is your existing code formatting at least consistent?

> You think I accept that?

This is a social issue rather than a technical one. You can tell people in your development readme to use specific style rules, or even a project-wide precommit hook. If your own code is formatted with one of these tools, you can even (to my understanding) set up automated checks on GitHub's side.

But of course you are free to reject any PR you want.

kristopolous · 3m ago

I don't keep examples around to defend my stance on this, sorry.

I left out my largest critique - spacing is semantic both for the compiler and the human.

Often I police the whitespace very thoughtfully with long comments for clarity.

I care deeply about maintainability and legibility of code and try to consider future human readers everywhere.

Then the formatter says "haha, fuck that!"

That's my biggest personal gripe with it. It's consistency over clarity, conformity over craft.

This all depends on what kind of code you're writing.

But honestly the jobs I sign up for demand that kind of care so I am really frustrated when I'm prevented from exercising my professional judgement and doing what I think is best due to some bureaucratic red tape

kristopolous · 12m ago

I'll sound off more on this.

Formatters also value consistency over clarity.

I break formatting all the time for the sake of clarity.

Sometimes my comments are paragraphs long with citations and things are carefully broken down with interstitial comments and references and then the formatter fucks it all up and the linter says "wah this oblivious pedant rule isn't followed"

The problem is it doesn't treat me like an adult And I'm not in this industry for dumb Nanny tools that scold me because they don't understand things

thfuran · 8h ago

You have it entirely backwards. Enforcing a consistent format is useful precisely because it avoids pointless git noise from different people changing formatting differently as they go.

rcxdude · 10h ago

Running random formatters on random subsets of your code is not a good idea. If you want code in a repo to be formatted a certain way, you need to have one set of settings and enforce it, and yeah, reject anything that just has spurious formatting changes that someone else has run.

jchw · 3h ago

Formatters breaking code is not something that happens in all language ecosystems; I think it's mostly a C++ and occasionally JS issue, but for gofmt and many other formatters just don't break code. It's also not really that common anyways.

You can solve the Git noise issue by enforcing formatting in CI and keeping formatter configuration in repo. This is what most high quality open source projects will do. The purpose of this is not about "adhering to aesthetic ideologies", it's about not bothering people with the minutiae of yet another pointless set of formatting conventions. Most developers couldn't give a shit less where you think braces should go, or whether you like tabs or spaces, or whatever else, they care about more important things like data structures and writing more correct code. Having auto formatting enables them to effortlessly follow project norms without needing to, for every single repo they work in, carefully try to adhere to the documented formatting (which usually winds up being inconsistent eventually anyways, in projects without auto formatting, because humans are fallible.)

The reason why people submit code with a huge formatting diff is usually because your project didn't ship a formatter manifest but their editor is configured to format on save. That's because probably most of the projects people work on now do actually use some form of automatic formatting, be it clang-format, gofmt, prettier, black, etc. so it winds up being necessary to special case your project to not try to run a formatter. It's still a beginner's mistake to actually commit and PR a huge reformatting, but it definitely happens by accident to even experienced devs when working on projects that have weird manual formatting.

mvieira38 · 7h ago

> This is most noticeable in open source contributions. Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

This wouldn't happen nearly as much if you had a defined set of formatting rules plugged into CI instead of chaos

bluGill · 4h ago

If the rule is not enforced in ci it isn't a rule. I've made that a mantra for a long time now and it helps. For a while I did verify formatting in ci but eventually we decided that formatiing wasn't as important as getting builds done fast. We still run other test and linters in ci and if they break fix the build but formatting isn't really that imbortant so we don't care. Yes our formatting is somewhat a mess but it isn't really that bad even after a decade of agreeing not to care.

misiek08 · 10h ago

I’ve seen multiple repos with pre-hook and just CI running formatter on _modified_ code only. Those repos were the cleanest to date.

craftkiller · 3h ago

> Sometimes I'll get a pull request with like 2 lines of change and 120 lines of some reformating tool.

The reformatting tools should be CI-enforced so you'll only end up with sudden massive changes like this once when you start using auto-formatters.

Regardless, tell your teammates to separate out formatting changes vs logic changes into separate commits (preferably separate PRs). Since they're auto-formatters it wouldn't even be any additional work, just:

  git fetch origin
  git checkout origin/main
  git checkout -b formatting
  ./run_the_autoformatter.bash
  git commit -a -m "Ran the auto-formatter, which should have been enforced by the CI."
  git push -u origin formatting

shepherdjerred · 16m ago

And then setup git-ignore-revs

https://github.com/orgs/community/discussions/5033

galangalalgol · 8h ago

Avoiding that situation is what I like about formatters. As long as the language has an obvious standard like rust or go.

WalterBright · 6h ago

What I do is make two separate PRs - one for the coding change, the other for reformatting only.

johnnypangs · 6h ago

I’ve used this before, it helps when you format the entire repo and remove the one commit from the history https://docs.github.com/en/repositories/working-with-files/u...

triknomeister · 6h ago

formatting on git diffs is a concept which should be embraced.

closeparen · 3h ago

Before picking up Go with its formatter and format-on-save norm, I mostly worked in contexts where a program would scan your source code and complain about style violations, but not actually fix them. We called those linters.

kolme · 7h ago

Yes, you are technically correct and yet absolutely irrelevant to the conversation, just adding meaningless noise.

Also, there are many linters that also do formatting, blurring the "line" you're pointing at.

vidarh · 8h ago

Because I spent the vast majority of the time I spent on code reading it, and the layout matters to me in terms of how much time it takes for me to read code.

Yes, I can get used to other layouts, but that by no means means all layouts are equal to me in terms of how readable they are, and how well things stand out when they should, or blend in when they should.

I recognise this isn't the case for everyone - some people read code beginning to end and it doesn't matter how its laid out. But I pattern match visually, and read fragments based on layout, and I remember code based on visual patterns.

Ironically, because I have aphantasia, and don't visualise things with my "minds eye", but I still remember things by visual appearance and spatial cues better than by text.

socalgal2 · 13h ago

some settings have advantages. For example, trailing commas on tables

    [
      'apple',
      'banana',
      'orange',
    ]

has an advantage over

    [
      'apple',
      'banana',
      'orange'
    ]

Because adding a new line at the end of the table (1) requires editing 1 line, instead of 2 (2) makes the diffs in code review smaller and easier to read and review. So a bad choice makes my life harder. The same applies to local variable declarations.

Sorted lists (or sorted includes) is also something that makes my life easier. If they're not sorted then everyone adds their new things to the end, which means there are many times more merge conflicts. sorted doesn't mean there are zero but does mean there are less than "append to the end". So, just like an auto-formatter is there to save time, don't waste my time by not sorting where possible.

Also, my OCD hates inconsistency. So

    [1, 2, 3]
    {a, b, c}

Is ok and

    [ 1, 2, 3 ]
    [ a, b, c ]

Is ok but

    [1, 2, 3]
    { a, b, c }

Is not. I don't care which but pick ONE style, not two styles!

austin-cheney · 9h ago

Yes, everyone has personal opinions about code vanity. When this becomes a holy war I really start to question the maturity of people on the project. I find that people worry about trivial nonsense to mask their inability to address more valid concerns.

All that really matters is consistency. Let a team make some decisions and then just move forward.

2muchcoffeeman · 8h ago

Don’t bother making decisions. Steal a standard. Vote on it once if you want to be democratic. Done forever.

gorgoiler · 7h ago

Democracy, strictly speaking, would be to periodically elect the most popular formatting policy once every sensible-time-period.

I’ve seen companies with such a large amount of developer churn that literally one person was left defending the status quo saying “we do X here, we voted on it once in 2019 and we’re not changing it just for new people”. 90% of the team were newcomers.

(The better teams I’ve worked on maintain a core set of leaders who are capable of building consensus through being very agreeable and smart. Gregarious Technocracy >> Popular Democracy!)

ParetoOptimal · 4h ago

> All that really matters is consistency. Let a team make some decisions and then just move forward.

Not so! Amount of tokens correlates to perceived code complexity to some. One example is how some people can't unsee or look past lisps parenthesis.

Another example is how some people get used to longDescriptiveVariableNames but others find that overwhelming (me for instance) when you have something like:

    userSignup = do
        let fullName = userFirstNameInput + userLastNameInput
            userName = take 1 userFirstNameInput + take 10 userLastNameInput
        saveToDB userName

Above isn't bad, but imagine variables named that verbosely used over and over, esp in same line.

Compare it to:

    userSignup = do
        let fullName = firstName + lastName
            userName = take 1 firstName + take 10 lastName
        saveToDB userName

The second example loses some information, but I'd argue it doesn't matter too much given the context one would typically have in a function named `userSignup`.

I've had codebases where consistency required naming all variables like `firstNameInputField` rather than just `firstName` and it made functions unreadable because it made the unimportant parts seem more important than they were simply by taking up more space.

parthdesai · 6h ago

It's one of my pet peeves when some senior engineers are bothered more by these coding semantics in a PR when there are bigger data model/code architectural issues, and don't call that out.

yes_man · 11h ago

The problem is when 2 people with same level of enthusiasm for linter rules but opposing views collide. If there’s nothing more impactful you could be solving and spending energy and time on than arguing those linter rules, then it’s time to question where the project is at and where is it going.

And if there is something more important, then instead of of micro-optimizing the rules when there is strong disagreement it’s probably best if one of the parties takes the high road and lives with it so you can all focus on what matters.

vbezhenar · 11h ago

I guess that's one reason why opinionated tools like prettier or gofmt are popular. They made all the choices for you, they don't have configurable knobs, so you just learn to live with it.

lsaferite · 58m ago

FWIW, there are formatting decisions that gofmt doesn't make for you, so it's not as simple as just using gofmt.

megamalloc · 7h ago

The bad thing about these sort of tools is when you work in a shop where multiple platforms are used for development and one of the platforms doesn't support the tool, or the tool fights with other tooling on that platform. You should for example never use pre-commit to enforce line ending style because git has brain dead defaults (which is to say, unless you have a .gitsettings file in your repo to prevent it, it will change line endings itself, fighting pre-commit). This just creates confusion and wasted time. In aid of what? So some anal so-and-so can get their way about code formatting as though it makes everyone else more productive to format code THEIR way. When in fact it makes others LESS productive as they fight "computer says no" format-nazi jobs in CI that don't even report what is "wrong" with the formatting and rely on tooling that they don't have installed to run locally.

Not to mention the overhead of running these worthless inefficient tools on every commit (even locally).

Tools like this just raise the debate from different opinions about formatting to different opinions about workflows. Workflows impact productivity a lot more than formatting.

No comments yet

bluGill · 7h ago

You should force them to choose from someone else's style. Don't let them tweak individual settings, choose a complete standard and apply it with both the thing they like and things they don't. A style is useful, the details do not matter that much.

rapind · 10h ago

Let this sink in though:

    [ 'apple'
    , 'banana'
    , 'orange'
    ]

maest · 10h ago

That makes prepending an element a special case.

ParetoOptimal · 4h ago

It makes it easier to read though because the least important parts are most easily ignored. The reader can focus on the contents of the list.

3pt14159 · 3h ago

I don't really know why we even need commas for lists of things. Just use the white space.

JBiserkov · 9h ago

In Clojure, commas are treated as whitespace and are thus completely optional.

Nevermark · 9h ago

This is so clearly superior. Delimiters are prefixes.

But the scale of technical debt this insight has revealed is depressing.

citizenkeen · 9h ago

Saying this is clearly superior means you don’t keep your lists sorted. A sorted list is as likely to add something to the beginning as the end, where this solution has the same problem.

Nevermark · 9h ago

I just reverse the sort order when that case happens.

setr · 9h ago

The only correct syntax/format

If only there existed a language designer intelligent enough to support it

maccard · 8h ago

You want yaml

    key:
      - a
      - b
      - c

MonkeyClub · 2h ago

Lisp:

crazygringo · 7h ago

Thank you for the humor!

I'm just suddenly slightly terrified someone's going to see this and think it's genuinely a good idea and make it part of the next popular scripting language, where lists are defined by starting commas or something :S

jrochkind1 · 7h ago

You'll be annoyed to know that your last "not okay" style is what's considered standard in ruby (although the curly braces have different semantics, they are, well, either a hash or a code block (which is kind of annoying to me that they're used for two entirely different things) never a list/array).

zahlman · 6h ago

> Also, my OCD hates inconsistency

Mine hates trailing commas :)

More seriously, I don't like having lists like that in the code in the first place. I don't want multiple lines taken up for just constant values, and if it turns out to require maintenance then the data should be in a config file instead anyway.

whizzter · 3h ago

Rule of 3, write/change it once or twice (or seldomly enough with no possible negative impact) and it doesn't need any complexity. More than so.. yeah probably goes into a config.

Revisional_Sin · 3h ago

Constants in the code are easier to navigate to than config files.

aleph_minus_one · 9h ago

> Because adding a new line at the end of the table (1) requires editing 1 line, instead of 2 (2) makes the diffs in code review smaller and easier to read and review.

This judgement is rather based on a strong personal opinion (which I don't claim to be wrong, but also not as god-given) on what is one, and what are two changes in the code:

- If you consider adding an additional item to the end of the list to be one code change, I agree that a trailing comma makes sense

- On the other hand, it is also a sensible judgment to consider this to be a code change of two lines:

1. an item (say 'peach') is added to the end of the list

2. 'orange' has been turned from the last element of the list to a non-last element of the list

If you are a proponent of the second interpretation, the version that you consider to be non-advantageous is the one that does make sense.

dghf · 8h ago

> 2. 'orange' has been turned from the last element of the list to a non-last element of the list

Then why not consider it four changes?

3. 'banana' has been turned from the last-but-one element of the list to the last-but-two element of the list

4. 'apple' has been turned from the last-but-two element of the list to the last-but-three element of the list

No comments yet

Skeime · 9h ago

But the second interpretation only makes sense if the last item somehow deserves special treatment (over, say, the second-to-last item). Otherwise, you should similarly argue that the previous second-to-last item should also show up in the changes as it has now turned into the third-to-last item. (So maybe every item in the list should be preceded by as many spaces as are items before it and succeeded by as many commas as are items following it. Then, every change to the list will be a diff of the entire list.)

    first item,,,
     second item,,
      third item,
       fourth item

In my experience, special treatment for the last item is rarely warranted, so a trailing comma is a good default. If you want the last item to be special, put a comment on that line, saying that it should remain last. (Or better yet, find a better representation of your data that does not require this at all.)

aleph_minus_one · 7h ago

> But the second interpretation only makes sense if the last item somehow deserves special treatment (over, say, the second-to-last item).

There do exist reasons why this can make sense:

- In an Algebraic Data Type implementation of a non-empty list, the last symbol is a different type constructor than the one to append an item to the front of an existing non-empty list (similarly how for an Algebraic Data Type implementation of an arbitrary list, the type constructor for an initial empty list is "special").

- In a single-linked list implementation, sometimes (depending on the implementation) the terminal element of the list is handled differently.

---

By the way: at work, because adding parameters at the beginning of a (parameter) list of a function is "special" (because in the code for many functions the first parameters serve a very special purpose), but adding some additional parameter at the end is not, we commonly use parameter lists formatted like

    'foo'
  , 'bar1'
  , 'bar2'
  , 'blub'

muzani · 11h ago

I agree with you on all these points. If you were to argue the opposite point, I'd agree as well.

hananova · 10h ago

Meanwhile, I know and understand the reasons for trailing commas, but I find them incredibly ugly so I always strip them out.

sarchertech · 10h ago

Can’t strip them out if the compiler requires them.

huflungdung · 12h ago

That isn’t ocd.

kolme · 7h ago

I did that when I was young and naive. I'll tell you why I did it.

I thought I was very smart. Like, really really smart, maybe the smartest programmer in the team.

And as such my opinion was very important. Maybe the most important opinion in the team. Everyone had to listen to it!

That is all. Also, I was wrong.

robertlagrant · 6h ago

> Also, I was wrong.

This is probably the only useful takeaway, but can you explain why you were wrong?

kolme · 5h ago

Yes, I was wrong on several levels.

First and foremost I was wrong thinking that I was smarter than others — that's not even how intelligence works.

Second I was wrong being so stubbornly pro-tabs / anti-spaces (for example). It doesn't make that much of a difference, so there's no point in being so passionate about it.

And third I was wasting everyone's time (and my persuasion powers) by not choosing my battles more wisely.

My suggestion would be nowadays: let's choose a popular style guide, set up a linter and be done with it.

psychoslave · 13h ago

I don't care that much about the specific retained options (though my own gusts of the day are obviously the best taste ever in the whole existence of universe) but having a common linter setting to prevent the noise in every damn PR is a must have.

Yes both git and all these PL are actually damn stupid to take lines at face value instead of something more elegant like Ada does. In my 20+ year career I've been proposed only once a project that involved Ada.

It's hard to come with something elegant and efficient. It's even harder to make it reach top tiers global presence, all the more when the ecological niche is already filled with good enough stuff.

schneems · 8h ago

I learned to love rustfmt but there’s one thing that bothers me: There’s a few times where there are two ways to do something like a one line closure can omit the curly brackets, but multi line closures cannot. Rustfmt prefers to remove those brackets when it can, but I prefer to keep them, which makes editing the code faster since I don’t have a syntax error if I suddenly need a second line.

I can still live with it. And I like the clean, minimal version when I don’t have to edit. Just adding that “style” can have impact beyond how it looks involving ease of editing. And it stinks when your preferences clash with the community.

jupp0r · 16h ago

I generally agree, but max line length being so high you have to horizontally scroll while reading code is very detrimental to productivity.

elevation · 15h ago

Formatters eliminating long lines is a pet peeve of mine.

About once every other project, some portion of the source benefits from source code being arranged in a tabular format. Long lines which are juxtaposed help make dissimilar values stand out. The following table is not unlike code I have written:

  setup_spi(&adc,    mode=SPI_01, rate=15, cs_control=CS_MUXED,  cs=0x01);
  setup_spi(&eeprom, mode=SPI_10, rate=13, cs_control=CS_MUXED,  cs=0x02);
  setup_spi(&mram,   mode=SPI_10, rate=50, cs_control=CS_DIRECT, cs=0x08);

Even if we add 4-5 more operational parameters, I find this arrangement much more readable than the short-line equivalent:

  setup_spi(&adc,
      mode=SPI_01,
      rate=15,
      cs_control=CS_MUXED,
      cs=0x01);
  setup_spi(&eeprom,
      mode=SPI_10,
      rate=13,
      cs_control=CS_MUXED,
      cs=0x02);
  setup_spi(&mram,
      mode=SPI_10,
      rate=50,
      cs_control=CS_DIRECT,
      cs=0x08);

Or worse, the formatter may keep the long lines but normalize the spaces, ruining the tabular alignment:

  setup_spi(&adc, mode=SPI_01, rate=15, cs_control=CS_MUXED, cs=0x01);
  setup_spi(&som_eeprom, mode=SPI_10, rate=13, cs_control=CS_MUXED, cs=0x02);
  setup_spi(&mram, mode=SPI_10, rate=50, cs_control=CS_DIRECT, cs=0x08);

Sometimes a neat, human-maintained block of 200 character lines brings order to chaos, even if you have to scroll a little.

sn0wleppard · 13h ago

The worst is when you have lines in a similar pattern across your formatter's line length boundary and you end up with

  setup_spi(&adc, mode=SPI_01, rate=15, cs_control=CS_MUXED, cs=0x01);
  setup_spi(&eeprom,
      mode=SPI_10,
      rate=13,
      cs_control=CS_MUXED,
      cs=0x02);
  setup_spi(&mram, mode=SPI_10, rate=50, cs_control=CS_DIRECT, cs=0x08);

crazygringo · 7h ago

I think with the Black formatter you can force the multiline version by adding a trailing comma to the arguments.

The pain point you describe is real, which is why that was intentionally added as a feature.

Of course it requires a language that allows trailing commas, and a formatter that uses that convention.

dvdkon · 6h ago

A similar tip: As far as I can tell, clang-format doesn't reflow across comments, so to force a linebreak you can add a // end-of-line comment.

crazygringo · 7h ago

I get what you're saying, and used to think that way, but changed my mind because:

1) Horizontal scrolling sucks

2) Changing values easily requires manually realigning all the other rows, which is not productive developer time

3) When you make a change to one small value, git shows the whole line changing

And I ultimately concluded code files are not the place for aligned tabular data. If the data is small enough it belongs in a code file rather than a CSV you import then great, but bothering with alignment just isn't worth it. Just stick to the short-line equivalent. It's the easiest to edit and maintain, which is ultimately what matters most.

paddy_m · 6h ago

This comes up in testing a lot. I want testing data included in test source files to look tabular. I want it to be indented such that I can spot order of magnitude differences.

a_e_k · 14h ago

Yes, so much this!

I've often wished that formatters had some threshold for similarity between adjacent lines. If some X% of the characters on the line match the character right above, then it might be tabular and it could do something to maintain the tabular layout.

Bonus points for it's able to do something like diff the adjacent lines to detect table-like layouts and figure out if something nudged a field or two out of alignment and then insert spaces to fix the table layout.

Cthulhu_ · 12h ago

I believe some formatters have an option where you can specify a "do not reformat" block (or override formatting settings) via specific comments. As an exception, I'm okay with that. Most code (but I'm thinking business applications, not kernel drivers) benefits from default code formatting rules though.

And sometimes, if the code doesn't look good after automatic formatting, the code itself needs to be fixed. I'm specifically thinking about e.g. long or nested ternary statements; as soon as the auto formatter spreads it over multiple lines, you should probably refactor it.

a_e_k · 2h ago

I'm used to things like `// clang-format off` and on pairs to bracket such blocks, and adding empty trailing `//` comments to prevent re-flowing, and I use them when I must.

This was more about lamenting the need for such things. Clang-format can already somewhat tabularize code by aligning equals signs in consecutive cases. I was just wishing it had an option to detect and align other kinds of code to make or keep it more table like. (Destroying table-like structuring being the main places I tend to disagree with its formatting.)

VBprogrammer · 14h ago

Those kind of tables improve readability right until someone hits a length constraint and had to either touch every line in order to fix the alignment, causing weird conflicts in VCS, or ignore the alignment and it's slow decay into a mess begins.

Cthulhu_ · 11h ago

It's not an either/or though. Tables are readable and this looks very much like tabular data. Length constraints should not be fixed if you have code like this, and it won't be "a slow decay into a mess" if escaping the line length rules is limited to data tables like these.

VBprogrammer · 8h ago

By length constraint I meant that one of the fields grows longer than originally planned rather than bypassing the linter.

lambdaba · 14h ago

I agree, I'm very much against any line length constraint, it's arbitrary and word wrapping exists.

jaimebuelta · 11h ago

The first line should be readable enough, but in case it's longer than that, I way prefer the style of

  setup_spi(&adc, mode=SPI_01, rate=15, cs_control=CS_MUXED,  
            cs=0x01);
  setup_spi(&eeprom, mode=SPI_10, rate=13, cs_control=CS_MUXED,  
            cs=0x02);
  setup_spi(&mram, mode=SPI_10, rate=50, cs_control=CS_DIRECT, 
            cs=0x08);

of there the short-line alternative presented.

I like short lines in general, as having a bunch of short lines (which tend to be the norm in code) and suddenly a very long line is terrible for readability. But all has exemptions. It's also very dependent on the programming language.

bryanrasmussen · 13h ago

People have already outlined all the reasons why the long line might be less than optimal, but I will note that really you are using formatting to do styling.

In a post-modern editor (by which I mean any modern editor that takes this kind of thing into consideration which I don't think any do yet) it should be possible for the editor to determine similarity between lines and achieve a tabular layout, perhaps also with styling for dissimilar values in cases where the table has a higher degree of similarity than the one above. Perhaps also with collapsing of tables with some indicator that what is collapsed is not just a sub-tree but a table.

account42 · 12h ago

Another issue with fixed line lengths is that it requires tab stops to have a defined width instead of everyone being able to choose their desired indentation level in their editor config.

rerdavies · 7h ago

I think you have that backward. Allowing everyone to choose their desired indentation in their editor config is the issue. That's insane!

DonHopkins · 10h ago

Another issue with everyone being able to choose their desired indentation level in their editor config is unbounded line length.

growse · 14h ago

//nolint

bloak · 10h ago

/* clang-format off */

vbezhenar · 11h ago

It is an obvious example where automatic formatter fails.

But are there more examples? May be it's not high price to pay. I'm using either second or third approach for my code and I never had much issues. Yes, first example is pretty, but it's not a huge deal for me.

someothherguyy · 13h ago

  setup_spi(
    &adc,
    mode=SPI_01,
    rate=15,
    cs_control=CS_MUXED,
    cs=0x01
  );
  setup_spi(
    &eeprom,
    mode=SPI_10,
    rate=13,
    cs_control=CS_MUXED,
    cs=0x02
  );
  setup_spi(
    &mram,
    mode=SPI_10,
    rate=50,
    cs_control=CS_DIRECT,
    cs=0x08
  );

ftfy

DonHopkins · 10h ago

This is good, and objectively better than letting the random unbounded length of the function name define and inflate and randomize the indentation. It also makes it easier to use long descriptive function names without fucking up the indentation.

  setup_spi(&adc,
            mode=SPI_01,
            rate=15,
            cs_control=CS_MUXED,
            cs=0x01
  );
  setup_spoo(&adc,
             mode=SPI_01,
             rate=15,
             cs_control=CS_MUXED,
             cs=0x01
  );
  setup_s(&adc,
          mode=SPI_01,
          rate=15,
          cs_control=CS_MUXED,
          cs=0x01
  );
  validate_and_register_spi_spoo_s(&adc,
                                   mode=SPI_01,
                                   rate=15,
                                   cs_control=CS_MUXED,
                                   cs=0x01
  );

Marazan · 13h ago

That is harder to read than the long line version.

However, it is the formatting I adopt when forced to bow down to line length formatters.

lenkite · 11h ago

Err..I find the short-line version easier to read. Esp if you need to horizontally scroll.

This is why a Big Dictator should just make a standard. Everyone who doesn't like the standard approach just gets used to it.

someothherguyy · 11h ago

to you, to me, it reads nicely, and thus the issue -- editors should have built in formatters that don't actually edit source code, but offer a view

thaumasiotes · 11h ago

To me, that reads fine, but it has lost the property elevation wanted, which was that it's easy to compare the values assigned to any particular parameter across multiple calls. In your version you can only read one call at a time.

IlikeKitties · 14h ago

I'm suprised. I find the short-line version to be much better.

komali2 · 14h ago

Devs have different pixel count screens. Your table wrapped for me. The short line equivalent looks best on my screen.

Thus 80 or perhaps 120 char line lengths!

account42 · 12h ago

So fix your setup? Why should others with wider screens leave space on their screen empty for your sake?

Especially 80 characters is a ridiculously low limit that encourages people to name their variables and functions some abbreviated shit like mbstowcs instead of something more descriptive.

genericspammer · 11h ago

Do you guys never read code as side by side diffs in the browser?

komali2 · 8h ago

Never mind in a browser, this is how I review a ton of code, either in magit or lazygit or in multiple terminals.

saagarjha · 10h ago

I softwrap so I don't care about line length myself but I read code on a phone a lot so people who hardwrap at larger columns are a little more annoying

maratc · 10h ago

> Why should others with wider screens leave space on their screen empty for your sake?

Because "I" might be older or sight-impaired, and have "my" font at size 32, and it actually fills "my" (wider than yours) screen completely?

Would you advise me to "fix my eyes" too? I'd love to!

"Why should I accommodate others" is a terrible take.

rerdavies · 7h ago

I would advise you to buy one of these: https://www.dell.com/en-ca/shop/dell-ultrasharp-49-curved-us...

80-column line lengths is a pretty severe ask.

komali2 · 11h ago

My main machine is an ultrawide, but I usually have multiple files open, and text reads best top-down so I stack files side-by-side. If someone has like, a 240 character long line, that is annoying. My editor will soft wrap and indicate this in the fringe of course but it's still a little obnoxious.

80 is probably too low these days but it's nice for git commit header length at least.

DonHopkins · 10h ago

So haul your wide monitor around with your laptop, you mean? No.

Just use descriptive variable names, and break your lines up logically and consistently. They are not mutually exclusive, and your code will be much easier for you and other people to read and edit and maintain, and git diffs will be much more succinct and precise.

delusional · 11h ago

> So fix your setup? Why should others with wider screens leave space on their screen empty for your sake?

What a terrible attitude to have when working with other people.

"Oh, I'm the only one who writes Python? Fix your setup. why should I, who know python, not write it for your sake?"

"Oh, I'm the only one who speaks German? Fix your setup. Why should I, who know German, not speak it for your sake?"

How about doing it because your colleagues, who you presumably like collaborating with to reach a goal, asks you to?

account42 · 8h ago

Yes, I don't think we should discourage people from using Python or German just because you don't want to learn those particular languages either.

Working together with others should not mean having to limit everyone to the lowest common denominator, especially when there are better options for helping those with limitations that don't impact everyone else.

balamatom · 10h ago

What do you do about the "oh, I'm the only one who cares about [???]? should I just fucking kill myself then?" Many such cases.

>How about doing it because your colleagues, who you presumably like collaborating with to reach a goal, asks you to?

If a someone wants me to do a certain thing in a certain way, they simply have to state it in terms of:

- some benefit they want to achieve

- some drawback they want to avoid

- as little as an acknowledged unexamined preference like "hey I personally feel more comfortable with approach X, how bout we try that instead"

I'm happy to learn from their perspective, and gladly go out of my way to accomodate them. Sometimes even against my better judgment, but hell, I still prefer to err on the side of being considerate. Just like you say, I like to work with people in terms of a shared goal, and just like you do, in every scenario I prefer to assume that's what's going on.

If, however, someone insists on certain approaches while never going deeper in their explanations than arbitrary non-falsifiable qualifiers such as "best practice", "modern", "clean", etc., then I know they haven't actually examined those choices that they now insist others should comply with. They're just parroting whatever version they imagine of industry-wide consensus describes their accidental comfort zone. And then boy do they hate my "make your setup assume less! it's the only way to be sure!". But no, I ain't reifying their meme instead of what I've seen work with my own two.

delusional · 8h ago

> If, however, someone insists on certain approaches while never going deeper in their explanations than arbitrary non-falsifiable qualifiers such as "best practice", "modern", "clean"

You're moving the goalposts of this discussion. The guy I was responding to said "fix your setup" to another person saying "Your table wrapped for me. The short line equivalent looks best on my screen." That's a stated preference based on a benefit he'd like to achieve.

We are not discussing "best practice" type arguments here.

balamatom · 7h ago

"Best practice" type arguments are the universal excuse for remaining inconsiderate of the fact that different people interact with code differently, but fair enough I guess

brettermeier · 12h ago

Living in the 80's XD

tsimionescu · 13h ago

I am at the opposite end. Having any line length constraints whatsoever seems like a massive waste of time every time I've seen it. Let the lines be as long as I need them, and accept that your colleagues will not be idiots. A guideline for newer colleagues is great, but auto-formatters messing with line lengths is a source of significant annoyance.

Cthulhu_ · 11h ago

> auto-formatters messing with line lengths is a source of significant annoyance.

Unless they have been a thing since the start of a project; existing code should never be affected by formatters, that's unnecessary churn. If a formatter is introduced later on in a project (or a formatting rule changed), it should be applied to all code in one go and no new code accepted if it hasn't passed through the formatter.

I think nobody should have to think about code formatting, and no diff should contain "just" formatting changes unless there's also an updated formatting rule in there. But also, you should be able to escape the automatic formatting if there is a specific use case for it, like the data table mentioned earlier.

jitl · 16h ago

every editor can wrap text these days. good ones will even indent the wrapped text properly

giveita · 16h ago

Thats a slippery slope towards storing semantics and displaying locally preferred syntax ;)

Cthulhu_ · 11h ago

And that's fine, as long as whatever ends up in version control is standardized. Locally you can tweak your settings to have / have not word wrapping, 2-8 space indentation, etc.

But that's the core of this article, too; since then it's normalized to store the plain text source code in git and share it, but it mentions a code and formatting agnostic storage format, where it's down to people's editors (and diff tools, etc) to render the code. It's not actually unusual, since things like images are also unreadable if you look at their source code, but tools like Github will render them in a human digestable format.

jitl · 16h ago

I prefer storing plain text and displaying locally preferred syntax, to a degree.

With some expressions, like lookup tables or bit strings, hand wrapping and careful white space use is the difference between “understandable and intuitive” and “completely meaningless”. In JS world, `// prettier-ignore` above such an expression preserves it but ideally there’s a more universal way to express this.

rightbyte · 37m ago

Is this a subtle pro-tab pinch?

NL807 · 16h ago

And the bikeshedding has begun...

thfuran · 8h ago

Who’s going to be bikeshedding (about formatting) when everyone can individually configure their own formatting rules without affecting anyone else?

giveita · 13h ago

What's the nuclear reactor in this analogy?

pferde · 13h ago

That the values could have been extracted to an array of structs, and iterated over in a small cycle that calls the function for each set of values.

virtue3 · 14h ago

was going to say the same thing.

Boy that was fast.

rTX5CMRXIfFG · 15h ago

You still have to minimize the wrapping that happens, because wrapped lines of code tend to be continuous instead of being properly spaced so as to make its parts individually readable.

hulitu · 15h ago

> every editor can wrap text these days.

could. Yesterday notepad (win 10) just plainly refused.

jitl · 8h ago

Windows is so weird

jghn · 9h ago

I’d agree with you except for the trend over the last 10 years or so to set limits back to the Stone Age. For a while there we seemed to be settling on somewhere around 150 characters and yet these days we’re back to the 80-100 range.

appellations · 16h ago

I forget there are people who don’t configure softwrap in their text editor.

Some languages (java) really need the extra horizontal space if you can afford it and aren’t too hard to read when softwrapped.

forrestthewoods · 16h ago

Define high? I think 120 is pretty reasonable. Maybe even as high as 140.

Log statements however I think have an effectively unbounded length. Nothing I hate more than a stupid linter turning a sprinkling of logs into 7 line monsters. cargo fmt is especially bad about this. It’s so bad.

skinner927 · 16h ago

I still prefer 80. I won’t (publicly) scoff at 100 though. IMO 120 is reasonable for HTML and Java, but that’s about it.

Sent from my 49” G9 Ultrawide.

guenthert · 12h ago

Obviously 100 is the right choice.

https://en.wikipedia.org/wiki/Line_length#cite_note-dykip-8

Joker_vD · 14h ago

Give a try to 132 mode, maybe? It was the standard paper width for printouts since, well, forever.

balamatom · 9h ago

That's actually just weirdly specific enough to be worth a shot.

psychoslave · 13h ago

Printing industry have not been anything close to forever, even writing is relatively novel compared to human spoken languages.

All that said, I'm interested with this 132 number, where does it come from?

Joker_vD · 10h ago

"Since forever" as in, "since the start of electronic computing"; we started printing the programs out on paper almost immediately. The 132 columns comes from the IBM's ancient line printers (circa 1957); most of other manufacturers followed the suit, and even the glass ttys routinely had 132-column mode (for VT100 you had to buy a RAM extension, for later models it was just there, I believe). My point is, most of the people did understand, back even in the sixties, that 80-columns wide screen is tiny, especially for reading the source code.

dcminter · 12h ago

Printers aside the VT220 terminal from DEC had a 132 column mode. Probably it was aping a standard printer column count. Most of the time we used the 80 column mode as it was far more readable on what was quite a small screen.

guenthert · 12h ago

Not only a small screen by modern standards, but the hardware lacked the needed resolution. The marketing brochure claims a 10x10 dot matrix. That will be for the 80 column mode. That works out to respectable 800 pixel horizontally, barely sufficient 6x10 pixel in 132 column mode. There was even a double-high, double-width mode for easier reading ;-)

Interesting here perhaps is that even back then it was recognized, that for different situations, different display modes were of advantage.

dcminter · 11h ago

> There was even a double-high, double-width mode for easier reading

I'd forgotten that; now that waa a fugly font. I don't think anyone ever used it (aside from the "Setup" banner on the settings screen)

I think the low pixel count was rather mitigated by the persistence of phospher though - there's reproductions of the fonts that had to take this into account; see the stuff about font stretching here: https://vt100.net/dec/vt220/glyphs

bloak · 13h ago

The IBM 1403 line printer, apparently.

typpilol · 14h ago

That's literally my setup everywhere. 120 for html/java/JavaScript and 80 elsewhere.

Really suites each language imo Although I could probably get away with 80, habit to use tailwind classes can get messy compared to 120

Cthulhu_ · 11h ago

Caveat, my personal experience is mainly limited to JS/TS, Java, and associated languages. 120 is fine for most use cases; I've only seen 80 work in Go, but that one also has unwritten rules that prefer reducing indentation as much as possible; "line-of-sight programming", no object-oriented programming (which gives almost everything a layer of indentation already), but also it has no ternary statements, no try/catch blocks, etc. It's a very left-aligned language, which is great for not unnecessarily using up that 80 column "budget".

forrestthewoods · 14h ago

Ugh. 80 is the worst. For C++ it’s entirely unreasonable. I definitely can not reconcile “linters make code easier to read” and “80 width is good”. Those are mutually exclusive imho.

What I actually want from a linter is “120, unless the trailing bits aren’t interesting in which case 140+ is fine”. The ideal rule isn’t hard and fast! It’s not pure science. There’s an art to it.

anilakar · 13h ago

But a 49" ultrawide is just two 27" monitors side by side. :-)

account42 · 12h ago

Better yet, its three monitors with more reasonable aspect ratios side by side.

16:9 is rarely what you want for anything that is mainly text.

setopt · 15h ago

It’s tricky to find an objective optimum. Personally I’ve been happy with up to 100 chars per line (aim for 80 but some lines are just more readable without wrapping).

But someone will always have to either scroll horizontally or wrap the text. I’m speaking as someone who often views code on my phone, with a ~40 characters wide screen.

In typography, it’s well accepted that an average of ~66 chars per line increases readability of bulk text, with the theory being that short lines require you to mentally «jump» to the beginning of the next line frequently which interrupts flow, but long lines make it harder to mentally keep track of where you are in each line. There is however a difference between newspapers and books, since shorter ~40-char columns allows rapid skimming by moving your eyes down a column instead of zigzagging through the text.

But I don’t think these numbers translate directly to code, which is usually written with most lines indented (on the left) and most lines shorter than the maximum (few statements are so long). Depending on language, I could easily imagine a line length of 100 leading to an average of ~66 chars per line.

fmbb · 15h ago

> the theory being that short lines require you to mentally «jump» to the beginning of the next line frequently which interrupts flow, but long lines make it harder to mentally keep track of where you are in each line.

In my experience, with programming you rarely have lines of 140 printable characters. A lot of it is indentation. So it’s probably rarely a problem to find your way back on the next line.

forrestthewoods · 14h ago

I don’t think code is comparable. Reading code is far more stochastic than reading a novel.

For C/C++ headers I absolutely despise verbose doxygen bullshit commented a spreading relatively straightforward functions across 10 lines of comments and args.

I want to be able to quickly skim function names and then read arguments only if deemed relevant. I don’t want to read every single word.

layer8 · 11h ago

100 is the sweet spot, IMO.

I like splitting long text as in log statements into appropriate source lines, just like you would a Markdown paragraph. As in:

    logger.info(
        "I like splitting long text as in log statements " +
        "into ” + suitablelAdjective + " source lines, " +
        "just like you would a Markdown paragraph. " +
        "As in: " + quine);

I agree that many formatters are bad about this, like introducing an indent for all but the first content line, or putting the concatenation operator in the front instead of the back, thereby also causing non-uniform alinkemt of the text content.

maleldil · 8h ago

Nitpick: this looks like Python. You don't need + to concatenate string literal. This is the type of thing a linter can catch.

Sohcahtoa82 · 1h ago

IMO, implicit string concatenation is a bug, not a feature.

I once made a stupid mistake of having a list of directories to delete:

    directories_to_delete = (
        "/some/dir"
        "/some/other/dir"
    )
    for dir in directories_to_delete:
        shutil.rmtree(dir)

Can you spot the error? I somehow forgot the comma in the list. That meant that rather than creating a tuple of directories, I created a single string. So when the `for` loop ran, it iterated on individual characters of the string. What was the first character? "/" of course.

I essentially did an `rm -rf /` because of the implicit concatenation.

layer8 · 2h ago

It’s actually Java, where the “+” is necessary.

saagarjha · 10h ago

This makes it really annoying to grep for log messages. I can't control what you do in your codebase but I will always argue against this the ones I work on.

layer8 · 10h ago

I haven’t found this to be a problem in practice. You generally can’t grep for the complete message anyway due to inserted arguments. Picking a distinctive formulation from the log message virtually always does the trick. I do take care to not place line breaks in the middle of a semantic unit if possible.

saagarjha · 10h ago

Yes, I find the part of the message that doesn't have interpolated arguments in it. The problem is that the literal part of the string might be broken up across lines.

smokel · 14h ago

I've never understood why we still look at the plain text representation of code, and not a visualization of the code that makes more sense.

Note that, in my mind, this visualization is not automatically generated, but lovingly created by humans who wish their code to be understood by others. It is not separate from the code, as typical design documentation is, but an integral part of it, stored in metadata. Consider it an extension of variable and function naming.

There is of course "literate programming" [1], but somehow (improvements of) that never took off in larger systems.

[1] https://en.wikipedia.org/wiki/Literate_programming

AdieuToLogic · 14h ago

> I've never understood why we still look at the plain text representation of code, and not a visualization of the code that makes more sense.

My guess is it is the same reason why the most common form of creating source code is typing and not other readily available mechanisms:

  Semantic density

Graphical visualizations are approachable representations and very useful for introductory, infrequent, and/or summary needs. However, they become cumbersome when either a well-defined repetitive workflow is used or usage variations are not known a priori.

An example of both are the emacs and vi editors. The vast majority of supported commands are at most a few keystrokes and any programming language source code can be manipulated by them.

jraph · 14h ago

> I've never understood why we still look at the plain text representation of code, and not a visualization of the code that makes more sense.

I suppose this is because nobody has been able to create good tooling for it (the visualization itself, the efficient editing, etc). You'll have to deal with the text version of it at some point if not all tools that we rely on get a version for the new visualization.

Another hypothesis is that it might not matter this much that we work with text directly after all.

> Note that, in my mind, this visualization is not automatically generated, but lovingly created by humans who wish their code to be understood by others.

If you allow manual crafting there, I suspect you'll need some sort of linting too.

seer · 8h ago

Um isn't that what Lisp and its children / siblings have been all about. I've written a bit of Closure it has a very clear idea that code is data and data is code. Your code is trivially serializable in your mind and by various tools, and because it is lisp - it all kinda makes sense.

I really wish we lived in a universe where a lisp became the lengua franca of the world instead of javascript, as almost happened with Netscape, but alas ...

jraph · 8h ago

The "code is data" aspect of lisp seems orthogonal to how code is still written as text, and btw lisp is still written using text. You still need to indent all these parentheses.

Virtually all programming languages are parsed into ASTs, and these ASTs can be serialized back. This is what formatters/"prettifiers" usually do.

Did I miss something?

rerdavies · 7h ago

    We must include the standard I/O definitions, since we want to
    send formatted output to stdout and stderr.
    <<Header files to include>>=
    #include <stdio.h>
    @

Not hard to see why nobody really embraced it. And not helped buy the fact that it was published right around the time that best practice was switching toward "don't comment unless absolutely necessary".

VonGallifrey · 1h ago

> just make a choice, run the linter automatically and be done with it.

Most people probably do this. These types of discussions (probably) come up when someone else made the choice and other people also need to adhere to this choice. This is important for teams, but sometimes big egos don't want these choices made for them.

yoyohello13 · 4h ago

Same! I have no patience for these kind of arguments about formatting. I don't care that you don't like what the formatter does, it isn't about you. I've written code in several different languages over the years and the main take away is that I can get used to reading anything. It's so important to pick a standard and follow it. As long as that standard is somewhat sane I couldn't care less what the actual standard is.

Another argument that is a pet peeve of mine is significant white-space vs curly braces. It literally doesn't matter. We often get new Python developers coming from a C# background and the amount of bitching about curly braces is so annoying. Just learn the language bro, it's not that hard.

anbotero · 3h ago

Those that complain:

I've worked with several Development Leads to actually define these. After the initial adjustment period, everybody's local environment setup properly: No one ever spent time reviewing style and formatting on Pull Requests.

Just decide as a team, auto-apply if possible (less than 5 seconds for big changes), enforce, and be done with it. Stop wasting everybody's time because after weeks you cannot make your mind on it and also don't tell your team/Lead about it.

ParetoOptimal · 5h ago

> I promise after a week you'll just get used to whatever format your team lands on.

Arthur Witney formats like this:

    C vt[]="+{~<#,";
    A(*vd[])()={0,plus,from,find,0,rsh,cat},
     (*vm[])()={0,id,size,iota,box,sha,0};

If your code was formatted automatically like that, do you think you'd get used to it after a week?

My point is there is meaning of how code is formatted and there is an effect on understanding for certain people.

I think that at a certain point of "reasonable" and for most "normal" people your statements hold true, but I don't want anyone to think that every person caught up on formatting is just doing it for bike-shedding or other trivial reasons.

I don't know what is actionable if what I say is true, but it feels important to say.

fallpeak · 4h ago

The unreadability of that example has approximately nothing to do with code formatting, which is generally understood to refer to modifying the textual representation of the code while leaving the actual logic more or less unchanged. Can you propose some alternative whitespace or indentation scheme which would make that example significantly more readable?

rs186 · 10h ago

That is true if a set of good linting rules are set up, those that help discover errors or other code smells which are valid issues in 99% of cases, or pure formatting rules when there is no "correct" thing to do. Linting becomes a problem when it is opinionated and has questionable rationale to begin with, and stands in your way instead of help you catch issues. Nobody should be fighting linting rules, but sadly that's what often happens.

See my other comment: https://news.ycombinator.com/item?id=45166670

mhh__ · 4h ago

Some styles can actively make some people less productive though e.g. I really try to avoid allman braces because I can work a lot better with denser (for a certain definition of dense code)

This, however, usually doesn't effect me if the official format for a project is one way or the other because [drumroll] I just format my tree differently and then format to the official style when I push.

torginus · 13h ago

The problem is that tools like ESlint often come with highly opinionated rules that might not even be applicable all of the time (leading to me having to manually turn them off via annotations)

And there's no centralized idea on best practices.

Cthulhu_ · 9h ago

ESLint is the centralized idea I suppose, but getting consensus is difficult.

When it comes to formatting, there's other languages (Go, Python?) that have clear, top-down guidelines applied by tooling, at least for code style. I think that's clever, and besides the odd mailing list post trying to change it because of a personal preference, it minimizes discussions about trivialities over the really important things.

Because 2 vs 4 spaces or line length discussions are ultimately futile; those aren't features, individual preferences don't matter. Codebases have millions of lines and thousands of developers; individual opinions do not matter at scale, consistency does.

HelloNurse · 13h ago

And best practices depend.

Recently, I discovered that the ruff linter for Python doesn't like the assert statement, because since it does nothing in "optimized" mode it isn't reliable. But such complaints about unit tests are not particularly useful.

mr_mitm · 13h ago

Rule S101 [1] is not in the default settings. If you choose to enable it, you have the possibility of disabling it for your tests like so:

    [tool.ruff.lint.per-file-ignores]
    "tests/*" = ["S101"]

(Besides, this was about formatting, not linting, but I realize it's related.)

[1] https://docs.astral.sh/ruff/rules/

__alexs · 13h ago

eslint is slow and has terrible UX. Use Biome instead.

torginus · 12h ago

Hi, I'm you from the future. Biome is slow and has terrible UX. Use Tokamak instead (/s)

__alexs · 11h ago

Biome has several advantages that make this future unlikely and evidence from similar attempts in other languages seems to support their direction. Such pessimism is unwarranted.

worldsayshi · 10h ago

I agree. Linters are one of the more frustrating aspects of modern dev. It's of such little relevance and yet it takes up a sizeable portion of my time when I'm going for a merge. Many editors/language combinations don't give automatic linting out of the box and when they do I can bet that the rules they infer is different from what the CI pipeline infers.

duxup · 8h ago

I'm in the same boat. I have not run into any situations where someone's choice on formatting was bad enough that I couldn't read the code so ... just pick a format / standard and let's go. I'll get used to it if I'm not already.

Cthulhu_ · 12h ago

But (at least for a long time), "run the linter automatically" wasn't available, not until Go's gofmt put the idea into people's heads that they could leave it to a tool. I think there were some formatting tools before then, but e.g. jslint/eslint had a lot of gaps which I unfortunately ended up pointing out in code reviews a lot. Which was nitpicking / bikeshedding, in hindsight.

memset · 11h ago

Interestingly, for over 30 years, C has had “indent” https://www.gnu.org/software/indent/manual/indent.html

psychoslave · 10h ago

What are the defaults, though, as not everyone seems to agree with GNU coding style?

>First off, I’d suggest printing out a copy of the GNU coding standards, and NOT read it. Burn them, it’s a great symbolic gesture.

https://www.kernel.org/doc/html/v4.10/process/coding-style.h...

Bender · 8h ago

I can see why people prefer particular styles so it's easier to read but on that note with Perl it was just perltidy flags. I can run perltidy on any code anyone here writes and it's easy for me to read, then I can pass it back to whomever and they can run perltidy with their favorite flags and it's easy for them to read. It probably doesn't quite work this way with all languages. I would imagine python being less flexible in this regard.

patwolf · 7h ago

I went through this on a few projects, and what surprised me the most was that some devs have very strong opinions about import ordering. I mostly rely on the IDE to manage imports, and most the time they're not even visible. We had to add a lot of prettier rules to get import orders just right.

scott_w · 11h ago

> It's so obviously bikeshedding

I think you just answered your own question ;-)

garbagepatch · 15h ago

> just make a choice

Now you are bikeshedding. Just go with the defaults.

wartijn_ · 15h ago

Which defaults? The programming languages I’ve worked with don’t have defaults for everything related to formatting. Editor defaults don’t work, since not everybody uses the same editor. So you have to make a choice somewhere.

swiftcoder · 14h ago

A lot of (relatively) recent languages do have defaults. Go and rust both come with an auto formatter out of the box, and defaults that are sane enough to just run with

wartijn_ · 12h ago

Ah yeah, in those cases it is possible to just use the defaults. Come to think of it, I have worked with Deno, which comes with a formatter (and linter and testing library) and I’m a fan. Saves a couple of dependencies, some config files and a bit of mental overhead when creating a new project.

stavros · 14h ago

I guess the GP means "use an opinionated formatter", I agree with both of you.

genericspammer · 10h ago

Please inform me what the defaults are for Java, C#, C++, C, Bash and Python?

sfn42 · 9h ago

As far as C# goes there's `dotnet format`. You can use it as is or provide an `.editorconfig` file to customize it.

sotix · 9h ago

A strong reason I enjoy Rust for collaboration is that it's so opinionated, it forces people to focus on solving real problems. I agree that bikeshedding over ES Lint and Prettier configs are not a strong use of time.

forrestthewoods · 16h ago

I’ll go a step further.

I’ve never understood why people care so much about the linter. Just let people write code and don’t worry about the linter. I don’t need to fight a linter which makes my code worse when I could just write it in a way that doesn’t suck. I promise it’ll be fine. I’m too busy doing actual software engineering to care if code is not perfectly formatted to some arbitrary style specification.

I feel like style lingers are horseshoe theory. Use them enough and eventually you wrap back around to just living without them.

ngruhn · 15h ago

Why is this flagged? I completely agree.

99% the linter is not enforcing correctness in my experience. It's just enforcing a bunch of subjective aesthetic constraints. Which import order, max number of empty lines between statement, what type of string literal to use, no trailing white space, etc. A non trivial part of my day is spent dealing with this giant catalog of dinner etiquette. Not all of it is auto fixable. Also, there are plenty of situations where everyone would agree that violating the rule is necessary (eg. "no use before define" but you need mutual recursion). Also sometimes rules are circularly in conflict (eg you have to change a line but there is no way to do it without violating the max-line-length rule).

MrJohz · 14h ago

If your linter is enforcing subjective aesthetic constraints, then I'd argue it's not really a linter but a formatter at that point, and it should be automatically fixing all that stuff for you rather than have you do that manually. Things like import order, empty lines, white space etc can all be fixed automatically in most languages I've worked with.

Linters enforcing rules that need to be broken is a pet peeve of mine, and I agree with you there. Most linters allow for using comments to explicitly exclude certain lines from being linted. This should ~never be necessary. If it is regularly necessary, then either you're programming bad (always a possibility!) or the rule has too many false positives and you should remove it.

komali2 · 14h ago

Does your linter not have a fix command for things like import order? Does your editor not auto trim white space?

To be frank, everyone I've worked with that complained about the linter didn't know much about their tooling. They didn't know about the fix command (even though I put it in the readme and told them about it), they didn't know how to turn on lintfix and prettier on save, wouldn't switch on git hooks and didn't know their lint failed until GitHub said so, and none of the people like this were so productive that it made up for this trait.

skinner927 · 15h ago

The point of linters is so the code looks the same regardless of who wrote it. This way it’s easier to read. Some people have horrible style and linters really help.

I find linters make me faster. Sometimes I’m feeling lazy and I just want to pump out a bunch of lines of ugly code with mappings poorly formatted, bad indents, and just have it all synched up when I save.

carlosjobim · 9h ago

I see no reason to accommodate to worthless programmers who aren't able to read or format the code that's sent to them. They can lint it themselves if they want.

mindwok · 8h ago

It's not about accomodating people, it's about consistency in codebases when many people with different preferences or styles are working on it. You just eliminate the cognitive overhead so people can develop intuition about how the code works and flows.

carlosjobim · 6h ago

Good, let huge companies do that. Now this lint madness is pushed onto everybody who wants to program, including individual hobbyists. And many of them are trying to learn coding and get their compiling sabotaged by the linter, not knowing it's something they can opt out of.

tacitusarc · 6h ago

That may be true for you, but I have worked with plenty of devs who cannot even be consistent with naming conventions in a function, let alone throughout the application.

Don’t get me wrong: modern liners often annoy me and devs who spend a lot of time fiddling with those settings tend not to be very good programmers. But sometimes having guardrails is necessary.

pletnes · 16h ago

Some linters find issues you care about. Forgotten print statements or confusing indentations come to mind. I’ve worked with people who easily forget, and I’m one of them myself.

No comments yet

thfuran · 8h ago

Everyone has their own opinion of what format doesn’t suck, so without a consistent code format, you’ll have to review diffs where fights over white space are mixed in with the meaningful change.

maratc · 10h ago

Agree.

There's a python linter named `black` and it converts my code:

    important_numbers = {
        "x": 3,
        "y": 42, # Answer to the Ultimate Question!
        "z": 2
    }

into this:

    important_numbers = {"x": 3, "y": 42, "z": 2}  # Answer to the Ultimate Question!

This `black` is non-configurable (because it's "opinionated") and yet, out of some strange cargo cult, people swear by it and try to impose it on everybody.

iainmerrick · 9h ago

This is the flip side of "I’ve never understood why people care so much about the linter"!

Why are you caring about formatting? Just write your code, get it working, let Black tidy it up in the standard way. Don't worry about the formatting.

In cases where you're annoyed about some choice the formatter makes, somebody else would be equally annoyed by the choice you would rather make. There is no perfect solution. The whole point is to have a reasonable, plausible default, and to automate it so that nobody has to spend any time thinking about it whatsoever.

Running a standard formatter when code is checked in minimizes the source control churn due to re-formatting. That churn is a pointless waste of time. If you don't run a standard formatter, I guarantee that badly-formatted code will make it into source control, and that's annoying.

maratc · 9h ago

I may be unusual in a way I treat my profession and care about my professional output (the code I write), and I take both very seriously.

There's a quote from Steve Jobs (or maybe his carpenter father):

    “When you’re a carpenter making a beautiful chest of drawers, you’re not going to use a piece of plywood on the back, even though it faces the wall and nobody will ever see it. You’ll know it’s there, so you’re going to use a beautiful piece of wood on the back. For you to sleep well at night, the aesthetic, the quality, has to be carried all the way through.”

When you say "Don't worry about the formatting", what you're saying is "use a piece of plywood on the back," and I'm just not going to do that.

iainmerrick · 7h ago

I don't think we'll ever fully agree, but I'd just like to clarify that I value that kind of craftsmanship too!

I just honestly believe that if you fully automate the formatting, the results are better than if you do it painstakingly by hand; better by virtue of being more consistent. It's using the right tool for the job.

tacitusarc · 6h ago

Did you read the example pietnas gave? The changed formatting ruined the communicative intent of his code. Formatters do that a lot, and it makes the code unambiguously worse.

I don’t really care about whether the back is plywood or whatever. I don’t know how to write plywood code. I do care about creating clear, readable code that communicates my intent. Sometimes formatters help with that. Often they hinder, as they reflect the arbitrary aesthetic preferences of their creators.

thfuran · 8h ago

That’s an obviously terrible formatting change. A format that prevents scoping comments narrowly is absurd. Why not just tuck all the inline comments at the end of the file so the code is denser while we’re at it?

iainmerrick · 7h ago

It works the way you want if you add a trailing comma:

  important_numbers = {
    "x": 3,
    "y": 42,  # Answer to the Ultimate Question!
    "z": 2,
  }

You might complain that that seems a bit obscure, but it only took me 10 or 20 seconds to discover it after pasting the original code snippet into an editor.

The trailing comma is an improvement as it makes the diff clearer on future edits.

Edit to add: occurs to me that I oversimplified my position earlier and it probably looks like I'm trying to have it both ways. I do advocate aiming for clean and clear formatting; I'm just against doing this manually. You should instead use automation, and steer it lightly only when you have to.

For example, I explicitly don't want people to manually "tab-align" columns in their code. It looks nice, sure, but it'll inevitably get messed up in future edits. Better to do something simpler and more robust.

maratc · 5h ago

The trailing comma communicates an intent of possibly adding more things in the future. I actually use it quite a lot -- when I have that intent.

In the above example, if I think I have listed all of the `important_numbers`, there is a certain point of not having the trailing comma there.

Here's another terrible example from `black`:

From this:

    my_print(f"This string has two parameters, `a` which is equal to {a} and `b` which is equal to {b}", 
        a=1, b=2)

To this:

    my_print(
        f"This string has two parameters, `a` which is equal to {a} and `b` which is equal to {b}",
        a=1,
        b=2,
    )

The trailing comma it added makes no sense whatsoever because I can not have an intent of adding more things -- I've already exhausted the parameters in the string!

On the top of it, I don't quite get why I need to change the way I write in order to please the machine. Who should be serving whom?

Edit: changed "print" to "my_print" to not have to argue about named parameters of print ("sep", "file" etc.).

Edit 2: here's a variant that `black` has no issues with whatsoever. It does not suggest a trailing comma or any other change:

    my_print(f"This string has two params, `a` which is {a} and `b` which is {b}", a=1, b=2)

So an existence of a trailing comma is a product of string length?

lenzm · 4h ago

Yes, it gets a trailing comma if it's on it's own line. That way when you add/remove arguments in a multi-line call it's only a one-line diff. This doesn't apply when the diff is only one line anyway.

Who's to say you don't add a new argument to the function in the future, like

    my_print(
        "This string has two parameters, `a` which is equal to {a} and `b` which is equal to {b}",
        a=1,
        b=2,
        color_negative_red=True,
    )

maratc · 3h ago

> it gets a trailing comma if it's on it's own line.

Sorry but it doesn't make any sense to me. If your argument is "a trailing comma is a good thing," it should go into any and all function calls/list declarations/etc. Who's to say I won't add this in the future:

    my_print("a={a}, b={b}", a=1, b=2, color_negative_red=True)

So do I need to have this now?

    my_print("a={a}, b={b}", a=1, b=2,)

There's a very responsive playground at https://black.vercel.app/ and whatever it does looks strange to me, because the underlying assumptions look inconsistent one with the other (to my eye at least.) Specifically, "the length of the string should decide whether there is a trailing comma or there isn't" makes zero sense.

thfuran · 3h ago

>Sorry but it doesn't make any sense to me. If your argument is "a trailing comma is a good thing," it should go into any and all function calls/list declarations/et

No, the argument is quite specifically that a one line diff to add a new argument/element to the end of a list is preferable to a two line diff to do the same thing. The presence of the trailing comma is necessary to achieve that only when elements are on their own line.

maratc · 3h ago

Ok, we're then back to `print` example:

    print(
        'Hello there from a very long line abcdefghijklmnopqrstuvwxyz',
        sep=' ', 
        end='\n', 
        file=None, 
        flush=False,
    )

All of the existing named parameters to `print()` function are already provided, and that standard function is highly unlikely to change. Should I add another string to `print`, I will have to do it before the named parameters anyway. There is no sense in the trailing comma here however you look at it.

Edit: sorry for using single quotes, in my 20 years of writing Python it was never an issue, but now with `black` it apparently is.

Revisional_Sin · 3h ago

Putting a trailing comma stops that.

maratc · 2h ago

https://news.ycombinator.com/item?id=45169331

raincole · 15h ago

People like you are the exact reason why linter is a thing.

forrestthewoods · 14h ago

Don’t be rude.

I write perfectly legible code. More legible than a linter infact. Because the rules for what is ideal are not so simple as to be encoded in simple lint rules. Sure it gets like 95%. But the last 5% is so bad it ruins the positives.

If your goal is “code that is easy to read and understand” then a linter is only maybe the first 20%. Lots of well linted code is thoroughly inscrutable.

raincole · 12h ago

> I write perfectly legible code

I 100% believe you. And for god's sake please use linter.

British and American spelling are both 100% legible English. But when multiple people coauthor a book, they should stick to one instead of letting each author use their favorite spelling.

stavros · 14h ago

I disagree. It gets you 95%, and do you know how many people are better than that? One in twenty.

I'll gladly pay the price of making the one person's code worse if it improves the other nineteen's.

genericspammer · 10h ago

Im sure you write very readable code, but in most companies, there are a bunch of devs who completely rape the codebase with unintelligble bullshit. The linter is the first line of defense against these bozos, unfortunately it must be enforced company wide.

jwilber · 16h ago

But what’s the issue? Setting lint rules is one and done - running pre-commit can be made automatic?

deadbabe · 12h ago

It’s more of a political thing. Controlling the linter is the first step of kingdom building.

einpoklum · 13h ago

If you ride a bike every day, bike sheds are rather important.

If you write and edit and read and search code every day, code formatting is rather important.

genericspammer · 10h ago

The point is that in the large picture there are many much more important topics with higher impact to focus on. The company wont make much more money by having consistently formatted code, compared to putting that energy towards new features.

onion2k · 10h ago

Consistency is important because it helps you pattern match.

What the pattern is doesn't really matter.

DonHopkins · 9h ago

You're missing what the bike shedding metaphor is about. It's not about having bike sheds or not, it's about coloring bike sheds, which every day bike riders in their right mind really don't give a shit about, because it doesn't affect their life in any tangible way.

Moomoomoo309 · 9h ago

No, the original metaphor is they were planning to build a nuclear reactor and they spent significantly more time than expected on the details of the bike shed because it was simple to understand and change, unlike the details of the reactor which were complex and required expertise and had lots of constraints. Who cares what color the bike shed is, we're building a nuclear reactor here!

anonymars · 8h ago

"Sigh, this guy is pedantically missing the...oh"

Took me a sec, but well played

xpe · 10h ago

I suggest rephrasing as a series of question:

1. Assuming at least one person who cares about linter settings isn't utterly confused or moronic, what are their self-described reasons why they care? People's work styles, brains, and even sensory perception differ in some important ways!

2. As freedom-loving developers [1] who want to make our own choices to help our own styles of work, why should we even have to care about "enforcing" one standard for something that isn't really necessary? This one-standard-per-project thing is a downstream result of a design decision upstream (storing source code as plain text).

3. How should we design languages going forward? This brings the conversation back to top-level post (which is why we're here -- to think about what languages could be, not to rehash tired old debates, after all): how can we take what we've learned and build better languages -- perhaps ones where the primary source of truth for source code is not plain text?

[1] Slightly tongue-in-cheek. It is one thing to want to have freedom to do our jobs well, it is another thing to turn this into advocacy an overarching system such as a political philosophy or various decentralized financial mechanisms and so on. Here, I'm merely referring to the "let me do my job in the way that actually works for my brain" sense.

kelseyfrog · 21h ago

The tradeoff here is not being able to use a universal set of tooling to interact with source files. Anything but text makes grep, diff, sed, and version control less effective. You end up locked into specialized tools, formats, or IDE extensions, while the Unix philosophy thrives on composability with plain text.

There's a scissor that cuts through the formatting debate: If initial space width was configurable in their editor of choice, would those who prefer tabs have any other arguments?

gr__or · 11h ago

Text surely is a hill, but I believe it's a local one, we got stuck on due to our short-sighted inability to go into a valley for a few miles until we find the (projectional) mountain.

All of your examples work better for code with structural knowledge:

- grep: symbol search (I use it about 100x as often as a text grep) or https://github.com/ast-grep/ast-grep

- diff: https://semanticdiff.com (and others), i.e.: hide noisy syntax only changes, attempt to capture moved code. I say attempt, because with projectional programming we could have a more expressive notion of code being moved

- sed: https://npmjs.com/package/@codemod/cli

- version control: I'd look towards languages like Unison to see what funky things we could do here, especially for libraries. A general example: no conflicts due to non-semantic changes (re-orderings, irrelevant whitespaces, etc.)

gorgoiler · 7h ago

I feel it’s important to stick up for the difference between text and code. The two overlap a lot, but not all text is code, even if most code is text.

It’s a really subtle difference but I can’t quite put my finger on why it is important. I think of all the little text files I’ve made over the decades that record information in various different ways where the only real syntax they share is that they use short lines (80 columns) and use line orientation for semantics (lah-dee-dah way of saying lots of lists!)

I have a lot of experience of being firmly ensconced in software engineering environments where the only resources being authored and edited were source code files.

But I’ve also had a lot of experience of the kind of admin / project / clerical work where you make up files as you go along. Teaching in a high school was a great place to practice that kind of thing.

zokier · 11h ago

But as the tools you link demonstrate, having "text" as the on-disk format does not preclude AST based (or even smarter) tools. So there is little benefit in having non-text format. Ultimately it's all just bytes on disk

gr__or · 6h ago

Even that is not without its cost. Most of these tools are written in different languages, which all have to maintain their own parsers, which have to keep up with language changes.

And there are abilities we lose completely by making text the source of truth, like a reliable version control for "this function moved to a new file".

theamk · 5h ago

At least the parsers are optional now - you can still grep, diff, etc.. even if your tools have no idea about language's semantics.

But if you store ASTs, you _have_ to have the support of each of the language for each of the tools (because each language has its own AST). This basically means a major chicken-and-egg problem - a new language won't be compatible with any of the tools, so the adoption will be very low until the editor, diff, sed etc.. are all updated.. and those tools won't be updated until the language is popular.

And you still don't get any advantages over text! For example, if you really cared about "this function moved to new file" functionality, you could have unique id after each function ("def myfunc{f8fa2bdd}..."), and insert/hide them in your editor. This way the IDE can show nice definition, but grep/git etc.. still work but with extra noise.

In fact, I bet that any technology that people claim requires non-readable AST files, can be implemented as text for many extra upsides and no major downsides (with the obvious exception of truly graphical things - naive diffs on auto-generated images, graphs or schematics files are not going to be very useful, no matter what kind of text format is used)

Want to have each person see it's own formatting style? Reformat to person's style on load and format back to project style on save. Modern formatters are so fast, people won't even notice this.

Want fast semantic search? Maintain the binary cache files, but use text as source-of-truth.

Want better diff output? Same deal, parse and cache.

Want to have no files, but instead have function list and edit each one directly, a la Smalltalk? Maintain files transparently with text code - maybe one file per function, or one file per class, or one per project...

The reason people keep source code as text as it's really a global maximum. The non-text format gives you a modest speedup, but at the expense of imposing incredible version compatibility pain.

gr__or · 4h ago

The complexity of a parser is orders of magnitude higher than that of an AST schema.

I'm also not saying we can have all these good things, but they are not free, and the costs are more spread out and thus less obviously noticeable than the ones projectional code imposes.

theamk · 3h ago

Are you talking about runtime complexity or programming-time complexity?

If the runtime, then I bet almost no one will notice, especially if the appropriate caching is used.

If the programming-time - sure, but it's not like you can avoid parsers altogether. If the parsers are not in the tools, they must be in IDE. Factor out that parsing logic, and make it a library all the tools can use (or a one-shot LSP server if you are in the language that has hard-to-use bindings).

Note even with AST-in-file approach, you _still_ need the library to read and write that AST, it's not like you can have a shared AST schema for multiple languages. So either way, tools like diff will need to have a wide variety of libraries linked in, one for each language they support. And at that point, there is not much difference between AST reader and code parser.

gr__or · 1h ago

I meant programming-time, but runtime is also a good point.

Cross-language libraries don't seem to be super common for this. The recovering-sense-from-text tools I named all use different parsers in their respective languages.

Again, reading (and yes, technically that's also parsing) from an AST from a data-exchange formatted file is mags simpler. And for parsing these schemes there are battle-tested cross-language solutions, e.g. protobuf.

rafaelmn · 5h ago

Why even have a database - let's just keep the data in CSVs, we can grep it easily, it's all bytes on a disk.

jrochkind1 · 7h ago

So there was an era, as the OP says, where your arguments were popular and believed and it was understood that things would move in this direction.

And yet it didn't, it reversed. I think the fact that "plain text for all source files" actually won in the actual ecosystem wasn't just because too many developers had the wrong idea/short-sightedness -- because in fact most influential people wanted and believed in what you say. It's because there are real factors that make the level of investment required for the other paths unsustainable, at least compared to the text source path.

it's definitely related to the "victory" of unix and unix-style OSs. Which is often understood as the victory of a philosophy of doing it cheaper, easier, simpler, faster, "good enough".

It's also got to do with how often languages and platforms change -- both change within a language/platform and languages/platforms rising and falling. Sometimes I wish this was less quick, I'm definitely a guy who wants to develop real expertise with a system by using it over a long time, and think you can work so much more effectively and productively when you have done such. But the actual speed of change of platforms and languages we see depends on reduced cost of tooling.

gr__or · 6h ago

For me, that's what "short-sighted inability" means. The business ecosystem we have does not have the attention span for this kind of project. What we need is individuals grouping together against the gradient of incentives (which is hard indeed).

Tooster · 10h ago

I’d also add:

* [Difftastic](https://difftastic.wilfred.me.uk/) — my go-to diff tool for years * [Nu shell](https://www.nushell.sh/) — a promising idea, but still lacking in design/implementation maturity

What I’d really like to see is a *viable projectional editor* and a broader shift from text-centric to data-centric tools.

The issue is that nearly everything we use today (editors, IDEs, coreutils) is built around text, and there’s no agreed-upon data interchange format. There have been attempts (Unison, JetBrains MCP, Nu shell), but none have gained real traction.

Rare “miracles” like the C++ --> Rust migration show paradigm shifts can happen. But a text → projectional transition would be even bigger. For that to succeed, someone influential would need to offer a *clear, opt-in migration path* where:

* some people stick with text-based tools, * others move to semantic model editing, * and both can interoperate in the same codebase.

What would be needed:

* Robust, data-native alternatives to [coreutils](https://wiki.archlinux.org/title/Core_utilities) operating directly on structured data (avoid serialize ↔ parse boundaries). Learn from Nushell’s mistakes, and aim for future-compatible, stable, battle-tested tools. * A more declarative-first mindset. * Strong theoretical foundations for the new paradigm. * Seamless conversion between text-based and semantic models. * New tools that work with mainstream languages (not niche reinventions), and enforce correctness at construction time (no invalid programs). * Integration of semantic model with existing version control systems * Shared standards for semantic models across languages/tools (something on the scale of MCP or LSP — JetBrains’ are better, but LSP won thanks to Microsoft’s push). * Dual compatibility in existing editors/IDEs (e.g. VSCode supporting both text files and semantic models). * Integrate knowledge across many different projects to distill the best way forward -> for example learn from Roslyn's semantic vs syntax model, look into tree sitter, check how difftastic does tree diffing, find tree regex engines, learn from S-expressions and LISP like languages, check unison, adopt helix editor/vim editing model, see how it can eb integrated with LSP and MCP etc.

This isn’t something you can brute-force — it needs careful planning and design before implementation. The train started on text rails and won’t stop, so the only way forward is to *build an alternative track* and make switching both gradual and worthwhile. Unfortunately it is pretty impossible to do for an entity without enough influence.

zokier · 9h ago

But almost every editor worth its salt these days has structural editing.

https://docs.helix-editor.com/syntax-aware-motions.html

https://www.masteringemacs.org/article/combobulate-structure...

https://zed.dev/blog/syntax-aware-editing

Etc etc.

Tooster · 7h ago

And that's a great thing! I look forward to them being more mature and more widely adopted, as I have tried both zed and helix, and for the day to day work they are not yet there. For stuff to take traction though. Both of them, however, don't intend to be projectional editors as far as I am aware. For vims or emacs out there - I don't think they mainstream tools which can tip the scale. Even now vim is considered a niche, quirky editor with very high barrier of entry. And still, they operate primarily on text.

Without tools in mainstream editors I don't see how it can push us forward instead of saying a niche barely anyone knows about.

jsharpe · 21h ago

Exactly. This idea comes up time and time again, but the cost/benefit just doesn't make sense at all. You're adding an unbelievable amount of complex tooling just to avoid running a simple formatter.

The goal of having every developer viewing the code with their own preferences just isn't that important. On every team I've been on, we just use a standard style guide, enforced by formatter, and while not everyone agrees with every rule, it just doesn't matter. You get used to it.

Arguing and obsessing about code formatting is simply useless bikeshedding.

scubbo · 20h ago

I disagree with almost every choice made by the Go language designers, but `Gofmt's style is no one's favorite, yet gofmt is everyone's favorite` is solid. Pick a not-unreasonable standard, enforce it, and move on to more important things.

spyspy · 20h ago

My only complaint about gofmt is that it’s not even stricter about some things.

duskwuff · 17h ago

Good news: there are tools like https://github.com/mvdan/gofumpt which fork gofmt and enforce stricter rules (while remaining invariant under gofmt).

rbits · 19h ago

Yeah it would probably be a waste of time. It's a nice idea to dream about though. It would be nice to be able to look at some C# code and not have opening curly brackets on a separate line.

mdaniel · 15h ago

I say this fully cognizant of the thread in which it's posted, but these people are sick

https://astyle.sourceforge.net/astyle.html#_style=whitesmith

And then someone said: oh yeah? Hold my beer https://astyle.sourceforge.net/astyle.html#_style=pico

masklinn · 13h ago

These are beginner horrors. For me nothing beats the insanity of the gnu style: https://astyle.sourceforge.net/astyle.html#_style=gnu

Buttons840 · 17h ago

> Arguing and obsessing about code formatting is simply useless bikeshedding.

Unless it's an accessibility issue, and it is an accessibility issue sometimes.

mmastrac · 17h ago

Maybe if you use 16-wide tabs or a 40 character line length.

raspasov · 15h ago

>> The goal of having every developer viewing the code with their own preferences just isn't that important.

Bah! So, what is more important? Is the average convenience of the herd more important? Average of the convenience, even if there was ever such a thing.

What if you really liked reading books in paper format, but were forced to read them on displays for... reasons?

rendaw · 15h ago

Grep, diff, sed, and line-based non-semantic merge are all terrible tools for manipulating code... rather than dig ourselves in either further with those maybe a reason to come up with something better would be good.

accelbred · 20h ago

What if the common intermediate encoding is text, not binary? Then grep/diff/sed all still work.

If we had a formatting tool that operated solely on AST, checked in code could be in a canonical form for a given AST. Editors could then parse the AST and display the source with a different formatting of the users choice, and convert to canonical form when writing the file to disk.

pmontra · 13h ago

All mainstream editors that agree to work on a standard AST for any given language could be nice. I'm not expecting that to happen at any time in future.

About grep and diff working on a textual representation of the AST, it would be like grepping on Javascript source code when the actual source code is Typescript or some other more distant language that compiles to Javascript (does anybody remember Coffescript?) We want to see only the source code we typed in.

By the way, add git diff to the list of tools that should work on the AST but show us the real source code.

sublinear · 20h ago

Nobody wants to have to run their own formatter rules in reverse in their head just to know what to grep for. That defeats the point of formatting at all.

pwdisswordfishz · 17h ago

That's why you grep for a syntactic structure, not undifferentiated text.

michaelmrose · 16h ago

Which grep doesn't do and you need to either use a new different tool or more likely several for little real benefit

hnlmorg · 14h ago

grep is half a century old now.

If we can’t progress our ecosystem because we are reliant on one very specific 50+ year old line parser, then that says more about the inflexibility of the industry to move forward than it does about the “new” ideas being presented.

account42 · 12h ago

We still use grep because its useful. And it's useful precisely because it doesn't depend on syntax so will work on anything text based.

hnlmorg · 11m ago

grep is great. My point isn’t that we shouldn’t use it. My point is that we shouldn’t be held back by it.

komali2 · 13h ago

The things all being described are way beyond non trivial to solve, and they'd need to be solved for every language.

Grep works great.

hnlmorg · 2m ago

Actually no.

If languages compile to a common byte code then you just need one tool. You already see examples of this with things like the IR assembly produced by LLVM, various Microsoft languages that compile to CLR, and the different languages that target JVM.

There are also already common ways to create reusable parsing rules like LSP for IDEs and treesitter.

In fact there are already grep-like utilities that are based on treesitter.

So it’s not only very possible to create reusable tools for different languages; but these tools already exist and being used by a great many developers.

> Grep works great

For LF-separated lists it does. But if it worked great for structured content then we wouldn’t be having this conversation to begin with.

jitl · 16h ago

comby is fantastic, give it a shot. It’s saved me huge amounts of time.

theamk · 5h ago

You'd need all-news tools for non-text world as well.

So the real choice is either:

- new tool: grep with caching reverse-formatter filter.

- new tool: ast-grep with understanding of AST serialization format for your specific language.

At least in the first case, you still have fall back.

Avshalom · 21h ago

The entire OS was built around these source files.

the unix philosophy on the other hand only "thrives" if every other tool is designed around (and contains code to parse) "plain text"

lmm · 19h ago

> The entire OS was built around these source files.

And how did that work out for them?

This seems like one of the many cases where unix won out by being a lowest common denominator. Every platform can handle plain text.

account42 · 12h ago

Not all platforms come with powerful text handling tools out of the box - or at least they didn't used to until Unix-based systems forced them to catch up.

aleph_minus_one · 12h ago

> This seems like one of the many cases where unix won out by being a lowest common denominator.

The lowest common denominator rather is binary blobs. :-)

thfuran · 7h ago

The conversion of which to text and back has historically proven rather fraught.

MyOutfitIsVague · 15h ago

The way I envision this working is with something like git filters. Checking out from version control converts it all into text in your preferred formatting, which you then work with as expected. Staging it converts it into the stored representation. In git, this would be done with smudge and clean filters, like how git LFS works. You'd also have viewers for forges and the like that are built to interpret all the stored representations as needed.

You still work with text, the text just isn't the canonical stored representation. You get diffs to resolve only when structure is changed.

You get most of the same benefit with a pre-commit linter hook, though.

zokier · 9h ago

The problem is that there is little benefit in not having the canonical stored representation be text. The crucial thing is to have some canonical representation but it might as well be human readable.

bapak · 12h ago

This is it, unfortunately git is "too dumb" for this. In order to merge code, it would have to either understand the AST.

What happens when you stage the line `} else return {`? git doesn't allow to stage specific AST nodes. It would also mean that you can't stage partial code (that produces syntax errors)

zokier · 9h ago

Git can use arbitrary merge (and diff) tools. Something like https://mergiraf.org/introduction.html works with git and gets you ast aware merging. Do not underestimate gits flexibility.

Hendrikto · 11h ago

Smudge and clean filters work on text, git would not need to change at all.

You would still store text, and still check out text, just transformed text. You could still check in anything you want, including partial code, syntax errors, or any other arbitrary text. Diffs would work the same way they do now.

account42 · 12h ago

Please no, git trying to automatically "correct" \n vs \r\n line endings is already horrible enough. At least you can turn that off.

danielheath · 16h ago

If you’re going to store the source in a canonical format and unpack that to suit each developer… why should the canonical format just be regular source code?

All the same tools can exist with a text backend, and you get grep/sed support for free too!

psychoslave · 13h ago

That’s seems like a genious remark actually. If you store the abstract objects and have the mechanism to transform to whatever the desired output form is, it’s almost trivial to expose a version as files and text rendering for tools that are thus oriented, isn’t it?

giveita · 16h ago

My grep may not work on your settings for the same code.

This becomes an issue with say CI where maybe I add a gate to check something with grep. But whose format do I assume? My local (that I used to test it locally) or the canonical (which means I need to switch local format to test it)?

brabel · 15h ago

You really rely on grep on CI? How fragile is that ?! This is a good argument for storing non-text. Grepping code is laughably unreliable. The only way to write things like that reliably is by actually parsing the code and working in its AST. Working in text is like writing code in a completely untyped language. It can be done, but it’s beyond stupid for anything where accuracy matters.

treadmill · 15h ago

You're misunderstanding the idea I think.

You would use the format on disk for the grep. "Your format" only exists displayed in your editor.

giveita · 10h ago

Aha

eviks · 16h ago

> If initial space width was configurable in their editor of choice, would those who prefer tabs have any other arguments?

Yes, of course, because tab width is * dynamically* flexible, so initial space width isn't enough

pasc1878 · 9h ago

Yes because if you want to deindent with tabs it is just delete one character whilst spaces requires you top delete x characters where x is the number of spaces you indent by.

eviks · 9h ago

For "clean-fixed-width" unambiguous indent (eg, at the beginning of lines) you can make delete also delete X=indent_width spaces.

But for "dirty-width" indents, eg, after some text that can vary in size (proportional fonts or some special chars even in fixed fonts) you can't align with spaces while a tab width can be auto-adjusted to match the other line

aleph_minus_one · 12h ago

> Anything but text makes grep, diff, sed, and version control less effective.

Perhaps this is rather a design mistake in how UNIX handles things and is so focused on text.

bee_rider · 19h ago

Is it possible converted from the DIANA ir back to something that looks like source code? Then the result of the conversion backward could be grepped, etc…

teo_zero · 16h ago

From TFA:

> Everyone had their own pretty-printing settings for viewing [DIANA] however they wanted.

bee_rider · 15h ago

> Back when he was working on Ada, they didn't store text sources at all — they used an IR called DIANA. Everyone had their own pretty-printing settings for viewing it however they wanted.

I’m still confused because the specifically call the IR DIANA, and they talk about viewing the IR. It isn’t clear to me if the IR is more like a bytecode or something, or more like just the original source code with a little processing done to it. They also have a quote,

> Grady Booch summarizes it well: R1000 was effectively a DIANA machine. We didn't store source code: source code was simply a pretty-printing of the DIANA tree.

So maybe the other visualizations they could do by transforming the IR were so nice that nobody even cared to look at the original ADA that they’d written to generate it?

brabel · 15h ago

I imagine it’s like storing JVM bytecode, ie class files instead of Java files. So when you open it up the editor decompiles it , like IntelliJ does if you try to open a class file, but then it also applies your own style, like from .editorconfig, on the code it shows. It’s a really good idea and I can’t believe people here are complaining that it’s bad because they can’t use grep! But that’s a good thing!! Who the hell is grepping code as if code had no structure and that’s the best you can do? So you also grep JSON instead of using jq? Just don’t!

cowsandmilk · 21h ago

How is diff less effective? I see the diff in the formatting I prefer? With sed, I can project the source into a formatting most convenient for what I’m trying to do with sed. And I have no idea what you’re on about version control. It ruins sending patch files that require a line number around, but most places don’t do that any more.

What I would be curious on is tracing from errors back to the source code. Nearly every language I’ve used prints line number and offset on the line for the error. How that worked in the Diana world would be interesting to learn.

sublinear · 20h ago

You'd have to run diff and sed before the formatter which is harder for everyone.

peanball · 15h ago

I can only recommend difftastic[1], which is a language aware diff. Independent of linter that shows the logical diff, not an assortment of characters or lines that changed.

[1]: https://github.com/Wilfred/difftastic

charcircuit · 20h ago

In practice how many tools do you really need to handle the custom format? Probably single digits and they could all use a common library to handle the formatting aspect of things.

Ygg2 · 6h ago

> would those who prefer tabs have any other arguments?

Yes. Because Yaml exists. And mixing tabs and spaces is horrible in it. And the rules are very finnicky.

Optimal tab usage is emit 2-4 spaces.

froh · 16h ago

yes, contemporary editors and tools like treesitter have decided this debate in favor of plain text file representation, exactly for the reasons you give: universal accessibility by general purpose tools.

xslt was a Diana like pre-parsed representation of dsssl. oh how I miss dsssl (a scheme based sgml transformation language) but no. dsssl was a lisp! with hygienic macros! "ikes" they went and invented XSLT.

the "logic" escapes me to this day.

no. plain text it is. human readable. and grep/sed/diff able.

davetron5000 · 20h ago

There’s also a typography element to formatting source code. The notion that all code formatting is mere personal preference isn’t true. Formatting code a certain way can help to communicate meaning and structure. This is lost when the minimal tokens are serialized and re-constituted using an automated tool.

https://naildrivin5.com/blog/2013/05/17/source-code-typograp...

Mikhail_Edoshin · 13h ago

And I'd add that typographers go out of their skin to typeset tables and formulae so that everything is aligned and has proper spacing. For centuries this was done manually because it it important, even though an outsider cannot notice it.

(That said, it must be possible to make a more sophisticated formatter for the source code too.)

anticodon · 16h ago

Yes. In Python, black formatter consistently breaks SQLAlchemy queries in an unreadable way (e.g. splitting conditions over multiple lines when it's not really necessary and makes reading harder).

3036e4 · 14h ago

For C++ clang-format does things like that all the time as well. Of course it has no idea what semantically belongs together on the same line or not. I wish the C++ world had settled on some other standard linter.

IshKebab · 10h ago

clang-format is probably the worst of the autoformatters. They tried to get fancy with a sort of global optimisation algorithm but in practice it's buggier and uglier than the classic Prettier algorithm which is elegant and generally works very well. It's also way less diff friendly.

I wouldn't draw any conclusions about autoformatters from clang-format.

frizlab · 18h ago

Yes! I’m always appalled that people cannot see that.

psychoslave · 11h ago

Caring for typography but blindly bending to dubious programming-language convention feels really like putting efforts on the wrong starting point though.

What’s the point of such an heavy obfuscation of the intend, really? Let’s take the first example.

    char *
    strcpy(to, from)
            register char *to;
            register const char *from;
    {
            char *save = to;

            for (; (*to = *from) != 0; ++from, ++to);
            return(save);
    }

If we are fine with the "lengthy" register, why not use character in full word? Or if we want something shorter sign would be actually semantically more on point in general.

What with the star to design a pointer? Why not sign-pointer? Or pin for short if we dare to use a pretty straightforward metaphor, so sign-pin. Ah yes by the way, using "dot" (.) or "dash, greater than" (->) is such a typographical non-sense.

And as a side note *char brings nothing in readability compared to sign-pin-pin. Remember that most people read words or even word sequences as a whole. And let’s compare **char to something like sign-pin-back-5.

What with strcpy? Do we want to play code-obfuscation to look smart being able to decode this pile of letter sequence? What’s wrong with string·copy* or even stringcopy (compare photocopy)? Or even simply copy? If we want to avoid some redundant identifier without relying on overriding through argument types, English is rich in synonyms. For example duplicate, replicate, reproduce.

Various parentheses could be just as well optional to ease code browsing if proper typography is already on place, and English already provide many adverb/preposition that could replace/complement them into a linguistically more usual counterparts.

Speaking about prepositions, using from and to as identifiers for things which would be far more aptly described with nouns is really such a confusing choice. What’s wrong with origin/source and destination/target? It’s also a bit counterproductive to put the identifier, which is the main point of interest, at the very end of it’s declaration statement.

Equal for assignment is just really an artifact of more relevant symbol like ← or ≔ because most keyboard layouts stem from disastrous design. But using an more adequate symbol is really pushing for unnecessary obscured notation.

Mandatory semicolon to end a statement is obviously also a typographical nonsense.

If a parameter is to be left blank in for, we would obviously be better served with a separate control-flow construction rather than any way to highlight it’s not filled in that employ.

So packing it all:

     duplicate as function ⟨
          requiring (
               origin as sign-pin-register,
               destination as sign-pin-register
          )
          making {
               save as sign-pin
               save assigned origin
               destination-pin assigned origin-pin until ( zeroized,
                    whilst [
                        origin-increment,
                        destination-increment
                    wrought ]
               done )
               return save
          made }
     built ⟩

Given that in that case the parentheses and comas are purely ornamental, the compiler could just ignore them and would have enough information with something like

     duplicate as function
          requiring
               origin as sign-pin-register
               destination as sign-pin-register
          making
               save as sign-pin
               save assigned origin
               destination-pin assigned origin-pin until zeroized
                    whilst
                        origin-increment
                        destination-increment
                    wrought
               done
               return save
          made
     built

Or even

     duplicate as function requiring origin as sign-pin-register destination as sign-pin-register making save as sign-pin save assigned origin destination-pin assigned origin-pin until zeroized whilst origin-increment destination-increment wrought done return save made built

pwdisswordfishz · 17h ago

> A C argument declaration is made up of modifiers (register, const), a data type (char *), and a name (from).

Now explain a declaration like "char *argv[]"...

> We’ve also re-set the data type such that there is no space between char and * - the data type of both of these variables is “pointer to char”, so it makes more sense to put the space before the argument name, not in the middle the data type’s name (update: it should be pointed out that this only makes sense for a single declaration. A construct like char* a, b will create a pointer to char, a, and a regular char, b).

Ah, yes, the delusional C++ formatting style. At least it's nice that the update provides the explanation why it should be avoided.

yccs27 · 13h ago

My $0.02: Don't throw away a perfectly good mental model because of a compiler ideosyncasy. Just treat it as a special case and use a linter against stuff like char* a, b.

You also don't think about dollars differently than other units, just because the sign goes before the number.

jauntywundrkind · 19h ago

I'm pretty unconvinced by the examples.

> Some of us even align other parts of our code, such repeated inline comments

> Now, the arguments block forms a table of three columns. The modifiers make up the first column, the data types are aligned in the second column, and the names are in the third column

These feel like pretty trivial routines that can be encompassed by code formatting.

We can contrive more extreme examples, like the for loop, but super custom formatting ("typesetting") like that has always made me feel awkward, feels like it givesicemse for people to use all manners of arbitrary formatting. The author has some intent, but when you run into an inconsistent code based with lots of things going on, the variance doesn't feel informative or helpful: it sucks and it's a drain.

What's stored is perhaps more minimal, some kind of reference encoding, maybe prettier-ifies for js. The meat of this article to me is that it shouldn't matter: the IDE should let you view and edit as you like:

> Everyone had their own pretty-printing settings for viewing it however they wanted.

IshKebab · 10h ago

Yeah in theory people can do a better job than auto-formatters. In practice they absolutely do not, so that argument is moot.

xpe · 10h ago

> Yeah in theory people can do a better job than auto-formatters. In practice they absolutely do not, so that argument is moot.

Status quo fallacy alert. Arguments are not forever mired in a current state of affairs. People can learn and can build tools to help them do better.

This could change quickly; e.g. if Claude or GitHub or (Your Team) decide to prioritize how source code looks.

chowells · 15h ago

I have to disagree with the premise. Formatting code is a critical communication channel. Well-formatted code should tell you:

1. The developer has enough experience to understand that formatting matters.

2. The developer has enough discipline to stick with their chosen formatting rules.

3. The developer has the taste necessary to choose good formatting rules.

4. The developer has the judgement necessary to identify when other concerns justify one-off violations of the rules.

These are really important attributes for a developer to have. They affect every aspect of the code, not just formatting. Formatting is just a very quick proxy to measure those by.

Unfortunately, things like autoformatting and linter rules are destroying the signal. Goodheart's law strikes again.

babel_ · 14h ago

The blog entry is short and simple, perhaps consider reading it before knee-jerk reacting to the title, and then you might understand why "should" and "unnecessary" are operative in said title.

chowells · 5h ago

You've jumped to a fascinatingly false conclusion here. Is this the so-called death of media literacy? I replied to the ideas underlying the post rather than the words in it, and you think that means I didn't read it?

To go through the details: The post explicitly complained about a linter enforcing style rules. It did not object to the presence of mechanically-enforced style rules. In fact, it glorified them implicitly by saying how great it would be if everything was formatted at presentation-time. This glorification is the exact thing I was criticizing.

I think machine-enforced rules are bad because they destroy a communication channel that importantly has point 4 that I listed - when well-formatted code breaks its conventions, there must be a reason for it. That is important information that enforced presentation rules force to be put into another channel.

And it's certainly true that other channels do convey this other information, but I find more value in having it conveyed in the presentation channel than I do in having that channel replaced by mechanistic formatting.

This is the premise underlying the article that I object to. It is present so heavily in the subtext that if you pretend it's not, the post becomes incoherent.

And FWIW, HN rules say not to accuse people of not having read the article. I think that rule is mostly there because someone can read the article and notice something you missed, and it's wiser to not post than it is to assume you absorbed 100% of the context of the post.

rho4 · 14h ago

Not caring about formatting also signals to me that:

- they have probably never worked on a codebase where files are edited by more than 1 person

- they have never done any significant amount of merging between branches

- they have never maintained a large codebase

- they have never had to refactor a large codebase

- they don't use diff/comparison tools to read the history of their codebase

- they have never written any tooling for their codebase

- they are not good team-players and/or only care about their own stuff

pure-orange · 14h ago

Did you not read the article?

KronisLV · 14h ago

If you are in circumstances where the answers to those questions are a resounding "No" then you should just set up the tooling to format the code on save / commit and perhaps to make the CI complain if anyone skips that and leave it at that.

Furthermore, instead of nitpicking over small details, it can actually be a good idea to just leave everything on default, forgo whatever your individual style might be and stick to what's been deemed to be good enough as the default - so the code will look more familiar to anyone who picks it up (and has used the tools you use for linting and formatting). Yes, formatting is different from linting; though if you set up one, you might as well do the other.

falcor84 · 12h ago

The same personality attributes can be assessed even better based on penmanship, so going forward, I'll require all PRs to be submitted in cursive

chowells · 5h ago

You know, my first job during college involved updating construction documents based on changes that were approved by both the contractors and the owners. Penmanship was critical when updating blueprints by hand - which was always a lot cheaper than getting the source documents, revising them, and reprinting them.

In my very limited experience, I learned the importance of penmanship in that profession.

In my much larger experience since, I've learned the irrelevance of penmanship to writing code. I don't practice my blueprint handwriting anymore. It would be wholly unfit-for-purpose without a bunch of practice. But I understand its value in that context.

If I understand the thrust of your comment correctly, you're pointing towards removing formatting as a channel being a net positive, despite the loss of all these indicators. I might almost agree with that, except for my point 4. Sometimes it's better, on the whole, to break conventions. Mechanical formatting systems cannot make these judgement calls.

I think the minor friction of explicit formatting is a net positive. I think the communication channel it adds carries more value than the friction it imposes hurts. (And I'm calling it explicit formatting because it doesn't have to be manual - it just has to be done with intention, judgement, and approval.)

I don't think the massive friction imposed by submitting code as ink on paper provides enough value to be worth its costs, by contrast.

linhns · 3h ago

I’d say you go re-read the article.

> The developer has the taste necessary to choose good formatting rules

Rely on this and you’re in trouble. More time will be lost just to argue which style is better. Go with the in-built formatter way of Go and Rust

PaulStatezny · 7h ago

You didn't read the blog.

It's talking about the Ada programming language and that its code was apparently stored not as plaintext but an intermediate representation (IR) that could then be transformed back into code.

So formatting was handled by tooling by the nature of the setup. Developers would each have their own custom settings for "pretty printing" the code.

The author isn't saying don't use code formatters. They're highlighting an unusual approach that the industry at large isn't aware of. Instead of getting rid of arguments about code style via formatters, you can get rid of them by saving code in an IR instead of plaintext.

shit_game · 13h ago

Would you say that someones code formatting is a shibboleth? How do you feel about formatters and linters in regards to this?

teaearlgraycold · 14h ago

There are times when you really want a specific formatting of the text, like visually turning a list into a table.

rho4 · 12h ago

The system should support this, e.g. via // @formatter:off/on tags

teaearlgraycold · 6h ago

For the stored IR version that means it needs to store raw source code when those directives are used. And then you lose the benefits.

jillesvangurp · 9h ago

There was a movement towards working with syntax trees directly and treating source code as a generated serialization of those syntax trees about 20-25 years ago. This probably started with refactoring as it was pioneered in the nineties. Things like Visual Age actually stored code in a database instead of on the file system. Later intentional programming (Charles Simonyi was pushing that) tried to also do things with this. And of course model driven development was a thing around the same time.

Refactorings (when done right) are syntax tree transformations that preserve things like referential integrity, etc. that ensure code does the same thing before and after applying a refactoring.

A rename becomes trivial if you are simply working on the symbol directly. For that to work with file based source trees, you need to parse the whole thing, keep track of where symbols are referred in files, rename the symbol and then update all the places in the source tree. That stuff becomes a lot easier when the code representation isn't a bunch of files but the syntax tree. The symbol just gets a different name. Anything that uses the symbol will still use the same symbol.

People like editing files of course and that has resulted in a lot of friction developing richer tools that don't store text but something that preserves more structure. The fact that we're still going on about formatting issues a quarter century later maybe shows that this is something to revisit. For many languages and editors, robust symbol renames are still somewhat science fiction. And that's just the most basic refactoring.

zokier · 8h ago

Meh.

> That stuff becomes a lot easier when the code representation isn't a bunch of files but the syntax tree

You are just mixing abstraction layers here. That syntax tree still needs to be stored in file(s) somehow, and nothing prevents having syntax tree aware (or smarter) tooling operating on human readable files. Basically deserializing AST and parsing source code are the same thing. The storage format really isn't that significant factor here.

So what is needed is better tools rather than fiddling with storage format. Microsofts Roslyn is obvious example, but plenty of modern compilers are moving in the direction of exposing APIs to interact with the codebase.

jillesvangurp · 6h ago

> That syntax tree still needs to be stored in file(s) somehow

Sure, but there are less flaky ways than spreading a syntax tree across files. Visual Age actually used a database for this back in the day. Smalltalk did similar things by storing code in an image file that contained both byte code and method definitions. You could export source code if you wanted. But wouldn't do that while developing typically. That's not an approach that caught on. But it has some advantages.

What you are describing is what Eclipse did with Java. Eclipse was the successor to Visual Age. The Eclipse incremental compiler for Java updated an internal data structure for the IDE. It could do neat things as partial compilation to enable running tests even in the presence of some compile errors. It also was really fast. By the time you stopped typing, it would have already compiled your code. Running the tests was similarly fast.

The problem of syncing a tree of source files with an AST is just a bit hard. Intellij never came close to this and has always had lots of trouble keeping its internal caches coherent. There's even a top level "invalidate caches" option in the File menu (still there, I checked. Right next to the Repair IDE option). They were off by 2-3 orders of magnitude. Seconds (at best) instead of milliseconds. I still miss Eclipse's speed every day I use Intellij.

Some compilers are taking some steps to supporting more advanced IDEs. But there aren't a lot of those beyond what Jetbrains provides. VS Studio Code support varies between different languages. But mostly it's very limited on this front. The Rust compiler is one of those. Though I don't know the current state of that. Mostly it's not well known for its blazing performance (the compiler). I'm not sure if Jetbrains leverages many of those features in its Rust IDE (I'm not a Rust developer).

aleph_minus_one · 12h ago

Some (sometimes) desirable source code formatting cannot be deduced from the abstract syntax tree alone:

Consider the following (pseudo-)code example:

  bar.glob = 1;
  bar.plu.a1 = 21;
  bar.plu.coza = fol;

Should this code formatted this way? Or should it be formatted

  bar.glob     = 1;
  bar.plu.a1   = 21;
  bar.plu.coza = fol;

to emphasize that three assignments are done?

Or should this code be formatted

  bar.glob      = 1;
  bar.plu .a1   = 21;
  bar.plu .coza = fol;

to bring make the "depth" of the structure variables more tabular so that you can immediately see by the tabular shape which "depth" a member variable has?

We can go even further like

  bar.glob     =   1;
  bar.plu.a1   =  21;
  bar.plu.coza = fol;

which emphasizes that the author considers it to be very important that the reader can easily grasp the magnitudes of the numbers involved (which is why in Excel or LibreOffice Calc, numbers are right-aligned by default). Or combining this with making the depth "tabular":

  bar.glob      =   1;
  bar.plu .a1   =  21;
  bar.plu .coza = fol;

Each of these formattings emphasizes different aspects of the code that the author wants to emphasize. This information cannot be deduced from some abstract syntax tree alone. Rather, this needs additional information by the programmer in which sense the structure behind the code intended by the programmer is to be "interpreted".

kennywinker · 11h ago

I see what you’re saying, but I also haven’t ever used anything but the first two formats, and my goal was always readability not emphasis.

Storing the AST instead of the text is a lossy encoding, but would we lose something more valuable than what we gain? If your example is the best thing we’d lose - i’d say it’s still net a massive win.

and there are ways to emphasize different parts, that would survive the roundtrip to AST. E.g. one way to emphasize depth:

    setValue([bar, glob], 1)

    setValue([bar, plu, a1], 21)

or to emphasize the data:

    configure(bar, 1, 21, fol)

Or heck you could allow style overides if you really wanted to preserve this kind of styling:

    // $formatblk: tabular_keypaths, aligned_assignments

    bar   .glob       = 1

    bar   .plu    .a1 = 21

    // $formatblk-end

Cthulhu_ · 12h ago

But "desirable code formatting" is subjective; some people prefer 2, 4 or 8 spaces, some prefer columnar layout like you demonstrated, etc. You can't deduce formatting from an AST alone as an AST is not source code and does not have formatting information.

gentooflux · 12h ago

The second two lines of your example smell like LoD violations. It's not a formatting problem, it's a structural problem.

aleph_minus_one · 12h ago

Sometimes you have to use libraries that are badly designed.

gentooflux · 12h ago

When that happens they're usually badly formatted too.

aleph_minus_one · 12h ago

Indeed, but this bad formatting should not "spill over" to your own code if possible.

efortis · 16h ago

Projectional Editing can be done with text sources.

Here’s an old video of JetBrains MPS rendering a table from code https://www.youtube.com/watch?v=XolJx4GfMmg&t=63s

I’m hoping for an IDE able to render dictionaries as tables -- my wishlist doesn’t stop there.

Currently, we have a glimpse of those features, such as code folding, inlay hints, or docstrings rendered as HTML:

https://x.com/efortis/status/1922427544470438381

crq-yml · 20h ago

I think the problem can be defined equally as: we can't invest in something more abstract than "plain text" at this time. When we try, it gets downgraded to a plain text projection of the syntax.

The plain text encoding itself exists in a process of incremental, path-dependent development from Morse Code signals to Unicode resulting in a "Gigantic Lookup Table" (GLUT, my coining) approach to symbolic comprehension. The assumption is useful - lots of features can "just work" by knowing that a particular bit pattern is always a particular symbol.

If we push up the abstraction level, we get a different set of symbols that are better suited to the app, but not equivalent GLUT tooling. Instead we usually get parsing of plain text as a transport. For example, CSV parsing. It is sloppy; it is also good enough.

Edit: XML is also a key example. It goes out of its way to respect the text transport approach. There are dedicated XML editors. But people want to edit it as plain text and they can't quite get there because funny-business with character encodings gets in the way, adding a bunch of ampersands and semicolons onto the symbols they want to edit. Thus we have ended up with "the CSV of hypertext documents", Markdown.

rs186 · 10h ago

Ah, eslint-config-airbnb. My favorite airbnb config issues:

https://github.com/airbnb/javascript/issues/1271

https://github.com/airbnb/javascript/issues/1122

I literally spent over an hour when adapting an existing project to use the airbnb config, when code was perfectly correct, clear and maintainable. I ended up disabling those specific rules locally. I never used it in another project. (Looks like the whole project is no longer maintained. Good riddance.)

The airbnb config is, in my view, the perfect example of unnecessarily wasting people's productivity when linting is done badly.

banashark · 21h ago

Interesting read. I’ve often wondered why the projection we see needs to be the same as the stored artifact. Even something like a git diff should be viewable via a projection of the source IR.

With things like treesitter and the like, I sometimes daydream about what an efficient and effective HCI for an AST or IR would look like.

Things like f#s ordered compilation often make code reviews more simple for me, but that’s because a piece of the intermediate form (dependency order) is exposed to me as a first class item. I find it much more simple to reason about compared to small changes in code with more lax ordering requirements, where I often find myself jumping up and down and back and forth in a diff and all the related interfaces and abstract classes and implementations to understand what effect the delta is having on the program as a whole.

oftenwrong · 6h ago

Storing an IR also means we can create languages beyond the limits of syntactical practicality. Imagine, for example, an entire comment/documentation dimension of the code. Instead of commenting on a line near some code, you could attach comments semantically to an expression, or to a variable, or to any unit of code.

hliyan · 5h ago

Token, statement and block level annotations would actually be nice. Perhaps even nicer if those annotations could be structured data instead of just text. You could create a truly self-describing code base without having to worry too much about the second hardest problem in programming.

PaulKeeble · 19h ago

In theory we could have an IDE apply a reformatting to any piece of code we looked at and formatted any changes back to the standard for the code base on updates. One of the things I dislike is that sometimes autoformatting does a poor job and looses some information that manually formatting provides but honestly in go fmt is mostly fine it just works.

All of this seems doable, I just think for the most part we don't care very much about our preferences, it has very little impact on readability. Its definitely doable however we could view the code however we most wanted it and have it stored in a different formatting. Might not be 100% round trip stable but it probably doesn't matter.

There is always better where the defaults can be overridden and formatting forced and we only format new and changed lines to reduce potential instability but again go fmt doesn't really suffer from this so its possible to make things pretty reliable. Its simple really, there is a default formatting and the code is stored that way and we can then have our view of choice reformat the code as we want it, when its stored its stored in the default.

GuB-42 · 11m ago

And yet, judging by the number of comments, it is definitely not a solved problem :)

lisper · 20h ago

It never ceases to amaze me how many times people can essentially re-invent S-expressions without realizing that's what they are doing.

benrutter · 13h ago

Scanning the comments waiting for a lisper to comment and found one!

I guess lisp still has whitespace? That seems like the only meaningful way it isn't already just what the post is describing.

Jach · 10h ago

In actual Common Lisp development, code is stored in text files and edited and diffed as text in source controlled repositories. Once code is evaluated by an implementation, it's a different story, but before that there are many formatting options. It's mostly around where to put line breaks, whitespace, and parens, but still. The other day I wrote this simple function:

    (defun check-password-against-hash (password hash)
      (handler-case
        (bcrypt:password= password hash)
        (error () nil)))

There's already multiple choices on formatting (and naming, and other things) just from this sample.

In theory a system could be made where this level of code isn't what's actually stored and is just a reverse pretty-print-with-my-preferences version of the code, as the post mentions. SBCL compiles my function when I enter it, I can ask SBCL to describe it back to me:

    * (describe #'check-password-against-hash)
    #<FUNCTION CHECK-PASSWORD-AGAINST-HASH>
      [compiled function]
    
    Lambda-list: (PASSWORD HASH)
    Derived type: (FUNCTION (T T) *)
    Source form:
      (LAMBDA (PASSWORD HASH) (BLOCK CHECK-PASSWORD-AGAINST-HASH (HANDLER-CASE (CL-BCRYPT:PASSWORD= PASSWORD HASH) (ERROR NIL NIL))))

I can also ask SBCL to show me the disassembly, perhaps again in theory a system could be made where you can get and edit text at that level of abstraction before putting it back in.

    * (disassemble #'check-password-against-hash)
    ; disassembly for CHECK-PASSWORD-AGAINST-HASH
    ; Size: 308 bytes. Origin: #xB8018AA278                       ; CHECK-PASSWORD-AGAINST-HASH
    ; 278:       498B4510         MOV RAX, [R13+16]               ; thread.binding-stack-pointer
    ; 27C:       488945F8         MOV [RBP-8], RAX
    ; 280:       488965D8         MOV [RBP-40], RSP
    ; 284:       488D45B0         LEA RAX, [RBP-80]
    ; 288:       4D8B7520         MOV R14, [R13+32]               ; thread.current-unwind-protect-block
    ; 28C:       4C8930           MOV [RAX], R14
    ; ... and so on ....

(SBCL does actually let you modify the compiled code directly if you felt the urge to do such a thing. You just get a pointer to the given origin address and offset and write away.)

But just going back to the Lisp source form, it's close enough that you could recover the original and format it a few different ways depending on different preferences. e.g. someone might prefer the first expression given to handler-case to be on the same line instead of a new line like I did. But to such a person, is that preference universal, or does it depend on the specific expressions involved? There are other not strictly formatting preferences at play here too, like the use of "cl-bcrypt" vs "bcrypt" as package name, or one could arrange to have no explicit package name at all. My own preferences on both matters are context-sensitive. The closest universal preference I have around this general topic is that I really hate enforced format tools even if they bent to my specific desires 100% of the time.

I'd say the closest modern renditions of what the post is talking about are expressed by node editors. Unreal's Blueprints or Blender's shader editor are two examples, ETL tools are another. But people tend to work at the node level (and may have formatting arguments about the node layout) rather than a pretty-printed text representation of the same data. I think in the ETL world it's perhaps more common to go under the hood a little and edit some text representation, which may be an XML file (and XML can be pretty-printed for many different preferences) or a series of SQL statements or something CSV or INI like... whether or not that text is a 'canonical' representation or a projection would depend on the tool.

lisper · 5h ago

> In actual Common Lisp development, code is stored in text files and edited and diffed as text in source controlled repositories.

That's true, but there is a very big difference between S-expressions stored as text and other programming languages stored as text because there is a standard representation of S-expressions as text, and Common Lisp provides functions that implement that standard in both directions (READ and PRINT) as part of its standard library. Furthermore, the standard ensures READ-PRINT equivalency, i.e. if you READ the result of PRINTing an object the result is an equivalent object. So there is a one-to-one mapping (modulo copying) between the text form and the internal representation. And, most importantly, the semantics of the language are defined on the internal representation and not the textual form. So if you wanted to store S-expressions in, say, a relational database rather than a text file, that would be an elementary exercise. This is why many CL implementations provide alternative serializations that can be rendered and parsed more efficiently than the standard one, which is designed to be human-readable.

This is in very stark contrast to nearly every other programming language, where the semantics are defined directly on the textual form. The language standard typically doesn't even require that an AST exist, let alone define a canonical form for it. Parsers for other languages are typically embedded deep inside compilers, and not provided as part of the standard library. Every one is bespoke, and they are often byzantine. There are no standard operations for manipulating an AST. If you want to write code that generates code, the output must be text, and the only way to run that code is to parse and compile it using the bespoke parser that is an opaque part of the language compiler. (Note that Python is a notable exception.)

whartung · 2h ago

Its interesting that despite the utility of S-Expressions, as mentioned, semantic diff, for example, of CL code is uncommon.

By that I mean highlighting the diff between these:

  (dolist (i l)
    (print (car i)))

  (dolist (i l) (print (cdr i)))

With the diff highlighting the `car` changed to `cdr` rather than just the raw lines being changed.

I'm pretty sure this exists, but it's uncommon (at least to me its uncommon).

lisper · 1h ago

It is uncommon because it turns out that text diff is good enough 99% of the time, especially if you follow normal formatting and indentation conventions.

Also, structural diff is actually a very hard problem.

mdaniel · 19h ago

Wait until that Bablr user shows up to these threads, and then you'll really have to start drinking

conartist6 · 9h ago

Wow I am thoroughly honored. You are probably the first person ever who isn't me to bring it up in a thread.

I had never heard of DIANA but I love old ideas being new again. (Plus you made me laugh)

mhh__ · 4h ago

A small point on formatting that I'm getting increasingly firm about as I "age" (still not that old): Formatting is very important, but if you find yourself complaining about what a formatter does to your code, the code be bad (obviously making an exception for a big block of constants or something)

e.g. if the formatter is really shifting stuff around, your code might be too nested - if you have a compiler, let it take the strain.

stevedekorte · 4h ago

Io (http://iolanguage.org) can work this way, as the message tree (maybe including comments - I don't recall) is what the interpreter used to evaluate the code and is accessible at runtime. However, it didn't store the choice of terminator (newline vs return) or indentation info (though it would be easy to add), so the pretty print of the message tree might look different depending on the source conventions.

TheAlchemist · 16h ago

I like that. We should have something like this for python.

Black is great, but maybe it's just me since it aligns with how I like the code formatted.

Would there be any downsides for python (or git ?) to define a standard way of formatting to save a valid file, and all the formatting necessary to read a file happens in the IDE showing the file ?

That would very much fit with python ethos 'There should be one-- and preferably only one --obvious way to do it.'

benrutter · 13h ago

I think the downside would mainly be complexity. As soon as you do that, you have to develop what that intermediate representation is and how it gets stored. But moreover, you'd need to develop workarounds for the fact that all external code infrastructure (version control, editors, command line tools) is built for text.

I can't see a crazy huge downside from a python point of view, but seems like a much bigger upside than flexible formatting would be needed to justify breaking from all of that stuff.

TheAlchemist · 1h ago

I was thinking just plain python - let's say Black formatted code is the default and we commit only that. Then on the visualization side, the IDE can format it for whatever we want.

Actually, this could be a really easy feature for the IDE and could work already easily.

burnte · 3h ago

Whitespace being important is literally the reason I don't use Python. I'm absolutely TERRIBLE with whitespace in my code, I do not indent like most people, I indent a lot less frequently. I learned coding on TRS80 Model 100 basic, and in DOS with Borland IDEs. Bytes mattered! :D

zahlman · 7h ago

> It's 2025, how are we still dealing with this sort of thing?

You have to get everyone set up to use it, whereas everyone is already, of necessity, set up to use plain text.

And we aren't all using the same programming language and the same hardware setup.

Thus, specifically:

* everyone has to agree on an IR standard; if it can't accommodate every programming language, then there needs to be coverage for all the programming languages, and a way for software systems to know which one to use

* everyone has to have local software that can convert back and forth (they can't just rely on something built in to the "development system", I assume burned into a ROM)

* everyone's version control setup has to invoke that software as a commit hook

* the IR has to be designed in a way that allows for meaningful diffs, and the version-control software needs to be aware of how to diff and patch (which potentially also means a new standard for diff files)

whartung · 2h ago

What "everyone"?

They don't have to agree on anything.

The poster child for this is Smalltalk. An untraditional environment despite being around 60 years. The source code is stored locally in an internal file tied directly to the runtime image. You can export/import code through "traditional" avenues, but not just anything, it needs to be structured source code. You can't readily move raw text in and out.

Despite this impedance mismatch with "the rest of the world", ST folks have been developing code and collaborating for decades. They even manage to get things accomplished.

Also, consider many of the modern logging platforms that are logging to databases rather than just raw files. While some grouse about that, others manage to make do. Letting the structured log managers handle the lower level details and provide a better UX.

The game is to make sure that your UX for your system is capable of the task, not worrying about interoperating with everyone else.

kesor · 19h ago

This is how Chrome Dev Tools shows source code. The original is often minified or in whatever format the author left it. And when you check the "pretty" checkbox in dev tools, it shows up using whichever format Chrome developers decided it should look like.

ozim · 7h ago

I wouldn’t have an issue if title had „could” instead of should.

There are most likely good reasons why Ada and DIANA are not in widespread use.

__MatrixMan__ · 18h ago

Unison doesn't move the formatting choices further than the machine on which the code was written. The codebase only contains the AST.

Its such a cool idea, though I haven't spent much time using it in anger, so its hard to say if its a useful idea.

oftenwrong · 6h ago

Unison's immutable definitions also enable a bunch of compelling capabilities. No merge conflicts. Incremental everything: build, test, lint, distribution, rendering as formatted text, et cetera. Trivial to apply "hot" updates to running systems.

wonger_ · 14h ago

Yeah, if any language has potential for AST source of truth instead of textual source of truth, it's Unison.

I'm just waiting for a breakthrough project to show that it's ready for wider adoption. Leaving text-based tooling is a big ask.

The principles behind Unison, for those who haven't read them yet: https://www.unison-lang.org/docs/the-big-idea/#richer-codeba...

> Each Unison definition is identified by a hash of its syntax tree.

laserbeam · 17h ago

Reminds me of dion systems. A few years ago a group of devs was working on a programming environment that feels very close to what DIANA is describing.

The project is dead enough that they no longer own the TLD for the company. As far as I know, the only remnants of the project are youtube recordings of demos held at conferences.

lordnacho · 21h ago

Aren't most projects these days written in a mix of languages, most of them text? You'd have to get them to change to use the same tools we currently use, or else you'd have to use special tools. The beauty of the modern stack is the base tools are near universal.

If you want everyone to see their own preference of format, either write a script or get AI to format it for you.

giorgioz · 4h ago

In Javascript there is Prettier which auto-formats the code on saving: https://prettier.io/ So essentially you stop caring about adding new lines or tabs, just press save and the code gets indented/formatted correctly.

rglynn · 4h ago

I think the point is that:

A. not everyone on your team is using prettier

B. not everyone is using the same config/agrees on what it should be

MaxLeiter · 4h ago

Yeah something I should have covered more is a lot of my frustration comes from the _tooling_ around formatting. Prettier is quite slow, people may not have it setup right, etc.

ChrisMarshallNY · 20h ago

I've heard that Google works [sort of] that way (don't know, myself). They have a lot of tools that allow devs to use what formatting they want, and it's made standard, during checkin.

I heard this, many years ago, when we used Perforce. The Perforce consultant that we dealt with, told us this, as an example of triggers. Back then, I was told that Google was a big Perforce shop (maybe just a part of Google. I dunno).

I have heard that this was one of the goals of developing IDLs. I think the vision was, that you could have a dozen different programmers, working in multiple languages (for example, C for the drivers, Haskell for the engine, and Lua for the UI). They would be converted to a common IDL, when submitted to configuration management, and then extracted from that, when the user looks at it.

I can't see that working, but a lot of stuff that I used to think was crazy, has happened, so, who knows?

yojo · 20h ago

I can confirm that Google was using Perforce for version control extensively, at least through 2008. I think it was somehow customized, but I definitely have lingering muscle memory around “p4 sync” and “p4 submit”.

I was on an internal tools team doing distinctly unsexy LAMP-stack work, but all the documentation I ever saw talked about perforce/p4.

__loam · 20h ago

Go was designed at Google with a built in style checker to explicitly address this and prevent bikeshedding.

stared · 8h ago

There are multiple ways to write exactly the same program. Linters are able to reduce this number by moving to some canonical version.

With modern tools it it is easy to add formatting on saving or on commit. So I don't understand what's the fuss about.

At the same time, for the most important tool in software engineering, Git, it matters which lines are changes. And it is better to only see actual logic changes, not swamped in tabs vs space or other parts that are just formatting.

That said, I would love to see more of this splitting between actual internal representation and view. Don't like anything in style guide (or even syntax alike curly brackets vs indentions) - just change view, alike folding.

karel-3d · 13h ago

Formatting code is a typical bikeshed argument. Nobody can say a thing about the nuclear reactor, everyone has an opinion about the bike-shed.

grim_io · 2h ago

I came prepared to fight the author, but I left in agreement.

Yes, we should expect better from our tools and languages.

BurningFrog · 6h ago

Code layout is important, even though it - much like naming - doesn't impact functionality.

I can write code that (IMHO) is substantially better than any formatter. But I've realized that there is no way to make other people on a team have the same opinions and skill as me, so I accept automatic code formatters.

preommr · 20h ago

Others have already mentioned how why this is a bad idea (e.g. common plaintext tools don't work, added complexity, etc.)

But I'll also mention that this pretty much already exists. You can have whitespace options for git. I also imagine there's some setup using hooks that uses one formatter locally, and another for remote.

Also, the common IR already exists - it's just the AST. It was "solved" back in the day when people were throwing whatever they could to the wall to see what sticks since it was all so new. With the benfit of hindsight, I think we can say that it's not that good of an idea.

leipert · 15h ago

With git you could set up “smudge” filters to do your own formatting on checkout and “clean” formatting for the canonical formatting on staging files.

https://git-scm.com/book/pt-br/v2/Customizing-Git-Git-Attrib...

perlgeek · 14h ago

> Back when he was working on Ada, they didn't store text sources at all — they used an IR called DIANA. Everyone had their own pretty-printing settings for viewing it however they wanted.

What about comments? Were they part of the IR?

(I agree with others that version control, grep etc. are also very important, and kind of a deal breaker).

sirwhinesalot · 13h ago

I wrote an article saturday on visual programming which is very related to this, but my thinking is the opposite of this article.

Raw text is amazing at smaller scales. The ability to apply a bunch of intermediate incorrect transformations to reach a valid destination is invaluable (like doing a bunch of hacky find/replace).

Projectional editors like JetBrains MPS have tons of disadvantages vs text, and the few advantages don't make up for it.

Formatting is a silly problem to have, but far beyond that why are we manipulating text files directly rather than editing a live program (ala Smalltalk). Text can just be the on-disk serialization format you never look at.

(Raw text is still how you edit individual functions and methods in Smalltalk, there just isn't any actual text file on disk)

whartung · 1h ago

My singular problem with visual programming is simply the amount of detail necessary in modern programming.

Code is flat out complicated, with lots and lots and lots of steps, each with perhaps even more detail.

And it's hard to do that efficiently with visual editors. Imagine a display with instead of thousands of lines of code, you have thousands of symbols.

Or, the visual editors break things down in to components that are so small they do not convey the "big picture" well.

It's a personal complaint with the way Smalltalk works. Lots of methods, small (ideally) snippets of code, all viewed in isolation.

It's common (at least for me) to put related code together in the source file. It's useful to scan the whole file to get a feel for the flow of the code, and the system. Looking at isolated code, out of context, has always been a struggle for me. There's a reason my code is not sorted alphabetically by function name.

Maybe if you organized code visually, that is, perhaps the upper left is the start up code, the lower right is some core math all collected together like beads in a pot. "All red ones go here, all the 1" ones go there".

Granted I have not worked on such a tool or such a project. But the linear presentation of code as structured text has worked well for me, even when I bounce around between modules in the IDE.

sirwhinesalot · 56m ago

Indeed, putting related code together in the same source file is one of the ways we cope with complexity.

Though it too breaks down, because the relations between various bits of code may be so complex that there's no good way to "linearize" them.

And you should be documenting your code, but documentation comments take up space on the screen since they are linearly arranged in the same file, so you see less functions at a time.

Imagine as an alternative that the documentation was presented on a side view of the functions, like how you can open two files side by side in VSCode. Then you'd be able to see many more functions at the same time.

If you have any unit tests then it would be great if you could see them (and run them) while editing the function. In Rust you can put tests in the same file as the function (very nice) but usually on a submodule at the bottom of the file rather than near the function itself. Again, the problem of trying to linearize everything in a single file.

The issue with visual programming tools is that they don't put any thought into this. On how they could actually help get you the information you want to see. Instead they focus on letting you make cute drawings.

It's a UX problem, we should be able to do better than text files, even if what we end up editing is still text (because of all the advantages it has).

MaxLeiter · 12h ago

Drop a link!

sirwhinesalot · 2h ago

Sorry for the late reply, here you go:

https://btmc.substack.com/p/thoughts-on-visual-programming

amdivia · 6h ago

I assume that's something similar to the Unison [1] programming language

https://www.unison-lang.org/

hannasm · 15h ago

This article is barely a comment on some other situation; but I've been saying this to anyone who wants listen for years.

There's nothing special about whitespace (unless you write python).

Capitalization and a bunch of other stuff in your coding convention document are usually just signs that you have poor tooling and lack of skill.

Give me a PR that satisfies the requirements and the appropriate test cases and i'll happily rewrite it to spaces only indented with curly braces on newlines and etc... as I see fit.

The hard part is the first two tasks, you can train an intern to do the third

hackerbrother · 20h ago

Along these lines, Go eliminates many formatting decisions at the syntax level. E.g.,

  func main()
  {
          fmt.Println("HELLOWORLD")
  }

is not just non-standard formatting, but illegal Go syntax. Similarly, extra parentheses around if clauses are not allowed.

TheDong · 13h ago

> Similarly, extra parentheses around if clauses are not allowed.

However 'if (x) == (1) {}' is totally fine with the formatter. As is an assignment of '(x) = (y)'.

It's actively annoying too because like, extra parenthesis often have important meaning.

For example, consider the following code:

    if (x.isFoo() || x.isBar()) /* && x.isBaz() */ { /* code */ }

In that case, the code is obviously temporarily commented out, but go's formatting will make it so that if you comment it out like that, fmt, and then uncomment it and forget to re-add the parens, you get shot in the foot.

I've hit that far more times than it's uhh... I dunno, I guess removed parenthesis I didn't want? I don't write them if I don't want them.

Tractor8626 · 9h ago

Author advocates for a thing they never used.

"It must have been good because Grady Booch says so".

jaimebuelta · 11h ago

I think that formatting code is necessary to maintain a codebase that's used by multiple people and keep some consistency. It's very confusing to have different standards in different parts of the same code.

Code should be generally written so it's easy to read.

BlueUmarell · 6h ago

Ohhh, a topic about code formatting

inetknght · 5h ago

tbqh I would be insanely happy if a formatter would format code the team's way when committing/pushing and my way when I'm viewing locally

I've never found a single formatter that formats my way though...

fridental · 9h ago

Typing keywords letter by letter is unnecessary too. Think about ZX Spectrum keyboard allowing you to type BASIC keywords with just one key press.

whartung · 1h ago

This was actually a potential problem, at least on Commodore machines.

On those machines you were able to abbreviate keywords.

At the same time, they support full screen editing. That meant you could just cursor up over some code, make changes, hit enter, and the changes would take place.

However, when using the abbreviations, it was possible to create lines that were too long. I don't recall the specifics, but there was a line limit for BASIC input. Lets say it was 80 chars (for discussion).

Using abbreviations (like ? for print) and you could end up with a line that would LIST for more than 80, but if you tried to change it with the screen editor, the lines would be too long, and truncate silently.

So you had to be cautious with your use of the abbreviations.

Cockbrand · 5h ago

Similar, and maybe more related to the article's topic: Commodore BASIC also saved the commands as tokens, so you could enter abbreviated commands like

  10 ? "Hello"
  20 gO 10

and a LIST command would yield

  10 print "Hello"
  20 goto 10

So saving commands as tokens in memory and formatting them on output was somewhat common back then.

The speccy was more advanced in terms of this (as mentioned in the parent comment), and it had the better BASIC for sure.

cluckindan · 8h ago

Or think about the M (aka MUMPS) language, which allows you to type just the first letter(s) of a keyword and considers it valid syntax.

Imagine Java if you could…

    na com.mycompany.myapp;
    
    pu cl MyClass {
      pro sta i = 42;
    
      pri fi ch[] MAGIC = ['a', 'b'];

      pu sta v main(String[] args) {
        OtherClass otherClass = n OtherClass();
        f (i i = 0; i < MyClass.i; i++) {
          otherClass.hex(i, this.MAGIC);
        }
      }
    }

lxe · 19h ago

> you could view the source however you wanted. Spaces vs. tabs didn't matter because neither affects the semantics and the editor on the system let you modify the program tree directly (known today as projectional editing).

But formatting still doesn't matter. Outside of whitespace-dependent languages, formatting is a subjective thing -- it's a people concern, not a computer concern. I can store my JavaScript as AST if I want to.

kmoser · 16h ago

There are annoying edge cases where formatting does matter, such as whitespace around HTML text nodes, e.g.:

  <span>foo</span>

vs:

  <span>
    foo
  </span>

grumbelbart2 · 10h ago

Sure, but the same goes for verbatim strings, or leading whitespaces in python. "Code formatting" questions usually only concern themself with those degrees of freedom that do not alter the semantic meaning of the code.

kmoser · 6h ago

Agreed, but my experience is that many formatters will try to "help" by formatting non-code, thereby breaking something.

jmward01 · 18h ago

I have gotten into discussions with people about linters and code formatting standards in general and I always liken it to a work desk. If my company decided that every work desk had to be 100% generic and that every day if I put any adjustment, even to the seat height, on that desk they would reset it, I would probably think that place was hostile. Even if I could 'auto format' it back to something close every time I stepped up to the desk I would be pretty unhappy. It just wouldn't feel like mine and eventually I would be beaten into whatever style, which wasn't my own, the code came out of the repo as. Basically, linters are evil. They only work for the person that set them up.

Leave code format up to the primary owner of the file. It is pretty rare that code has more than one person that does 95% of the edits on a file so let them own the formatting. In the rare case where there are shared files with shared edits then it is ok to mandate some sort of enforced format but those are so rare that it generally isn't worth discussing. The proposed approach here ignores all the messy non-standard stuff that happens because of the margins or the rules that are very hard to build in when codifying personal coding style.

Let me have my messy desk and I'll let you have yours.

semiinfinitely · 3h ago

some people do not understand the difference between form and substance

gethly · 12h ago

I am quite thankful for Go. It ticks so many boxes. Having native gofmt and simple syntax means I can easily read anyone else's code instantly.

Perz1val · 10h ago

One thing I absolutely despise is 2 space indentation. Instead of not creating stupid 5+ level deep nesting, people decided to decrease the indentation size. Then everyone uses tab colorizers or those stupid vertical margins in vscode, because you can't read it otherwise. And then those people put it everywhere with some opinionated formatters - looking at you Elixir. I don't even have vision problems and it's already shit to look at.

shmerl · 21h ago

You can't easily search / grep etc. an IR, unless you use some kind of reverse translator. Readable source files have their benefits in being simple in that sense.

marssaxman · 20h ago

Imagine having to write a new diff tool for each language!

kesor · 19h ago

You don't need a special grep for every language, you just need a tool that translates the mini version into the formatted version and back. Then you chain the tools, just like anything else in UNIX.

marssaxman · 18h ago

Seems reasonable. Since you're likely to perform this translation more than once for any given file, it seems like it would be practical to cache the translated output, perhaps as a file on disk.

eviks · 16h ago

> unless you use some kind of reverse translator

Would a few decades help in universally having such a translator in all the tools?

account42 · 12h ago

In all the tools? No, you'd need an infinite amount of time for that.

eviks · 12h ago

The tools didn't require an infinite amount of time to write, why would it take infinity to change a format??? (but no not all the tools, just all the ones that are used)

yeasku · 19h ago

Kind of a stupid take if you ever plan on sharing your code or using git.

kesor · 19h ago

Is it? When every developer has an IDE that can easily format the code in whichever the way they prefer, and minify it back just before pushing a commit.

MaxLeiter · 19h ago

No reason websites couldn’t let you choose how to view it like editors, either

yeasku · 19h ago

The world is not javascript.

globular-toast · 5h ago

I feel like squeezing out every possible place for people to be creative and express themselves creates a pretty boring environment. Like, imagine if all buildings were just purely functional. They'd all be the same grey box shaped things. I recently worked on a codebase that was lovingly developed by a single maintainer and had all kinds of weird and wonderful quirks that made me smile every so often. I'm a pretty serious person most of the time but I'm not going to lie it made my day that little bit more interesting.

bertil · 20h ago

That’s essentially what black has done with Python, though.

dubya · 18h ago

I want to like Black (or rather, uv format), but the mandatory trailing commas weird me out, especially in function definitions. It always looks like an error to me.

thewisenerd · 14h ago

this is so future 'git diff's when adding new parameters don't look bad

i wonder how many default formatting decisions are made this way (including go fmt, etc)

dubya · 5h ago

I've seen the reason, I just don't find it convincing. I would like to omit these commas, but also use the formatter sometimes, but that's not really an option. It's frustrating since 'uv format' will omit them for older python versions, so the logic is there.

Something between "everything fits on one short line" and "every argument gets its own line" would be nice too. Spreading a function definition or call across ten lines when it would fit on two or three doesn't feel like an automatic win.

Jean-Papoulos · 14h ago

So I need to run a tool to even be able to read the code. No thank you, if anything goes wrong your file is now garbage.

benrutter · 13h ago

In theory that's true of standard office document files too (.docx etc)? Don't think I've encountered too many issues of that actually happening though in the wild?

I'm mainly just being pedantic to be honest, I realise my comment is just me essentially saying "what could possiblye go wrong?"

account42 · 12h ago

Word documents absolutely do get messed up if you open it in a different office suite - or even a different enough version of the same suite.

IshKebab · 9h ago

Doesn't seem like a real risk - these files are all version controlled.

The bigger problem is you now need custom tooling for your IDE, version control, diff & merge, code review, code hosting, etc. etc.

conartist6 · 8h ago

You're spot-on. HTML and XML are similar textual embedding formats for documents which can contain arbitrary text. Neither is particularly good when the documents are syntax trees though. CSTML takes the design ideas that obviously work and adapts them to be natural for syntax trees, like this: https://gist.github.com/conartist6/75dc969b685bbf69c9fa9800c.... The trees always store inside themselves the complete source code as the parser saw it, so there's little fear of not being able to recover the original, and they're quite human-readable unlike, say, docx files. My team and I are working on building out the VCS, diff and merge, review, hosting, grepping, blogging and other kinds of solutions necessary to make our format able to take over as a de-facto standard.

dark-star · 13h ago

In most languages (other than e.g. Python), formatting code is actually unnecessary...

pluto_modadic · 15h ago

my initial gut take from the title was "OH MY GOODNESS don't let {x} toxic developer who writes TERRIBLE code see this as justification". There are enough toxic bros who either 1. think this is unnecessary or 2. think it is perfectly solved.

re: intermediate representation and projectional editing: yes, editors are now getting better at helping you refactor code (rename function in language XYZ is possible in language servers for IDEs, /no AI required, it works better when a human coded AST tool does it/)

projectional editors aren't around /because the more complex parts of it are harder/ - BUT - I could definitely see more intelligent refactor tooltips written by humans.

For example: in Rust, if I've been passing a pointer vs borrowing (or whatever), pattern A for most of my code, then pattern B and it complains, it would be useful to have a tooltip that goes "do you want to refactor all the other references/parameters to pattern B" instead of Rust's default "this function isn't using pattern A" borrow checker error.

anacrolix · 7h ago

you don't say

paphillips · 6h ago

Ahh yes, nothing brings out the strong opinions like formatting. Let's do SQL next!

pbiggar · 11h ago

This is how darklang works as well.

komali2 · 14h ago

Formatting code is unnecessary. You write it however you want, then run the lint fix command and prettier command, commit, and move on.

jgalt212 · 7h ago

line length limits make code harder to grep

_ZeD_ · 14h ago

good luck with that.. I didn't had a decent IDE capable of even implementing elastic tab stops... I expect people using hex editors to flip bytes directly on the binary blob saved on disk instead of using the appropriate tools to view the "source" files.

btw: have a look at how much disdain was reserved for systemd and its pletora of binary blobs + custom tools (e.g. the journal stuff) ... and that was basically forced upon from the distributions

pandemic_region · 13h ago

I wish for Java to have a built-in formatter just like Go has. Enough with these plugins and google-format and messy intelij settings.

cnnlives83 · 20h ago

It basically is, unless you’re in a whitespace-Nazi language like Python (no offense!).

It doesn’t get much less formatted than Minified JavaScript, except maybe Perl or Brainfuck.

kesor · 19h ago

Minified JS often comes with mangling the names of functions, variables, etc... Formatters and prettifiers lack the ability to bring back the original names and meaning.

numtel · 12h ago

With how much LLMs do nowadays, I'm waiting for the time when specifying types is unnecessary. Like, it if can write code, shouldn't we also be able to have an AI type checker?

linhns · 4h ago

It writes better when correct type is specified beforehand. So chicken and egg problem you have.

IshKebab · 10h ago

Systematic checking like that is pretty much LLM's worst case. They're really bad at it. Definitely better to use them to suggest types.

Frieren · 12h ago

Correctness is a key characteristic of a compiler.

To have something that sometimes checks the types and some times does is not a feasible solution.

Google Meet Outage

Ask HN: I can't remember the name of a AI CLI tool I saw a few weeks ago

Not caring enough about money?

Claude Code fails to fetch or create the GPL v3.0 license when asked

Ask HN: How to take notes and learn from them?

Ask HN: Looking for headless CMS recommendation

Ask HN: Good resources for DIY-ish animatronic kits for Halloween?

Ask HN: Who wants to be hired? (September 2025)

Ask HN: How much can we trust open-source projects or our hardware?

Ask HN: Who is hiring? (September 2025)

Ask HN: How to avoid passive use of AI?

Ask HN: Can an amateur make contributions to pure math or theoretical physics?

Raku.org Chooses Htmx

Ask HN: Are api.nasa.gov, data.nasa.gov down or shutdown?

Why the Technological Singularity May Be a "Big Nothing"

Ask HN: Significant reduction in AI related submissions?

Automated Workday check in/check out and Microsoft Teams messages monitoring

Ask HN: Is Reddit going the way of Stack Overflow?

If AI agents take the jobs, who buys the stuff?

Ask HN: Why does Google word privacy settings like you agree even when off?

Ask HN: Moving from Dev to PM

Ask HN: What do you think of the new Digg?

Ask HN: Is your company still hiring junior engineers?

New Member Alert

Ask HN: How long did it take you to learn Git?

Tell HN: My advice after I applied to 450 positions before getting hired

Ask HN: LLM struggles to center div too?

Ask HN: Useful AI applications in regular businesses?

Ask HN: Why do LLMs struggle with word count?

A16Z scouting ambitious Swiss founders for $1M accelerator

Ask HN: How do you fight YouTube addiction and procrastination? I'm struggling

ASIC: Proof-of-Concept Binary Optimizer Reduces Size, More to Come

Ask HN: When was the last time you visited Stack Overflow?

Ask HN: VSCode AI Autocomplete Woes

Formatting code should be unnecessary

Comments (438)