Converting a Git repo from tabs to spaces (2016)

38 keybored 82 5/2/2025, 1:06:56 PM eev.ee ↗

Comments (82)

mmastrac · 12h ago
I wish Git had a way to "skip" a commit for blame for mechanical changes like this. It's the one big shortcoming I keep running into. A commit should be able to be marked as "blame-free" and git blame should then walk up to the parent commit.

It might be expensive to compute but man it would be so useful.

Edit: TIL about .git-blame-ignore-revs. I am the 1 in 10000 for this one today, thanks.

js2 · 12h ago
It does. See `--ignore-revs-file`:

https://git-scm.com/docs/git-blame

You can configure a default:

  git config blame.ignoreRevsFile .git-blame-ignore-revs
GitHub supports it too:

https://docs.github.com/en/repositories/working-with-files/u...

I'm really curious though. This is a feature you've wished for: have you never bothered to run `man git-blame`, `git blame --help`, or Google for it? Git has supported it for ages and it's a trivially easy feature to find. Using your own description:

https://www.google.com/search?hl=en&q=git%20skip%20commit%20...

McP · 4h ago
Nice to see ignore-revs getting some love :)

I originally wrote it because I wanted to do a mass-refactoring to llvm-project to change its weird naming convention and "it will mess up git blame" was an objection that was raised. Getting ignore-revs landed took many iterations over several months (thanks Barret!) and at the end of it I felt so drained that I didn't have the energy to do the mass refactoring I originally planned. Oh well. Maybe someday.

mabster · 3h ago
A big thank you! Blame history being correct is something i care quite a bit about and I always add one of these files when I do formatting changes. I think I'm probably the only developer on my teams with this configured on though haha!
jayd16 · 7h ago
The annoying thing about git is that you can't really set this kind of stuff up globally for a project w/o digging into some custom hook solutions. They should really have some kind of default config file with all these things. I really don't understand why everything needs to be per user settings ONLY.
npendleton · 2h ago
Whoa, it looks like my old patch is getting fixed up properly now! We might be getting this feature https://lore.kernel.org/git/20250501214057.371711-4-gitster@...
IshKebab · 11h ago
It would be a lot more usable if you could put that info in the commit.
prepend · 6h ago
No. I don’t want the author to make that decision for me. I’d rather git record everything and then I can choose how to view or render it.

Different people have different view preferences.

OptionOfT · 11h ago
That on its own is a security risk, as it would introduce means to hide a commit in the commit itself.

At least with the . file you have to make 2 separate transactions.

IshKebab · 8h ago
No it wouldn't. You would still be able to see the commit in logs and file histories and if you ran blame without the skipping option.
mmastrac · 12h ago
I've been using Git since the early 2010s and this feature was released in Aug 2019 (https://github.blog/open-source/git/highlights-from-git-2-23...).

You don't think I looked for it for the first 7-8 years of using Git at least a few times and came up empty? Seems a little uncharitable. Hacker News is a place to learn about stuff, not be chided for missing a point note in a release.

Come on man, you've been using HN for almost as long as I have. Be curious, treat people's comments with charity, continue the life-long learning tradition.

Obligatory XKCD lucky 10,000 link: https://xkcd.com/1053/

js2 · 11h ago
You're right. My apologies. It wasn't meant as a critique. I've been using git even longer and my memory was that the feature had been there way before 2019. Time flies. Relevant commit:

https://github.com/git/git/commit/ae3f36dea16e51041c56ba9ed6...

xiaoyu2006 · 10h ago
I like how the length of the commit message is at the same order of magnitude as the commit itself.
keybored · 11h ago
Thanks. Git (proj.) commits can be an enjoyable read.
joshstrange · 12h ago
`.git-blame-ignore-revs` is probably what you are looking for

Example: https://gist.github.com/kateinoigakukun/b0bc920e587851bfffa9...

y-curious · 12h ago
My one gripe with this is that devs need to point their IDE to the file in the IDE settings. When I implemented .git-blame-ignore-revs, I got a lot of people complaining about blame disappearing completely and I had to point them all to editing IDE settings
braiamp · 11h ago
`blame -w` ignores the ones that are described in the article.
PhilipRoman · 11h ago
git-blame-ignore-revs is great, but ultimately a half measure. Replace blame with log -L
patrickthebold · 12h ago
Is .git-blame-ignore-revs what you are looking for?
kwk1 · 7h ago
See also `git blame -w`
joshstrange · 12h ago
> Then, commit! As per Yelp tradition when rewriting every single file in the whole codebase, I attributed the commit to Yelp’s lovable mascot Darwin. It stands out better in git blame, and it preserved the extremely critical integrity of my commit stats.

Interesting, I fully expected this blog post to touch on `.git-blame-ignore-revs` as a way to not "pollute" the git history but I'm not sure when that "came out". I found a Github issue from 2021 asking for support to be added to Github so it may just be newer.

How do other people feel about this? Massive code changes across the codebase? Where I work some people are (understandably) concerned about it "ruining" `git blame` or IDE tools to blame. It's not useful to see "Converting to spaces!" on every line you want more context on. Yes, you can step further back but that's always been a little awkward for me (at least in IntelliJ) but maybe I'm missing something. I just find it incredibly helpful to understand the context of why a line was last changed and I'd want to skip over any edits like tabs->spaces.

johnmaguire · 11h ago
zahlman · 11h ago
> I fully expected this blog post to touch on `.git-blame-ignore-revs` as a way to not "pollute" the git history but I'm not sure when that "came out".

Per https://news.ycombinator.com/item?id=43869828, it appeared August 2019 - so, indeed too late for OP.

e: Also, FTA:

> Blame is not, in fact, permanently ruined. git blame -w ignores whitespace-only changes.

matsemann · 12h ago
What if one instead rewrote the last commit for each line to use spaces for that line? Or just rewrite the whole history to have used spaces. Might break something in the history if one were to check out an old commit, though. And makes it hard to revert if something breaks due to changing to spaces (impossible to find the offender in the diff).
_Algernon_ · 12h ago
>Or just rewrite the whole history to have used spaces.

Ah, yes. The 1984 approach to coding

woodrowbarlow · 12h ago
`git blame -w` ignores whitespace-only changes, for what it's worth.
Alifatisk · 12h ago
Why would you want to convert from tabs to spaces?
diggan · 12h ago
> their mostly-Python codebase had always been indented with tabs

Tabs VS spaces isn't usually very important, but what's more important is that all the stuff is the same way. So if all the other codebases (in the same language) are using tabs, then make everything (in that language) use tabs. Consistency basically :)

gwbas1c · 7h ago
I used to agree with that, until I read this article. I would always use the IDE's default and "not care" as long as the code was consistent.

The problem with tabs is that they render as different widths in different contexts. For example, Visual Studio shows them as 5 spaces, but Github shows them as 8.

Puts me firmly in the spaces camp now.

InsideOutSanta · 6h ago
> The problem with tabs is that they render as different widths in different contexts

The funny thing is that this is why I prefer them. It means I control how indents render rather than the person who wrote the code.

mabster · 3h ago
I would agree, except that only deals with the left hand side of the code. We are also making decisions on the right hand side of the code to deal with lime width as well which only really works if all developers have the same tab size.

Nowadays I just chuck format on save on all the code I deal with so I don't have to deal with any of this stuff anymore.

If we take this to its longer conclusion though, it would be pretty good if our tooling supporting a difference between the view (using your own preferences) and storage (consistent code for committing to git or whatever).

diggan · 6h ago
> I would always use the IDE's default and "not care" as long as the code was consistent.

I mean, "just use the IDE's default" isn't really agreeing, unless that's what your entire organization does too, and you all use the same IDE :)

Alifatisk · 12h ago
I agree.
mcdonje · 11h ago
Because they're deranged control freaks who need to convert a single character that is semantically a tab into multiple characters that are an opinionated representation of a tab.

Devs: We need to separate concerns and split the view from the model.

Also devs: Someone might view the code differently!!1!

maw · 8h ago
A codebase that's formatted notgivingashittily is an accessibility issue. It's not just deranged control freakism.

Maybe Yelp's codebase was otherwise clean, but aside from golang projects (and the Linux kernel) I've come to associate tabs with unreadable slop code. Maybe your experience is different.

smrq · 6h ago
Forcing a single opinionated tab width is an accessibility issue -- a real one, not a weird heuristic that boils down to "tab fans can't format". I've read multiple accounts from people who need either very small tab widths (to accommodate unusually large font sizes for eyesight reasons without cascading off the side of the screen), or very large tab widths (to accommodate difficulty in seeing indentation differences, again for eyesight reasons).
umbra07 · 57m ago
I'm confused. how does handing control of the reading experience over to the reader = accessibility issue? isn't it the other way around? accessibility issues come in many different forms, and you can't accommodate them all yourself.
Defletter · 5h ago
I've been firmly pro-spaces ever since I discovered there was an everlasting war over this, and it came about primarily over documentation. Say you're writing documentation within a /***/ block, so each line is prefixed by three characters. Now say your documentation includes a code snippet. Or lets say that particular sections of the documentation (such as JavaDoc's @see) are indented so each line always starts after the @see. You end up with documentation indented with spaces because it's the only way to ensure consistency. And if you're doing it with your documentation... why not your code too?

However, my conviction has since been tested by Dart which opinionatedly forces you to use two-space indentation. There's no way to disable this and its IDE plugins enforce the style. I just find it so difficult to read, even with Rainbow Brackets. I yearn for Dart to use tabs just so I can configure the tabs to appear as four-space indentation. Or better yet, stop trying to coerce how people write their own code.

ooterness · 12h ago
zahlman · 11h ago
My best guess: using spaces selects for developers who understand how their editor works (which correlates with higher overall cluefulness), because they'd go insane otherwise.
david2ndaccount · 5h ago
Tabs are a control character and have no business being in a text file. Do you use ascii record separator characters too?
yjftsjthsd-h · 2h ago
> Do you use ascii record separator characters too?

The only reason I don't use them is because nothing supports/expects/shows them. The alternate history where we properly use them is a world where CSV isn't needed and we're better off for it.

IvyMike · 4h ago
Galaxy brain: indent using U+001F Unit Separator
imiric · 2h ago
No, but I do use Line Feed and Carriage Return.
rascul · 11h ago
This is the discussion I came here for.
mmastrac · 12h ago
Because it's the one true way, and tabs are WRONG.

Also Vim > Emacs, the new BSG was better than the old BSG, TNG is the best Trek, and all the other hashed-out flamewars of the 90's and 2000's. :)

evbogue · 12h ago
There's a debate about new BSG being better than old BSG?
mmastrac · 11h ago
I posit:

For every topic of A vs B where A and B are related in some way, no matter how small, there exists an argument C where two people take increasingly opposed positions about which is better.

HideousKojima · 11h ago
I actually love the original BSG. And the new one started out strong but the writers clearly didn't have a plan for where they wanted things to go despite the opening credits insisting the Cylons have a plan.
mixmastamyk · 5h ago
Agreed. Not to mention the original BSG was strangled in its crib for costing too much. Something a production in the aughts didn't have to worry much about.
y-curious · 12h ago
daneel_w · 12h ago
Because the latter is universal, and it can always align perfectly.

    # using tabs with tabsize 4
    
    some_func( eyesore,
                blah );
    
    some_func(
            eyesore,
            blah );
    
    some_func(
                eyesore,
                blah
            );
DaiPlusPlus · 11h ago
> and it can always align perfectly.

I'm firmly in Team Tab, and I want to arrest any misconception that us Tabbers would do anything as nonsensical as using our precious variable-width tab-stop chars for anything like column-aligning identifiers: we don't.

My very hard and fast rule is that tabs are for only indenting at the block level, while spaces are used for alignment after the initial tab chars; tabs must never be used on a line if preceded by any non-tab char.

Whereas I can't stand always-using-only-spaces-for-indenting-and-alignment - especially because when you're drag-selecting text most editors won't snap your selection to the indent level, so you get RSI in your wrist from having to make micro-movements to make sure you don't select more - or less - spaces than the intended indent. ...or worse: when moving the caret via the keyboard and having to tap your arrow-keys 4 or 8 times per indent instead of just once.

You spaces-only people are totally spaced out, man.

Ferret7446 · 4h ago
> I want to arrest any misconception that us Tabbers would do anything as nonsensical as using our precious variable-width tab-stop chars for anything like column-aligning identifiers: we don't.

The irony is that this is exactly what tab characters are used for. Have you wondered why they're called tabs? Because they're used for tabulation, making tables. They are intended for aligning columns in a table. Not for indentation.

imiric · 2h ago
I'm so happy that languages like Go have formatting tools that sidestep all these pointless discussions. :)
DaiPlusPlus · 4h ago
We aren't using typewriters anymore.
Cieric · 10h ago
I personally agree with this, but a lot of the tools out there break this easily. I'm curious if you have any tools that handle the formatting like this properly. I've written my own tool that will report invalid whitespace when following this, but it can't fix any of it automatically. The commonly used clang-format also messes up this scheme as it will convert alignment space to tabs.
DaiPlusPlus · 9h ago
I'll admit that I spend most of my time in Visual Studio which supports my preferences very well, including my .editorconfig (which is now the .editorconfig for my entire org... it's almost as if good ideas have a following ;)

I do understand the appeal and advantages of having automated+opinionated re-formatting as part of a gated check-in process, because it's about having a normalized and consistent representation in the canonical repo; the idea being that you'd have a git-hook that would apply your own preferred formatting style on checkout which would be undone on commit; alas, we're not quite there yet.

...but having a single, normalized format (even if everyone hates it for different reasons) is the reason why gofmt and clang-format stick to spaces. I remember (back in 2017) being forced to submit to gofmt's dominion over my code and it ruining my beautifully aligned mass-assignments - and in my frustration I complained about this on StackOverflow and almost immediately someone replied with a working solution: use C-style comments to "protect" whitespace from being mangled by gofmt, see here: https://stackoverflow.com/questions/46940772/how-can-i-use-g...

Also, apparently clang-format now supports tabs with some hoops: https://stackoverflow.com/questions/69135590/how-make-clang-... - does that work for you?

Cieric · 8h ago
I'm mainly in Visual Studio at my job as well, I was more asking for my personal projects since at work the issue has been "solved." Sadly the clang-format stuff doesn't work, while it looks like it supports tabs on the surface all those settings do (at least last I used it a year or 2 ago) is convert all of the tabs to spaces, do all the formatting it typically does, and then convert all x number of spaces back to tabs if they're at the beginning of the line. Effectively converting all the alignment spaces to tabs (leaving a few spaces at the end if it's not an even multiple.)

My tool at this point basically just has a bunch of rules like,

  1) if tab indentation changes, spaces for alignment aren't allowed
  2) tab indentation can never be off by more than 1 of the prior line
Also flags cases of trailing whitespace and I believe tabs not at the beginning of a line. Still debating how I'd like to handle fully spaced files as my current program reports no errors in that case, maybe just throw a warning somewhere that the file looks suspicious.
daneel_w · 11h ago
Sir, you are way out of alignment. Detabulate immediately.
zahlman · 11h ago
My style using spaces is

  some_func(
      eyesore,
      blah
  )
which would work just as well with tabs.

Many years ago, I used tabs, and set them to two-space indent. The former because the entire point is that tabs carry different semantic information - this is a level of indentation, not just making things align vertically - and allow each developer to set the indentation width to their preference. (The other comment from DaiPlusPlus explains the proper use of tabs, just as I did it.)

The latter because that makes them more square. Aesthetics matter.

I switched mostly out of peer pressure. But one argument I did find convincing is that setting some specific limit on line length - whether it's 72 or 78 or 80 or 100 or anything else - makes sense, and letting people change the amount of indentation defeats that purpose. That is: the guy who likes 8-space indents can't actually have them, because it produces a horizontal scroll for code that "conformed to the style guidelines" when written by the 2-space guy.

But now I alias names, break up complex subexpressions etc. to avoid questions of how to split code across multiple lines - and most lines in my code are nowhere near any such length limit. And I write short functions, so there aren't enough levels of indentation to matter.

And I use 4-space indents, because standards have value after all.

Asooka · 12h ago
We use 4-wide tabs and in our code style it would be

    some_func(
        verylongarg0,
        verylongarg1
    );
Which I feel is the most readable option. If you have to break the args into a vertical list, you want to use the least amount of whitespace before each arg. It's also a bit easier to read with every term starting at a tab break.
daneel_w · 11h ago
With a large set of arguments broken down to multiple lines I prefer to keep them clear of the function name.

    with_long_func_names(
        this_scheme,
        looks_muddled
    );
    
    long_func_name(
                    tidier,
                    scheme
                  );
But my main gripe with tabs is that no one agrees on the width.
Vendan · 11h ago
that's the entire point of tabs, they can be customized to what the person reading them wants. It's an accessibility issue (https://www.reddit.com/r/javascript/comments/c8drjo/nobody_t...).
mcdonje · 11h ago
Not agreeing on width is an argument in favor of tabs.
creeble · 11h ago
The beauty is you don’t have to.
jdrek1 · 11h ago
> But my main gripe with tabs is that no one agrees on the width.

That's the entire point of tabs. One tab means one indentation level and you as the user can decide how that's displayed. Spaces forces everyone to see the code exactly as whoever decided on his favourite width and that is in the best case "only" annoying to people with different preferences and in the worst case actively hurtful to people with disabilities.

The only argument spaces people ever have is "some of my colleagues are too stupid to properly indent with tabs and align with spaces" and that is trivially fixed by either of those:

- don't use alignment, it's useless anyway

- get better coworkers

- educate your coworkers

- use commit hooks to check wrong usage

So basically there is no argument left on the spaces side at all^[1]. Meanwhile tabs semantically mean "one indentation level", take up less bytes, and most importantly allow everyone to have their own preferences without affecting other people. And honestly I am insanely baffled by how many people don't get the importance of that last part. Accessibility like that costs you nothing but means the world to other people, similarly how we have ramps at public buildings for the elder, wheelchair users, strollers, and so on. And not to mention the fact that there are a lot of autistic people in programming, which often have a harder time dealing with things not being as they want them to be. Is there any reason to choose an objectively inferior method and force that onto those demographics just because "muh alignment"?

[1] Okay fine, there is one: "Tools I don't own don't display tabs as I want them, for example GitHub with their retarded default of 8". But first of all you can change that if you're logged in and second you're supposed to use your IDE and not a web interface...

Asraelite · 8h ago
I would agree that there aren't any arguments for spaces and would be 100% on the side of tabs, except for one problem: variable width means you can't enforce a maximum column limit.

Some people don't care about column limits, but they're important to me because I like to tile multiple editor panes side-by-side with no space wasted.

The entire debate is stupid anyway and should already be a solved problem. If we used tooling that operates on syntax trees instead of source text, then every developer could have exactly the formatting they want without conflicts. I don't know why that isn't more widespread; the only language I know of to do it is Unison.

aeonik · 4h ago
Why can't you just have a linter or a hook check that (tabs*2 + chars) < $defined_width
SoftTalker · 12h ago
Well, obviously tabs should always be 8 spaces.
daneel_w · 11h ago
Not sure if you're joking since 8 makes the whole problem even worse :)
zzo38computer · 1h ago
Note that some files will need tabs such as Gopher menus and Makefile.
gwbas1c · 11h ago
FYI: If you're in the .net ecosystem, you can choose your tabbing style (tabs or spaces) with an .editorconfig file. Then running "dotnet format" will change everything for you. (And, if you use github, you can create actions to assert that the .editorconfig is followed.)
diggan · 8h ago
FWIW: EditorConfig isn't a ".net ecosystem" thing but works across a ton of languages, editors and IDEs: https://editorconfig.org/

Also, rather than using GitHub Actions to validate if it was followed (after branch was pushed/PR was opened), add it as a Git hook (https://git-scm.com/docs/githooks) to run right before commit, so every commit will be valid and the iteration<>feedback loop gets like 400% faster as you don't have to wait for Actions to finish.

gwbas1c · 7h ago
Git hooks require environment-specific configuration. CI enforcement makes sure that everyone follows the rules, even if they "forget" to set up the git hook.

Also: dotnet format is kind of slow, which is why they aren't used where I work.

diggan · 6h ago
> CI enforcement makes sure that everyone follows the rules, even if they "forget" to set up the git hook.

Yeah, my wording was a bit poor (shouldn't have said "rather"), both are needed, one just helps you fix stuff faster :)

And if you write your hook in a language that can cross-compile and can easily deal with multiple platforms (Go, Rust, NodeJS, many options [probably .net too?]), it's really easy. Just need to make the setup of them part of the onboarding.

baobun · 4h ago
I would just approach this like text. Something like:

    find -type f -name '*.py' -exec sed -i 's/^\t/    /' {} \+
, until you don't see a diff

Seems simpler to adjust that general approach to whatever codebase and replacement.

gwbas1c · 7h ago
One funny anecdote: I once did a similar cleanup on a codebase that was mostly spaces, but a few tabs slipped in. (I just did a find and replace on \t -> " ")

Suddenly, one unit test broke. On closer inspection, whoever wrote it put a tab character into a string. I changed the test to use \t.

s09dfhks · 3h ago
what is this furry tomfoolory
user9999999999 · 3h ago
whitespace is a terrible block scope definition, its literally using 'invisible' characters to determine block scope! just use semi colons. LONG LIVE SEMI COLONS ;;;;;;;;;;;
kgwxd · 8h ago
I never understood why programmers universally like fixed width fonts, but then about half want just 1 of those characters to be batshit crazy.
gwbas1c · 10h ago
> One way or another, you must get this block in your devs’ Git configuration

Uhm, things like this should be enforced in CI. IE, as a rule that must pass in order for a pull request to be merged.