Frequent reauth doesn't make you more secure (tailscale.com)
680 points by ingve 8h ago 305 comments
Three Algorithms for YSH Syntax Highlighting (github.com)
9 points by todsacerdoti 1h ago 1 comments
Microsoft Office migration from Source Depot to Git
305 dshacker 246 6/12/2025, 12:15:23 AM danielsada.tech ↗
TFA throws some shade at how "a single get of the office repo took some hours" then elides the fact that such an operation was practically impossible to do on git at all without creating a new file system (VFS). Perforce let users check out just the parts of a repo that they needed, so I assume most SD users did that instead of getting every app in the Office suite every time. VFS basically closes that gap on git ("VFS for Git only downloads objects as they are needed").
Perforce/SD were great for the time and for the centralised VCS use case, but the world has moved on I guess.
It uses the same technology that's built into Windows that the remote drive programs (probably) use.
Personally I kind of still want some sort of server based VCS which can store your entire companies set of source without needing to keep the entire history locally when you check out something. But unfortunately git is still good enough to use on an ad-hoc basis between machines for me that I don't feel the need to set up a central server and CI/CD pipeline yet.
Also being able to stash, stage hunks, and interactively rebase commits are features that I like and work well with the way I work.
No comments yet
So when you have a repo that's hundreds of GB in size, the entire history can be massive.
It was not. It was literally a fork of perforce with executable renamed to sd.exe from p4. Command line was pretty much identical.
No comments yet
I moved to Google from Microsoft and back when employee orientation involved going to Mountain View and going into labs to learn the basics, it was amusing to see fresh college hires confused at not-git while I sat down and said "It's Source Depot, I know this!"[1]
[1] https://www.youtube.com/watch?v=dFUlAQZB9Ng
A legacy tool might be bad, or it might be very good but just unpopular. A company that devotes political will to modernize for the sake of modernizing is the kind of craziness we get in the JS ecosystem.
I can, if the version control software is just not up to standards.
I absolutely didn’t mind using mercurial/hg, even though I literally haven’t touched it until that point and knew nothing about it, because it is actually pretty good. I like it more than git now.
Git is a decent option that most people would be familiar with, cannot be upset about it either.
On another hand, Source Depot sucked badly, it felt like I had to fight against it the entire time. I wasn’t upset because it was unfamiliar to me. In fact, the more familiar I got with it, the more I disliked it.
They might not be upset on the first few weeks but after a month or so they will be familiar with the pain.
Credit where credit is due at my time at Excel we did improve things a lot (migration from Script# to TypeScript, migration from SourceDepot to git, shorter dev loop and better tooling etc) and a large chunk of development time was spent on developer tooling/happiness.
But it does suck to have to go to one of the old places and use sourcedepot and `osubmit` the "make a change" tool and then go over 16 popups in the "happy path" to submit your patch for review (also done in a weird windows gui review tool)
Git was quite the improvement :D
I was asking because I wonder what the enterprises that want to both use AI on their workflows like LLM's and have air-gap owned 100% data and pipelines are doing rn.
Feels like one of the few areas where to compete with big labs to me, might be wrong
What I hate about bitbucket is how stagnated it is.
https://git-scm.com/docs/partial-clone
Quite a few devs were still using it even then. I wonder if everything has been migrated to git yet.
VSS (Visual SourceSafe) being Microsoft's own source versioning protocol, unlike Source Depot which was licensed from Perforce.
One of the people who joined Microsoft via the acquisition was Brian Harry, who led the development of Team Foundation Version Control (part of Team Foundation Server - TFS) which used SQL Server for its storage. A huge improvement in manageability and reliability over VSS. I think Brian is retired now - his blog at Microsoft is no longer being updated.
From my time using VSS, I seem to recall a big source of corruption was it's use of network file locking over SMB. If there were a network glitch (common in the day) you'd have to repair your repository. We set up an overnight batch job to run the repair so we could be productive in the mornings.
Indeed my experiences of vss was also not amazing and certainly got corrupted files too.
Shared database files (of any kind) over SMB... shudder Those were such bad days.
I started a job at MSFT in 2004 and I recall someone explaining that VSS was unsafe and prone to corruption. No idea if that was true, or just lore, but it wasn't an option for work anyway.
I think VSS was fine if you used it on a local machine. If you put it on a network drive things would just flake out. It also got progressively worse as newer versions came out. Nice GUI, very straight forward to teach someone how to use it (checkout file, change, check in like a book), random corruptions about sums up VSS. That checkin/out model seems simpler for people to grasp. The virtual/branch systems most of the other ones use is kind of a mental block for many until they grok it.
It's an absurd understatement. The only people that seriously used VSS and didn't see any corruption were the people that didn't look at their code history.
When I was at Microsoft, Source Depot was the nicer of the two version control systems I had to use. The other, Source Library Manager, was much worse.
kinda nice to know it wasn't just our experience
Migrating from MSMAIL -> Exchange, though - that was rough
I have more long, boring stories about projects there, but that’s for another day
MSMAIL was designed for Win3.x. Apps didn't have multiple threads. The MSMAIL client app that everyone used would create the email to be sent and store the email file on the system.
An invisible app, the Mail Pump, would check for email to be sent and received during idle time (N.B. Other apps could create/send emails via APIs, so you couldn't have the email processing logic in only the MSMAIL client app).
So the user could hit the Send button and the email would be moved to the Outbox to be sent. The mail pump wouldn't get a chance to process the outgoing email for a few seconds, so during that small window, if the user decided that they had been too quick to reply, they could retract that outgoing email. Career-limited move averted.
Exchange used a client-server architecture for email. Email client would save the email in the outbox and the server would notice the email almost instantly and send it on its way before the user blinked in most cases.
A few users complained that Exchange, in essence, was too fast. They couldn't retract a misguided email reply, even if they had reflexes as quick as the Flash.
That is something that's actually pretty common and called "benevolent deception" - it has been discussed on HN some years past, too [1].
[1] https://news.ycombinator.com/item?id=16289380
"How do you write code so that it compiles differently from the IDE vs the command line?" to which the answer was "If you do this your colleagues will burn you in effigy when they try to debug against the local build and it works fine"
Thanks for sharing this authentic story! As an ex-MSFT in a relatively small product line that only started switching to Git from SourceDepot in 2015, right before I left, I can truly empathize with how incredible a job you guys have done!
I wonder if Microsoft ever considered using BitKeeper, a commercial product that began development in 1998 and had its public release in 2000. Maybe centralized systems like Perforce were the norm and a DVCS like BitKeeper was considered strange or unproven?
They should have recalled it to avoid continued public use…
I spent a couple years at Microsoft and our team used Source Depot because a lot of people thought that our products were special and even Microsoft's own source control (TFS at the time) wasn't good enough.
I had used TFS at a previous job and didn't like it much, but I really missed it after having to use Source Depot.
It should have existed around the same time and other parts of MS were using it. I think it was released around 2005 but MS probably had it internally earlier.
NT created (well not NT itself, IIRC, there was some an MS-internal developer tools group in charge)/moved to source depot since a shared file-system doesn't scale well to thousands of users. Especially if some file gets locked and you DoS the whole division.
Source depot became the SCCS of choice (outside of Dev Division).
Then git took over, and MS had to scale git to NT-size scale, and upstream many of the changes to git mainline.
Raymond Chen has a blog that mentions much of this - https://devblogs.microsoft.com/oldnewthing/20180122-00/?p=97...
It wasn't too bad for a centralized source control system tbh. Felt a lot like SVN reimagined through the prism of Microsoft's infamous NIH syndrome. I'm honestly not sure why anyone would use it over SVN unless you wanted their deep integration with Visual Studio.
Can't explain TFS though, that was still garbage internally and externally.
Get this; if you wanted to change a file you had to check it out. It was then locked and no-one else could change it. Files were literally read only on your machine unless you checked them out. The 'one at a time please' approach to Source Control (the other approach being 'lets figure out how to merge this later')
They were file based revision control, not repository based.
SVN added folders like trunk/branches/tags that overlaid the file based versioning by basically creating copies of the files under each folder.
Which is why branch creation/merging was such a complicated process, because if any of the files didn't merge, you had a half merged branch source and a half merged branch destination that you had to roll back.
A pretty good tradeoff, because you can set it on complex structured files (e.g. PSDs and the like) to avoid the ballache of getting a conflict in an unmergeable file but it does not block code edition.
And importantly anyone can steal locks by default. So a colleague forgetting to unlock and going on holidays does not require finding a repo admin.
It sucked; but honestly, not using anything is even worse than SourceSafe.
I still ‘member articles calling it a source destruction system. Good times.
> It sucked; but honestly, not using anything is even worse than SourceSafe.
There have always been alternatives. And even when you didn’t use anything, at least you knew what to expect. Files didn’t magically disappear from old tarballs.
Highly debatable.
CVS has a horrendous UI, but didn’t have a tendency to corrupt itself at the drop of a hat and didn’t require locking files to edit them by default (and then require a repository admin to come in and unlock files when a colleague went on holidays with files checked out). Also didn’t require shared write access to an SMB share (one of the reasons it corrupted itself so regularly).
And afaik P4 still does good business, because DVCS in general and git in particular remain pretty poor at dealing with large binary assets so it’s really not great for e.g. large gamedev. Unity actually purchased PlasticSCM a few years back, and has it as part of their cloud offering.
Google uses its own VCS called Piper which they developed when they outgrew P4.
But yeah, it's basically all about having binaries in source control. It's not just game dev, either - hardware folk also like this for their artifacts.
And yes, p4 just rolls with it, git lfs is a creacky hack.
I think I read somewhere that game dev teams would also check in the actual compiler binary and things of that nature into version control.
Usually it's considered "bad practice" when you see, like, and entire sysroot of shared libs in a git repository.
I don't even have any feeling one way or another. Even today "vendoring" cpp libraries (typically as source) isn't exactly rare. I'm not even sure if this is always a "bad" thing in other languages. Everyone just seems to have decided that relying on a/the package manager and some sort of external store is the Right Way. In some sense, it's harder to make the case for that.
The better organised projects I've worked on have done this, and included all relevant SDKs too, so you can just install roughly the right version of Visual Studio and you're good to go. Doesn't matter if you're not on quite the right point revision or haven't got rough to doing the latest update (or had it forced upon you); the project will still build with the compiler and libraries you got from Perforce, same as for everybody else.
I think there are three main issues:
1. Since it's a distributed VCS, everyone must have a whole copy of the entire repo. But that means anyone cloning the repo or pulling significant commits is going to end up downloading vast amounts of binaries. If you can directly copy the .git dir to the other machine first instead of using git's normal cloning mechanism then it's not as bad, but you're still fundamentally copying everything:
2. git doesn't "know" that something is a binary (although it seems to in some circumstances), so some common operations try to search them or operate on them in other ways as if they were text. (I just ran git log -S on that repo and git ran out of memory and crashed, on a machine with 64GB of RAM).3. The cure for this (git lfs) is worse than the disease. LFS is so bad/strange that I stopped using it and went back to putting the tarballs in git.
We built oxen to solve this problem https://github.com/Oxen-AI/Oxen (I work at Oxen.ai)
Source control for large data. Currently our biggest repository is 17 TB. would love for you to try it out. It's open source, so you can self host as well.
And it’s no sweat off p4’s back.
In both cases the designer does not recompile, but in the second case there are no checked in binaries in the repo... I still think nuget / MAVEN would be more appropriate for this task...
VCS + Nuget: half the things are in the VCS, you checkout the project and then you have to hunt down a bunch of packages from a separate thing (or five), when you update the repo you have to update the things, hopefully you don't forget any of the ones you use, scripts run on a prayer that you have fetched the right things or they crash, version sync is a crapshoot, hope you're not working on multiple projects at the same time needing different versions of a utility either. Now you need 15 layers of syncing and version management on top of each project to replicate half of what just checking everything into P4 gives you for free.
Oh, and there's things like x509/proxy/whatever errors when on a corpo machine that has ZScaler or some such, so you have to use internal Artifactory/thing but that doesn't have the version you need or you need permissions to access so.. and etc etc.
One does not forget what nugets are used: VS projects do that bookkeeping for you. You update the VS project with the new nugets your task requires; and this bookkeeping will carry on when you merge your PR.
I have seen this model work with no issues in large codebases: VS solutions with upwards of 500,000 lines of code and 20-30 engineers.
Also, where does nuget get this stuff from? It doesn't build this stuff for you, presumably, and so the binaries must come from somewhere. So, you just got latest from version control to get the info for nuget - and now nuget has to use that info to download that stuff?
And that presumably means that somebody had to commit the info for nuget, and then separately upload the stuff somewhere that nuget can find it. But wait a minute - why not put that stuff in the version control you're using already? Now you don't need nuget at all.
AOSP with 50M LoC uses a manifest-based, depth=1 tool called repo to glue together a repository of repositories. If you’re thinking “why not just use git submodules?”, it’s because git submodules has a rough UX and would require so much wrangling that a custom tool is more favorable.
Meta uses a custom VCS. They recently released sapling: https://sapling-scm.com/docs/introduction/
In general, the philosophy of distributed VCS being better than centralized is actually quite questionable. I want to know what my coworkers are up to and what they’re working on to avoid merge conflicts. DVCS without constant out-of-VCS synchronization causes more merge hell. Git’s default packfile settings are nightmarish — most checkouts should be depth==1, and they should be dynamic only when that file is accessed locally. Deeper integrations of VCS with build systems and file systems can make things even better. I think there’s still tons of room for innovation in the VCS space. The domain naturally opposes change because people don’t want to break their core workflows.
I think it is neat that at least one company with mega-repos is trying to lift all boats, not just their own.
Every month or two, we get notifications along the FINAL WARNING lines, telling us about some critical system about to be deleted, or some new system that needs to be set up Right Now, because it is a Corporate Standard (that was never rolled out properly), and by golly we have had enough of teams ignoring us, the all powerful Board has got its eyes on you now.
It's a full time job to keep up with the never-ending churn. We could probably just spend all our engineering effort being compliant and never delivering features :)
Company name withheld to preserve my anonymity (100,000+ employees).
Add in the various spam (be it attacks or just random vendors trying to sell something).
At some point, people start to zone out and barely skim, if that, most of their work emails. Same with work chats, which are also more prone to people sharing random memes or photos from their picnic last week or their latest lego set.
My #1 method of keeping my inbox clean, is unsubscribing from newsletters.
I also set up a rule to auto-delete phishing test emails based on their headers, which annoyed the security team.
TFS was developed in the Studio team. It was designed to work on Microsoft scale and some teams moved over to it (SQL server). It was also available as a fairly decent product (leagues better than SourceSafe).
I’d never heard of Source Depot before today.
Basically instead of everyone creating their own short-lived branches (expensive operation), you would have long-lived branches that a larger group of people would commit to (several product areas). The branch admins job was then to get the work all of these people forward integrated to a branch upwards in the hierarchy. This was attempted a few times per day, but if tests failed you would have to reach out to the responsible people to get those test fixed. Then later, when you get the changes merged upwards, some other changes have also been made to the main integration branch, and now you need to pull these down into your long lived branch - reverse integration - such that your branch is up to date with everyone else in the company.
I'd love a system that would essentially be a source control of my patches, while also allowing a first class view of the upstream source + patches applied, giving me clear controls to see exactly when in the upstream history the breakages were introduced, so that I'm less locking in precise upstream versions that can accept the patches, and more actively engaging with ranges of upstream commits/tags.
I can't imagine how such a thing would actually be commercially useful, but darned if would be an obvious fit for AI to automatically examine the upstream and patch history and propose migrations.
I’ve only ever really used CVS, SVN, and Git.
Subdirectories-as-branches (like bare repo + workspace-per-branch practices w/git) is so much easier for average computer users to grok, too. Very easy to admin too.
No idea what the current "enterprisey" offering is like, though.
For corporate teams, it was a game changer. So much better than any alternative at the time.
We're all so used to git that we've become used to it's terribleness and see every other system as deficient. Training and supporting a bunch of SWE-adjacent users (hw eng, ee, quality, managers, etc) is a really, really good reality check on how horrible the git UX and datamodel is (e.g. obliterating secrets--security, trade, or PII/PHI--that get accidentally checked in is a stop-the-world moment).
For the record, I happily use git, jj, and Gitea all day every day now (and selected them for my current $employer). However, also FTR, I've used SCCS, CVS, SVN, VSS, TFS and MKS SI professionally, each for years at a time.
All of the comments dismissing tools that are significantly better for most use cases other than distributed OSS, but lost the popularity contest, is shortsighted.
Git has a loooong way to go before it's as good in other ways as many of its "competitors". Learning about their benefits is very enlightening.
And, IIRC, p4 now integrates with git, though I've never used it.
What makes it convoluted? Where did it lose the beat?
The ability to lock files centrally might seem outdated by the branching and PR model, but for some organizations the centralized solution works way better because they have built viable business processes around it. Centralized can absolutely smoke distributed in terms of iteration latency if the loop is tight enough and the team is cooperating well.
Perforce is a complete PITA to work with, too expensive and is outdated/flawed for modern dev BUT for binary files it's really the only game in town (closely followed by svn but people have forgotten how good svn was and only remember how bad it was at tracking branch merging).
Will they get an annoing window, in the midle of the migration, telling them that Office must be updated now, or the world will end ?
"Gosh, that sounds like a right mother," said Unix.
I'm sorry, what?! 4,000 engineers doing what, exactly?
Excel turns 40 this year and has changed very little in those four decades. I can't imagine you need 4,000 engineers just to keep it backwards compatible.
In the meantime we've seen entire companies built with a ragtag team of hungry devs.
Overall good story though
To have never touched it in the last decade? You've got a gap in your CV.
Turn it around: If I were to apply for a job at Microsoft, they would probably find that my not using Windows for over twenty years is a gap on my CV (not one I would care to fill, mind).
These are young industries. So most hiring teams expect that you take the time to learn new technologies as they become established.
Yes, there is a transition, no it isn't really that hard.
Anyone who views lack of git experience as a gap in a CV is selecting for the wrong thing.
(You can get 80% of the way there with symlinks but in my experience they eventually break in git when too many different platforms making commits)
Also at one point I maintained an obscenely advanced test tool at MS, it pounded through millions of test cases across a slew of CPU architectures, intermingling emulators and physical machines that were connected to dev boxes hosting test code over a network controlled USB switch. (See: https://meanderingthoughts.hashnode.dev/how-microsoft-tested... for more details!)
Microsoft had some of the first code coverage tools for C/C++, spun out of a project from Microsoft Research.
Their debuggers are still some of the best in the world. NodeJS debugging in 2025 is dog shit compared to C# debugging in 2005.
Git has sparse client views with VFS these days.
Cross-repo commits are not a problem as long as you understand "it only counts as truly committed if the child repo's commit is referenced from the parent repo".
This is a big problem in my experience. Relying on consumers of your dependency to upgrade their submodules isn't realistic.
Is this a "git" failure or a "Linux filesystems suck" failure?
It seems like "Linux fileystems" are starting to creak under several directions (Nix needing binary patching, atomic desktops having poor deduplication, containers being unable to do smart things with home directories or too many overlays).
Would Linux simply sucking it up and adopting ZFS solve this or am I missing something?
OverlayFS has had performance issues on Linux for a while (once you start composing a bunch of overlays, the performance drops dramatically as well as you start hitting limits on number of overlays).
Other file systems, e.g. the much faster XFS, have equally efficient snapshots.
Git isn’t even very old, it came out in 2005. Microsoft Office first came out in 1990. Of course Office wasn’t using git.
Most people don’t know or realize that Git is where it is because of Microsoft. About 1/2 of the TFS core team spun out to a foundation where they spent several years doing things like making submodules actually work, writing git-lfs, and generally making git scale.
You can look for yourself at the libgit2 repo back in the 2012-2015 timeframe. Nearly the whole thing was rewritten by Microsoft employees as the earliest stages of moving the company off source depot.
It was a really cool time that I’m still amazed to have been a small part of.
AJAX, that venerable piece of kit that enabled every dynamic web-app ever, was a Microsoft invention. It didn't really take off, though, until Google made some maps with it.