> The root inode is the root of the file system. Inode 0 can't be used for normal purposes and historically bad blocks were linked to inode 1 (inode 1 is no longer used for this purpose; however, numerous dump tapes make this assumption, so we are stuck with it). Thus the root inode is 2.
This is also echoed on the wikipedia page for it.
The linux Kernel also has this comment for why it does not dish out that inode for shmem for instance:
> Userspace may rely on the the inode number being non-zero. For example, glibc simply ignores files with zero i_ino in unlink() and other places.
On macOS it's pretty clear that inode 0 is reserved:
> Users of getdirentries() should skip entries with d_fileno = 0, as such entries represent files which have been deleted but not yet removed from the directory entry
A file descriptor can't be -1 but it's not 100% clear whether POSIX bans other negative numbers. So Rust's stdlib only bans -1 (for a space optimization) while still allowing for e.g. -2.
nulld3v · 4h ago
Also, there seems to be an effort brewing in the kernel to push userspace away from depending on inode #s due to difficulty in guaranting uniqueness and stability across reboots. https://youtu.be/TNWK1zbTMOU
AndrewDavis · 4h ago
They definitely aren't unique even without reboots. Postfix uses the inode number as a queue id. At $dayjob we've seen reuse surprisingly quickly, even within a few hours. Which is a little annoying when we're log spelunking and we get two sets of results because of the repeating id!
(there is now a long queue id option which adds a time component)
koverstreet · 20m ago
The combination of st_ino and the inode generation is guaranteed to be unique (excepting across subvolumes, because snapshots screw everything up). Filesystems maintain a generation number that's incremented when an inode number is being used, for NFS.
Unfortunately, it doesn't even seem to be exposed in statx (!). There's change_cookie, but that's different.
If anyone wants to submit a patch for this, I'll be happy to review it.
amiga386 · 4h ago
...but it's unique while the file exists, right?
The combination of st_dev and st_ino from stat() should be unique on a machine, while the device remains mounted and the file continues to exist.
If the file is deleted, a different file might get the inode, and if a device is unmounted, another device might get the device id.
the_mitsuhiko · 3h ago
> The combination of st_dev and st_ino from stat() should be unique on a machine
It should, but it seems no longer to be the case. I believe there was an attempt to get a sysctl flag in to force the kernel to return the same inode for all files to see what breaks.
londons_explore · 3h ago
> ...but it's unique while the file exists, right?
I don't think all filesystems guarantee this. Especially network filesystems.
amiga386 · 3h ago
That's a problem for programs that do recursive fs descent (e.g. find, tar) because they use st_dev and st_ino alone for remembering what directories they've been in. They can't just use the absolute path, because symbolic links allow for loops.
In particular, I'm intrigued by the comment in the last link:
/* With NFS, the same file can have two different devices
if an NFS directory is mounted in multiple locations,
which is relatively common when automounting.
To avoid spurious incremental redumping of
directories, consider all NFS devices as equal,
relying on the i-node to establish differences. */
So GNU tar expects an inode to be unique across _all_ NFS mounts...
the_mitsuhiko · 3h ago
You are not wrong, but the issues with tar are well known. Linus himself had this to say [1]:
> Well, the fact that it hits snapshots, shows that the real problem is
just "tar does stupid things that it shouldn't do".
> Yes, inode numbers used to be special, and there's history behind it.
But we should basically try very hard to walk away from that broken
history.
> An inode number just isn't a unique descriptor any more. We're not
living in the 1970s, and filesystems have changed.
You might still get away with it most of the time today, but it's causing more and more issues.
If it's not the 1970s anymore, then update the POSIX standard with a solution that works for all OSes (including the BSDs) and can be relied upon. Definitely don't suggest a Linux-only solution for a Linux-only problem.
the_mitsuhiko · 2h ago
Again, you are not wrong. This is all clearly not intended. However it has become a challenge to map things like Btrfs subvolumes (when seen from a Btrfs mount) onto POSIX semantics [1].
You are absolutely right that ideally there is an update to the POSIX standard. But things like this take time and it's also not necessarily clear yet what the right path here is going forward. You can consider a lot of what is currently taking place as an experiment to push the envelope.
As for if this is a Linux specific problem I'm not sure. I'm not sufficiently familiar with the situation on other operating systems to know what conversations are taking place there.
ZFS has the same problem, for the same reasons. But it also has additional reasons. The simplest of them is that inode numbers are 64–bit integers but ZFS filesystems can have up to 2¹²⁸ files.
dwattttt · 2h ago
Have you checked what POSIX has to say about inode numbers? It may say less than you think.
As an example, it offers that the identity of a file that's deleted could be reused.
the_mitsuhiko · 3h ago
It's effectively impossible to guarantee this when you have a file system that unifies and re-exports. Network file systems being an obvious one, but overlayfs is in a similar position.
Even if inodes still work nowadays they will eventually run into issues a few years down the line.
AndrewDavis · 3h ago
Yes! It's reusable, but not duplicated.
quotemstr · 4h ago
The problem isn't relying on inode numers; it's inode numbers being too short. Make them GUIDs and the problems of uniqueness disappear. As for stability: that's just a matter of filesystem durability in general.
the_mitsuhiko · 3h ago
> The problem isn't relying on inode numers; it's inode numbers being too short.
It's a bit of both. inodes are conflating two things in a way. They are used by the file system to identify a record but they are _also_ exposed in APIs that are really cross file system (and it comes to a head in case of network file systems or overlayfs).
What's a more realistic path is to make inodes just an FS thing, let it do it's thing, and then create a set of APIs that is not relying on inodes as much. Linux for instance is trying to move towards file handles as being that API layer.
Animats · 6h ago
It's been a long time since what user space sees as an "inode" has anything to do with the representation within the file system.
> The root inode is the root of the file system. Inode 0 can't be used for normal purposes and historically bad blocks were linked to inode 1 (inode 1 is no longer used for this purpose; however, numerous dump tapes make this assumption, so we are stuck with it). Thus the root inode is 2.
This is also echoed on the wikipedia page for it.
The linux Kernel also has this comment for why it does not dish out that inode for shmem for instance:
> Userspace may rely on the the inode number being non-zero. For example, glibc simply ignores files with zero i_ino in unlink() and other places.
On macOS it's pretty clear that inode 0 is reserved:
> Users of getdirentries() should skip entries with d_fileno = 0, as such entries represent files which have been deleted but not yet removed from the directory entry
A file descriptor can't be -1 but it's not 100% clear whether POSIX bans other negative numbers. So Rust's stdlib only bans -1 (for a space optimization) while still allowing for e.g. -2.
(there is now a long queue id option which adds a time component)
Unfortunately, it doesn't even seem to be exposed in statx (!). There's change_cookie, but that's different.
If anyone wants to submit a patch for this, I'll be happy to review it.
The combination of st_dev and st_ino from stat() should be unique on a machine, while the device remains mounted and the file continues to exist.
If the file is deleted, a different file might get the inode, and if a device is unmounted, another device might get the device id.
It should, but it seems no longer to be the case. I believe there was an attempt to get a sysctl flag in to force the kernel to return the same inode for all files to see what breaks.
I don't think all filesystems guarantee this. Especially network filesystems.
find:
* https://cgit.git.savannah.gnu.org/cgit/findutils.git/tree/fi...
* https://cgit.git.savannah.gnu.org/cgit/findutils.git/tree/fi...
tar:
* https://cgit.git.savannah.gnu.org/cgit/tar.git/tree/src/crea...
* https://cgit.git.savannah.gnu.org/cgit/tar.git/tree/src/name...
* https://cgit.git.savannah.gnu.org/cgit/tar.git/tree/src/incr...
In particular, I'm intrigued by the comment in the last link:
So GNU tar expects an inode to be unique across _all_ NFS mounts...> Well, the fact that it hits snapshots, shows that the real problem is just "tar does stupid things that it shouldn't do".
> Yes, inode numbers used to be special, and there's history behind it. But we should basically try very hard to walk away from that broken history.
> An inode number just isn't a unique descriptor any more. We're not living in the 1970s, and filesystems have changed.
You might still get away with it most of the time today, but it's causing more and more issues.
[1]: https://lkml.iu.edu/hypermail/linux/kernel/2401.3/04127.html
If it's not the 1970s anymore, then update the POSIX standard with a solution that works for all OSes (including the BSDs) and can be relied upon. Definitely don't suggest a Linux-only solution for a Linux-only problem.
You are absolutely right that ideally there is an update to the POSIX standard. But things like this take time and it's also not necessarily clear yet what the right path here is going forward. You can consider a lot of what is currently taking place as an experiment to push the envelope.
As for if this is a Linux specific problem I'm not sure. I'm not sufficiently familiar with the situation on other operating systems to know what conversations are taking place there.
[1]: https://lwn.net/Articles/866582/
> The st_ino and st_dev fields taken together uniquely identify the file within the system.
It says exactly what it ought to say.
As an example, it offers that the identity of a file that's deleted could be reused.
Even if inodes still work nowadays they will eventually run into issues a few years down the line.
It's a bit of both. inodes are conflating two things in a way. They are used by the file system to identify a record but they are _also_ exposed in APIs that are really cross file system (and it comes to a head in case of network file systems or overlayfs).
What's a more realistic path is to make inodes just an FS thing, let it do it's thing, and then create a set of APIs that is not relying on inodes as much. Linux for instance is trying to move towards file handles as being that API layer.