symlinks, hardlinks and reflinks explained

summary: symlinks reference names, hardlinks reference meta-data and reflinks reference data.

The 3 different types of links on unix systems are used to connect the 3 components of a file. I.E., names (stored in directory entries), meta-data such as permissions (stored in inodes), and the data blocks themselves. As illustration, I'll describe the link types used, and the commands used to create the structure below:

name2 -- symlink --> name1 -- hardlink --> inode1 -- reflink ---> data
                     name3 -- hardlink -/                      /
                     name4 -- hardlink --> inode2 ---reflink -/

We'll create name1, inode1 and data first, which are components of a standard file, and can be created in many ways, like: echo > name1

symlinks

cp --symbolic-link name1 name2

symlinks can reference any name in the system hierarchy. Note it's not significant/shown in the above diagram, that the symlink reference is usually stored in a specially tagged inode, and hence timestamps for a symlink can be set on some systems. Permissions are ignored in symlink inodes. The main use for symlinks is to create aliases.

hardlinks

cp --link name1 name3

All names for standard files are hardlinked to an inode, but you can use cp -l to hardlink multiple names to an inode. Since hardlinks reference inodes directly, they're restricted to the same file system. Note a file can be held open by a process while all hardlinks are subsequently unlinked, leaving the data accessible until the file is closed. The main use for multiply hardlinked files is to create efficient backups.

reflinks

cp --reflink name1 name4

reflinks are supported by macOS, BTRFS, OCFS2, and XFS, and support transparent copy on write which is especially useful for snapshotting. Note that since separate inodes are used, one can have different permissions to access the same data. Reflinks have the same use as hardlinks, but are more space efficient and generally handle all subsequent operations on a file, not just unlink().