Filesystems

In general, the space on a large hard disk will be split up into convenient-sized pieces, called disk partitions. With smaller disks, the whole disk might be treated as a single partition. Either way, once a system is large enough to have several partitions to use for disk storage (on a single disk or on several disks) there is a question to sort out - how should the system treat the various partitions so as to make them all visible to users? One possibility would be to have a separate root directory on each partition and then specify file names as pathnames starting from a particular partition root directory. But from what you have seen so far there seems to be only a single directory hierarchy and not one for each disk partition.

What happens in Linux is that each partition will have a filesystem imprinted on it with its own partition top level directory and its own directory hierarchy underneath. Linux then makes these individual filesystems appear to be one directory hierarchy by mounting the top level directory of one filesystem over a leaf directory of another and making the join appear seamless (see Figure 1).

Each file of whatever type, stored in a disk partition, is allocated a number (called its inode number) which is actually the index number of an entry in an array stored on the disk. Each element of the array is an inode which stores the administrative information about a single file (such as, when it was created, who owns it and where the data blocks for this file are stored on the disk partition). It is the inode number of a file that is stored in a directory alongside the file's name. So, essentially, directories are just tables that associate file names with inode numbers. Each file name and inode number pair in a directory is called a link.

Given that this is so, there is no real reason why the same inode number should not appear in more than one link. What this would mean is that you would have a single file with more than one valid pathname. This is shown diagrammatically in Figure 2. There are many instances where this can be useful. One possibility allows you to set up a directory of links to important files in order to provide a kind of 'undelete' facility, against the accidental removal of any of these files. This works because when you issue a remove (rm) command on a file and there is more than one link to the inode, then only your link will be removed. The inode itself and any other links will remain intact. It is not until you rm a file for which only one link exists that the inode itself will be released along with the file data blocks and the directory link.

To create a new link to an existing file you use the ln command and specify as parameters the pathname for the existing file followed by the new link pathname:

	$ ln text/passwd text/newpass
	$ mkdir backup
	$ ln text/motd backup/motd.bak

In order to see the inode numbers to check that the links are all as expected, you use the -i switch to the ls command, as follows:

	$ ls -i backup text 
	backup:
		338 motd.bak 
	text:
		338 motd	340 newpass	340 passwd

If you change the contents of text/motd in the example, then the contents of the file backup /motd.bak will also be changed because it is exactly the same file.

The only problem with all this is that each filesystem (disk partition) has its own array of inodes, so that the inode numbers are only unique within a single filesystem. This means that you could not use ln to set up this kind of link between, say, text/motd under your home directory and another pathname in a different filesystem. This is because, in another filesystem, inode number 338 (the inode number of text/motd in your home directory's filesystem) would be a completely different file.

If you need to set up a link between pathnames in different filesystems it can be done, but not using the shared inode technique above. What you do in this case is to use ln -s to set up a symbolic link between the two instead. A symbolic link is one of the Linux special file types and in effect it is just a text file which contains the pathname of the other file to which it provides a link. The other file is the real file which contains all the data. All the commands that read or write the contents of a file, when they are applied to a symbolic link, will follow the link and access the real file instead. Obviously, there is some very small time penalty for having to take this extra step but it is too small to make any real difference.

When you want to create a link between two files and you want to make sure that what you do can be ported to other systems, you should use symbolic links. This is because they can be used both within a single filesystem and between separate filesystems, while inode links can only be used in the first of these situations:

	$ ln -s text/motd backup/motd2.bak
	$ ls -i backup text
	backup:
		338 motd.bak    328 motd2.bak 
	text:
		338 motd	340 newpass	340 passwd

Another command that operates by manipulating inode links is mv. This command is used to move a directory link from one place to another within the same filesystem. Effectively, it performs the same function as ln to link modes, but then removes the old link to the file:

	$ mv text/newpass backup/passwd.bak
	$ ls -i backup text
	backup:
		338 motd.bak	328 motd2.bak	340 passwd.bak
	text:
		338 motd	340 passwd

One side effect of the mv command, which is actually used more than its ability to move files around between directories, is the fact that if the new file is in the same directory as the old file then it performs the same function as a rename command. In fact there is no specific rename command in Linux; you just use mv:

	$ mv backup/motd2.bak backup/motd.sym.iink
	$ ls -i backup
		338 motd.bak	328 motd.sym.link 340 passwd.bak

Just as with the cp command, mv can have its second parameter specified as a directory rather than a file and the first parameter can then be a list of files, all of which will be moved into the specified directory, whilst keeping the same file names as before.

Next: Manipulating Files