The layout of a disk partition configured as a standard UNIX System V file-system is very simple and is illustrated in Figure 1.
As you can see, each disk partition is split into just four parts:
In the case of modes and data blocks, some method is required to show when they are free to be allocated, or when they are already allocated and in use. For the System V filesystem a list of free modes and data blocks is kept in the super block to make allocation faster when they are required, though the way in which they are held is different for the two things.
It is generally not possible, by a simple inspection of a data block's contents, to determine whether or not the block is in use. This is because all of a data block's bytes are used to hold file contents, and a file's contents can be any sequence of bytes at all. This means that it is not possible to store a particular sequence of bytes in a data block to indicate that the block is free for allocation. The necessary solution to this problem is to have some data structure which lists those data blocks that are free.
This is actually implemented as a form of linked list, the first part of which is stored in the filesystem's super block. Data blocks on the disk partition are numbered sequentially and are thus each given a unique block number. When the filesystem is first created, all of the data blocks are obviously free and should, therefore, all appear on the free list. However, the space in the super block for this list is limited and only a small part of the list will fit. The next part of the list is stored in one of the free data blocks, the number of which is arranged to be the last block number in the super block list. This idea is repeated, using free blocks to store lists of free block numbers, and making sure that the last free block number in each list is a pointer to the next list of free block numbers. This idea is shown in Figure 2.
Using this arrangement, when a request is made for a new data block, the next free one on the super block list will be allocated (blocks will be allocated right to left in the diagram). As more data blocks are requested, this will be repeated until the super block list has only one entry left in it. Don't forget that this last block in the super block list contains the block numbers of the next part of the free block list. Therefore, these block numbers need to be copied into the super block list before the block which contained them is allocated, which it can be once the copy has taken place.
Figure 3 shows the situation after all the initial list of free block numbers in the super block have been allocated and the contents of block number 200 have been copied into the super block, just before block 200 was allocated as free.
When files are deleted and data blocks become free, their block numbers are just added to the free list in the super block. When this list becomes full and another data block becomes free (say, 188), then there is no space left to store the number. In this situation what happens is that the portion of the data block free list which is in the super block is copied into the newly freed block (188) and then the super block list is emptied and the newly freed block (188) is made the leftmost (and only) entry in the super block list.
The case of inode allocation is different to the allocation of disk data blocks for several reasons and, consequently, the free inode information is stored in a different format to that of free data blocks. The main reasons for the difference are:
Taking these points into account, the kernel could just do a linear search through the inode array to find a free inode every time a request for a new inode was made. While this strategy could be made to work, it is made much more efficient by having a set of free inode numbers available in the super block. Unlike the case of free data blocks, however, the free inode list in the super block is just a stand alone list, so that there can be many inodes that are free and yet do not appear in the list. The use of the super block inode free list is illustrated in Figure 4, parts (A) to (F) as follows:
Now that a mechanism exists for allocating a new inode to a file and for finding disk data blocks to add to the file, all that is now required is some method for accessing the data blocks in a file given its inode. The most obvious mechanism for this is to have the data block numbers listed in the inode. Sadly, for any reasonably large file, this solution is impracticable due to the amount of space it would require. within the iode.
Looking at the average sizes of files created by typical users shows that the majority are 1024 bytes or less in size and that, as the size of files increases beyond this, the number of files of each size gets less. This suggests that an arrangement which makes accesses to small files quick and efficient, even at the expense of some speed for larger files, would be an acceptable compromise. It is this type of arrangement which System V implements.
Inside a System V mode there is an array of 13 elements, each of which can hold the number of a disk data block (see Figure 5). The first ten elements of the array are used to hold the data block numbers of the first ten data blocks assigned to storing the file's contents.
Assuming, for the purposes of illustration, that each disk data block is 1024 bytes in size, then these ten data block pointers will allow files to be created that are up to 10 Kb in size. As you can see, for the large majority of files it should be possible to access the data with nothing more than a direct lookup required to find the data block that contains any particular data byte.
With this scheme, once a file has grown to 10 Kb, there are only three block pointers in the inode left to use, whatever the eventual size of the file. Obviously, some new arrangement must be found so that the three remaining block pointers will suffice for any realistic file size, while at the same time not degrading the data access time too much.
This goal is achieved by using the idea of indirect block pointers. Specifically, when an 11th data block needs to be allocated to the file, the 11th inode block pointer is used, but instead of pointing to the block which will contain the data, the 11th pointer is a single indirect pointer which points to a data block filled with a list of direct block pointers. In our example, if we assume that a data block number is a 32-bit value, then a list of 256 of them will fit into the single indirect block. This list will point directly to the data blocks for the next 256 Kb of our file. This means that with 11 block pointers in the inode, files of up to 266 Kb (10 + 256) can be created. True, it takes a little longer to access the data beyond the first 10 Kb in the file, but it takes only one extra disk block read to find the position on the disk of the required data.
For files bigger than 266 Kb the double indirect (12th) inode block pointer is used. This is the same idea as the previous inode pointer except that the double indirect pointer points to a list of pointers in a data block, each of which is itself a single indirect block pointer which points to a list of 256 direct block pointers. This means that the 12th inode block pointer gives access to the next 65536 Kb (256x256) of data in our file.
By now, you should be able to spot the pattern and see that when the file grows bigger than 64 Mb (actually 65802 Kb), the inode's 13th data block pointer will be used, but this time as a triple indirect pointer, which will give access to a staggering 16 Gb (256x256x256 Kb) of extra file space. A single file bigger than 16Gb sounds huge. However, even though the calculation we have just done suggests that this file size is possible with the inode layout as given, in fact there are other factors which limit the maximum size of a file to a smaller value than this. For example, the size of a file, in bytes, is stored separately in its inode in a field of type unsigned long. This is a 32-bit number which limits the size of a file to 4 Gb, so that 13 data block pointers in an inode really are enough.