Monday, January 16, 2012

Understand UNIX / Linux Inodes Basics with Examples


Several countries provides a unique identification number (for example, social security number in the USA) to the people who live in that country. This makes it easier to identify an individual uniquely. This makes it easier to handle all the paper work necessary for an individual by various government agencies and financial institutions.
Similar to the social security number, there is a concept of Inode numbers which uniquely exist for all the files on Linux or *nix systems.

Inode Basics

An Inode number points to an Inode. An Inode is a data structure that stores the following information about a file :
  • Size of file
  • Device ID
  • User ID of the file
  • Group ID of the file
  • The file mode information and access privileges for owner, group and others
  • File protection flags
  • The timestamps for file creation, modification etc
  • link counter to determine the number of hard links
  • Pointers to the blocks storing file’s contents
Please note that the above list is not exhaustive. Also, the name of the file is not stored in Inodes (We will come to it later).
When a file is created inside a directory then the file-name and Inode number are assigned to file. These two entries are associated with every file in a directory. The user might think that the directory contains the complete file and all the extra information related to it but this might not be the case always. So we see that a directory associates a file name with its Inode number.
When a user tries to access the file or any information related to the file then he/she uses the file name to do so but internally the file-name is first mapped with its Inode number stored in a table. Then through that Inode number the corresponding Inode is accessed. There is a table (Inode table) where this mapping of Inode numbers with the respective Inodes is provided.

Why no file-name in Inode information?

As pointed out earlier, there is no entry for file name in the Inode, rather the file name is kept as a separate entry parallel to Inode number. The reason for separating out file name from the other information related to same file is for maintaining hard-links to files. This means that once all the other information is separated out from the file name then we can have various file names which point to same Inode.
For example :
$ touch a

$ ln a a1

$ ls -al
drwxr-xr-x 48 himanshu himanshu 4096 2012-01-14 16:30 .
drwxr-xr-x 3 root root 4096 2011-03-12 06:24 ..
-rw-r--r-- 2 himanshu family 0 2012-01-14 16:29 a
-rw-r--r-- 2 himanshu family 0 2012-01-14 16:29 a1
In the above output, we created a file ‘a’ and then created a hard link a1. Now when the command ‘ls -al’ is run, we can see the details of both ‘a’ and ‘a1′. We see that both the files are indistinguishable. Look at the second entry in the output. This entry specifies number of hard links to the file. In this case the entry has value ’2′ for both the files.
Note that Hard links cannot be created on different file systems and also they cannot be created for directories.

When are Inodes created?

As we all now know that Inode is a data structure that contains information of a file. Since data structures occupy storage then an obvious question arises about when the Inodes are created in a system? Well, space for Inodes is allocated when the operating system or a new file system is installed and when it does its initial structuring. So this way we can see that in a file system, maximum number of Inodes and hence maximum number of files are set.
Now, the above concept brings up another interesting fact. A file system can run out of space in two ways :
  • No space for adding new data is left
  • All the Inodes are consumed.
Well, the first way is pretty obvious but we need to look at the second way. Yes, its possible that a case arises where we have free storage space but still we cannot add any new data in file system because all the Inodes are consumed. This may happen in a case where file system contains very large number of very small sized files. This will consume all the Inodes and though there would be free space from a Hard-disk-drive point of view but from file system point of view no Inode available to store any new file.
The above use-case is possible but less encountered because on a typical system the average file size is more than 2KB which makes it more prone to running out of hard disk space first. But, nevertheless there exists an algorithm which is used to create number of Inodes in a file system. This algorithm takes into consideration the size of the file system and average file size. The user can tweak the number of Inodes while creating the file system.

Commands to access Inode numbers

Following are some commands to access the Inode numbers for files :

1) Ls -i Command

As we explained earlier in our Unix LS Command: 15 Practical Examples article, the flag -i is used to print the Inode number for each file.
$ ls -i
1448240 a 1441807 Desktop 1447344 mydata 1441813 Pictures 1442737 testfile 1448145 worm
1448240 a1 1441811 Documents 1442707 my_ls 1442445 practice 1442739 test.py
1447139 alpha 1441808 Downloads 1447278 my_ls_alpha.c 1441810 Public 1447099 Unsaved Document 1
1447478 article_function_pointer.txt 1575132 google 1447274 my_ls.c 1441809 Templates 1441814 Videos
1442390 chmodOctal.txt 1441812 Music 1442363 output.log 1448800 testdisk.log 1575133 vlc
See that the Inode number for ‘a’ and ‘a1′ are same as we created ‘a1′ as hard link.

2) Df -i Command

df -i command displays the inode information of the file system.
$ df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda1            1875968  293264 1582704   16% /
none                  210613     764  209849    1% /dev
none                  213415       9  213406    1% /dev/shm
none                  213415      63  213352    1% /var/run
none                  213415       1  213414    1% /var/lock
/dev/sda2            7643136  156663 7486473    3% /home
The flag -i is used for displaying Inode information.

3) Stat Command

Stat command is used to display file statistics that also displays inode number of a file
$ stat a
File: `a'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 805h/2053d Inode: 1448240 Links: 2
Access: (0644/-rw-r--r--) Uid: ( 1000/himanshu) Gid: ( 1001/ family)
Access: 2012-01-14 16:30:04.871719357 +0530
Modify: 2012-01-14 16:29:50.918267873 +0530
Change: 2012-01-14 16:30:03.858251514 +0530

Example Usage Scenario of an Inode number

  1. Suppose there exist a file name with some special character in it. For example:  ”ab*
  2. Try to remove it normally using rm command, you will not be able to remove it.
  3. However using the inode number of this file you can remove it.
Lets see these steps in this example :
1) Check if the file exists:
$ ls -i
1448240 a 1447274 my_ls.c
1448240 a1 1442363 output.log
1448239 "ab* 1441813 Pictures
1447139 alpha
So we have a file with name “ab* in this directory
2) Try to remove it normally:
$ rm "ab*
> ^C
$ rm "ab*
> ^C
$
See that I tried couple of times to remove the file but could not.
3) Remove the file using Inode number:
As we discussed earlier in our find command examples article, you can search for a file using inode number and delete it.
$ find . -inum 1448239 -exec rm -i {} \;
rm: remove regular empty file `./"ab*'? y
$ ls -i
1448240 a 1447274 my_ls.c
1448240 a1 1442363 output.log
1447139 alpha 1441813 Pictures
So we used the find command specifying the Inode number of the file we need to delete. The file got deleted. Though we could have deleted the file otherwise also by using the command rm \”ab* instead of using the complicated find command example above but still I used it to demonstrate one of the use of Inode numbers for users.