File Descriptors, File Descriptor Tables, and File Structures
Linux: The different structures pointing to an open file
Yes, to access a file, the file system go through multiple structures. Among them, we find the three elements mentioned in the title. I use the word “element” here since the file descriptor is not a structure but merely a number identifying an open file. Without further ado let’s demystify those confusing terms.
A file descriptor is an element returned by an open(2) system call. It is a number that can have a value between 0 and N. The value of N depends on the system. To find this value, we can execute this command (on my system the command returns 4096):
$ ulimit -Hn
The file descriptor effectively identifies an open file. Once obtained through the open(2) system call, it can be used to access the opened file with the read(2) and write(2) system calls. Once the file is no longer needed, it can be closed with the close(2) system call.
File Descriptor Table
The file descriptor is an index of the “file descriptor table”. The content of the table is a set of pointers to an (open) file structure. The below figure shows the relation between these elements:
The role of the file descriptor table is to identify a file structure “safely”. A naive alternative will be to give the address of the (open) file structure directly to the user space application as an identifier for the (open) file. However, the user space can tamper with this address and give the address to another structure when accessing the file. This can corrupt the kernel.
(Open) File Structure
This structure represents an open file. It contains two pieces of information. The first is a pointer to a Dentry or Inode: two structures representing a file. The second is the current position in the file. This position starts at 0 when the file is opened and is incremented after each read(2) or write(2) operation. It can also be directly modified through the lseek(2) system call.
The specificity of the above elements is that have no representation on the disk and are only available in memory while the operating system is running. This is contrary to other data structures Dentry (short for directory entry) and Inodes that have an on-disk representation. The first contains the filename, the second contains the rest of the information about a file: access rights, size, location of the file content, and so on.
This is a simplified view of some structures of the file system to grant “safe” access to a file. There are other fields of these structures that we have not mentioned such as the synchronization mechanism that is used to ensure the coherence of these structures: locks, reference counters, etc. But this goes beyond the scope of this introductory article.
Until next time, feel free to drop a comment and like this article!