Our first experiment determines network message overheads for common file and directory operations at the granularity of system calls. We consider sixteen commonly-used system calls shown in Table 1 and measure their network message overheads using the Ethereal packet monitor. Note that this list does not include the read and write system calls, which are examined separately in Section 4.4.
For each system call, we first measure its network message overhead assuming a cold cache and repeat the experiment for a warm cache. We emulate a cold cache by unmounting and remounting the file system at the client and restarting the NFS server or the iSCSI server; this is done prior to each invocation of a system call. The warm cache is emulated by invoking the system call on a cold cache and then repeating the system call with similar (though not identical) parameters. For instance, to understand warm cache behavior, we create two directories in the same parent directory using mkdir, we open two files in the same directory using open, or we perform two different chmod operation on a file. In each case, the network message overhead of the second invocation is assumed to be the overhead in the presence of a warm cache.
Directory operations | File operations |
Directory creation (mkdir) | File create (creat) |
Directory change (chdir) | File open (open) |
Read directory contents (readdir) | Hard link to a file (link) |
Directory delete (rmdir) | Truncate a file (truncate) |
Symbolic link creation (symlink) | Change permissions (chmod) |
Symbolic link read (readlink) | Change ownership (chown) |
Symbolic link delete (unlink) | Query file permissions (access) |
Query file attributes (stat) | |
Alter file access time (utime) |
The directory structure can impact the network message overhead for a given operation. Consequently, we report overheads for a directory depth of zero and a directory depth of three. Section 4.3 reports additional results obtained by systematically varying the directory depth from 0 to 16.
Directory depth 0 | Directory depth 3 | |||||||
V2 | V3 | V4 | iSCSI | V2 | V3 | V4 | iSCSI | |
mkdir | 2 | 2 | 4 | 7 | 5 | 5 | 10 | 13 |
chdir | 1 | 1 | 3 | 2 | 4 | 4 | 9 | 8 |
readdir | 2 | 2 | 4 | 6 | 5 | 5 | 10 | 12 |
symlink | 3 | 2 | 4 | 6 | 6 | 5 | 10 | 12 |
readlink | 2 | 2 | 3 | 5 | 5 | 5 | 9 | 10 |
unlink | 2 | 2 | 4 | 6 | 5 | 5 | 10 | 11 |
rmdir | 2 | 2 | 4 | 8 | 5 | 5 | 10 | 14 |
creat | 3 | 3 | 10 | 7 | 6 | 6 | 16 | 13 |
open | 2 | 2 | 7 | 3 | 5 | 5 | 13 | 9 |
link | 4 | 4 | 7 | 6 | 10 | 9 | 16 | 12 |
rename | 4 | 3 | 7 | 6 | 10 | 10 | 16 | 12 |
trunc | 3 | 3 | 8 | 6 | 6 | 6 | 14 | 12 |
chmod | 3 | 3 | 5 | 6 | 6 | 6 | 11 | 12 |
chown | 3 | 3 | 5 | 6 | 6 | 6 | 11 | 11 |
access | 2 | 2 | 5 | 3 | 5 | 5 | 11 | 9 |
stat | 3 | 3 | 5 | 3 | 6 | 6 | 11 | 9 |
utime | 2 | 2 | 4 | 6 | 5 | 5 | 10 | 12 |
Table 2 depicts the number of messages exchanged between the client and server for NFS versions 2, 3, 4 and iSCSI assuming a cold cache.
We make three important observations from the table. First, on an average, iSCSI incurs a higher network message overhead than NFS. This is because a single message is sufficient to invoke a file system operation on a path name in case of NFS. In contrast, the path name must be completely resolved in case of iSCSI before the operation can proceed; this results in additional message exchanges. Second, the network message overhead increases as we increase the directory depth. For NFS, this is due to the additional access checks on the pathname. In case of iSCSI, the file system fetches the directory inode and the directory contents at each level in the path name. Since directories and their inodes may be resident on different disk blocks, this triggers additional block reads. Third, NFS version 4 has a higher network message overhead when compared to NFS versions 2 and 3, which have a comparable overhead. The higher overhead in NFS version 4 is due to access checks performed by the client via the access RPC call.
We make one additional observation that is not directly reflected in Table 2. The average message size in iSCSI can be higher than that of NFS. Since iSCSI is a block access protocol, the granularity of reads and writes in iSCSI is a disk block, whereas RPCs allow NFS to read or write smaller chunks of data. While reading entire blocks may seem wasteful, a side-effect of this policy is that iSCSI benefits from aggressive caching. For instance, reading an entire disk block of inodes enable applications with meta-data locality to benefit in iSCSI. In the absence of meta-data or data locality, however, reading entire disk blocks may hurt performance.
While the message size can be an important contributor to the network message overhead analysis of the two protocols, our observations in the macro-benchmark analysis indicated that the number of messages exchanged was the dominant factor in the network message overhead. Consequently, we focus on the number of messages exchanged as the key factor in network message overhead in the rest of the analysis.
Directory depth 0 | Directory depth 3 | |||||||
v2 | v3 | v4 | iSCSI | v2 | v3 | v4 | iSCSI | |
mkdir | 2 | 2 | 2 | 2 | 4 | 4 | 3 | 2 |
chdir | 1 | 1 | 0 | 0 | 3 | 3 | 2 | 0 |
readdir | 1 | 1 | 0 | 2 | 3 | 3 | 3 | 2 |
symlink | 3 | 2 | 2 | 2 | 5 | 4 | 4 | 2 |
readlink | 1 | 2 | 0 | 2 | 3 | 3 | 3 | 2 |
unlink | 2 | 2 | 2 | 2 | 5 | 4 | 3 | 2 |
rmdir | 2 | 2 | 2 | 2 | 4 | 4 | 3 | 2 |
open | 3 | 2 | 6 | 2 | 5 | 5 | 9 | 2 |
creat | 4 | 3 | 2 | 2 | 6 | 4 | 6 | 2 |
open | 1 | 1 | 4 | 0 | 4 | 4 | 6 | 0 |
rename | 4 | 3 | 2 | 2 | 6 | 6 | 6 | 2 |
trunc | 2 | 2 | 4 | 2 | 5 | 5 | 7 | 2 |
chmod | 2 | 2 | 2 | 2 | 4 | 5 | 5 | 2 |
chown | 2 | 2 | 2 | 2 | 4 | 5 | 5 | 2 |
access | 1 | 1 | 1 | 2 | 4 | 4 | 3 | 0 |
stat | 2 | 2 | 2 | 2 | 5 | 5 | 5 | 0 |
utime | 1 | 1 | 1 | 2 | 4 | 4 | 4 | 2 |
Table 3 depicts the number of messages exchanged between the client and the server for warm cache operations. Whereas iSCSI incurred a higher network message overhead than NFS in the presence of a cold cache, it incurs a comparable or lower network message overhead than NFS in the presence of a warm cache. Further, the network message overhead is identical for directory depths of zero and three for iSCSI, whereas it increases with directory depth for NFS. Last, both iSCSI and NFS benefit from a warm cache and the overheads for each operation are smaller than those for a cold cache. The better performance of iSCSI can be attributed to aggressive meta-data caching performed by the file system; since the file system is resident at the client, many requests can be serviced directly from the client cache. This is true even for long path names, since all directories in the path may be cached from a prior operation. NFS is unable to extract these benefits despite using a client-side cache, since NFS v2 and v3 need to perform consistency checks on cached entries, which triggers message exchanges with the server. Further, meta-data update operations are necessarily synchronous in NFS, while they can be asynchronous in iSCSI. This asynchronous nature enables applications to update a dirty cache block multiple times prior to a flush, thereby amortizing multiple meta-data updates into a single network block write.