How it works

The source tree

Shfs sources are split across a few directories. The shfs/Linux-2.x directory contains kernel module code for the specified kernel version, while the shfsmount/ directory is where user-space utilities live (shfsmount, the perl/shell server core is here too).

Where the code came from

Since writing new filesystem module from scratch is not very fun thing, shfs was partially based on Florin Malita's ftpfs. Using the ftpfs's code base required some bug fixing (locking, memory leaks, handling of date). As the time and development went on, the code from ftpfs began to vanish and now there is hardly any of the original code there. I have assimilated some portions of smbfs(ncpfs) code (mainly directory caching code through page cache) from the main Linux kernel tree.

Caches

Sending a shell command to the remote host on every request from the kernel VFS layer is not a wise idea, because of the high load it generates on both sides of ssh channel. A much better way is to use the caches for some operations, such as reading directories, reading and writing files, etc.

Read-write cache

(fcache.c)

This makes great performance improvements, since calling dd (= storing data on the remote side) for each page generates quite a high system load. By employing this read-write cache, dd is only called every on nth request. You can tune this cache using "cachesize" and "cachemax" options while mounting the filesystem.

Directory cache

(dcache.c)

Readline cache

(proc.c, function sock_readln)

Lines are read all at once instead of char-by-char. This speeds up directory lookups.

How it all works together

Figure illustrates: When a user calls shfsmount (or mount -t shfs) command in order to mount a remote share, basic checks are done, and a process is forked, and user command (ssh in most cases) is executed. This command has stdin/stdout redirected so it can be used for the connection by shfsmount. After the connection is established, shfsmount initializes the remote side and transfers "server" code to the remote side.

Although shfs is a shell filesystem, there are two different server code implementations available: shell and perl. Both have the same basic functionality although perl code is faster and more robust. Each implementation has its test phase which does all necessary checks (for perl this would be perl version, available modules, etc.).

Upon executing shfsmount, it calls mount syscall which passes the file descriptor (of ssh stdin/stdout) to the kernel module. Shfsmount could exit or wait for ssh to die and restart the connection again (while in persistent mode).