How it works
The source tree
Shfs sources are split across a few directories. The shfs/Linux-2.x directory contains kernel module code for the specified kernel version, while the shfsmount/ directory is where user-space utilities live (shfsmount, the perl/shell server core is here too).
Where the code came from
Since writing new filesystem module from scratch is not very fun thing, shfs was partially based on Florin Malita's ftpfs. Using the ftpfs's code base required some bug fixing (locking, memory leaks, handling of date). As the time and development went on, the code from ftpfs began to vanish and now there is hardly any of the original code there. I have assimilated some portions of smbfs(ncpfs) code (mainly directory caching code through page cache) from the main Linux kernel tree.
Sending a shell command to the remote host on every request from the kernel VFS layer is not a wise idea, because of the high load it generates on both sides of ssh channel. A much better way is to use the caches for some operations, such as reading directories, reading and writing files, etc.
- on file open, n pages are allocated as a simple read-write buffer
- file-offset and size are associated with the buffer
- the entire buffer is either clean (for read only) or dirty (data not yet written)
- on a read request, an attempt to read full buffer is performed (dirty data are flushed first)
- subsequent requests read data from this buffer (hit)
- on write request, if there is enough space in the buffer, data is written to the buffer
- if the buffer is full or a file is closed, the entire buffer is is send to the remote peer
This makes great performance improvements, since calling dd (= storing data on the remote side) for each page generates quite a high system load. By employing this read-write cache, dd is only called every on nth request. You can tune this cache using "cachesize" and "cachemax" options while mounting the filesystem.
- this cache is taken almost intact from smbfs/ncpfs, it uses plain dentry cache and page cache to prevent rm -rf from complaining
- the time-to-live of the dentry is 30 seconds by default, could be changed in the mount-time (ttl option).
Readline cache(proc.c, function sock_readln)
Lines are read all at once instead of char-by-char. This speeds up directory lookups.
How it all works together
Figure illustrates: When a user calls shfsmount (or mount -t shfs) command in order to mount a remote share, basic checks are done, and a process is forked, and user command (ssh in most cases) is executed. This command has stdin/stdout redirected so it can be used for the connection by shfsmount. After the connection is established, shfsmount initializes the remote side and transfers "server" code to the remote side.
Although shfs is a shell filesystem, there are two different server code implementations available: shell and perl. Both have the same basic functionality although perl code is faster and more robust. Each implementation has its test phase which does all necessary checks (for perl this would be perl version, available modules, etc.).
Upon executing shfsmount, it calls mount syscall which passes the file descriptor (of ssh stdin/stdout) to the kernel module. Shfsmount could exit or wait for ssh to die and restart the connection again (while in persistent mode).