################################################ # # # ## ## ###### ####### ## ## ## ## ## # # ## ## ## ## ## ### ## ## ## ## # # ## ## ## ## #### ## ## ## ## # # ## ## ###### ###### ## ## ## ## ### # # ## ## ## ## ## #### ## ## ## # # ## ## ## ## ## ## ### ## ## ## # # ####### ###### ####### ## ## ## ## ## # # # ################################################ The following paper was originally published in the Proceedings of the USENIX Summer 1993 Technical Conference Cincinnati, Ohio June 21-25, 1993 For more information about USENIX Association contact: 1. Phone: 510 528-8649 2. FAX: 510 548-5738 3. Email: office@usenix.org 4. WWW URL: https://www.usenix.org The Restore-o-Mounter The File Motel Revisited Joe Moran Bob Lyon Legato Systems, Incorporated Abstract We present a scheme for referencing and accessing saved (( footnote 1: We use the word "save" to denote the super set of "backup" and "archive"; save is also easier to conjugate than backup.)) files in a manner that is transparent to UNIX" applications. The scheme requires no kernel modifications. Instead, it uses a "mounted" process that allows users to change directories to the past and browse their saved files with their favorite utilities. The mounted process acts as a protocol gateway between NFS and a commercially available network backup product. Time travel is supported; users may change directories to any moment in the past. Any saved version (not just the most recent version) of any file can be viewed or recovered, even if the file has since been deleted. Using this transparent method of retrieving saved files by naming their location in the past, a poor man's file migration scheme can be implemented by substituting a symbolic link to a saved location for a file. Once a file is referenced, the symbolic link can be replaced with its original file. This migration scheme requires no kernel modifications yet remains transparent to UNIX applications and users. Introduction This paper describes two file management features (( footnote 2: The features described in this document should not be construed as products features currently available from Legato.)) that have eluded UNIX users - file recovery integrated into the operating environment and transparent file archiving (more commonly called file migration). Since saved files can be named and accessed with the same ABI as normal files, users can employ their favorite UNIX tools to find and restore deleted or damaged files. The most obvious examples of such tools are cd, ls, find and cp. Equally important though are the users' environments - shells like csh or ksh that are customized to each individual and provide services such as file name completion and pattern matching, or even windowing environments that completely hide the normal UNIX commands. File archiving and migration strategies are driven by economics and convenience. Most users do not mind that "inactive" files are removed from their local disks and archived elsewhere provided that the files are easily and rapidly recoverable when needed. Migration of this sort is economical because low capacity, high performance and high priced disks contain only frequently accessed files, while high capacity, low performance and low cost media (like optical disks or tapes) contain seldom or never accessed files. Users find it far more convenient to archive files to free local disk space than to perform the numerous tasks and endure the long elapsed times associated with ordering and installing new and bigger disks. Customers might also wish to "actively" archive complete directory sub-trees after important events occur at their companies. For example, when a software company ships a major release of its product, it might want to immediately archive that release including documentation, SCCS perhaps even the compilers and system libraries used to build the release. A second example is a company that archives a departed employee's home directory, thereby freeing much needed disk space without risking the deletion of important data. These features are of very limited use if they are not available in the networked environment. Why delivering solutions is hard Ease of use is the key attribute to any solution involving file recovery. In the UNIX environment, this usually means the solutions have to be compatible with and transparent to users' existing tools. Since the common ABI for all tools is the UNIX system call interface, this generally means that a solution must be implemented as enhancements to the kernel's filesystem primitives. The Virtual File System [KLEI86], VFS architecture was introduced as part of the implementation of Network Filesystem [SAND85] to separate the filesystem interface from the various filesystem implementations. VFS clearly made adding filesystem functionality easier provided that the implementor had access to the kernel source and understood the rules associated with extending the kernel. However, independent software vendors (ISVs) cannot depend on the VFS interfaces; they are not identical on each UNIX vendor's platform and each new release of a specific UNIX variant can render the ISVs' added functionality useless until the code can be re-ported to the new kernel. Hardware costs also conspired to keep solutions unavailable. The price of media jukeboxes allowed only the richest and most dedicated companies to consider deploying automated recovery systems. The limited size of this niche market further discouraged UNIX ISVs from producing software that assisted in the automation. However, within the last twelve months, the price of tape jukeboxes has plummeted while the reliability of their robotics has risen sharply. For example, one can buy a 50 gigabyte (before compression) tape jukebox with tape drive for less than $5,000. With compression and larger jukeboxes, end-user cost might drop as low as three cents per megabyte. About NetWorker Legato sells a backup and recovery product called NetWorker [LEGA93]. NetWorker is designed and implemented using a client/server architecture. A server is a machine that manages the saved media (usually tapes or optical disks) and an "on-line index" consisting of two databases. The first database maps a file's key and a time to the file's attributes. The second database maps a file's key and a time to a location in the server's media pool where the complete file's attributes and data are saved. Note that the NetWorker server knows no details about its clients or the clients' files and filesystems. The server is required to retrieve file attributes (quickly) and file data (a little more slowly) when and if a client demands them. The server supports most known tape devices, optical disks, and normal files. Many popular media robots are also supported for "unattended" operation. The NetWorker client is the machine with files and filesystems that need to be saved. The client walks the local filesystems and sends their descriptions and data to the server. In the case of a UNIX client, the files' keys are both file names and device-inumber pairs, while the files' attributes are identical to NFS attributes. A UNIX directory has no data, but its attributes include a list of all files found within the directory. The client knows nothing about how its files' attributes and data are archived by the server; it only knows that server can deliver either on demand. The product's client side includes file browsers that allows a user to view saved filesystems as of any time in the past. The command line interface to the browser is similar to BSD restore -i with additional features for time specification and file version information. The product also includes X Window System browsers for command-line challenged users. The browser translates user inputs into queries against the server's on-line index and presents the results as a view into a saved filesystem. Once the user has selected files for recovery, the browser submits a request to the server for data associated with the desired files. Clients communicate with servers via an application specific protocol built on ONC (Sun) Remote Procedure Calls [NOWI88]. The name of the application protocol is "NetWorker Save and Recover" or NSR. Typical protocol stacks are NSR/RPC/TCP/IP or NSR/RPC/SPX/IPX. NetWorker server platforms include eight UNIX vender's plat forms as well as NetWare". Client-only implementations are also available for DOS and numerous other UNIX plat forms. More details of NetWorker's design and implementation are covered in NetWorker's theory of operations manual and beyond the scope of this paper. Most of hard design issues were addressed years ago during NetWorker's development, and the features of the restore-o-mounter are built upon these mature solutions. Ib: the file index browser {{ Figure 1. The restore-o-mounter forks processes that translate the NFS protocol originating from the local kernel to the NSR protocol destined for a NetWorker ("backup") server. The NetWorker server is not necessarily a UNIX machine or even an NFS server.}} When you lose a file, why can't you just "change directory" to the past and copy the file back to the present? This simple question was the inspiration for the restore-o-mounter. The first reason why you can't do that is because the past is not a mounted filesystem and normal users can't use mount anyway. However, the automounter [CALL89] mounts filesystems for normal users on demand. The automounter is a UNIX process that mounts itself as an NFS server in the UNIX filesystem. The automounter then turns name references sent from its kernel into mounts to actual filesystems. The automounter also periodically attempts to auto-unmount the filesystems that it mounted. The index browser (ib) takes a similar approach. By default, it mounts itself on the mount point /ib, but instead of translating NFS requests from its kernel into mount actions, the requests are translated into browse sessions with NSR servers. Figure 1 shows the conceptual workings of the index browser; it is essential a protocol gateway that translates from the NFS protocol to the NSR protocol. When presented with a new client name or a new time to browse the saved files, ib forks and execs a copy of iba, the index browser agent. The generic form of names that ib translates to iba sessions is [[server:]client][@time] Server is the name of the NSR server to use; note that NSR client software usually determines the appropriate server. The ability to specify the NSR server is needed only in environments where a single NSR client has multiple NSR servers. Client is the name of the NSR client whose saved files will be browsed and potentially recovered; the default client name is the local machine's name. Time determines which view will be browsed; the default time is the most recently saved version. Ib's most visible function is to choose suitable and intuitive defaults and convert human usable time strings into NSR suitable time values. Ib uses getdate routines [BELL87] to provide user friendly time translation. Here are some examples of changing the working directory into some root directories in the past: cd /ib/@now changes directory to the most recent save for the local machine. cd /ib/@yesterday changes directory to yesterday's save for the local machine. cd /ib/ganymede changes directory to the most recent save for the machine ganymede. cd /ib/jupiter:io changes directory to the most recent save for the machine io saved to the NSR server jupiter. cd /ib/jupiter:io@last_month changes directory to the save of a month ago for the machine io saved to the NSR server jupiter. By the time the ib process receives an NFS lookup or stat on a name in /ib, the kernel has already concluded that the name was not a filesystem mount point. Therefore, ib creates a new directory whose name is derived from the name submitted by the kernel, mounts an index browser agent on the derived directory, then creates the submitted name as a symbolic link to the derived directory, and finally returns the symbolic link information to the kernel. For example, after a user references a file from "yesterday's" save of the local machine, /ib may look like: io% cd /ib/@yesterday io% ls -l /ib total 2 lrwxrwxrwx 1 root 10 Apr 9 10:33 @yesterday -> jupiter:io@04-08-93_10:33:46 drwxr-xr-x 25 root 1536 Apr 6 09:02 jupiter:io@04-08-93_10:33:46 One can see that the local machine's name is io and its default NSR server is jupiter. The getdate routine mapped "yesterday" to April 8, 1993 at 10:33:46 am, which is exactly twenty four hours prior to the execution of this example. Ib periodically attempts to unmount the mounts that it automatically performed. Upon a successful unmount, the mount point and symbolic link are removed. We found that removing the mount point caused problems with programs that remember the true mount point path names (e.g., the OpenWindows File Manager and ksh). To alleviate this, ib's time translator appends or removes an extra underscore character, `_' at the end of the time string when the time string is identical to the string submitted by the kernel. So, the contents of /ib may eventually look like: io% ls -l /ib total 2 lrwxrwxrwx 1 root 28 Apr 9 11:27 jupiter:io@04-08-93_10:33:46 -> jupiter:io@04-08-93_10:33:46_ drwxr-xr-x 25 root 1536 Apr 6 09:02 jupiter:io@04-08-93_10:33:46_ should the unmounted name be re-submitted. Iba: the index browser agent {{ Figure 2. The typical mount points in a UNIX filesystem after two browsing sessions are initiated with the NetWorker server.}} Figure 2 shows the results of ib forking and execing two index browser agent (iba) processes. Like ib, iba is a "mounted" NFS server process that only responds to calls from its kernel. When launched by ib, iba mounts itself on top of a directory fabricated by ib. Iba establishes an NSR connection subject to the arguments passed to it by its parent. The arguments include which NSR server to bind to, which NSR client's index to browse, which view (in time) to present to its kernel, and where to mount itself in the UNIX file name space. As mentioned before, NetWorker's on-line index for a UNIX file contains the complete stat information of that file. If the file is a symbolic link, then the link's value is also kept on-line; this means that NSR media need not be accessed to resolve symbolic links. Because NFS and NSR file attributes are similar and because the NFS and NSR architectures are similar (since the original designers of both architectures have considerable overlap) iba's primary task is implemented by one-to-one mapping of NFS operations to NSR operations already available in NetWorker's browsers through a library. The challenge in most NFS server implementations is the design of the NFS file handle (fhandle). Iba fhandles contain actual pointers to cached in-core data structures built by the library as files are referenced. Iba fhandles also contain information used to verify their validity. The size and speed of iba are covered in a later section. {{ Figure 3. A typical application's view of restore-o-mounted filesystems. The two sub-trees were generated by the user pointing the OpenWindows File Manager to the directories /ib/io@now and /ib/io@yesterday. In this example, numerous directories were deleted between yesterday and today. Note that the two folders at the far right are symbolic links to their corresponding sub-trees.}} Iba servicing NFS reads Iba has command line arguments that determine one of three policies for servicing NFS read requests: 1. Always attempt file recovery on NFS read. 2. Do file recover on NFS read if data can be recovered automatically without requiring any human intervention. This is the default. NetWorker software knows if the volume(s) that contain the requested data are mounted on a device or available in a robotic jukebox. If so, the data is considered "near-line" and iba will fetch it. If the data is not near-line, iba returns ERREMOTE (( footnote 3: ERREMOTE is not an error defined in the NFS protocol specification. It is defined as part the System V RFS definition. Its associated error string "Object is remote" almost matches the desired "the data is too far away". This is probably good enough for a system that was wont to print "Not a typewriter". Clearly iba is relying on its NFS client, the local kernel, to pass most errors up to its callers without any attempt to interpret them.)) as the results of the NFS read operation. 3. Never attempt file recovery on NFS read. The results of the NFS read operation is always ERREMOTE. This is handy if the customer does not want the NetWorker server bogged down doing actual data recovers and desires a "stat-only" filesystem. In this case, the restore-o-mounter could more appropriately be called the "browse-o- mounter". The latter options are not found in related work but is useful in the restore-o-mounter, because the NetWorker server most likely stores the data on tape. The seek performance of this class of media is at least three orders of magnitude slower than that of rotating media. So, while fetching a few files from tape via iba may be practical, grep'ing one's entire home directory in the past for a pattern is not recommended. NetWorker's recover command is more suitable for moving vast amounts of data from tapes to disks. Still, users agree that a wealth of information is available from stat- only filesystems. Once a read request is performed, iba requests the entire file from the NSR server. The file is then cached in a more traditional filesystem on the UNIX machine. (The location of the file cache is yet another command line argument to iba.) The original read operation is not responded to until the entire file is recovered. Subsequent reads are then ser viced from the cache. File caching is the optimal way of dealing with the very cheap, but very low performance tape media. However, a subsequent section shows how the cache may become the real thing. The UNIX kernel is multi-threaded, but iba (like most user NFS server processes) is singly-threaded. To avoid suspending all lookup service for the duration of any file fetch, iba forks a child to handle the recovery of each file. The parent then resumes NFS service but ignores read requests associated with children who are actively recovering the data. That is, a read request is ultimately answered by the parent iba, but only after a child has placed a complete file in the cache. Versions and hidden files The NetWorker product provides a means to see every saved version of any file. Iba employs hidden file techniques to see and explicitly name these versions. A hidden file name never appears in the results the NFS readdir call, but the same name yields successful results when used as arguments to the NFS lookup call. As originally described in The File Motel [HUME88], versions of a file f may be found in the hidden directory f.V, independent of iba's browse time. For example, io% cd /ib/io/etc io% ls -l rc.local.V total 28 -rw-r--r-- 1 root 7564 Apr 1 09:03 v1:_jupiter_5g.124_at_\dev\nrst9 -rw-r--r-- 1 root 7396 Mar 5 08:39 v2:_jupiter_5g.113_at_Engn_Jukebox -rw-r--r-- 1 root 7361 Jan 9 19:58 v3:_jupiter_5g.062_at_Engn_Jukebox -rw-r--r-- 1 root 7096 Dec 3 12:32 v4:_jupiter_5g.018 shows all the saves of the machine io's /etc/rc.local file. The generic name of files in a versions directory is v#:_volume[_at_location] where v1 through vn are assigned to versions, beginning with the most recent save. Volume names the save media con taining that version. Location is provided if it is known. In this example, the most recent save of rc.local is on a tape that is currently mounted in /dev/nrst9. In order to have legal file names, iba translates any forward slashes in the location information into back slashes. The next two versions reside in the engineering jukebox, while the last version on the tape "jupiter_5g.018" may be sitting in the non-automated portion of the media pool. The listed modifications times above indicate that the file was saved because it changed. Files are also saved during "fulls"; listings of their versions are pretty boring. For example, io% ls -l /ib/ganymede/vmunix.V total 6401 -rwxr-xr-x 1 root 828881 Jan 11 09:17 v1:_jupiter_5g.124_at_\dev\nrst9 -rwxr-xr-x 1 root 828881 Jan 11 09:17 v2:_jupiter_5g.114_at_Engn_Jukebox -rwxr-xr-x 1 root 828881 Jan 11 09:17 v3:_jupiter_5g.101_at_Engn_Jukebox -rwxr-xr-x 1 root 828881 Jan 11 09:17 v4:_jupiter_5g.063_at_Engn_Jukebox shows that the vmunix built for ganymede has not changed since its installation. Iba usurps all the access time values and replaces them with the (NSR supplied) time when the save occurred. So by using the -u option to ls, we see io% ls -lu /ib/ganymede/vmunix.V total 6401 -rwxr-xr-x 1 root 828881 Apr 2 22:25 v1:_jupiter_5g.124_at_\dev\nrst9 -rwxr-xr-x 1 root 828881 Mar 5 22:41 v2:_jupiter_5g.114_at_Engn_Jukebox -rwxr-xr-x 1 root 828881 Feb 5 21:49 v3:_jupiter_5g.101_at_Engn_Jukebox -rwxr-xr-x 1 root 828881 Jan 11 21:38 v4:_jupiter_5g.063_at_Engn_Jukebox that ganymede's vmunix was saved soon after it was built and late in the evenings of the first Friday of each month thereafter. Users should approach hidden directories with caution since pwd can't find their names: io% cd /ib/ganymede/vmunix.V io% pwd pwd: getwd: read error in .. io% cd .. Iba also supports naming non-directory files at any point in time with the simple syntax f@time. These names are also hidden, and they remain independent from iba's browse time. For example, the following two commands are identical: io% strings vmunix.V/v3* io% strings vmunix@Feb_6 {{ Figure 4. This is a screen shot of a NSR Motif based console monitoring the NSR server jupiter. All four sessions originate from two browsing sessions on the machine io. The first browser is navigating the machine quattro's saved files and has forked a child, pid 2023, to recover some of quattro's data (indicated by the last two lines of the Sessions panel). The data may be awhile in coming since 114 tape file marks must first be skipped (indicated by the last line of the Devices panel). The second line of the Sessions panel shows that the browser on io is navigating the machine secondo's filesystems as of yesterday. }} Yclept files The previous section dealt with iba hiding versions of files in various directories. A different type of hidden file is one that is hidden by time. This corresponds to the case where users knows the name of a lost file, but does not recollect a date (usually in the distant past) when they last had the file. The users could eventually locate their file by systematically traveling backwards in time until they find the date when the file last existed, but this could be cumbersome. Iba provides another feature inherited from NetWorker to avoid the tedious searching in time. If a user can name a file, he can recover it. Iba requires no special syntax for naming files (including directories) obscured by time. However, the exact name is required since the name does not exist in the current version of its parent directory. Once yclept, files remain legitimate (not hidden) members of their parent directories. Final words on ib and iba The file handles provided by iba may contain pointers to data structures (nodes) associated with referenced files. These nodes are currently never freed, so the persistence of file handles is guaranteed. Two obvious consequences are that mapping a file handle to its associated data is trivial, but in the worst case may require large amounts of virtual memory. The average measured size of iba's file node is 115 bytes, including malloc overhead. The storage is mostly used by the NFS stat structure; actual file names are a distant second. Because file access through iba is casual and because iba exits once auto-unmounted, anticipated swap requirements when caching all file nodes is still quite reasonable. Should iba memory usage become a problem, an alternative implementation of iba can use less caching and place more of a burden on the NSR server by regenerating iba file nodes on demand from other information contained in the iba fhandle. The designs of the NFS file handles are different in ib and iba because these two programs have very different job descriptions, though both are NFS servers. Iba is a application level gateway between the NFS and NSR protocols. Ib provides ease-of-use features by auto-mounting and launching iba with the appropriate arguments. The original deployment of NFS separated launch (mount) from operation and it has since proven to be a good idea [PUGS84]. We anticipate other methods for launching the existing iba at points spread throughout the traditional UNIX filesystems. Performance The find command was used to traverse a machine's filesystems in three ways - local UFS, via NFS from across a quiescent Ethernet, and via the restore-o-mounter to an NSR server across the same quiescent Ethernet. In all cases, the machines involved are Sun SPARCstation-2 with 64Mb of main memory and 3.25 inch SCSI disks with 13.5 msec. average access time. The NSR server managed 416 gigabytes from 34,000 distinct save sessions from 56 clients. The client chosen for the benchmark had 398,000 file instances in NSR server's on-line index. Each of the client's filesystems had at least three full saves (stretching back at least three months) in the on-line index. The identical find command is run twice to show the caching effects of UFS, NFS and the restore-o-mounter. Figure 5 shows the results. Access Files Seconds Norm'z Norm'z method found elapse UFS NFS UFS 1 71441 621 1.00 0.61 UFS 2 71452 750 1.00 0.60 NFS 1 71284 1120 1.64 1.00 NFS 2 71286 1144 1.67 1.00 IB 1 70979 1890 2.76 1.67 IB 2 70979 479 0.70 0.42 Note that NFS takes 64% - 67% more time than local UFS. In the first pass, the restore-o- mounter takes 67% more time than NFS, or 176% more time than UFS. Most of the restore-o-mounter time for the first pass is attributable to the NSR server, and not to the iba process. The second time the command is run yields little difference in the performance of UFS and NFS. But the restore-o-mounter blazes away, taking only 42% of the time that NFS takes and only 70% (( footnote 4: The biggest advantage that the restore-o-mounter has over the traditional filesystems is that its files are static and its metadata can be cached in a small amount of virtual memory. Never-the-less, we hope this astounding number helps to debunk threes myths: code runs faster in the kernel; marshalling data through XDR is inefficient; context switching to user level (file) services is expensive.)) of the time that UFS takes! Each pass of the find benchmark shows the extremes of the relative lookup and stat performance of the restore-o- mounter. Neither are very realistic given the usage model of the restore-o-mounter. Read performance was not bench marked because it is highly dependent on the speed of the underlying media holding the file data. Obviously, minutes may pass before any data is forthcoming from a tape jukebox. Related work asserts that these long lag times are reason enough to exclude tape media from any serious consideration of transparent file recovery services. Our experience does not bear this out; users tolerate seemingly long delays because the process is entirely automated, highly reliable, and provides feedback (via the NetWorker monitors) regarding its progress. Once users trust the system, they usually switch to another task (take a coffee break or read e-mail) and switch back after their files are recovered. Transparent file migration We have shown how the restore-o-mounter provides application transparent file access to saved files. In this section we present a new application that exploits restore-o-mounter features to provide a poor man's file migration utility. Note that NSR clients distinguish backup data from archive data. The NSR server separates these two types of data into different media pools. The media with backup data is eventually deleted or recycled according to customer policies; after all, the data is merely a copy of the real thing. Archived data is expected to live forever. Rather than being a copy, it is assumed that the data is the real thing. As the NetWorker product ships today, files are archived from explicit user actions. Once the files are archived, users may then choose to explicitly remove some files. (( footnote 5: n some of our competitors products, file removal is automatic. This is euphemistically called "filesystem grooming".)) In doing so, it is up to the users to remember their actions and to recover the files should they ever be needed. The best way to remember removed files is by systematically making notes about them in the filesystem. Related work incorporates new information into a file's inode. This has the advantage that users or applications need not per form any special actions to recover removed files when running a modified kernel that automatically reacts to references to the new type of inode. Our scheme also makes notes about removed files. But rather than extending the inode information, a removed file is simply replaced by a symbolic link to the restore-o-mounter name of the archived file. For example, a user may record all his outgoing e-mail messages in a file that he periodically moves to his notion of a mail archive. If the full path name of such a file was /home/io/mojo/mail.record.92, then the associated symbolic link might appear as: mail.record.92 -> /ib/jupiter:io/home/io/mojo/mail.record.92@03-03-93_11:52:47 Note that the resolution of the symbolic link contains the explicit save time of the archived file. Omitting the time would cause an infinite loop once the symbolic link itself is saved and the symbolic link within the iba filesystem is dereferenced! The following symbolic link is functionally equivalent: mail.record.92 -> /ib/jupiter:io@03-03-93_11:52:47/home/io/mojo/mail.record.92 But, while the first link can use any existing iba session for the client io to the NSR server jupiter, the second requires an explicit iba session at the time 03-03-93_11:52:47. Therefore, the first style of link is used to avoid unnecessary process forking. Two nice features fall out from the symbolic link approach to migrated files. The first is that user may rename their files without losing the associated archives. Second, the symbolic links are saved during traditional backups. So if an entire directory or entire disk is lost, the traditional recovery replaces the symbolic links; all the archived data is not placed back onto the disk. Unmigration As mentioned before, iba usually caches a complete file's data when it processes an NFS read request. The restore- o-mounter can be invoked with an option which recovers files back to their original locations when the files' data is accessed. The following conditions must be met if iba is to recover a file in place: + the file name presented to iba must be of the form f@date where date is exact match of f's save time + the file cannot overflow the filesystem's space + the original location of the file must still be a symbolic link that directly resolves to the file name presented to iba. The last rule implies that renamed migrated files cannot be recovered in place. Heuristics could be added to iba to allow it to hunt down the renamed symbolic link and recover to it. A simple heuristic would be to look for the symbolic link in the original parent directory. A more complicated one could be to have the NetWorker save command build a special symbolic link cache as it performs its daily backups. (If the file is recovered from tape, then iba may have quite a bit of clock time to burn; searching for the renamed symbolic link could be accomplished then.) Use of the -L option to ls (use stat instead of lstat) hides the fact that a file is migrated: io% cd /home/io/mojo io% ls -lLt mail.record* -rw------- 1 mojo 985484 Apr 10 23:42 mail.record -rw------- 1 mojo 4871761 Jan 3 1993 mail.record.92 -rw------- 1 mojo 6720668 Jan 2 1992 mail.record.91 while no -L option to ls indicates the files' real status: io% ls -lt mail.record* -rw------- 1 mojo 985484 Apr 10 23:42 mail.record lrwxrwxrwx 1 mojo 57 Mar 3 11:54 mail.record.92 -> /ib/jupiter:io/home/io/mojo/mail.record.92@03-03-93_11:52:47 lrwxrwxrwx 1 mojo 57 Mar 3 11:54 mail.record.91 -> /ib/jupiter:io/home/io/mojo/mail.record.91@03-03-93_11:52:47 Here we use egrep to recover in place a file from a tape jukebox, and use ls to see its new status: io% /bin/time egrep presto mail.record.92 > /tmp/foo 253.8 real 5.7 user 2.1 sys io% ls -lt mail.record* -rw------- 1 mojo 985484 Apr 10 23:42 mail.record -rw------- 1 mojo 4871761 Jan 3 1993 mail.record.92 lrwxrwxrwx 1 mojo 57 Mar 3 11:54 mail.record.91 -> /ib/jupiter:io/home/io/mojo/mail.record.91@03-03-93_11:52:47 The filesystem that iba exports for recover in place is mounted for read and write access since a file could be modified by applications. The applications may continue to access the file via its NFS handle into the iba process while subsequent opens of the file will be handled by the file's originating filesystem. Thus one can modify a currently migrated file and have the correct behavior (e.g., cat >> currently_migrated_file). The final observation is that in place recovery can be performed by a restore-o-mounter running on any machine. The example above was performed on the machine io which was also the originator of the migrated files. The example could have been performed on ganymede which NFS mounts io:/home/io. This demonstrates why both the NSR server and NSR client are named in every symbolic link. Migration Readers may have resigned themselves to the fact that this paper only addresses recovers and not saves. Despair no more. We allocate the following half page to describing archive saves. The migrator was implemented by modifying the NetWorker save command to declare its saves as archive data. Then normal save policies were replaced with migration policies. Policies include: + the file's type + the file's change time + the file's size + the file's owner and permissions + the number of exact copies of the file on archive media Only regular files are automatically migrated. Files greater than 16 Kbytes were typically chosen. Some systems also provide upper bounds for candidate files. Filtering files owned by special users or with certain permissions is also a good idea. For example if we migrate an executable owned by root, then we may end up migrating the kernel, shared libraries, single-user utilities like fsck, and even iba itself. Some users feel that the file should be archived on multiple media before it is replaced with a symbolic link. The migrator checks this by reading the versions directory of a candidate file as it is presented by iba. If there are enough duplicate versions located on different volumes, then the migrator replaces the file with a symbolic link to the last such version. If a symbolic link is not substituted for the file but the file is a candidate for later migration, the migrator then saves the file to archive media. To actually migrate a file, the migrator must visit the file at least twice. The first time it saves the file to archive media; the last time, it replaces the file with the corresponding symbolic link. Caveats The migrator is an application that we expect to traverse the filesystems once a day, once a week, or maybe only once a month. A daemon that polls for the free space in filesystems could be implemented to launch a migrator should the free space drop below some threshold. Without kernel support, the migrator cannot protect applications from ENOSPC errors caused by short term anomalous behavior. Recovery in place of archived files by NFS clients can only succeed if the NFS clients shares a common view of mounted filesystems with the NFS server. Re-migrating a recovered in place but unmodified file may occur; this is not desirable. The unnecessary re-migration could be avoided if iba could control a restored file's change time. Related work Several earlier systems have provided a filesystem interface with time travel capability (primarily to access versions from past backups). The restore-o-mounter was originally inspired by The File Motel [HUME88] which implemented a Version 8 File System in addition to its own access commands to browse and recover past backups to optical disks. The restore-o- mounter provides additional functionality by extending the name space to allow specification of the NSR server, NSR client, and browse time. It also allows specification of an arbitrary version of any file within the restore-o-mounter file system and includes in place recovery of archived files. The 3DFS [ROOM92] uses NFS access to a filesystem constructed from previous backups to an optical disk jukebox. 3DFS has many similarities to the restore-o-mounter and NetWorker, but has a number of differences in the design and implementation. + Both 3DFS and the restore-o-mounter look like a filesystem using a standard NFS interface. Both 3DFS and the restore-o-mounter allow you to use unmodified UNIX commands to browse and access old versions of files. Both accept dates for files at any point in the filesystem hierarchy. While the restore-o-mounter restricts the dates for a directory tree to the top of the tree (i.e., a mount point), 3DFS allows time travel at arbitrary points in the tree. But, the greater flexibility 3DFS may present a less consistent view of the filesystem name space. + 3DFS uses only optical disk jukeboxes. The restore-o-mounter (via NetWorker) can use a variety of save media (e.g., 8mm and 4mm tapes, optical disk) with or without jukebox support. + 3DFS uses remote NFS mounts to a dedicated server, while the restore-o-mounter uses only local NFS mounts to a gateway process. Using local NFS mounts makes it easy to figure out when a mount is no longer busy (the unmount system call doesn't return EBUSY), provides control of NFS access (e.g., ib sets up reasonable timeouts and retransmission parameters), and makes it possible to provide better file migration support (it can recover files in place). The disadvantage of using local NFS gateway processes is that there can be no centralized network-wide caching of recovered files (although this functionality can be provided by the NetWorker server). + 3DFS has more complicated naming rules and glob'ing rules for increased flexibility, but sometimes requires special commands. Each directory can have a date associated with it, but this date is not visible via the standard UNIX pwd command. The restore-o-mounter has simpler file naming conventions and doesn't require any special commands to access version information, but provides less flexibility for accessing the data. We view the less complicated naming conventions and fewer options as a benefit. + 3DFS internal access methods are strictly path name based, so it does not handle file or directory rename situations gracefully. For example, renaming a directory renames all the files under it and breaks their history. The restore- o-mounter (via NetWorker) uses both path name and file id access methods and properly handles rename situations. This allows a correct view of a filesystem as of any time (subject to the times of when saves are performed). The Plan 9 filesystem [PIKE90] uses a true filesystem on optical disk and a two-level cache to provide transparent filesystem access to previously backed up files. It also provides automatic file migration. standard NFS interface. Both 3DFS and the restore-o-mounter allow you However, Plan 9 provides support only for files in the Plan 9 filesystem while the restore-o-mounter works with any traditional UNIX filesystems. The Inversion filesystem [OLSO93] uses the POSTGRES database system to provide fine grained time travel and transactional operations. However, to take advantage of these facilities applications currently need to be rewritten to use new programmatic interfaces provided by a special library that must be linked with each application. A number of systems provide a more tightly integrated file migration solution by providing custom OS's or modifying the kernel. The BUMP project used kernel modifications to provide automatic file migration hooks [MUUS89]. Epoch and NetStor provide file migration products based on NFS servers. Epoch has traditionally used specialized file servers to provide automatic file migration services. Currently Epoch is moving away from specialized kernels and is working to define UNIX kernel hooks to allow for user processes to handle the bulk of the file migration services [WEBB93]. The restore-o-mounter file migration philosophy differs from these approaches in that it requires absolutely no OS changes at the cost of some functionality. Conclusions We have shown that both transparent file recovery and transparent file migration can be accomplished with no modifications to the kernel. We avoided the VFS abstraction because an ISV cannot take advantage of it when delivering products into the ever changing UNIX markets. A far more stable and universal interface, the NFS protocol was readily combined with our NSR protocol to provide the new functionality. Symbolic links, often reviled by users, were exploited as stubs for migrated files, making our scheme transparent to any application and compatible with any backup method. The Automounter philosophy is useful for moving functionality out of the kernel and providing new privileges for normal users. We believe that extending the UNIX system via new processes will soon become the norm and not the exception. Finally, the architecture described in this paper allows a variety of devices (not just optical jukeboxes) to be practical for both backup and file migration. Acknowledgments Joe Moran implemented all aspects of the restore-o-mounter after Bob Lyon said it should be easy. Bob Lyon wrote most of this paper. The restore-o-mounter builds upon the NetWorker product which is implemented and deployed by the team at Legato Systems. Dave Cohrs, Bill Nowicki and Linda Weinert helped to turn our random thoughts into mostly coherent English. We give special acknowledgment to Tom Lyon for envisioning and championing the mechanisms that allow NFS servers to do more than just share files. References [BELL87] Steven M. Bellovin is the author of getdate routines, public domain software written in 1987 acquired from the University of North Carolina at Chapel Hill. [CALL89] Brent Callaghan and Tom Lyon, The Automounter, 1989 Winter (San Diego) Usenix Conference Proceedings. [HUME88] Andrew Hume, The File Motel - An Incremental Backup System for Unix, 1988 Summer (San Francisco) Usenix Conference Proceedings. [KLEI86] Steven R. Kleiman, Vnodes: An Architecture for Multiple File System Types in Sun UNIX, 1986 Summer (Atlanta) Usenix Conference Proceedings. [LEGA93] Legato Systems, NetWorker Product Overview, 1993, Palo Alto, CA. [MUUS89] Michael John Muuss, Terry Slattery, and Donald F. Merritt, BUMP - The BRL/USNA Migration Project, November 1989 LISA Conference Proceedings, Monterey, CA. [NOWI88] William I. Nowicki, RPC: Remote Procedure Call Protocol Specification, RFC 1057, Network Information Center, USC ISI, Marina del Rey, CA, 1988. [OLSO93] Michael A. Olson, The Design and Implementation of the Inversion File System, 1993 Winter (San Diego) Usenix Conference Proceedings. [PIKE90] Rob Pike, Dave Presotto, Ken Thompson, and Howard Trickey, Plan 9 From Bell Labs, 1990 Summer UKUUG Conference Proceedings, London. [PUGS84] Tom Lyon, Why separate mounts from the NFS service, 1984, private communication to Sun's Network File System's group. [ROOM92] William D. Roome, The 3DFS: A Time-Oriented File Server, 1992 Winter (San Francisco) Usenix Conference Proceedings. [SAND85] Russel Sandberg, David Goldberg, Steve Kleiman, Dan Walsh, and Bob Lyon, Design and Implementation of the Sun Network Filesystem, 1985 Summer (Portland) Usenix Conference Proceedings. [WEBB93] Neil Webber, Operating System Support for Portable Filesystem Extensions, 1993 Winter (San Diego) Usenix Conference Proceedings. Trademarks The authors have made every effort to supply trademark information about products and services mentioned in this paper. 3DFS, Ethernet, Motif, NetWare, NFS, ONC, OpenWindows, RFS, SPARCstation-2, Sun-3, SunOS, UNIX, and X Window System are trademarks of their respective companies. Biographies Joseph Moran is a principal engineer at Legato Systems Incorporated which he cofounded in 1988. At Legato, he has had major roles in architecting and implementing all of Legato products including Prestoserve, the client and server sides for UNIX and NetWare versions of NetWorker, and NetWorker for DOS in addition to pitching in where ever needed. From 1984 to 1988 he was a Member of Technical Staff at Sun Microsystems. While at Sun he worked on a variety of tasks for the SunOS kernel including implementing the virtual memory system, doing the original Sun-3 SunOS port, bringing SunOS up on a number of new machines, and designing and implementing kadb after getting tired of debugging kernels on bare hardware. From 1982 to 1984 he was a System Programmer at Hewlett Packard where he was involved with various UNIX related activities including working on the original BSD UNIX port to the Hewlett Packard Precision Architecture. He received a M.S. in Computer Science in 1982 and a B.S. in Electrical and Computer Engineering in 1980, both from the University of Wisconsin, Madison. He can be reached via e-mail at mojo@Legato.COM. Robert B. Lyon is the Vice President of Core Technologies at Legato Systems Incorporated which he cofounded in 1988. At Legato, he has exerted architectural influence on the implementation and deployment of the company's products, and at times, implemented components likes NetWorker's database for its on-line file indices. From 1983 to 1988 he was the Project Leader and Manager of Sun Microsystems Network File System group where he played a similar role after implementing Sun's Remote Procedure Call and eXternal Data Representation (RPC/XDR) package. >From 1979 to 1983 he was a Member of Technical Staff at Xerox Corp Systems Development Division where he assisted in the implementation and tuning of all levels if the XNS protocol stack and had project lead responsibilities for the Clearinghouse Name Service. Prior to Xerox, he was an MTS at Bell Labs in Holmdel NJ where he had system administration responsibilities for a small lab of UNIX machines; he claims to be the first to deploy uucp outside of Murray Hill. He received a M.S. in Electrical Engineering from Stanford University in 1978 and a B.S. in Engineering from Cornell University in 1977. He can be reached via e-mail at blyon@Legato.COM.