Coda File System

Next Previous Contents

3. Common Scenarios

There are several common scenarios that you may encounter. This chapter attempts to list as many of them as possible and suggest how to handle each scenario.

3.1 Constructing a hoardfile

Coda allows you to give files priorities that affect the cache manager. The higher the priority, the lower the possibility that a file will get flushed from the cache to make space for a another file. These priorities are stored in a hoard database internal to Venus. This database is preserved across invocations of Venus, but will be erased when Venus is re-initialized.

The best way to set up a hoard database is by creating hoard files. After you've created the files once, you do not need to do it again for that set of files. You can create a hoard file by hand or by using the spy program. See the hoard (1) man page for more details.

To run spy :

  1. Run spy in the background, redirecting its output to a file.
  2. Run all of the programs and access files you want to hoard.
  3. Send a SIGTERM to spy (Do not use ^C)
  4. Sort the output file, removing duplicates
  5. Remove unnecessary entries
  6. On each line add a at the beginning and a priority at the end

The following is an example of creating a hoard file for gnu-emacs. Note that while running gnu-emacs, I explicitly enter "scribe mode." This makes sure that the scribe-specific files are fetched.

% spy
[1] 316
% gnu-emacs
% kill %1
[1]    Done                 spy
sort -u gemacs.out
% cat gemacs.hdb

Next I would delete the first and last line of the file as I do not need them. Then add the hoard specific commands. The final file looks like:

a /coda/i386_mach/omega/usr/local/emacs 600
a /coda/i386_mach/omega/usr/misc/.gnu-emacs 600
a /coda/misc/gnu-emacs/i386_mach/omega/bin/gnu-emacs 600
a /coda/misc/gnu-emacs/i386_mach/omega/lisp/scribe.elc 600
a /coda/misc/gnu-emacs/i386_mach/omega/lisp/term/x-win.el 600
a /coda/misc/gnu-emacs/i386_mach/omega/lisp/x-mouse.elc 600

The a that starts each line tells hoard to "add" the named file to the database. The 600 that ends each line sets the file's priority. You may also specify additional attributes for each line. These attributes are separated from the priority by a ":" and are:

For example, to hoard all of the emacs directory, its descendents and any future descendents, I would include the following line in a hoardfile:

a /coda/i386_mach/omega/usr/local/emacs 600:d+

This ensures you get all of the files you need, but you will use tens of megabytes of cache space to hoard many files that you do not need, so oftern you want to be more specific with respect to which files to hoard.

Other valid command to hoard are clear, delete, list, and modify. See the hoard (1) man page for more details on these commands.

3.2 Hoarding for a Weekend

One of the most common uses of a Coda laptop is to take it home overnight or for the weekend. Naturally, you want to be sure that all of the files that you need over the weekend are in the cache; otherwise, there is little point in bringing the laptop home. The hoard (1) program helps you do this. Create a hoard file, as described in Section XXX , for each project you want to work on. You may also want hoard files for your personal files, such as your home directory if its in Coda, and other tools that you plan on using. Its best to clear the hoard database whenever you switch projects to make sure you have enough space in your cache. You might consider having clear as the first command in your personal hoard file. If you do, make sure you always run hoard with this file before any other files. Once youve built hoard files for all of your tools and projects, its simply a matter of running hoard to build the hoard database you want. When you run hoard , you must be logged onto the machines console, (dont run X). About fifteen minutes before you are ready to leave, force a hoard walk with the following command:

% hoard walk

This will cause venus to attempt to cache all of the files in the hoard database. Wait until the hoard command completes . You are now ready to disconnect from the network. You are encouraged to try all of the commands you intend on using after you disconnect. If you are missing some files, it will be easy to reconnect and hoard them.

3.3 Reintegrating After a Disconnected Session

When you reconnect to the network after a disconnected session, Coda will automatically try to reintegrate your changes with the Coda servers. You must be authenticated before reintegration occurs. Watch the file /usr/coda/etc/console with the codacon command or by running: tail -f /usr/coda/etc/console . The reintegrations status will be written to this file.

If the reintegration was successful, the log entries would look like:

Reintegrate u.raiff, (1, 244) ( 13:33:43 )
coda: Committing CML for u.raiff ( 13:33:43 )
coda: Reintegrate: u.raiff, result = SUCCESS, elapsed = 2640.0 (15.0, 2609.0, 
15.0) ( 13:33:43 )
coda:   delta_vm = 0, old stats = [0, 1, 0, 0, 0] ( 13:33:43 )
coda:   new stats = [   0,   0.0,     0.0,    1,   0.2], [   0,   0.0, 0.0,  
 0,   0.0] ( 13:33:43 )

The following example is from a failed reintegration on the volume u.raiff.

Reintegrate u.raiff, (1, 244) ( 13:27:10 )
coda: Checkpointing u.raiff ( 13:27:10 )coda: to /usr/coda/spool/2534/u.raiff@@%
coda%usr%raiff ( 13:27:10 )
coda: Aborting CML for u.raiff ( 13:27:10 )
coda: Reintegrate: u.raiff, result = 198, elapsed = 2437.0 (15.0, 2265.0, 531.0)  ( 13:27:10 )
coda:   delta_vm = 1000, old stats = [0, 0, 1, 0, 0] ( 13:27:10 )
coda:   new stats = [   0,   0.0,     0.0,    1,   0.2], [   0,   0.0, 0.0,   0,
   0.0] ( 13:27:10 )

Notice that the change modify log (CML) was checkpointed to /usr/coda/spool/2534/u.raiff@@%coda%usr%raiff . This file is a tar file containing the changes that were made on during the disconnected session. The files in the tar file are relative to the root of u.raiff. The cfs examineclosure and cfs replayclosure will show which files were not reintegrated and force a reintegration respectively.

3.4 Dealing With a Flaky Network

When the network is acting up, you can use Coda to help isolate yourself from the networking problems. Set up your hoard database so that Venus will hoard the files you are working on. Then, disconnect from the Coda servers with the cfs disconnect command. To Coda, this is equivalent to physically disconnecting from the network.

Once the network becomes stable, you can use cfs reconnect to reconnect yourself to the Coda servers and re-integrate your work. Dont forget to clear your hoard database with hoard clear once you are done working on the set of files that you hoarded.

Note: AFS will not be affected by cfs, so access to AFS files will still be affected by the network problems.

3.5 Reintegrating Over the Phone Line

If you are planning on taking a Coda laptop on an extended trip, you should consider using SLIP to reinitgrate with the Coda servers periodically. Using SLIP will allow updates to be visible to other Coda users, protect against client crashes such as hard drive failure or theft, and allow you to work on multiple projects, even when your cache space is not large enough for all of the projects. By using the following instructions, you can reintegrate over the phone and change the files in your hoard database.

1. Read the dialup manual page, and /afs/cs/help/03-Communication/03-Tcons_and_Dialups/slip.doc /afs/cs/help/03-Communication/03-Tcons_and_Dialups/cisco_tcon.doc

2. Get an account on a terminal server from facilities. This server is the name that you will use when you start slip.

3. Connect to the terminal server and start slip.

4. Run /etc/slattach /dev/com{0,1} speed. If you are using the internal modem, specify /dev/com0 and 2400. If you are using an external modem, use /dev/com1 and whatever speed (9600).

5. Exit your communications program (kermit, whatver). slattach holds the line open.

6. Now you have to reset the routes in your routing table. First delete the old routes:

% set slipaddr = # address of
% set gwaddr =    # address of

% /etc/route delete 0 $gwaddr
% /etc/route delete net 128.2 $hostaddr
% /etc/route -f delete $hostaddr $hostaddr

and config the slip interface up with the new routes:
% /etc/ifconfig sl0 $hostaddr $slipaddr -trailers up
% /etc/route add net 128.2 $slipaddr 0
% /etc/route add 0 $slipaddr 1

If youve started up disconnected, you will also have to run the command:
% ifconfig par0 down

Finally, tell Coda to see which servers it can communicate with:

% cfs checkservers

Your laptop will now behave as if it is on the network. Response time to commands will be sluggish.

If you want to stop running SLIP before you shut down your computer, simply turning off your modem or killing slattach will terminate your SLIP connection.

3.6 Repairing an Inconsistent Directory

Occasionally, a directory entry will become inconsistent. This happens when there is a conflict between file server replicas that Coda cannot automatically resolve or a reintegration failed because of a local update the conflicts with the global state. The most common causes of a conflict are when the file servers are partitioned and a file is changed on more than one of the partitions or when a disconnected client updates a file that is also updated on the servers. When this happens, the directory containting the conflict will now look like a symbolic link and will be pointing to its fid . For example, if a directory, conflict , is inconsistent, it will now appear as:

% ls -l conflict
lr--r--r--  1 root      27 Mar 23 14:52 conflict -

Most applications will return the error File not found when they try to open a file that is inconsistent. You need to resolve this conflict by using the repair (1) tool.

Server/Server Conflicts

Once you run repair, you need to do a "beginRepair" on the object that is inconsistent. After "beginRepair" is issued, the inconsistent directory will have an entry for each of the replicated volumes. You can look at all of these to decide which copy you want. Use repair to copy the correct version and clear the inconsistency. In the following example the file conflict/example is replicated on three servers. It has gone inconsistent.

% ls -lL conflict
lr--r--r--  1 root           27 Dec 20 13:12 conflict -
% repair
The repair tool can be used to manually repair files and directories 
that have diverging replicas.  You will first need to do a "beginRepair" 
which will expose the replicas of the inconsistent object as its children.

If you are repairing a directory, you will probably use the "compareDir" and "doRepair" commands.

For inconsistent files you will only need to use the "doRepair" command.

If you want to REMOVE an inconsistent object, use the "removeInc" command.

Help on individual commands can also be obtained using the "help" facility.
* begin conflict
a server-server-conflict repair session started
use the following commands to repair the conflict:
* ^Z
% ls conflict
% ls conflict/*

% fg
Pathname of Object in conflict?  [conflict]  
Pathname of repair file produced?  []  /tmp/fix


-rw-r--r--  1 raiff           0 Dec 20 13:10
-rw-r--r--  1 -101            0 Dec 20 13:11

        Fid: (0xb0.612) VV:(0 2 0 0 0 0 0 0)(0x8002f23e.30c6e9aa)
        Fid: (0x9e.5ea) VV:(2 0 0 0 0 0 0 0)(0x8002ce17.30d56fb9)
Should /coda/project/coda/demo/basic/rep/conflict/ be removed?   [no]  yes
Should /coda/project/coda/demo/basic/rep/conflict/ be removed?   [no]  
Do you want to repair the name/name conflicts  [yes]  
Operations to resolve conflicts are in /tmp/fix
* do
Pathname of object in conflict?  [conflict]  
Pathname of fix file?  [/tmp/fix]  
OK to repair "conflict" by fixfile "/tmp/fix"?  [no]  yes
* quit
% ls conflict
% exit

Local/Global Conflicts

Local/global conflicts are caused by reintegration failures, which means that the mutations performed while the client was disconnected are in conflict with the mutations performed on the servers from other clients during the disconnection. The objects involved in local/global conflict are represented in the same fashion as server/server conflicts, i.e., they become dangling symbolic links.

To start a local/global repair session for an object OBJ, you need to invoke the repair tool and issue the "beginrepair" command with the pathname of OBJ as the argument. Once the repair session is started, both the local and global replicas of OBJ are visible at OBJ/local (read-only) and OBJ/global (mutable and serving as the workspace for storing the repair result for OBJ and its descendants). The central process of repairing the local/global conflicts on OBJ is to iterate the local-mutations-list containing all the local updates performed on OBJ or its descendants, which can be displayed by the "listlocal" command. Each operation in the list must be accounted for and the repair tool cooperates with Venus to maintain the current-mutation being iterated. The "checklocal" command can be used to show the conflict information between the current-mutation and the global server state. You can advance the iteration to the next operation using either the "preservelocal" or the "discardlocal" command with the former replaying the current-mutation operation on the relevant global replicas. You can also use the "preservealllocal" and "discardalllocal" commands to speed up the iteration. Because the global replica OBJ is mutable, existing tools such as "emacs" etc. can be directly used to make the necessary updates. The "quit" command is used to either commit or abort the repair session. The man page on on the repair commands contains more detailed information, and the following simple example illustrates the main process of repairing a local/global conflict.

Suppose that during disconnection, a user creates a new directory /coda/usr/luqi/papers/cscw/figs and stores a new version for file /coda/usr/luqi/papers/cscw/paper.tex . However, during the disconnection his co-author also creates a directory /coda/usr/luqi/papers/cscw/figs and stores some PS files in it. Upon reintegration a local/global conflict is detected at /coda/usr/luqi/papers/cscw .

% ls -l /coda/usr/luqi/papers/cscw would show 
lr--r--r--  1 root           27 Dec 20 00:36 cscw -
% repair
* begin
Pathname of object in conflict?  []  /coda/usr/luqi/papers/cscw
a local-global-conflict repair session started
the conflict is caused by a reintegration failure
use the following commands to repair the conflict:
a list of local mutations is available in the .cml file in the coda spool directory

* !ls -l /coda/usr/luqi/papers/cscw
total 4
drwxr-xr-x  3 luqi         2048 Dec 20 00:51 global
drwxr-xr-x  3 luqi         2048 Dec 20 00:51 local
Back to *

* listlocal
local mutations are:

Mkdir   /coda/usr/luqi/papers/cscw/local/figs
Store   /coda/usr/luqi/papers/cscw/local/paper.tex (length = 19603)

* checklocal
local mutation: mkdir /coda/usr/luqi/papers/cscw/local/figs
conflict: target /coda/usr/luqi/papers/cscw/global/figs exist on servers

* discardlocal
discard local mutation mkdir /coda/usr/luqi/papers/cscw/local/figs

* checklocal
local mutation: store /coda/usr/luqi/papers/cscw/local/paper.tex
no conflict

* preservelocal
store /coda/usr/luqi/papers/cscw/global/paper.tex succeeded

* checklocal
all local mutations processed

* quit
commit the local/global repair session?  [yes]

Next Previous Contents