Coda File System

Re: New troubles in coda land

From: Jan Harkes <>
Date: Sun, 3 Jul 2005 17:16:05 -0400
On Fri, Jul 01, 2005 at 11:44:58AM -0600, Patrick Walsh wrote:
> 	I left for a week on vacation and left tests running on our servers
> only to come back and find a series of problems and probably coda bugs.
> I think most of these should be reproducible.  I'll do my best to make
> them easy to track down.
> 1) cfs getpath fid_at_realm
> 	This command works fine on consistent objects, but not at all on
> inconsistent objects.  So when you get a log entry that looks like
> this:  
> VIOC_GETPATH: No such file or directory

Yeah, getpath doesn't set the 'GetInconsistent' flag when it calls
fsdb::Get. I guess that could be considered a bug, but it shouldn't be
hard to fix (depending on how many other places are using GetPath).

> 2) cfs getmountpoint volid
> 	This command doesn't seem to be working anymore.  I'm sure I used to be
> able to use it without problems.  Here's a line in our VRList file:

The path leading to a volume is actually in many cases unknown. It
exists only when we have recently traversed the mountpoint, but things
like callbacks, disconnections and other events 'uncover' the
mountpoints so that the client can discover when volumes have been
removed, moved somewhere else, or if replicas were added or removed. We
need to see the hidden mountlink object because volume information is
only refreshed during the traversal (automatic mount) of a volume.

Oh, and it probably would be 'volid_at_realmname' because I could have
7f000002 volumes in the and
realms and the client isn't psychic.

$ cfs getmountpoint ???/

(here you can also see that the /coda -> /testserver mountlink is
currently uncovered, so it failed to resolve the complete path name to
the root).

> 3) One of the reasons for all of the new problems is that we made a
> watchdog script that checks for inconsistencies every five minutes by
> doing a find command.  This happens on every client.  Ironically, this
> seems to be causing inconsistencies.  I expect what is happening is that
> the find commands, which look like this: 
> find /coda/realm -noleaf -lname '@*'
> are blocking the server so that normal write operations are timing out
> on the client side and version mismatches are happening.  I expect this
> is partially due to our very fast connections between clients and
> servers and I think much of this might be avoidable if we could manually
> increase the client timeouts.  RPC2_timeout and RPC2_retries are in
> venus.conf are commented out so they should be 60 seconds and 5 retries.
> The find command takes nowhere near that long to run (it's pretty darn
> quick, actually).  Any ideas?

When some client is performing a write operation, it is a 2 phase commit
process. Initially it sends the operation to all servers, collects the
responses and then sends the combined results in the form of an update
vector (COP2).

This is the description from coda-src/venus/

 *    Implementation of the Venus COP2 facility.
 *    The purpose of these routines is to distribute the UpdateSet to the AVSG
 *    members following a mutating operation.  Distribution may be
 *    synchronous, i.e., COP2() is invoked by the worker thread immediately
 *    following the COP1 call, or asynchronous. Asynchronous has two variants:
 *    piggybacked and non-piggybacked.  In the piggybacked case the UpdateSet
 *    is sent in the next worker-invoked RPC to the relevant VSG.  In the
 *    non-piggybacked case the UpdateSet is sent in a COP2 RPC invoked by the
 *    volume daemon.  Non-piggybacked is used when piggybacked is not enabled
 *    or when no RPC occurs within a timeout period.  In either case multiple
 *    UpdateSets (corresponding to multiple COP1s) may be sent in the same
 *    RPC.  Also, UpdateSet propagation is idempotent, so UpdateSets
 *    piggybacked on successive RPCs to the same VSG may overlap.

Venus defaults to asynchronous COP2 updates with a timeout value of 10
seconds, but they are triggered by a separate thread which runs about
once every 5 seconds, so the delay can be up to 15 seconds.

So what is happening here... After the update is made on the servers,
they send out callback messages to all clients. There probably is a
reasonable chance that one of the find's happens to traverse that part
of the tree before the COP2 message arrives, and the client will notice
that the version-vectors are not (yet) identical. So the client triggers
resolution, which normally should have no problem resolving this because
the combination of different version vectors, but identical store
identifiers is identified as a missing COP2 update and the VVs are
set to be identical and the client is happy. But then the delayed COP2
hits, and the versions are different again.

The problem here is that a COP2 doesn't send off a callback message. So
all the clients have cached data that has a different version compared
to what the servers have. The next time anyone tries to write to the
object, this is identified as an update/update conflict (2 clients
concurrently updating the same object).

It is fixable, but this is a bit harder. I think the server already
keeps track of the COP2s/store-ids that it is expecting. I guess
resolution has to clear that entry to make sure that the delayed COP
message will be dropped on the floor. Not too familiar with this part of
the code though, so I don't know if the solution is really as simple as
it sounds.

> 4) We're back to having issues with clog (or so I believe).  To
> reproduce this, you need to log in to coda (as the same user) over and
> over again every two seconds or so, while in another window
> while [ 1 ] ; do mv WORK/s/* WORK/t; mv WORK/t/* WORK/s; done
> This will eventually kill coda and require a server restart.

No idea on that one. It actually succeeds at killing the server, that's
pretty impressive.

Actually, I have one idea... If your server is still running, but
becomes unresponsive, the following rpc2 patch may fix the problem.

Received on 2005-07-03 17:16:55