Coda File System

Re: weakly connected reintegration

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Mon, 29 Jan 2001 03:46:17 -0500
On Sun, Jan 28, 2001 at 03:31:17PM -0500, Brad Clements wrote:
> On 28 Jan 2001, at 15:14, Jan Harkes wrote:
> 
> > No way of telling. Only the initiation of a resolve shows in the codacon
> > output of the client that triggered the resolution.
> 
> So, if a volume is reintegrating, on the client that caused that re-
> integration to occur knows what's happening.

There is reintegration, when a client sends a batch of operation to the
servers. And resolution where servers try to make sure their copies of
the data are in sync. Resolution is triggered by a client who detects
the differences in the version vectors when accessing the object.

On slow links there is the `weak-reintegration' where a client
reintegrates only with one server in the replicated group and then
immediately triggers resolution on the affected objects.

The volume locks are normally `invisible' to clients, pending RPC2 calls
simply wait for their turn on the lock. However, it used to be quite
common during resolution that a lock would get `forgotten', so there is
a timer that kills locks that have been around for 5 minutes, sort of
like a garbage collector for stale locks.

> > The servers also
> > lock any volumes that are resolving, so clients will be blocked out
> > until the servers are back in sync.
> 
> will be blocked, but won't know why. What do the blocked clients see?

They will not see the RPC2 response, retransmit the request, receive
RPC2_BUSY because their call is being `processed', wait for 15 seconds
and try again. I'm not completely sure whether they retry indefinitely
as long as they receive the BUSY back, or at some point (5-6 tries) give
up and return with an error code.

Jan
Received on 2001-01-29 03:46:24