Coda File System

Venus errors: "Re: Long Running (Multi)XXXX: code = -2001, elapsed = NNNN"

From: Chet Murthy <chet_at_watson.ibm.com>
Date: Wed, 30 Jun 2004 00:52:53 -0400
I've been messing with Coda (v6.0.6, from DEBs, but also built from
source (apt-get source)) for a while, and have been having serious
problems with stability of the client under network conditions that
aren't perfect.

The situation is a little weird, so I thought I'd describe it a bit,
but first, here's the basic symptom:

(1) I'm untaring the Linxu kernel into my coda directory on my client,
which is connected to the server via DSL.

(2) To elicit the bug, I do this during the day, or at night, I
simulate network congestion by, say, pushing large files up to some
server.  In short, the network connection between my client and the
Coda server is not flawless.

(3) Basically, at some point, I see in codacon:

connection::unreachable lambda.csail.mit.edu

(4) I use the timestamp to look in the venus.log, and see:

[ W(24) : 0000 : 00:36:05 ] *** Long Running (Multi)Store: code = -2001, elapsed = 17576.1 ***

where the operation might be Reintegrate, one of the SHA operations,
etc.  But there are lots of operations that seem to hit this timeout.
And it doesn't seem like this timeout ALWAYS causes a problem.  Often,
even for long periods, such operations will get such return-codes, and
things will continue swimmingly.

Then, out of the blue, such a code comes back, and the client decides
to hang up.

================================================================

OK.  So this is really, really easy to reproduce -- basically, I've
done it by installing the coda client and server onto a vanilla Debian
box running (client == stable, server == testing), with the client
cachesize set to 200M, and the server set to one of the larger
configurations in the default list.  In short, a pretty vanilla setup.

I'm going to try it with my client running Debian testing, just to be
sure that things are really failing even with testing on both ends,
but I suspect this is the case.

================================================================

If anybody has any advice as to what is going on here, I'd appreciate
it.  Alternatively, if anybody has any suggestions as to how to debug,
those would be appreciated, too.

Thanks,
--chet--
Received on 2004-06-30 01:01:28