Coda File System

Re: starting over

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Sat, 12 Mar 2005 00:03:02 -0500
On Fri, Mar 11, 2005 at 06:26:40PM -0300, Gabriel B. wrote:
> I decided to start over. I wiped all coda and venus conf
> 
> vice-setup was praticaly this:
> /vice
> yes, scm
> scm id 1
> user codaroot with uid 1075
> /vice/LOG 20M
> /vice/DATA 90M
> 
> then vice-setip-srvdir
> use /vicepa
> 2M files

I thought vice-setup already ran vice-setup-srvdir, why run it
separately? And if vice-setup didn't complete normally your server might
not be running. At the end of vice-setup it (should) ask if it can start
the various daemons, it should start

    rpc2portmap updatesrv updateclnt auth2 codasrv

And once codasrv is started it asks if it can create the rootvolume.

> pbtool
> > nu bossnes
> > ng www 1076 (bossnes id)
> 
> createvol_rep / camboinha.servers/vicepa -3
> that didn't worked. i waite 2hours and ctrl_Clled.

Where did you get that '-3'? It is probably trying to resolve that name,
or maybe inet_aton turned it into some strange ip-address which volutil
is trying to connect to.

> volutil create_rep /vicepa / 00001
> bldvldb.sh
> (a valid workaround?)

No it is not, since this only creates the underlying volume replica
(which should be named "/.0") And again, where does that strange 00001
number come from? The createvol_rep script does this first but then
creates the replicated volume by dumping the existing (currently empty)
VRDB into the /vice/db/VRList file, appending a entry that describes
which replicas are part of the replicated volume and recreates a new
VRDB file from the data in the updated /vice/db/VRList.

> now to the client.

While the server is not correctly set up... You only have an underlying
volume replica so if your client mounts it there are no version vectors
and it will not keep anything cached locally when disconnecting because
there is no way to revalidate the cached objects.

> now, lets make some mount points for the future volumes:
> bossnes# mkdir /coda/camboinha.servers/htdocs
> 
> and that's it. it will hang in there pratically forever. until i ctrl+c it.

Actually venus internally is probably still hanging at this point, you
just told the kernel to stop waiting for the result. But indefinite
hanging is quite unusual. Timeouts are set to about 60 seconds before an
RPC operation aborts, except when the server responds with a BUSY.

The only 'worst case' that I know of is when we initially contact a
realm since we are hit by multiple RPC2 timeouts, one when we try to get
the rootvolume name, one when we fall back on getvolumeinfo, at this
point the mount fails, but with a colorizing ls we get an additional
readlink and getattr on the mountpoint both of which also trigger an
attempt to mount the realm (i.e. another 4 timeouts). So we end up
blocking for about 6 minutes if the realm is unreachable.

> then. i restart both server and client, and sometimes it shows the
> server inside /coda. others time it simply shows an empty /coda
>
> i then created two more volumes. now venus report  "2 volume replicas"

Did you mount those volumes then? How would venus know about the newly
created volumes? Those 2 replicas are probably the one that is at /coda
and the one at /coda/camboinha.servers.

> the logs have nothing even with the highest debuglevel.

Venus is really verbose at the highest debuglevel, so verbose actually
that I tend to limit myself to 100, and normally only set it to that
level for a short period of time around the operation that I'm trying to
figure out.

> cfs lv /coda/camboinha.servers and it magicaly appears there
> ...with the file the creating i ctrl+Celed before when it hang. so, it
> probably is the cache. the weird part is that i delete all rvm files
> on the client, and every time i start it up with -init. anyhow, as
> soon as i create anything it hangs.

It probably did end up creating the file on the server because ^C only
makes the kernel module stop waiting for the result. Those hangs are not
normal, and especially not seeing anything traffic on port 2432, there
is probably something wrong with ip-addresses, such as your server
claiming it's ip-address is 127.0.0.1 or some other address that is
unreachable for the client.

Jan
Received on 2005-03-12 00:05:11