Coda File System

Re: /coda: Input/output error

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Mon, 23 Aug 1999 14:53:07 -0400
On Sun, Aug 22, 1999 at 09:08:03PM +0200, Torsten Foertsch wrote:
> 
>   cp -r /usr /coda
> 
> First it looked good. cp had copied files for about an hour. Then I missed
> the sound of my hard disk. The cp process hung. All the coda
> processes were there. I couldn't find anything pointing to an error
> in the log files. (/vice/srv/Srv{Log,Err}, /usr/coda/venus.cache/venus.log
> ...)

Hi Torsten,

Ok, during the recursively copying of the whole /usr-tree, Coda probably
detected that the server was sluggish, and switched to weakly connected
operations. In that mode it starts logging writes. Ofcourse the log (and
the not-yet reintegrated objects) are finite resources, and there are
several possible cases where running out of these local resources,
crashes the client.

There probably was an error logged in /usr/coda/etc/console. Yeah, I
know, too many logs. 

> ls /coda gave /coda: Input/output error

Venus has crashed, so the kernel can only return EIO on all operations.

> df said /coda is mounted and has about 9GB free (nothing used)

That's something the kernel does know, it knows how to `fake' the result
of the statfs call when venus isn't around :)

> At the end I have tried
> 
> kill -9 <venus_pid>
> umount /coda
> rmmod coda
> insmod coda
> venus -init

That is exactly the right sequence, except in most cases you don't need
to restart venus with the -init flag. You do need to get coda tokens
before it will reintegrate what's left in the logs. The -init flag is
mostly useful if the client is badly hosed, you are not worrying about
local updates, and want to get back to a sane state quickly.

> and ls /coda works and df shows reasonable values.
> 
> What did I wrong? How can I avoid problems like this?

Lot's of write operations still manage to confuse clients into believing
that servers are unreachable or weakly connected. To avoid the automatic
switching to weakly connected mode, run `cfs strong' before starting to
copy the tree. Although venus can still switch to logging if it believes
the server has died, which is when it isn't responding in about 15
seconds.


> I plan to use coda on a highly used WEB server. Actually there are
> 2 computers that should work as coda server and replicate the WEB
> content. One of them may be read-only.

Just have read-write replicated volumes, and run the scripts that index
the website and update f.i. mailinglist archives from another client. If
the writing client fails, the problems are isolated to that single
writer. The webserver normally has only read access, and when the cache
is large enough, will only have to talk to the servers to get an updated
copy. (At least, that's how our lightly used webserver works).

> Thanks,
> torsten foertsch

Jan
Received on 1999-08-23 14:55:11