Coda File System

Re: venus stuck at 100% cpu?

From: <u-thti_at_aetey.se>
Date: Mon, 27 Oct 2014 16:55:20 +0100
Hello Greg,

On Mon, Oct 27, 2014 at 10:55:49AM -0400, Greg Troxel wrote:
> I am reorganizing some files I have in coda.  A lot of things are
> working ok, including rsyncing in a subtree, resulting in 2100 pending
> reintegration operations, and forcing reintegration.

Your system looks healthy.

> However, when I do "rmdir foo" of an empty directory, venus goes to 100%
> CPU time.  Even after I kill -9 it and restart, the directory is still
> there.  This is on NetBSD 6, i386, with what I think are the latest
> official releases:
> 
>   coda-6.9.5nb7       Coda distributed fileystem
>   rvm-1.17            Recoverable Virtual Memory
>   rpc2-2.10nb3        CMU (Coda) remote procedure call package
>   lwp-2.6             Light Weight Process style threads
> 
> Amusingly, I went to try this with the venus on my server.  It's
> netbsd-5 i386, and been up 220 days.  Apparently the socket used by
> venus for clog/cfs got purged due to being old, but restarting venus
> brought that back.  On that system I could remove directories.

clog/cfs use ioctls on a special file object under /coda
so I wouldn't expect it to be able to disappear. The venus
process was probably not in the right shape at that time.

> So apparently there is some issue with venus doing rmdir on NetBSD 6.
> 
> Is anyone else seeing this, on any platform?

For the good or the bad, I do not recall a similar occasion.
My experience is mostly on Linux, the limited use on NetBSD was
certainly insufficient to trigger all corner cases.

On the other side, venus if "abused" with massive cache replacements
and/or being actively used for a long time can become unstable and
sometimes does. I assume that there are some uncaught races yet.

If/when you reinitialize venus, you should be able to remove
the troublesome directory, did you try a full reinit?

Regards,
Rune
Received on 2014-10-27 12:01:22