Coda File System

Re: Repair problems

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Mon, 24 Jan 2005 16:44:04 -0500
On Sun, Jan 23, 2005 at 12:00:55AM +0100, Michael Tautschnig wrote:
> Since I'm using Kerberos, but usually logging in using my ssh-key, I 
> repeatedly get in trouble, probably because of expired tokens. This 
> usually results in
> 
> timeout in locking authority file /coda/TIKI/tautschn/.Xauthority
> 
> This wouldn't be a problem, if I could then access all the files when I do 
> have tokens - which isn't the case (most of the time) - I have to delete 
> my .Xauthority file.

I've been having similar problems, it seems like venus is not using the
System:Anyuser acl rights when I lose tokens but it tries to keep
validating my access based on identity of the expired token. Haven't
figured it out yet, maybe it isn't even venus's fault as the kernel
seems to be caching inode information too agressively.

In any case, you might consider replacing the file with a symlink to a
local disk directory. Xauthority really isn't supposed to be shared
across systems and Coda's directory ACLs probably aren't really
protecting it as the UNIX modebits are more like a hint interpreted by
the client, not a rule enforced by the server.

> be able to repair these conflicts - but don't know how. doing 
> repair->beginrepair->.Xauthority just tells me, that the object is not in 
> conflict; doing "ls" results in
> 
> ls: .: Permission denied
> 
> and "cfs lv ." tells me about 4 objects begin in conflict;
> 
> venus.log contains the following entries:
> 
> [ W(2830) : 0000 : 23:47:33 ] fsdb::Get: trying to access localized object 
> 5539d148.7f000002.1.1
> (many times)

Strange, which version of venus is this?

> [ L(19) : 0000 : 23:51:34 ] LocalInconsistentObj: 
> objFid=5539d148.7f000002.1.1
> 
> venus.err says:
> 
> 23:38:51 volume tautschn has unrepaired local subtree(s), skip 
> checkpointing CML!
> 
> 23:41:17 Local inconsistent object at /coda/TIKI/tautschn, please check!

Since it seems to be a reintegration conflict, the object in conflict
should be one involved in the first operation in the cml checkpoint
file (/usr/coda/spool/<userid>/<volume>.cml, it might also be in
/var/lib/coda/spool). Alternatively, cfs getpath 7f000002.1.1_at_TIKI, but
I think that call is mostly resolved in the kernel, so it depends on
whether the kernel actually has the object cached or not.

In any case, since the vnode.uniquifier is 1.1, this object is the root
directory of the volume 0x7f000002. (which is probably the directory
/coda/TIKI/tautschn.

I guess something has a reference to the directory, for instance all
your shell windows and applications that happen to have their current
working directory set to your home directory. As a result we can't turn
the directory in a dangling symlink which is why the conflict is
invisible (and unrepairable).

> These problems are somewhat annoying, since I don't really know, what 
> venus is telling me - and how to solve the problem without a need to 
> restart venus all the time...
> 
> Despite these problems, coda is doing really fine here, but repairing 
> conflicts without restarting would be really really important for 
> production use!

The problem is really twofold, first of all the code is trying to change
an existing object into a dangling symlink, which is a pretty nasty
operation for the kernel especially when the object is considered in
use. The second is a Linux kernel 'issue' which is that anything that
has a reference to a directory cache entry, such as the current working
directory of a process, automatically has a reference on the inode
(making the object 'in use'). Probably the only way around this is to do
something similar to the NFS 'silly rename' when a referenced file is
removed on the server, the referenced object is moved aside, and in our
case would be replaced by a new object that represents the dangling
symlink.

Jan
Received on 2005-01-24 16:46:13