Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Wed, 27 Apr 2005 17:30:28 -0400

On Wed, Apr 27, 2005 at 02:56:22PM -0600, Patrick Walsh wrote:
> 	First, someone tried to use rpm to install a package to coda (by making
> a sym link from the normal destination to the coda dir).  rpm insists on
> using the lchown() system call to set the ownership of the files.
> Apparently coda doesn't like this and gives an error.  Is there a reason
> coda doesn't like this system call?  Can it just ignore it?

I think Coda doesn't allow someone to change the owner/group id of a
file if the user doesn't have admin rights from the directory ACL. NOt
sure though, and since Coda ultimately relies on the ACL based
permissions it doesn't matter all that much.

The only thing that you might run into is that we explicitly disallow
setuid binaries, and if RPM doublecheck the mode bits it might complain
a bit.

>	Also, there were questions about why unpacking a tar file seemed so
> slow.  I speculated that coda, which is connected strongly, was
> uploading each file to the server before letting the next one unpack.
> Is this true?  I was also asked if it waited for the updates to be sent
> out between the servers and I'm pretty sure that it doesn't, but I
> wanted to double-check.  Would it be quicker or is there any benefit to
> disconnecting before unpacking the files?

Correct, in connect mode the we don't return back to the application
until the file is stored on all replicas. So if you're using 2 replicas
it will end up transferring twice the amount of data. On top of that,
the Coda server will force all the changes do disk (and probably even
flush/truncate the RVM) before it returns to the client. In addition,
the client probably performs at least 4 operations for every file,
(create, store, chown, chmod, possibly utimes) and there is no
coalescing every operation becomes a separate RVM transaction, along
with a bunch of RVM flushing/truncating/fsyncing.

Write-disconnected mode is in this respect a lot faster. The 4-5
operations will get optimized to just 2 or 3 (create/store/setattr).
Although it could, I'm not sure if the setattr will merge with the store
operation. It will also send the operations in batches of up to 100,
which will all get committed within a single transaction, so the server
will essentially have to perform about 1/167th the number of
transaction.

However... in write-disconnected mode the client tries to predict what
the version-vectors on the server will look like and is sometimes wrong
and gets a reintegration conflict. Also the client often sends the
updates to only a single server and then triggers resolution to
propagate them to all other replicas. If it sends the next batch to
another server before the resolution has completed, we get yet another
type of reintegration conflict. So write-disconnected is not really the
best solution especially for new users that are just taking their first
steps with Coda.

Jan

Coda File System

Re: Coda and RPM/tar/lchown()