Coda File System

Re: Group ID's and Apache

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 3 Sep 2002 15:32:53 -0400
On Tue, Sep 03, 2002 at 08:36:40PM +0200, Kees Hoekzema wrote:
> choose to run NFS. Now, a year later, NFS has become to overloaded, so I
> went to look at the current Coda, and it worked almost immediately :).

Great.

> But.. I do have a couple of configuration issues that I couldnt figure out
> with the docs and the mailinglistarchive.
> 
> 1. How do I change the default group-id? I tried to set the System:AnyUser
> groupId to 65441 (iow, group 95) but then my coda wouldnt start.

Ok, I don't know why you want to change this, as Coda doesn't do
'groups' in the traditional sense. The group-id that you see when doing
ls -l /coda/foo will always be 65534 (i.e. nogroup). The only exception
to this rule is when a file is created, as long as the object is cached
in the kernel it still has whatever the kernel decided to fill in for
the group-id, but there is no group-id information stored in venus, or
on the Coda servers, so whenever the kernel cache refreshes it's
information the gid will be nogroup again.

Coda groups are only used in directory ACL's, a Coda group contains
other Coda groups or users, when there is a group ACL, any user that is
a direct or indirect member will have same access rights as were given
to the group.

Internally we store user-id's with a positive number, and group-id's
with a negative number. I don't think it should even be possible to set
it's group-id 65441 (-65441 would be possible), so the userid and
groupid numbers each have 31-bits of addressable space. Any time we try
to map an ACL to actual permissions we do a lookup in the pdb.

Whenever a group or user id is changed, we fix up everything in the pdb
database (the groups we are a member of) to refer to the new number. So
that typically should cause no problems. However we don't go through all
the existing ACLs and rewrite those.

As you changed the id of System:AnyUser, the server probably assumes
that your unauthenticated client doesn't have the right to access /coda,
because that right now belongs to whatever group uses id -101.
Newly created volumes will ofcourse default to providing 'rl' right for
the new System:AnyUser group. So you could create a new volume, mount
that as a Coda root, and then recursively change all directory ACLs.

I'm guessing you also need to create a temporary groupname for id -101,
otherwise you won't be able to remove the old ACL, or maybe even see the
ACL rights at all.

    find /coda -type d -exec cfs sa {} System:AnyUser rl \;
    find /coda -type d -exec cfs sa {} OldAnyUser none \;

> 2. I do want a filesystem like:
> /coda/
>   + web/
>     - site1/
>     - site2/
>     - siteN/
>   + home/
>     - user1/
>     - user2/
>     - userN/
> Will coda store different userId's (eg, in /coda/home/kees/ there will be
> files from user "kees" and in /coda/home/jan/ there will be files from user
> "jan", do i need to create seperate volumes or will one just do? still,
> apache must be able to read/write in those dirs)

Separate volumes will allow for more parallelism, write operations,
server-server resolution, reintergration, repair all lock on a
per-volume basis. It also makes it simpler to administer the server.

Because apache typically runs setuid 'www-data' or something similar,
you could even give that local user a Coda token, and thus control
how much (or little) apache can access in /coda.

> 3. Is it possible for apache to acquire tokens without anyone logged in?
> (Will a shellscript echo pass | clog pipe apache; apachectl start do the
> trick? Apache needs to read AND write/insert to the filesystem (image upload
> system etc)).

probably something closer to 
    cat /etc/coda/apache | su www-data clog -pipe apache

That way the password doesn't show up in 'ps' output and the token is
given to the actual user-id that apache uses. Have it both in the
init.d script and a daily cronjob to make sure it stays valid.

> 4. A more general question; How stable is Coda, considering there are 60k
> files and a total of 1GB data, is it stable enough to use in a production
> environment?

If the apache server is only read-only, I can definitely say it is
reasonably stable, our own webserver works that way. Read-write is
another story. The client might become disconnected, or a conflict could
pop up that basically makes the filesystem inaccessible until the
conflict is repaired.

For instance, our mailinglist archives are stored in /coda and updated
automatically from a cron-script. However hypermail was creating too
many temporary files and would occasionally push the system into
disconnected mode (not because the server goes down, but because the
local CPU usage + slow server response times combined make the client
believe the server is overloaded) and while disconnected write more data
than the client could keep around in it's write-back cache, crashing the
client...

So for now I'm rebuilding the archives on the local disk, and then have
the following script to update the exported copy in /coda,

    # make sure we are authenticated
    cat passwordfile | clog -pipe webmaint

    # try to avoid disconnections due to observed network latency
    cfs strong
    cfs forcereintegrate

    rsync -av --delete /local/mail/archives/ /coda/html/maillists

    # if we still went disconnected and 'survived' we probably have a
    # conflict. let's just purge all pending operations, clearing our
    # the conflict, the next update should then be able to pick up where
    # we left off.
    yes | cfs purgeml /coda/html/maillists

    # Now go back to our regular scheduled program
    cfs adaptive

As you can see, I have a lot of workarounds to make unsupervised writes
on a 24/7 system at least a bit more reliable :)

Jan
Received on 2002-09-03 15:34:12