Coda File System

Re: replicated servers

From: Patrick Walsh <pwalsh_at_esoft.com>
Date: Thu, 07 Apr 2005 15:44:59 -0600
        I've been trying to get replication to work all day and I think
I've
identified some issues and sharpened my questions.

        First, suppose I have a realms file like this:

realm1          coda1 coda2
realm2          coda2

where coda1 is the scm and coda2 is (supposedly) replicating it.

        If I add a file to realm1 and then go to realm2 to see if it has
appeared, I get timeout and permission denied errors.  A `cfs lv` tells
me I'm disconnected from realm2 and no amount of `cfs reconnect` fixes
that.  

        By the way, realm2 doesn't have to be defined, going
to /coda/x.x.x.x
has the same effect when x.x.x.x is the ip of coda2.

        At this point, restarting the client results in an error.  To
get venus
to start again I have to touch /usr/coda/venus.cache/INIT.

        So at this point, I seem to have a replicating server.  However,
I have
a question.  I spent much of the day not realizing the problem stated
above so I was trying to create volumes on the replicated server.  After
analyzing createvol_rep I came up with this command:

volutil -h coda2 create_rep /vicepa /.1 2130706432

where that last number is the decimal representation of the hex number
found in the gruopid column of the / partition on the scm (as shown in
the VRList file). 

        I did this because it looks like createvol_rep will create
volumes on
multiple servers using that command, but won't let you add volumes on
other servers if they come later.

        The question: what is the point of creating volumes on the
replicated
server when it appears that volumes I didn't use volutil to create can
be seen on that server anyway?  Or did creating the "/" partition
trigger everything to work?  (I'll find the answer to that last part
myself soon as I wipe clean the replicating server and try again from
scratch).

..Patrick

On Thu, 2005-04-07 at 08:54 -0600, Patrick Walsh wrote:
>       I need some help setting up a non-scm server.  I ran through
vice-setup
> and configured it appropriately.  I started the updateclnt, auth2, and
> codasrv processes.  After fixing an error where the server couldn't
find
> a ROOTVOLUME file that it wanted to replicate, the db files were
> properly transferred over to the replicated server.  I added the
> replicated server to the servers file.
> 
>       If I use a client to connect directly to the replicated server
(by
> cd'ing to /coda/x.x.x.x using the ip address of the replicated
server),
> I can see a list of the files and directories, but if I try to cd into
> them or cat a file in the root dir, I get permission denied and
timeout
> errors.  I'm logged in as the administrator and my tokens are good for
> x.x.x.x.  
> 
>       I expect I haven't set something up properly, but the
documentation is
> outdated.  For example, the documentation says I should create two new
> entries in the /vice/db/VSGDB file, but there is no longer any such
> file.
> 
>       The non-scm server will simply be replicating the volumes on the
scm --
> it won't have any of its own volumes (unless those are needed for
> replication).
> 
>       I'd really appreciate any help.
> 
>       Also, an unrelated question: if I have a cluster of web servers
using
> coda for shared storage, do I have to worry about their tokens
expiring?
> Do other people use cron jobs to renew their tokens every day or so?
> 
> Thanks,
> 
-- 
Patrick Walsh
eSoft Incorporated
303.444.1600 x3350
http://www.esoft.com/

Received on 2005-04-07 17:46:21