Coda File System

Re: Replication server problems

From: Achim Stumpf <newgrp_at_gmx.de>
Date: Mon, 18 Dec 2006 10:35:05 +0100
Hi,

On clusty1 (SCM):
# cat vicetab
clusty1.mytest.de   /vicepa   ftree   width=256,depth=3
clusty2.mytest.de   /vicepa   ftree   width=256,depth=3

On clusty2 (slave):
# cat vicetab
clusty1.mytest.de   /vicepa   ftree   width=256,depth=3
clusty2.mytest.de   /vicepa   ftree   width=256,depth=3

After that I have restarted the servers on both machines.
And on clusty1 (SCM):
# createvol_rep coda.rep clusty1.mytest.de/vicepa clusty2.mytest.de/vicepa
egrep: /tmp/vollist.1947: No such file or directory
egrep: /tmp/vollist.1947: No such file or directory
grep: /tmp/vollist.1947: No such file or directory
Found no partitions for server clusty2.mytest.de.

The same thing happened :o(

The SrvLog of clusty1 (SCM):
Date: Mon 12/18/2006

10:14:51 Coda Vice, version 6.1.2       log started at Mon Dec 18 
10:14:51 2006

10:14:51 RvmType is Rvm
10:14:51 Main process doing a LWP_Init()
10:14:51 Main thread just did a RVM_SET_THREAD_DATA

10:14:51 Setting Rvm Truncate threshhold to 5.

Partition /vicepa: inodes in use: 1116, total: 16777216.
10:15:00 Partition /vicepa: 8211208K available (minfree=5%), 8161284K free.
10:15:00 The server (pid 1867) can be controlled using volutil commands
10:15:00 "volutil -help" will give you a list of these commands
10:15:00 If desperate,
                "kill -SIGWINCH 1867" will increase debugging level
10:15:00        "kill -SIGUSR2 1867" will set debugging level to zero
10:15:00        "kill -9 1867" will kill a runaway server
10:15:00 Vice file system salvager, version 3.0.
10:15:00 SanityCheckFreeLists: Checking RVM Vnode Free lists.
10:15:00 DestroyBadVolumes: Checking for destroyed volumes.
10:15:00 Salvaging file system partition /vicepa
10:15:00 Force salvage of all volumes on this partition
10:15:00 Scanning inodes in directory /vicepa...
10:15:01 Entering DCC(0x1000001)
10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 MarkLogEntries: loglist was NULL ...

10:15:01 DCC: Salvaging Logs for volume 0x1000001
10:15:01 done:  1261 files/dirs,        27892 blocks
10:15:01 SalvageFileSys completed on /vicepa
10:15:01 VAttachVolumeById: vol 1000001 (/.0) attached and online
10:15:01 Attached 1 volumes; 0 volumes not attached
lqman: Creating LockQueue Manager.....LockQueue Manager starting .....
10:15:01 LockQueue Manager just did a rvmlib_set_thread_data()

done
10:15:01 CallBackCheckLWP just did a rvmlib_set_thread_data()

10:15:01 CheckLWP just did a rvmlib_set_thread_data()

10:15:01 Starting AuthLWP-0
10:15:01 Starting AuthLWP-1
10:15:01 Starting AuthLWP-2
10:15:01 Starting AuthLWP-3
10:15:01 Starting AuthLWP-4
10:15:01 ServerLWP 0 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 1 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 2 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 3 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 4 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 5 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 6 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 7 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 8 just did a rvmlib_set_thread_data()
10:15:01 ServerLWP 9 just did a rvmlib_set_thread_data()
10:15:01 ResLWP-0 just did a rvmlib_set_thread_data()

10:15:01 ResLWP-1 just did a rvmlib_set_thread_data()

10:15:01 VolUtilLWP 0 just did a rvmlib_set_thread_data()

10:15:01 VolUtilLWP 1 just did a rvmlib_set_thread_data()

10:15:01 Starting SmonDaemon timer
10:15:01 File Server started Mon Dec 18 10:15:01 2006

10:15:06 New Data Base received
10:18:45 AuthLWP-0 received new connection 1153619517 from 
192.168.222.24:32802
10:18:45 client_GetVenusId: got new host 192.168.222.24:32802
10:18:45 Building callback conn.
10:18:45 AuthLWP-1 received new connection 987041452 from 
192.168.222.24:32802
10:18:47 GetAttrPlusSHA: Computing SHA 1000001.114.1deb, disk.inode=9b
10:18:47 GetAttrPlusSHA: Computing SHA 1000001.150.1e09, disk.inode=ba
10:18:47 GetAttrPlusSHA: Computing SHA 1000001.2a2.1ed2, disk.inode=166
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.f2.1dba, disk.inode=8a
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.304.1f23, disk.inode=198
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.68.1d75, disk.inode=64
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.e8.1db5, disk.inode=85
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.62.1d72, disk.inode=61
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.22c.1e97, disk.inode=129
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.242.1ea2, disk.inode=134
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.13c.1dff, disk.inode=af
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.6a.1d76, disk.inode=65
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.66.1d74, disk.inode=63
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.302.1f22, disk.inode=197
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.244.1ea3, disk.inode=135
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.158.1e0d, disk.inode=bf
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.234.1e9b, disk.inode=12d
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.24a.1ea6, disk.inode=138
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.18a.1e26, disk.inode=d8
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.152.1e0a, disk.inode=bb
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.6c.1d77, disk.inode=66
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.c4.1da3, disk.inode=72
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.130.1df9, disk.inode=a9
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.e4.1db3, disk.inode=83
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.14c.1e07, disk.inode=b8
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.19c.1e2f, disk.inode=e1
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.d2.1daa, disk.inode=79
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.14a.1e06, disk.inode=b7
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.190.1e29, disk.inode=db
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.236.1e9c, disk.inode=12e
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.112.1dea, disk.inode=9a
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.10a.1de6, disk.inode=96
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.178.1e1d, disk.inode=cf
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.110.1de9, disk.inode=99
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.292.1eca, disk.inode=15e
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.21e.1e90, disk.inode=122
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.32e.1f38, disk.inode=1e9
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.64.1d73, disk.inode=62
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.d0.1da9, disk.inode=78
10:18:48 GetAttrPlusSHA: Computing SHA 1000001.8d0.1d29, disk.inode=5
10:19:05 Worker0: Unbinding RPC connection 1153619517
10:19:05 Deleting client entry for user System:AnyUser at 
192.168.222.24.32802 rpcid 1153619517
10:19:15 AuthLWP-2 received new connection 1453781193 from 
192.168.222.24:32802
10:26:37 New Data Base received


The SrvLog of clusty2 (slave):
Date: Mon 12/18/2006

10:15:47 Coda Vice, version 6.1.2       log started at Mon Dec 18 
10:15:47 2006

10:15:47 RvmType is Rvm
10:15:47 Main process doing a LWP_Init()
10:15:47 Main thread just did a RVM_SET_THREAD_DATA

10:15:47 Setting Rvm Truncate threshhold to 5.

Partition /vicepa: inodes in use: 0, total: 16777216.
10:15:52 Partition /vicepa: 8211208K available (minfree=5%), 8191888K free.
10:15:52 The server (pid 23185) can be controlled using volutil commands
10:15:52 "volutil -help" will give you a list of these commands
10:15:52 If desperate,
                "kill -SIGWINCH 23185" will increase debugging level
10:15:52        "kill -SIGUSR2 23185" will set debugging level to zero
10:15:52        "kill -9 23185" will kill a runaway server
10:15:52 Vice file system salvager, version 3.0.
10:15:52 SanityCheckFreeLists: Checking RVM Vnode Free lists.
10:15:52 DestroyBadVolumes: Checking for destroyed volumes.
10:15:52 Salvaging file system partition /vicepa
10:15:52 Force salvage of all volumes on this partition
10:15:52 Scanning inodes in directory /vicepa...
10:15:52 SalvageFileSys completed on /vicepa
10:15:52 Attached 0 volumes; 0 volumes not attached
lqman: Creating LockQueue Manager.....LockQueue Manager starting .....
10:15:52 LockQueue Manager just did a rvmlib_set_thread_data()

done
10:15:52 CallBackCheckLWP just did a rvmlib_set_thread_data()

10:15:52 CheckLWP just did a rvmlib_set_thread_data()

10:15:52 Starting AuthLWP-0
10:15:52 Starting AuthLWP-1
10:15:52 Starting AuthLWP-2
10:15:52 Starting AuthLWP-3
10:15:52 Starting AuthLWP-4
10:15:52 ServerLWP 0 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 1 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 2 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 3 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 4 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 5 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 6 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 7 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 8 just did a rvmlib_set_thread_data()
10:15:52 ServerLWP 9 just did a rvmlib_set_thread_data()
10:15:52 ResLWP-0 just did a rvmlib_set_thread_data()

10:15:52 ResLWP-1 just did a rvmlib_set_thread_data()

10:15:52 VolUtilLWP 0 just did a rvmlib_set_thread_data()

10:15:52 VolUtilLWP 1 just did a rvmlib_set_thread_data()

10:15:52 Starting SmonDaemon timer
10:15:52 File Server started Mon Dec 18 10:15:52 2006

10:16:13 New Data Base received
10:26:43 New Data Base received
10:27:13 New Data Base received



Sean Caron wrote:
> Hi Achim,
>
> I think you need to have all vice partitions on all Coda servers
> listed in the vicetab on your SCM. e.g.
>
> clusty1.mytest.de   /vicepa   ftree   width=256,depth=3
> clusty2.mytest.de   /vicepa   ftree   width=256,depth=3
>
> Theoretically it is then replicated down to other servers with
> updateclnt, but I found that in reality, this is not always the case,
> and that I needed to manually replicate this file to all the servers.
>
> Doing this "worked for me" when I was running a multi-server cell.
>
> Regards, Sean
> scaron_at_umich.edu
>
Received on 2006-12-18 04:40:05