Coda File System

Strange hang and deqing log message

From: Patrick Walsh <pwalsh_at_esoft.com>
Date: Fri, 26 Aug 2005 15:23:45 -0600
	Hey gang, haven't seen much traffic here for awhile.  That may be a
good sign.  Alas, I have a new problem to report.  I don't think it's
the first time it happened.  A colleague was making sym links by hand.
He deleted one, then readded it a minute later.  The directory he was
working in became unusable from any connected client.  Other directories
continued to work fine.  In the end I was forced to restart both of our
coda servers to get things going again.

	There may have been a cron job or two running in the background at the
same time, so the linking thing may not have been the cause, but it's
suspicious because the issue occurred in the directory he was working
in.

	SrvErr on both servers was empty.  Restarting clients didn't fix the
problem.  The log messages in SrvLog were fairly usual except for a few
that I haven't seen before that look like this:

****** WARNING entry at 0x812ac80 already has deqing set!

	Anyone have any ideas for what may have caused this?  I should have
attached with gdb, but didn't think of it until after restarting the
servers.  Here's the log from server 1:

14:25:04 GetAttrPlusSHA: Computing SHA 1000004.fd4.3569, disk.inode=1d77
14:25:04 GetAttrPlusSHA: Computing SHA 1000004.28dc.1eaf,
disk.inode=165e
14:30:02 ****** WARNING entry at 0x812ac80 already has deqing set!

14:30:07 LockQueue Manager: found entry for volume 0x1000004
14:30:20 SmonDaemon timer expired
14:30:20 Entered CheckRVMResStat
14:30:20 Starting SmonDaemon timer
14:31:07 LockQueue Manager: found entry for volume 0x1000004
14:32:07 LockQueue Manager: found entry for volume 0x1000004
14:32:49 ****** WARNING entry at 0x812ac80 already has deqing set!

14:33:07 LockQueue Manager: found entry for volume 0x1000004
14:33:54 GetAttrPlusSHA: Computing SHA 1000002.1c0.101, disk.inode=187e
14:33:54 VGetVnode: vnode 1000002.2048 is not allocated
14:33:54 GetAttrPlusSHA: Computing SHA 1000002.20c.127, disk.inode=19c7
14:33:58 GetAttrPlusSHA: Computing SHA 1000002.204c.144f, disk.inode=4cd
14:33:58 GetAttrPlusSHA: Computing SHA 1000002.20e.128, disk.inode=1873
14:34:07 LockQueue Manager: found entry for volume 0x1000004
14:34:18 VGetVnode: vnode 1000002.2048 is not allocated
14:35:07 LockQueue Manager: found entry for volume 0x1000004
LQMan: Unlocking 1000004
14:38:45 VGetVnode: vnode 1000002.2048 is not allocated


	And here's the log from server 2:

14:25:04 GetAttrPlusSHA: Computing SHA 2000004.28dc.1eaf,
disk.inode=1667
14:29:12 Entering RecovDirResolve 7f000003.14d.1407

14:29:12 RegDirResolution: WEAKLY EQUAL DIRECTORIES
14:29:12 RecovDirResolve: RegDirResolution succeeded

14:30:00 Entering RecovDirResolve 7f000003.14d.1407

14:30:00 RegDirResolution: WEAKLY EQUAL DIRECTORIES
14:30:00 RecovDirResolve: RegDirResolution succeeded

14:30:02 ****** WARNING entry at 0x8124eb0 already has deqing set!
14:32:49 ****** WARNING entry at 0x8124eb0 already has deqing set!

14:37:07 Callback failed RPC2_NAKED (F) for ws 192.168.30.82:2435
14:37:07 Deleting client entry for user backend at 192.168.30.82.2435
rpcid 811174596
14:37:07 Deleting client entry for user System:AnyUser at
192.168.30.82.2435 rpcid 735880060
14:37:07 Deleting client entry for user System:AnyUser at
192.168.30.82.2435 rpcid 338432730
14:37:07 Deleting client entry for user System:AnyUser at
192.168.30.82.2435 rpcid 385662857
14:38:41 SmonDaemon timer expired
14:38:41 Entered CheckRVMResStat
14:38:41 Starting SmonDaemon timer
14:41:07 Callback failed RPC2_NAKED (F) for ws 192.168.30.81:2435
14:41:07 Deleting client entry for user backend at 192.168.30.81.2435
rpcid 951606769
14:41:07 Deleting client entry for user backend at 192.168.30.81.2435
rpcid 948031054
14:41:07 Deleting client entry for user backend at 192.168.30.81.2435
rpcid 219515365
14:41:07 Deleting client entry for user scmca at 192.168.30.81.2435
rpcid 1036434862
14:41:07 Deleting client entry for user System:AnyUser at
192.168.30.81.2435 rpcid 193309155
14:41:07 Deleting client entry for user System:AnyUser at
192.168.30.81.2435 rpcid 993820966


	I'm definitely open to thoughts on this.  Similar behavior hit us a
little while ago, but I didn't have time to investigate so I'm not sure
if the logs would have looked similar.

	Thanks for any help.


-- 
Patrick Walsh
eSoft Incorporated
303.444.1600 x3350
http://www.esoft.com/

Received on 2005-08-26 17:29:47