Coda File System

Re: coda crash

From: Steffen Neumann <sneumann_at_techfak.uni-bielefeld.de>
Date: 10 Oct 2003 12:06:15 +0200
Steffen Neumann <sneumann_at_techfak.uni-bielefeld.de> writes:

> Jan Harkes <jaharkes_at_cs.cmu.edu> writes:
> 
> > On Tue, Oct 07, 2003 at 07:42:15PM +0200, Eduard Frese wrote:
> > > Hallo,
> > > we haw a venus crash every day in last days :-(
> > > 
> > > what i can do?
> > 
> > What kernel version is this with.
> 
> It is a 2.4.20, the Coda is a CVS around 6.0.2,
> both client and server live on this machine.
> The crash is caused by / during the nightly backups,
> which do a kind of tar each night, so we're eager
> to get the backup back ;-)

After it has been very quiet for two days,
here we are again. I have the feeling coda 
is very flaky at the moment, and I am not sure what happens.

This time an error has hit us during an 

	rm -rf /coda/vol/kernel/src/linux-2.2.20

which went berserk. Any ideas ?

Yours,
Steffen,
still trying to hunt down that other 
problem we have.

---------------------------------------------------------


11:15:00 root acquiring Coda tokens!
11:49:09 Fatal Signal (11); pid 10164 becoming a zombie...
11:49:09 You may use gdb to attach to 10164

---------------------------------------------------------

[ H(07) : 0234 : 11:48:43 ] HDBDaemon just woke up
[ H(07) : 0234 : 11:48:44 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.f02.15533>
[ H(07) : 0234 : 11:48:44 ] Number of interrupt failures = 1
[ H(07) : 0234 : 11:48:45 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.1708.15580>
[ H(07) : 0234 : 11:48:45 ] Number of interrupt failures = 2
[ H(07) : 0234 : 11:48:45 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.f0e.15539>
[ H(07) : 0234 : 11:48:45 ] Number of interrupt failures = 3
[ H(07) : 0234 : 11:48:46 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.1714.15586>
[ H(07) : 0234 : 11:48:46 ] Number of interrupt failures = 4
[ H(07) : 0234 : 11:48:47 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.7fa0.15952>
[ H(07) : 0234 : 11:48:47 ] Number of interrupt failures = 5
[ H(07) : 0234 : 11:48:48 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.7fa2.15953>
[ H(07) : 0234 : 11:48:48 ] Number of interrupt failures = 6
[ H(07) : 0234 : 11:48:49 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.d7da.1530b>
[ H(07) : 0234 : 11:48:49 ] Number of interrupt failures = 7
[ H(07) : 0234 : 11:48:50 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.d828.15352>
[ H(07) : 0234 : 11:48:50 ] Number of interrupt failures = 8
[ H(07) : 0234 : 11:48:50 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.d82a.15353>
[ H(07) : 0234 : 11:48:50 ] Number of interrupt failures = 9
[ H(07) : 0234 : 11:48:51 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.d836.15359>
[ H(07) : 0234 : 11:48:51 ] Number of interrupt failures = 10
[ H(07) : 0234 : 11:49:03 ] Hoard Walk interrupted -- object missing! <529d80c8.7f000036.804a.153ea>
[ H(07) : 0234 : 11:49:03 ] Number of interrupt failures = 11
[ H(07) : 0234 : 11:49:09 ] *****  FATAL SIGNAL (11) *****

---------------------------------------------------------

(gdb) where
#0  0x40138c16 in __sigsuspend (set=0x150a03e4) at ../sysdeps/unix/sysv/linux/sigsuspend.c:45
#1  0x080b24c0 in SigChoke (sig=11) at sighand.cc:241
#2  <signal handler called>
#3  olist_iterator::operator() (this=0x150a07d0) at olist.cc:246
#4  0x0805b134 in fso_vol_iterator::operator() (this=0x150a07d0) at fso1.cc:2803
#5  0x080a60dc in repvol::ValidateFSOs (this=0x525bfd08) at vol_vcb.cc:453
#6  0x080a4cf6 in repvol::GetVolAttr (this=0x525bfd08, uid=4294967294) at vol_vcb.cc:113
#7  0x0808f451 in volent::Enter (this=0x525bfd08, mode=2, uid=4294967294) at venusvol.cc:1050
#8  0x080a7e97 in vproc::Begin_VFS (this=0x8246500, volid=0x150a2db8, vfsop=22, volmode=-1) at vproc.cc:590
#9  0x08071f0b in hdb::ValidateCacheStatus (this=0x529dcac8, vp=0x8246500, interrupt_failures=0x150a2e38, statusBytesFetched=0x150a3efc)
    at vproc.h:224
#10 0x08072647 in hdb::StatusWalk (this=0x529dcac8, vp=0x8246500, TotalBytesToFetch=0x150a3ef8, BytesFetched=0x150a3efc) at hdb.cc:881
#11 0x080732c0 in hdb::Walk (this=0x529dcac8, m=0x0, local_id=0) at hdb.cc:1169
#12 0x08075e49 in HDBDaemon () at hdb_daemon.cc:121
#13 0x080a7ab5 in vproc::main (this=0x8246500) at vproc.cc:428
#14 0x080a72a2 in VprocPreamble (init_lock=0x8246540) at vproc.cc:146
#15 0x400882f0 in Create_Process_Part2 () at lwp.c:796
Received on 2003-10-10 06:13:37