Coda File System

Re: Problem with mv(1)

From: Peter J. Braam <braam_at_cs.cmu.edu>
Date: Fri, 27 Nov 1998 16:16:56 -0500 (EST)
Michael,

Yes what you show is slow, but for a reason.  The Coda servers are
stateful and protect clients from seeing bad data.  The protocol is that
of callback promises, and when a client wants to store a new version,
servers will first break callbacks at all other clients.  In this way read
only access to files/directories goes without ever contacting the servers
provided files are in cache. 

Before you made the final change on client2 your server called all other
clients to break the callback (including the client1 that you had
shutdown). [The servers do this with a multi rpc to limit the total wait
to a single timeout.] Client1 was no longer on the net, so the server let
the RPC time out [sending a variety of retransmisions etc].  This is what
took a minute.

Yet, you do have a point: when a client is shut down orderly, we should
tell all the servers to which we hold about it, so that it destroys state
in an orderly fashion.

- Peter -






On Fri, 27 Nov 1998, Michael Salzmann wrote:

> 
> 
> Hi,
> 
> I have a coda-server (Linux 2.0.34, coda 4.6.6), two clients cc1, cc2 (Linux
> 2.1.128, coda 4.6.6, kernel-module built from linux-coda-4.7.0.tgz) and a
> problem with mv(1).
> 
> I did the following as admin-user (with tokens):
> 
> [admin_at_cc1 /coda/var]$ctokens
> 
> Token held by the Cache Manager:
> 
> Local uid: 500
> Coda user id: 500
> Expiration time: Sat Nov 28 12:06:30 1998   
> 
> [admin_at_cc2 /coda/var]$ctokens
> 
> Token held by the Cache Manager:
> 
> Local uid: 500
> Coda user id: 500
> Expiration time: Sat Nov 28 10:06:47 1998  
> 
> 
> create file on cc1:
> 
> [admin_at_cc1 /coda/var]$touch test1/dummy  
> 
> move file to directory test2 (on cc2):
> 
> [admin_at_cc2 /coda/var]$time mv test1/dummy test2/
> 0.00user 0.01system 0:00.03elapsed 29%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (76major+8minor)pagefaults 0swaps  
> 
> move file to directory test1 (on cc1):
> 
> [admin_at_cc1 /coda/var]$mv test2/dummy test1/
> 
> stop venus on cc1:
> 
> [root_at_cc1 /]# /etc/rc.d/init.d/venus.init stop 
> 
> 
> another move on cc2:
> 
> [admin_at_cc2 /coda/var]$time mv test1/dummy test2/
> 0.01user 0.00system 1:00.43elapsed 0%CPU (0avgtext+0avgdata 0maxresident)k
>                     ^^^^^^^^^^^^^^
> 0inputs+0outputs (76major+8minor)pagefaults 0swaps
> 
> 
> That's quite slow .... :-(   
> 
> Some lines from venus.log (vutil d 1):
> 
> [ W(15) : 0000 : 12:09:12 ] vproc::getattr: fid = (7f000003.5.10b)
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Getattr): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] fsobj::GetAttr: failed validation
> [ W(15) : 0000 : 12:09:12 ] 	remote = [33 0 0 0 0 0 0 0] [4c36eb7 365e87cb] [0]
> [ W(15) : 0000 : 12:09:12 ] 	local = [32 0 0 0 0 0 0 0] [4c368b7 365e8575] [0]
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Getattr): code = 157
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Getattr): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] fsobj::GetAttr: failed validation
> [ W(15) : 0000 : 12:09:12 ] 	remote = [33 0 0 0 0 0 0 0] [4c36eb7 365e87cb] [0]
> [ W(15) : 0000 : 12:09:12 ] 	local = [32 0 0 0 0 0 0 0] [4c368b7 365e8575] [0]
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Getattr): code = 0
> [ W(15) : 0000 : 12:09:12 ] Getattr : returns SUCCESS, elapsed = 4.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::lookup: fid = (7f000003.1.1), name = test1, nc = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Lookup): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 1.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Lookup): code = 0
> [ W(15) : 0000 : 12:09:12 ] Lookup : returns SUCCESS, elapsed = 5.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::getattr: fid = (7f000003.3.10a)
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Getattr): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Getattr): code = 0
> [ W(15) : 0000 : 12:09:12 ] Getattr : returns SUCCESS, elapsed = 0.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::access: fid = (7f000003.3.10a), mode = 01
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Access): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Access): code = 0
> [ W(15) : 0000 : 12:09:12 ] Access : returns SUCCESS, elapsed = 0.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::lookup: fid = (7f000003.3.10a), name = dummy, nc = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Lookup): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 1.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Lookup): code = 0
> [ W(15) : 0000 : 12:09:12 ] Lookup : returns SUCCESS, elapsed = 12.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::getattr: fid = (7f000003.44.12a)
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Getattr): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Getattr): code = 0
> [ W(15) : 0000 : 12:09:12 ] Getattr : returns SUCCESS, elapsed = 0.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::lookup: fid = (7f000003.5.10b), name = dummy, nc = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Lookup): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Lookup): code = 2
> [ W(15) : 0000 : 12:09:12 ] Lookup : returns No such file or directory, elapsed = 7.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::access: fid = (7f000003.3.10a), mode = 03
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Access): vid = 7f000003, u.u_vol = 0, mode = -1
> [ W(15) : 0000 : 12:09:12 ] vproc::End_VFS(Access): code = 0
> [ W(15) : 0000 : 12:09:12 ] Access : returns SUCCESS, elapsed = 0.0 msec
> [ W(15) : 0000 : 12:09:12 ] vproc::rename: fid = (7f000003.3.10a), td_fid = (7f000003.5.10b), name = dummy, toname = dummy
> [ W(15) : 0000 : 12:09:12 ] vproc::Begin_VFS(Rename): vid = 7f000003, u.u_vol = 0, mode = -1
> 
> [ D(02) : 0119 : 12:09:17 ] NotifyUserOfServerBandwidthEvent
> 
> [ F(06) : 0120 : 12:09:27 ] Transaction: elapsed = 0.0, delta_vm = 0
> 
> [ F(06) : 0123 : 12:09:42 ] Transaction: elapsed = 3.0, delta_vm = 0
> 
> [ T(01) : 0124 : 12:09:42 ] RecovFlush: count = 1, size = 24320, elapsed = 20.0, delta_vm = 0
> 
> [ F(06) : 0128 : 12:10:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> 
> [ W(15) : 0000 : 12:10:12 ] *** Long Running (Multi)VRename: code = -2001, elapsed = 60105.0 ***
> [ W(15) : 0000 : 12:10:12 ] Transaction: elapsed = 0.0, delta_vm = 0
> [ W(15) : 0000 : 12:10:12 ] Transaction: elapsed = 1.0, delta_vm = 0
> [ W(15) : 0000 : 12:10:12 ] Rename : returns SUCCESS, elapsed = -1631886080.0 msec
> [ W(15) : 0000 : 12:10:12 ] vproc::End_VFS(Rename): code = 0
> [ W(15) : 0000 : 12:10:12 ] [Return Done] Rename : returns SUCCESS, elapsed = 60109.0
> 
> 
> 
> What went wrong? What have I done wrong?
> 
> 
> 
> -MIC
> 
> p.s: I found "[ H(07) : 0003 : 12:15:38 ] Tally for vuid=-1:" somewhere
> else in the log-file. What does this mean? Is is related to the problem?
> 
> 
> 
> 
Received on 1998-11-27 17:01:16