Coda File System

Re: CVS updates take down the client

From: Stephen J. Turnbull <stephen_at_xemacs.org>
Date: Wed, 03 Sep 2003 09:32:14 +0900
Just FYI, I've finally gotten around to applying this patch.  The
workspace is otherwise straight CVS HEAD, up-to-date as of Sep 2 11:00
UTC or so, except for a couple of ancient tweaks to stuff in debian/.

[Speaking of debian, I noticed that several of the library names
didn't include ".so"---this seems to be better now, but I'm not really
sure because /usr/lib had some cruft from slightly earlier builds of
6.0.3---the rev nos were bumped for this one.  Also dh_shlibsdep
bitched about improper format for some of the library symlinks when I
was building _this_ time, so probably there's still a problem
somewhere.]

The 6.0.3 venus has been somewhat more stable recently, but it still
crashes on the "vol_cml.cc, line 2256" assertion once or twice a week,
always during the daily CVS update on volume "xe-21.5" (containing the
most recent XEmacs sources, and so I conjecture in some sense "big",
most likely in the sense of many mkdir/rmdir operations from cvs
update -dP---at least local/global conflicts on some random directory,
never a file, used to happen on every cvs update in that tree for Coda
5.3.20).

Classes just started, so I probably will neglect to report experience
(especially if it's good :-).  Don't hesitate to ask if you're
interested, and of course I'm not throwing away any logs.

>>>>> "Jan" == Jan Harkes <jaharkes_at_cs.cmu.edu> writes:

    Jan> On Thu, Aug 14, 2003 at 02:23:01PM +0900, Stephen J. Turnbull
    Jan> wrote:

    >> Assertion failed: f, file "vol_cml.cc", line 2256

    Jan> This assertion is interesting. When we're reintegrating, the
    Jan> reintegration is smart enough to apply most directory
    Jan> operations to a directory that was already modified on the
    Jan> server (because it was dirty on the client it is missing a
    Jan> current version).

    Jan> When reintegration completes, the server includes a list of
    Jan> 'stale directories' that should be refetched by the
    Jan> client. And this assertion triggered in the piece of code
    Jan> that tries to mark/remove the stale directories. The
    Jan> assertion happens because the client can't find the stale
    Jan> object (i.e. it must have already been thrown out).

    Jan> So instead of asserting because we can't find it, we should
    Jan> probably just 'skip' the (already removed?) object.

    Jan> Jan

--- vol_cml.cc.orig	2003-06-01 19:03:52.000000000 -0400
+++ vol_cml.cc	2003-08-21 21:55:25.000000000 -0400
@@ -2253,10 +2253,11 @@
 		    LOG(0, ("ClientModifyLog::COP1: stale dir %s\n", 
 			    FID_(&StaleDir)));
 		    fsobj *f = FSDB->Find(&StaleDir);
-		    CODA_ASSERT(f);
-		    Recov_BeginTrans();
+		    if (f) {
+			Recov_BeginTrans();
 			f->Kill();
-		    Recov_EndTrans(DMFP);
+			Recov_EndTrans(DMFP);
+		    }
 		}
 	    }
     }


-- 
Institute of Policy and Planning Sciences     http://turnbull.sk.tsukuba.ac.jp
University of Tsukuba                    Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
               Ask not how you can "do" free software business;
              ask what your business can "do for" free software.
Received on 2003-09-02 20:39:05