Coda File System

Re: crash in rvmlib_free (not necessarily) during repair

From: Piotr Isajew <pki_at_ex.com.pl>
Date: Tue, 9 Jul 2013 18:52:43 +0200
On Mon, Jul 08, 2013 at 12:17:03PM +0000, u-codalist-rcma_at_aetey.se wrote:

> I would suggest scripting the copy into slices per file
> hierarchy level (i.e. breadth first) and ensuring that the
> changes are fully reintegrated/resolved before copying the next
> deeper slice.

Just to let you know about the result, it seems that the crash is
independent from the amount of and structure of data being
copied. It's just that copying more files gives more chances for
a crash to occur.

I upgraded everything to the most recent git sources. It improved
reliability when it comes to server/server conflict
resolution, but gave no advantage for copy operations.

As for now it seems that it's possible to crash both servers in
repetitive way, even when copying small amount of files to single
directory (both servers crashed on the same rvm assertion when
copying 20 pdf files, 60M total).

The most stable behaviour can be achieved by turning off non-SCM,
performing copy operation, waiting for venus to reintegrate
everything to SCM and than bringing up non-SCM and propagating
changes to it. This, however, for larger sets of data gives "No
space left in volume log." error on the SCM, and it crashes on
another assertion. Turnig on non-SCM in such situation leads to
repeatable suicide at it's start, and the whole situation starts
to look like a dog trying to catch his own tail.
Received on 2013-07-09 12:53:00