Coda File System

Re: server-server conflict doesn't seem resolvable

From: Patrick Walsh <pwalsh_at_esoft.com>
Date: Fri, 06 May 2005 10:00:21 -0600
> Is this on a volume that was 'grown' from singly replicated to doubly
> replicated? 

	Yes.

> I bumped into a problem the other day when one of my servers
> died it tried to resolve a directory conflict in such a volume. It turns
> out that even when resolution is reenabled, the directories in the
> original replica that do not have any resolution log entries trigger a
> null-pointer dereference in the log-based resolution path.

	Hmmmm.  Could be the case.  But other directories went through exactly
the same process and no conflicts appeared, so I suspect this is a
different situation.

> Is this the first repair you tried, or did you have a failed or aborted
> repair on the same object before? 

	The first repair that I tried was, by the time I asked for help, out of
my scroll history...

> However it seems like we either didn't send callbacks, or they get
> ignored when the version vectors get bumped. So the next time the client
> tries to repair he is still looking at the (stale) directories in the
> local cache and the repair will always be rejected because of the
> version-vector mismatch. I have seen this more with files than with
> directories. To check if this is the case,
> 
>     cfs br /coda/director/httpd/html		# expand the conflict
>     cfs getfid /coda/director/httpd/html/*	# show version vectors
>     cfs fl /coda/director/httpd/html/*		# flush cached replicas
>     cfs getfid /coda/director/httpd/html/*	# refetch and show vvs
>     cfs er /coda/director/httpd/html		# collapse the conflict

	That worked like a charm!  Fantastic!  Thank you very much.  

> If the version before and after the flush were different, repair should
> now be able to fix the conflict.

	The versions were in fact different and running repair again caused it
to fix itself.

	Thank you again, Jan.  I'll add these instructions to the wiki.  I
tried cache flushes, but it didn't occur to me to flush the objects
after the repair began.

-- 
Patrick Walsh
eSoft Incorporated
303.444.1600 x3350
http://www.esoft.com/

Received on 2005-05-06 12:01:24