Coda File System

Re: Yellow zone, slowing down writer

From: Jason A. Pattie <>
Date: Thu, 19 Feb 2004 13:29:02 -0600
Hash: SHA1

Jan Harkes wrote:
| On Thu, Feb 19, 2004 at 12:11:10PM -0600, Jason A. Pattie wrote:
|>What does that mean?  I suppose that's why it's taking quite awhile to
|>transfer the 580MB of data into the coda volume?
| It means that your client is write-disconnected and performing more
| operations faster than it can reintegrate with the server. If we didn't
| slow down the writes, your client would end up using up the rest of the
| reintegration log quite quickly and most likely crash. If your client is
| not reintegration at all (possibly caused by a conflict) it will at some
| point reach the 'red zone' and block any mutating operations. This is
| pretty fatal in itself, because each blocked operation will use up a
| worker thread and there are only about 20 of those.
|>I did an 'cfs strong' command, but that didn't change anything.  Is it
|>possible that the cache is getting full (only 400MB) and needs to purge
|>least recently used entries or something and so it's taking lots of time?
| I've said this many times before, there is no such thing as guaranteed
| connected operation in Coda. If anything goes wrong during a write/store
| operation the client will silently switch to write-disconnected
| operation (logging state). If the server is slow to respond we switch to
| a logging state. And reversely, when the client can't be reached by the
| server, the server triggers the disconnect were are likely to switch to
| a logging state.
| The only thing that cfs strong does is prevent the client from listening
| to the often incorrect 'bandwidth estimates' from the RPC2 communication
| layer, so that transitions only happen in error cases and not based on
| incorrect estimates. In fact, if you were already write-disconnected
| before calling cfs strong, the client will never discover that the
| network actually has good bandwidth and will never transition to the
| connected state.

So where do I begin the process of troubleshooting what went wrong?  I'm
currently in the Red state.  The scenario is that I'm on the actual file
server (scm) transferring files from one location on another partition
to the /coda/<realm>/<dir> location using tar.  I.e., the file server is
running both codasrv and coda-client (venus).  Is this a bad thing?
Should I only connect from a remote venus instance (i.e., my laptop,
etc.)?  I would think this would be a scenario that never gets
"disconnected".  I also don't see how anything could "go wrong" in this
scenario, especially since I'm copying data that shouldn't have changed,
but I can delete all the data in the directory and start the copy over.
~ Technically, I did have a previous copy of the data in the directory
before I had created users and groups that matched those of the original
directory.  My assumption was that I could reissue the tar command and
replace all the files with the correct ownership and permissions.  If
this is not the case, then I can easily delete all files and start the
copy over again.


- --
Jason A. Pattie
Xperience, Inc. (
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Debian -


This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
MailScanner thanks transtec Computers for their support.
Received on 2004-02-19 14:35:06