Coda File System

Re: CODA starting questions

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 9 Oct 2001 15:30:14 -0400
On Tue, Oct 09, 2001 at 11:53:09AM +0200, Ren? Rebe wrote:
> As alternative I find CODA quite promissing. But I found problems reading
> through the docu.
> 
> The introduction about coda says:
> ..., Venus fetches the entire file from the servers, ...
> 
> Err? I have some multi-track wave files (> 2 GB of course), RockLinux tars
> (< 1GB) and ISOs (600MB) lying around ... . So Code will not be able to serve
> this files? Also when I create a 4GB cache it would be extremly slow to copy
> the whole file over a 100MBit ethernet when a appication calls open (). Coda
> is not able to serve the data on-the-fly (for such files only ...) ??

It will be extremely fast when they are already in the cache. How do you
expect work with a file while disconnected from the network if you only
have a random few bits and piece cached locally, and then when both the
local and the server version of the file are updated, how would you
reconstruct a consistent complete version of the local copy.

Also, if you want to work with file-chunks, we suddenly would need to
make more upcalls from the kernel to userspace when a file is accessed
to check whether the stuff is present. Upcalls are pretty expensive,
just compare creating 100 files in /coda compared to creating 100 files
on the local disk. (while Coda is write-disconnected, otherwise you're
measuring client-server communication overhead).

Partial caching just won't work if you want to have any guarantees on
consistency, just look at the mess AFS3 made of it. Small example for
AFS users, open a file for read/write, write 100KB, sleep for a bit. In
the mean time, truncate the file and write a few bytes of data from
another client. Then continue writing on the first client. You end up
with... The few bytes written by the second client, almost 64KB of 0's,
and then the tail of the file as written by the first client. Similar
things apply to NFS which basically caches on a page basis, and earlier
writes can possibly replace more recent updates. Just google for the
horror stories.

> In one docu I found that the cache shouldn't be larger than ~300MB. Is this
> the latest info? Especially for the larger files (mentioned above) and hording
> on my laptop this would not be that cool ...

That is derived from on an average file size of 24KB, i.e. we're really
concerned about the number of cacheable files. 300MB is approximately
12000/13000 cacheable files. If your files are on average 600MB, the
cachesize can be several terabytes (never tried, but...) as long as you
override the default calculation by defining values for both cacheblocks
_and_ cachefiles in /etc/coda/venus.conf.

BTW. It could be that RFC822 doesn't allow characters like č in email
headers. My email client refused to insert "René Rebe <rene.rebe@..." as
the reply-to address. Mutt typically handles this stuff pretty well, so
I expect it is some override to avoid breaking standards.

Jan
Received on 2001-10-09 15:30:19