Coda File System

Re: CODA kernel module limitations...

From: Roland Mainz <Roland.Mainz_at_informatik.med.uni-giessen.de>
Date: Tue, 17 Oct 2000 18:39:51 +0200


Jan Harkes wrote:

> > Think about a ftp site (using CODA kernel-module and podfuk+ftpfs) which
> > has large INDEX files (~60MB). File manager likes to determinate the
> > datatype - reads the first two bytes - but triggers the download of
> > 60MB.
> > Quite efficient, isn't it ? ;-(((
> >
> > (Possible) solution:
> > Allow random-access (read/write etc.) to remote files.
> > What about an interface to the SYSV-VFS layer, e.g. moving more stuff
> > from kernel to userspace layer ?
>
> This would actually introduce far too much overhead in context switches
> between the application and the cache-manager. Besides, as far as Coda
> is concerned, it makes it impossible to guarantee consistency. However,
> there is another solution which we've been thinking of.
>
> When a large file is opened, CODA_OPEN could return early, f.i. when
> the first 8-64KB have been fetched. The kernel would get a `lease' on
> accessing these first pages (both read and write), while the cache
> manager pulls in the rest of the file.
>
> When the application accessing the file seeks or reads past it's `lease'
> it is blocked until the data is available, and the cachemanager has
> returned a new lease. However when the application is done and closes
> the file before everything has been fetched, the ongoing fetch can be
> aborted.
>
> In this model, we can keep streaming data into the container files as
> efficiently as possible, while at the same time allowing some early
> access to the containerfile. One of the big problems is that most
> applications don't handle read/write errors very well, so an interrupted
> transfer (disconnection) would lead to silently truncated files. Mostly
> due to user `error', when someone opens a file in an editor, makes some
> changes but doesn't notice the end of the file was lost and saves it back.

I was thinking about a completly different solution: What about letting venus/podfuk decide at OPEN to download the complete file to the local system (e.g. into the "cache") or handle each operation (read/write/seek/locking etc.) manually and map them to the remote file. This would give us the control whether file operations should be executed on the "cached" file or on a remote file. This would be a more universal approach as this also supports both your and my idea, too (for example: venus decides at OPEN to access the remote file readonly but triggers the download into the cache in _parallel_. If download is complete it switches from remote to cached file).
And it would allow other projects to use the same interface, either using cached operations, direct operations or both.


Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) Roland.Mainz_at_informatik.med.uni-giessen.de
  \__\/\/__/  gisburn_at_informatik.med.uni-giessen.de
  /O /==\ O\  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
 (;O/ \/ \O;) TEL +49 641 99-13193 FAX +49 641 99-41359
Received on 2000-10-17 12:41:14