Coda File System

Re: [dhowells@redhat.com: [PATCH] CacheFS - general filesystem cache]

From: David Howells <dhowells_at_redhat.com>
Date: Wed, 01 Sep 2004 06:45:30 -0400
> > I am not sure about persistency across reboots. Also it assumes that the
> > cache is completely managed by some an in-kernel filesystem. So we would
> > need a lot of hooks and changes before venus can put anything in there.

Not necessarily. You should be able to do it relatively easily from within the
kernel. It needs you to declare your indexes (fill_super/put_supe) and files
(iget/clear_inode), and to make calls from readpage() and writepage() and
releasepage(). Coda would then own its own pages, which would be backed by
CacheFS; CacheFS reads/writes directly from/to the netfs's pages.

I presume Coda loads a whole file at a time into its cache, messes around with
it and writes the whole thing back? I could support that; and, in fact, I
probably need to for my AFS client.

In my AFS client, I want to add an ioctl() that allows you to pull an entire
file into the cache. I'll do it by pinning the inode, then calling
do_generic_mapping_read() to pull the whole file into the cache (readpage()
invokes cachefs).

I also want to add cache pinning (which is implicit anyway for any active
inode). I'd probably also have to add the ability to build a cache file from
userspace, but Linus is interested in that because he'd like to see some tmpfs
type thing on top of it.

> > It also makes its own decisions on when to purge objects from the cache,
> > probably not such a big problem.

Like I said, I want to add hard cache-pinning anyway, and you get implicit
cache pinning on active objects. But since you'd be sharing the cache with
AFS, NFS or whatever else, it'd probably be best to leave that to CacheFS.

This can even be done in userspace. Go into wherever the cache is mounted and
you can use unlink and rmdir to eject files and indexes from the cache.

> > I don't know how it deals with writes, it could very well be write-through
> > only and assumes that anything in the cache can be thrown out at any time
> > without losing data.

I have a plan to make a write-back journal. This would allow a filesystem to
record an extent of a cache file that needs writing back. This journal would
be "replayed" when the cache inode is "opened".

Having a mark in the write-back journal would pin an inode in the cache.

> Cachefs is persistent across reboots. As for the other questions, I'm
> guessing dhowells has a better answer than I.

Cachefs is persistent across reboots, yes; it's also journalled to make sure
the cache doesn't get into a bad state.

> > It was probably designed for an NFS-like client that is completely in
> > the kernel and caches on a page granularity. The last time I checked
> > David Howells' AFS client was an NFS client that happened to use
> > AFS-compatible rpc calls to talk to the server. But that was a while ago
> > so I'm not sure if it has changed much since then.

It's not an NFS client; it's an AFS client. I do deal with files on a
page-by-page granularity, but the filesystem you're storing your cache on
probably does too.

> > In any case, I believe even if it is useful, we will still need our own
> > cache for the metadata and directories

I can do some of that. CacheFS doesn't care what data resides in a cache file,
just that it's fed in pages. It also doesn't care what index structure you
select, as long as the index entry key matching function you provide knows
what you're looking for. You can also store quite a bit of auxilliary metadata
in an index entry (index entries have to fit into journal entries along with
the journal header, so you get on the order of 400 or so bytes to play with).

> > and probably for local mutations that are waiting to be reintegrated.

As I said above, I'm also going to need to implement a write-back journal.

> My initial reaction is the
> existing cachefs support for files with holes could be used with a
> couple of hooks to request more of the file from venus.

If you can supply page-sized reads, then do_generic_mapping_read() will do the
trick.

Tell me what you'd like to be able to store in CacheFS, and I'll see what I
can do to accommodate you.

David
Received on 2004-09-01 09:29:41