Coda File System

Re: Plan to revise documentation

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Fri, 29 Apr 2005 14:39:33 -0400
On Fri, Apr 29, 2005 at 10:14:00AM -0700, John Anderson wrote:
> >Some of these I know, but there are probably many I don't even know.
> >
> >   directory size:	256KB
> >	- this doesn't easily translate to # of files, because it
> >	  depends on the average filename size, padding, space for the
> >	  file identifiers, etc.
> >	  a ballpark figure would be between 2048 and 4096 files.
> 
> Is there any way to modify or override this limitation?  I think this was 
> the wall I was running into while trying to load all those .gif's into a 
> coda directory.

Not easily. Right now the directory format is something like a fixed
array of 128 pointers to 2048 byte blocks (or maybe 256 pointers to 1KB
blocks). Somewhat similar to a Unix directory, but without the
indirect blocks.

Scattered through the code are actually 3 different representations of
the directory data, one is used between the Coda client and the kernel
and is mostly a flat file that I believe is identical to a directory on
BSD FFS, I guess the first kernel implementations didn't parse the
directory contents and let the kernel's readdir handle it. On Linux it
had to be parsed from the start because of subtle differences, but it
still uses the same layout. Because there is no room for FIDs or hashed
lookups this is actually a suboptimal format, so the RVM representation
on the client and the servers is actually the array of pointers to
blocks along with a hash to speed up lookup-by-name. The final format is
used between the client and server where the indexing array is dropped
and all the individual blocks are coalesced into to one big chunk. This
is because SFTP doesn't do scatter-gather, so this is the only way a
directory can be sent through SFTP.

In any case, changing the basic directory structure will affect all
servers and clients. There is no simple way to make things backward or
forward compatible. The other problem is that when directories do get
significantly larger than 256KB, allocating, constructing and unpacking
the blob that is sent by SFTP might become an issue.

Ideally we'd have a single format that is stored on disk on both the
client and the servers, instead of in RVM, which can be fetched just
like a file (i.e. no special case packing of a directory into a memory
blob or a local tmpfile) way as files, and that the kernel module would
parse into whatever the readdir needs to return to userspace.
Of course any modifications to the file would still have to be logged in
RVM to avoid inconsistencies and some care has to be taken with how the
directory files are updated wrt. concurrent readdir and block/page
boundaries.

Jan
Received on 2005-04-29 14:40:33