Coda File System

coda & reasonable cache sizes

From: <shivers_at_ccs.neu.edu>
Date: Wed, 2 May 2007 09:24:41 -0400 (EDT)
> 10Gb is small, if you use the proper unit of measurement: it's $4 worth of
> storage. I have a little over a terabyte of disk on my personal office
> computer, so dedicating 10Gb to my cache is trivial.

In fact, if you think about it, you can make an argument that *most* of your
disk-drive capacity should be given over to a coda cache.

First, coda is a global file system, like AFS. The amount of data "out there"
dwarfs the amount of data that lives locally on your computer. So that tells
you something about how you need to split your disk blocks between storage
for the files "out there" and the local data.

Second, data stored in coda should be safer than data stored on your local
drive: it can be kept in geographically distinct, replicated servers that live
in real machine rooms, with conditioned power, fat pipes to the net,
industrial air-conditioning and attentive acolytes seeing to the needs of the
hardware. Your office or home workstation does not have it so good. So the
only data that *ought* to live privately on your computer is the operating
system and possibly /usr -- that is, stuff that came off an install CD and is
completely replaceable in the event of a hardware failure.

These tiny caches that people use for coda (and that the install scripts tend
to do) are not in tune with the 2007 state of the world. Coda *ought* to be
able to handle a 1Tb or larger cache. 1Tb is *one* hard drive today; that's
not *serious* storage, just *reasonable* storage.

Hey, just my music collection is 300Gb -- I abandoned lossy compression around
1998. If I owned a TV, we'd be talking serious storage -- most of my
technically proficient friends buffer Netflix movies on their hard drives and
use MythTV to do Tivo-like capture of their favorite tv series off of cable.
Coda *should* be perfect for a large media store, because that data is so
typically read-only.

Note that if coda were rock solid and scaled like this, then you would never
have to waste disk space setting up RAID arrays on every disk system in your
life -- you'd only need to RAID your coda *server*, while all your client
boxes could run un-RAIDed.
    -Olin
Received on 2007-05-02 09:25:30