Coda File System

Re: Backing up coda volumes

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Tue, 24 Feb 2004 11:56:17 -0500
On Tue, Feb 24, 2004 at 09:54:14AM -0600, Jason A. Pattie wrote:
> Jan Harkes wrote:
> | It works pretty much the same as backing up normal volumes with dump,
> | except that the Coda backups are in Coda's 'voldump' format, so
> | restoring is still a bit more work and it isn't possible to pull
> | individual files out of the dumps.
> 
> Does this mean that Amanda is actually calling the "backup-coda"
> executable that gets installed by the coda-backup .deb package?  Or is
> it doing something entirely different?

Not really, the server has an administrative 'volutil' RPC2 interface to
manage volume related things. The backup-coda program uses this to go
through the various steps of cloning a volume, dumping the contents and
then marking the backup as successful.

The wrapper scripts for sendsize and sendbackup simply do the same
things by using the standalone volutil command. What it does is the
following.

Amanda first calls sendsize on all client to get a current estimate for
the size of full and incremental backups. If the volume hasn't been
backed up in the last 4 hours the sendsize.coda wrapper automatically
locks and clones the volume to make sure we have a fresh backup.
It then uses volutil dumpestimate to get accurate sizes of a volume
dump at various incremental levels (before compression).

Amanda then internally plans which volumes to back up at what level and
remotely calls sendbackup on all the Amanda clients. Again, if the
backup type is 'coda' we catch the request in our sendbackup.coda
wrapper locate the correct backup volume and use volutil dump to stream
the data (optionally through gzip) back to the Amanda server. If the
'record' flag was set we then mark the backup as successful.

The real difference is that Amanda gets to schedule when full dumps are
made (instead of relying on a fixed day of the week). So for archival a
separate Amanda configuration is necessary which only makes full dumps
and doesn't record the fact that backups occured.

The other difference is that backup-coda only makes full and level-1
incremental backups. Amanda will actually make dumps at any level that
it deems optimal and these have to be merged before we can restore.

Restoring a Coda volume is a bit different,

    Let's say we want fridays's backup restored, but that day there was
    a level-2 incremental. So we pull this level-2 off the tape. We then
    have to look back in the log to find the most recent level-1 before
    this level-2 backup. Which might have been on wednesday, so we pull
    wednesday's backup off tape and then find the level-0 (full) against
    which wednesdays level-1 was made. This might have been monday.

    So now we have a full-dump and a level-1 against this full dump, we
    merge these first
	merge wednesday-0 monday-0 wednesday-1

    And then we apply the level-2 against that,
	merge friday-0 wednesday-0 friday-2

    Finally we can restore fridays full dump,
	cat friday-0 | volutil -h <servername> restore /vicepa <volumename>

    (catting to stdin is a sneaky way to get around a 2GB file limit
    in RPC2, it prevents sftp to seek to the wrong offset)

This restored volume is not replicated and read-only, there probably is
some hidden flag to change to turn it into a read-write volume replica
that can be used to bring a full blown replicated volume back to life
but I haven't figured out which bits to flip.

The merge tool might be part of the coda-backup package, I'm not sure.
And at the moment it probably doesn't really protect you against
applying an incremental to the wrong full dump, which could lead to a
corrupted full dump.

Clearly it is not completely trivial, but it does work.

Jan
Received on 2004-02-24 11:57:54