Coda File System

Re: large volume

From: Phil Nelson <phil_at_cs.wwu.edu>
Date: Thu, 18 Apr 2002 20:14:52 -0700 (PDT)
Hi Yonatan,

2 Tb?  That is quite a bit.  :)

I have done some work with larger coda servers.  I am running
one machine that has two coda servers running on it at two
different IP addresses.  One server is serving up a 20 Gb partition
and the other one a 13 Gb partition.  Each server has 1Gb of RVM.
That allows my one machine to serve 33G of space.  I'd be serving
more space, but my disk is not big enough. :)

The documentation for servers says that RVM should about 5% of data
served.  So we know you can run servers with 1Gb of RVM and that would
mean you should be able serve 20Gb.  This really depends on file size.
Since meta data is stored in RVM, large files use up the server RVM
space at a much lower ratio than the 5%.  A single 1G file would take
very little space in RVM and use 1G of file space.  This size of file
is most likely not the best kind of file to store on coda.  On the
other hand, I suspect that e-mail files may be smaller than average
files so the 5% may be optimistic.  But then again, if people mail a lot
of pictures, the average size may larger.

So using the above method, I would think you should be able to run up to 5 
server processes on recent server class machines.  That would provide
100 Gb per machine and the you would need only 20 machines to get to
your 2 Tb.  Of course, if you have 2 Tb of data, you may want a larger
available space so you aren't too close to limits.  

I'm sure nothing like what you want to do has been done yet.

In thinking about this, I was wondering if using a static linked
binary might allow the RVM for a single process get as big as 2Gb.
I haven't tried this.  It may also depend on which OS your server
is running.  If you could get a 2Gb RVM mapped, that would allow
your projected size to be 40Gb per server.
  
The other question is how much replication would you want.  I would
recommend that each volume be on at least 2 different machines.  I
don't replicate between server processes running on the same physical
machine.  Using replication would double the number of machines you
need to use. 

I'd love to play with even larger servers, but I don't currently
have the resources.

--Phil

-- 
Phil Nelson                       NetBSD: http://www.netbsd.org
e-mail: phil@cs.wwu.edu           Coda: http://www.coda.cs.cmu.edu
http://cs.wwu.edu/faculty/nelson 
Received on 2002-04-18 23:17:29