Coda File System

CODA and High Volume Websites

From: Ray Taft <searchtraffic2005_at_yahoo.com>
Date: Mon, 14 Feb 2005 16:22:16 -0800 (PST)
I am considering a deployment of CODA for a large
website. I have a few questions I was hoping some CODA
guru's could lend an ear too.

The details are as follows:

- The website sites pushes in excess of 1.2Gbps of
bandwidth to the internet. 

- We use 30 FreeBSD 4.11 Servers servers and load
balancing to balance web traffic to those servers.

- Each webserver has a 100Mbps connection to the NET,
and a private 100Mbps connection to a Storage Network.

- Each server is a P4. 3.2Ghz, 1GB RAM, 160GB ATA Hard
Drive (2 x 100Mbps NICs). Each one of these servers
would run the CODA client and Apache 1.3. 

- Our file server is a DUAL AMD 2.4Ghz (64bit)
platform. It has 8GB RAM, 4 Intel 1Gbps Interfaces
(Trunked / Etherchannel) running SuSE 9.1. Lots of
Fast RAID 10 Storage. All 4 Gig ports are connected to
a private storage network.

- Our file operations are READ ONLY. Our webservers
only request content from the file server (CODA /
NFS). They NEVER write. They can even be mounted as
read only. 

Comments (and hopes)

The client side servers have a max of 160 Gigabytes
each (local ATA hard disk). The total volume of the
site is 360Gigabytes and growing on the file server.
The web server /coda client will never have enough
storage to hold all of the files on the coda server. 
 
The reason for the smaller storage was that not all
360 Gigabytes would be accessed at any one point in
time that frequently. Roughly 200 gigabytes of storage
would be requested on any particular day of the total.


Right now, we use NFS as the form of communication to
deliver files from the NFS to the web server to the
surfer. This is great, but it is really killing our
NFS server once they get above 100Mbytes a second
transfer (1000mbps) . We have tried squid, but it does
not perform nearly to the level we require, or have
the security functionality (authentication) we need in
Apache. 


I have a few questions (and hopes):

Our hope is by implementing CODA, the most frequently
requested files (average size is 100 Mbytes  - big
files), would be cached on the client side webserver
elevating some load from the NFS file server. That by
caching the files locally, we would be able to achieve
higher transfer rates from the webserver to the surfer
as the file operation is a request on a local cached
file system rather than on a backend NFS pipe. 

In theory, all seems like it should work, so before I
spend 100 hours rebuilding it, I've got a few
questions.  

Questions:

How does an administrator set the cache size on the
local hard disk? How does the flushing algorithm work?
Does it have a flushing algorithm?

Will the CODA clients "tank" when all 140GB +/- of
local storage assigned to cache (assuming you can
assign a cache space on the webserver) become
consumed? Will coda flush the least requested files on
some sort of timer, keeping only frequently requested
data in the local disk cache?

When a surfer http requests a file that does not exist
at the CODA client side cache, and the coda client has
to go back to the coda server to fetch the file, does
the surfer have to wait before the entire file has
been cached on the coda client before taking delivery
(before the file transfer starts)? 

Any other snafu's that might present themselves you
can think of? 

Thanks for your help in advance. Getting through a
scalable architecture such as this has challenges.
Received on 2005-02-14 19:25:56