Coda File System

From: Jan Harkes <jaharkes_at_cs.cmu.edu> Date: Thu, 6 Jun 2002 12:35:34 -0400

On Thu, Jun 06, 2002 at 04:44:51PM +0100, Chris Shiels wrote:
> We're consistently running into problems with our codasrv processes dying
> with the following error message in /vice/srv/SrvErr:
> 
> kossoff# tail SrvErr
>    RVMLIB_ASSERT: error in rvmlib_malloc

I know what it is...

Basically caused by fragmentation of RVM, aggrevated by a poor judgement
change I made that went into 5.3.18 or .19.

The quick fix is already committed in CVS, it is a simple one line
change to the vnode allocation stride in coda-src/vice/srv.cc. Still
working on the necessary cleanups to do the real fix, which is to change
the ever growing vnode array to a fixed size hashtable.

> We've now reached what seems to be the maximum RVM log and data sizes 
> available on our platform and we are unable to detect when this will
> happen next or resize to higher values as none seem available.  Can
> you please help with this?

If you check out and build the CVS version you should be able to store
about 4 to 8 times the number of files in RVM before this hits you
again.

> Incidentally we don't think we're storing that much data - each volume
> contains approx. 15000 files for a total size of approx. 180Mb per volume.

I found that I typically hit the limit around 30K files in a volume on a
server with 200MB of RVM data. So 15000 does seem rather low.

> With 2M(log)/90M(data) we'd see the 'RVMLIB_ASSERT: error in rvmlib_malloc'
> error whilst trying to populate the first volume.
> 
> With 20M(log)/200M(data) we'd see the 'RVMLIB_ASSERt: error in 
> rvmlib_malloc'
> error whilst trying to populate the fourth volume.

Log size doesn't really matter, except for the defragmentation step that
sometimes seems to succeed when RVM is exhausted. In a way these numbers
seem to match my experimental data pretty well. By using several volumes
instead of one you've managed to store about 60000 files before hitting
the severely fragmented case, which is about twice of what I got with a
single volume.

> I'm guessing the RVMLIB_ASSERT is error being caused by filling up all
> available space in the RVM log or data.  Is this correct?

Yes, (and no). You didn't really run out of space, but there isn't a
large enough consecutive chunk to satisfy the allocation. A volume
consists of an array of pointers to vnodes (file objects) and the actual
vnodes.

    AAVVVV

When the array is filled, it is resized and new vnodes can be allocated.

    AAVVVVAAAA     (new array allocated, old data is copied)
    __VVVVAAAA     (old array is freed)
    __VVVVAAAAVVVV (new vnodes are allocated)

This is repeated, and gaps are starting to occur,

    __VVVV____VVVVAAAAAAVVVV

Defragmenting doesn't work because we have the vnodes at each side and
the way both RVM allocations are satisfied and the fact that an
allocation pool is used to speed up vnode allocation and freeing simply
leads to a lot of unused space in RVM.

> According to the announcement for 5.3.19 this is done by 'volutil 
> printstats' but I just can't see this information.

After printstats is run, it should be at the end of /vice/srv/SrvLog
with the header 'RDS statistics'.

>    release_segment unmap failed RVM_ENOT_MAPPED
>    rds_zap_heap completed successfully.
>    rvm_terminate succeeded.
> 
> Whilst the rdsinit was running I could see it had been mapped at 0x50000000
> by looking at /proc/pid/maps.  Additinally strace -p pid indicated that the
> call to munmap() was successful with exit status 0.
> What's going wrong here?

Possibly colliding with shared libraries or something. We did have an
extremely large server running on NetBSD, which has a slightly different
initial offset for RVM. It might even have been staticly linked to
create a more compact binary and to avoid shared library issues.

Jan

Coda File System

Re: RVMLIB_ASSERT and RVM data sizes.