Coda File System

Re: RVMLIB_ASSERT and RVM data sizes.

From: Chris Shiels <chris_at_taglab.com>
Date: Thu, 06 Jun 2002 19:26:33 +0100
Hi Jan.


Firstly, thank-you for the very prompt reply to the email - much 
appreciated.


When we did our install we were lazy and went with the 5.3.19 RPM's - 
will it be
ok to just build and then deploy the /usr/sbin/codasrv binary and run 
this with the
files from the 5.3.19 RPM's or are there additional files or issues that 
I need to consider?

[ I'm hoping (fingers crossed) that it will be as simple as deploying 
the new
/usr/sbin/codasrv and keeping my existing installations. ]


Also, a few things I'm not clear on:

We're currently running 30M(log)/500M(data) and I've found that I can't go
to 30M(log)/1G(data), are there any other RVM data sizes that are 
available after
the 500M one?  If so then how are the sizes calculated?

[ The workaround each time I've hit this problem has been to use 
'norton-reinit ... -dump',
then vice-setup-rvm with bigger sizes, then 'norton-reinit ... -load'. 
 It would be good to
know if there is a size we can use next time we need to workaround this 
problem. ]


Best Regards,
Chris Shiels.

Senior Systems Architect
Taglab Ltd.



Jan Harkes wrote:

>On Thu, Jun 06, 2002 at 04:44:51PM +0100, Chris Shiels wrote:
>
>>We're consistently running into problems with our codasrv processes dying
>>with the following error message in /vice/srv/SrvErr:
>>
>>kossoff# tail SrvErr
>>   RVMLIB_ASSERT: error in rvmlib_malloc
>>
>
>I know what it is...
>
>Basically caused by fragmentation of RVM, aggrevated by a poor judgement
>change I made that went into 5.3.18 or .19.
>
>The quick fix is already committed in CVS, it is a simple one line
>change to the vnode allocation stride in coda-src/vice/srv.cc. Still
>working on the necessary cleanups to do the real fix, which is to change
>the ever growing vnode array to a fixed size hashtable.
>
>>We've now reached what seems to be the maximum RVM log and data sizes 
>>available on our platform and we are unable to detect when this will
>>happen next or resize to higher values as none seem available.  Can
>>you please help with this?
>>
>
>If you check out and build the CVS version you should be able to store
>about 4 to 8 times the number of files in RVM before this hits you
>again.
>
>>Incidentally we don't think we're storing that much data - each volume
>>contains approx. 15000 files for a total size of approx. 180Mb per volume.
>>
>
>I found that I typically hit the limit around 30K files in a volume on a
>server with 200MB of RVM data. So 15000 does seem rather low.
>
>>With 2M(log)/90M(data) we'd see the 'RVMLIB_ASSERT: error in rvmlib_malloc'
>>error whilst trying to populate the first volume.
>>
>>With 20M(log)/200M(data) we'd see the 'RVMLIB_ASSERt: error in 
>>rvmlib_malloc'
>>error whilst trying to populate the fourth volume.
>>
>
>Log size doesn't really matter, except for the defragmentation step that
>sometimes seems to succeed when RVM is exhausted. In a way these numbers
>seem to match my experimental data pretty well. By using several volumes
>instead of one you've managed to store about 60000 files before hitting
>the severely fragmented case, which is about twice of what I got with a
>single volume.
>
>>I'm guessing the RVMLIB_ASSERT is error being caused by filling up all
>>available space in the RVM log or data.  Is this correct?
>>
>
>Yes, (and no). You didn't really run out of space, but there isn't a
>large enough consecutive chunk to satisfy the allocation. A volume
>consists of an array of pointers to vnodes (file objects) and the actual
>vnodes.
>
>    AAVVVV
>
>When the array is filled, it is resized and new vnodes can be allocated.
>
>    AAVVVVAAAA     (new array allocated, old data is copied)
>    __VVVVAAAA     (old array is freed)
>    __VVVVAAAAVVVV (new vnodes are allocated)
>
>This is repeated, and gaps are starting to occur,
>
>    __VVVV____VVVVAAAAAAVVVV
>
>
>Defragmenting doesn't work because we have the vnodes at each side and
>the way both RVM allocations are satisfied and the fact that an
>allocation pool is used to speed up vnode allocation and freeing simply
>leads to a lot of unused space in RVM.
>
>>According to the announcement for 5.3.19 this is done by 'volutil 
>>printstats' but I just can't see this information.
>>
>
>After printstats is run, it should be at the end of /vice/srv/SrvLog
>with the header 'RDS statistics'.
>
>>   release_segment unmap failed RVM_ENOT_MAPPED
>>   rds_zap_heap completed successfully.
>>   rvm_terminate succeeded.
>>
>>Whilst the rdsinit was running I could see it had been mapped at 0x50000000
>>by looking at /proc/pid/maps.  Additinally strace -p pid indicated that the
>>call to munmap() was successful with exit status 0.
>>What's going wrong here?
>>
>
>Possibly colliding with shared libraries or something. We did have an
>extremely large server running on NetBSD, which has a slightly different
>initial offset for RVM. It might even have been staticly linked to
>create a more compact binary and to avoid shared library issues.
>
>Jan
>
Received on 2002-06-06 14:28:31