Coda File System

Re: volutil getvolumelist

From: Jan Harkes <jaharkes_at_cs.cmu.edu>
Date: Mon, 30 Oct 2000 14:25:02 -0500
On Mon, Oct 30, 2000 at 06:38:14PM +0000, Barney Sowood wrote:
> Hi,
> 
> I'm suffering the same problem with volutil getvolumelist returning
> corrpted characters as described by Stephan Koledin
> (http://www.coda.cs.cmu.edu/maillists/codalist/0687.html) after I add
> a certain number of volumes the last one starts getting corrupted and I
> have problems with creating replicated volumes, anyone have any ideas?
> Stephan did you manage to resolve this ?

This is fixed, I have to check if it actually made it in 5.3.10. It is
caused by a buggy snprintf implementation in some libc's (it hit us on a
NetBSD 1.3 machine). Argh, I found it about one week after the release
of 5.3.10.

Excuse the explicits, but I was pretty annoyed once I finally figured
out what was going wrong. Here is the patch.

========================================================================
2000/10/12 13:58:27     <jaharkes_at_chios.odyssey.cs.cmu.edu>

	getvolumelist sometimes returned a corrupted list because
	snprintf sometimes f***s up royally and returns success instead
	of -1 when the output is truncated.

------------------------------------------------------------------------
--- volume.cc   2000/09/20 14:09:23     4.44
+++ volume.cc   2000/10/16 10:15:57     4.46
@@ -526,6 +526,9 @@
                 V_parentId(vp), V_creationDate(vp), V_copyDate(vp),
                 V_backupDate(vp), volumeusage);
 
+    /* hack, snprintf sometimes f***s up and doesn't return -1 */
+    if (n >= (int)(*buflen - *offset)) n = -1;
+
     if (n == -1) {
        *buflen += 1024;
        *buf = (char *)realloc(*buf, *buflen);
@@ -553,6 +556,10 @@
 retry:
        n = snprintf(*buf + *offset, buflen - *offset, "P%s H%s T%x F%x\n",
                     part->name, ThisHost, part->totalUsable,
part->free);
+
+       /* hack, snprintf sometimes f***s up and doesn't return -1 */
+       if (n >= (int)(buflen - *offset)) n = -1;
+
        if (n == -1) {
            buflen += 1024;
            *buf = (char *)realloc(*buf, buflen);
========================================================================


> Also I'm still having a problem with venus which causes it to die with
> "RVMLIB_ASSERT: error in rvmlib_malloc" if I build serveral thousand
> directories in swift succession.

There are some limitations associated with a large number directories
per volume. The size of a volume resolution log is limited (current
default was 2048, and is now 4096) and cannot grow without
administrative intervention. Each directory has at least one resolution
log entry, so the max # of directories would be about 4095, but then
resolution doesn't have the room to work anymore.

If I remember correct you were automatically generating user home
directories in one volume, which is a bad idea in any case because,
locking, conflict resolution, quota's, relocation of volumes (once
finished), etc. are all on a per volume basis so it is much better to
create a volume for each user.

Between 5.3.8 and 5.3.9 the `volume storage groups' were temporarily
taken out of the clients to help the restructuring of the volume-related
code. As a result 5.3.9 and 10 have a lot of overhead on directories
with a lot of volume mountpoints, as new connections for each volume are
set up with the servers. This should get fixed by reintroducing VSGs in
one of the next releases.

Jan
Received on 2000-10-30 14:27:33