Coda File System

Next Previous Contents

14. Coda Source Layout

This chapter is a guide to using and modifying Coda source code at CMU-SCS. It is mainly intended to help new members of the project come up to speed. But it may also help sites outside CMU that receive a Coda distribution to devise procedures appropriate to their environments.

14.1 The Big Picture

All files relating to Coda, with the sole exception of RCS files (see Section XXX ), are stored in /coda/project/coda. Let's call this $CODA . The directories under $CODA correspond to different releases of Coda, such as alpha, beta, and so on. A release is a complete set of mutually consistent sources, libraries and binaries. "Mutually consistent" means, for example, that the Venus and server from a release will work with each other. It also means that you can freely use the libraries and include files to compile new binaries in that release. @ux[There are no guarantees of any kind across releases.]

Before a new instance of a release is made, careful thought must be given as to how current users of that release will be affected. Will all such users have to be upgraded in one fell swoop? Or can they switch over at their own leisure? As the deployed Coda system grows, upgrading everyone in one fell swoop gets harder. After all, some of the users may be in hiding with their disconnected laptops!

By convention, the omega release corresponds to the version of Coda in production use. Most users Venii, as well as the servers on production machines, are from this release. Hence the value of /usr/coda/SYMLINK on those users workstations is $CODA/omega .

How versions evolve

The alpha release of software corresponds to a lightly tested version of the system. Individual developers have done a significant amount of testing of alpha software, but it hasnt yet been stressed heavily, nor has it been tested in real use. Hence alpha server code should not be run on servers that hold real user data. They should only be run on the alpha servers (also known as "test" servers). Similarly for other alpha software. When you are doing code development, you usually pick up include files and libraries from the alpha release. Thus, the private versions of software you build is, at best, of alpha quality.

The beta release corresponds to a version of the system that is expected to be released soon for production use. A small number of users, typically members of the Coda project, depend upon this release. Actual use by such users is part of the testing of the beta release. Eventually, when we have enough confidence in the stability and correctness of the beta release, we promote it omega.

beta is really a symlink to a volume mount point beta- < unique > , where < unique > is an identifier made up by the makebeta script described below. Upgrading the beta release consists of creating a new volume, mounting it at beta- < unique > , populating it using the steps described in Section XXX , and finally changing the beta symlink to point to beta- < unique > . The old beta volume is preserved for a while, until is clear that the new release hasnt triggered any serious problems. It can then be purged to reclaim space, or recycled for a future beta release.

omega is also a symlink to a volume mount point beta- < unique > . After a beta release has been running robustly for a while, it is upgraded to omega by merely changing the value of the omega symlink. No recompilation is involved.

When a beta- < unique > release is created, the makebeta script creates an RCS branch named beta- < unique > for every file in the release. Normally no development or changes are done along this branch. However, it provides a way to introduce an emergency fix in a beta or omega release if the need ever arises.

How code is developed

Lets illustrate this with a specific example. Suppose I want to upgrade the module rpc2 and build a new Venus to use the upgrade. I begin by creating two private directories, say /coda/usr/satya/src/rpc2 and /coda/usr/satya/src/venus , and populating them with source files from the alpha release of these two modules. Section XXX tells you how this step actually gets done. Now I proceed to modify files in rpc2 and to compile and test the module using standalone test programs. Then I modify files in venus to use the upgrade and then build a Venus. When building Venus, I must make sure that the version of rpc2 used is the one I just built. Section XXX tells you how to do this. Now I have a Venus that I can test. As bugs are found in the new Venus, I iterate the above procedure. When I am confident that my changes to rpc2 and venus are right, I update the alpha release of these two modules.

The process of updating the alpha release is known as installation. Installation is the point at which work done by a Coda project member becomes visible to others in the project. Prior to this point, all work is done in that users own private directory. Installation always occurs at the granularity of entire modules. In other words, one never installs an individual library, include file, etc. Rather the entire source code for a module and all relevant files compiled from it are installed together. Modified files are automatically checked into RCS as part of the installation procedure. At some future time, these changes along with many others that were installed will make it into a beta release. Section XXX shows you how this is done. After some use as beta , it will be promoted to omega .

14.2 Layout of Source Code

Layout of a Release

Each release has the structure shown in Figure XXX . Note that this layout is identical for all releases.

Underlined names are symlinks to machine-specific directories. On a 386 machine for example, bin is a symlink to i386_mach/bin . Although only two machine-specific directories ( pmax_mach and i386_mach ) are shown, there can be many more.

The source tree for the release is in src . A copy of the header files from src is in include . Both src and include are machine-independent. For each supported machine type, there is a directory containing binaries ( bin ) and libraries ( lib ). The source tree in a release is fully self-contained. In other words, if you started out with empty include, lib, and bin directories, you could completely populate them by compiling the source code in src . The only exception to this are the files in include-special, lib-special, src-special, and bin-special . These directories contain a very small number of files that have to be copied in by hand. The sources for these files are not in src .

Layout of the source directory

The src directory in each release is organized as in Figure XXX . The MAKECODA script simplifies the compilation of the entire release. The Makeconf file defines in a single spot many key variables and paths used by the makefiles in individual modules. Those makefiles inherit these definitions automatically, when the CMU-SCS make is used. The SOURCEME file contains a minimal set of environment definitions. By sourcing this file before you compile anything, you can be sure that you arent obtaining binaries, libraries etc. from non-standard places. This is especially important if your .login or .cshrc files define elaborate PATH, CPATH, LPATH variables.

Underlined names are symlinks to RCS directories. For example, RCSLINK is a symlink to /afs/cs/project/coda/rcs ; auth2/RCS is a symlink to ../RCSLINK/auth2 ; and so on. Indirecting via RCSLINK makes it simple to relocate the RCS directories without changing lots of individual symlinks. Such relocation might happen, for example, when Coda sources are used outside CMU.

Layout of a typical module

The structure of a typical module is shown in Figure XXX .

In this particular module, rpc2 , the underlined name RCS indicates a symlink to ../RCSLINK/rpc2 . Indirection via RCSLINK makes relocating the RCS directories a simple matter.

Notice that only the source files are located in this directory; there are no object files. The CMU-SCS make facility puts all compilation targets elsewhere, thus allowing the source directory to be readonly. This has two advantages. First, the source directories are uncluttered. Second, it simplifies building Coda for multiple machine types, since the target directories can be different for each machine type.

Also notice the presence of multiple makefiles in this module. Although this is not characteristic of all Coda modules, it is typical of some. The true dependencies are capture in Makefile.real , and the others, like Makefile.coda and Makefile.misc invoke Makefile.real after defining environment variables appropriately. This simplifies the use of these modules outside Coda. If you are compiling by hand, you have to say " make -f Makefile.coda < target > ", rather than just " make < target > ".

14.3 Module Dependencies

Most Coda modules rely on files installed by other Coda modules. It is therefore important to install modules in the correct order. Otherwise you could get yourself into real trouble. If you are compiling from scratch, the out-of-order installations will simply fail. But if you are modifying an existing release, you could end up with mysterious bugs because obsolete versions of header files and libraries may be used.

You are @ux[strongly] urged to use the script MAKECODA, described in Section XXX . The correct precedences of modules are wired into the script, so you dont have to deal with them.

For the curious, the correct order of compilation of modules is given below. In principle, modules of the same precedence (i.e. in the same set) can be compiled in parallel. But I havent actually tried that yet.

  1. scripts Miscellaneous scripts, including alphaci on which intalls of all other modules rely
    • mlwp Lightweight process package
    • dir Directory package used by Venus and server
    • sys Miscellaneous routines
    • sunrpc Sun Microsystems public domain XDR code and interface for device driver/venus interaction
    • igmp Internet multicast support for old RFC (dummied out currently; someone should fix these to use the new RFC)
    • util Utility routines
    • rpc2 RPC package
    • camstuff Header files that allow runtime choice of RVM or VM for persistence on servers
    • blurb Program to adjust copyrights
    • pdbstuff Protection database management
    • rp2gen Stub generator for RPC2
    • comm Communication layer above RPC2 for connection management (not yet in use)
    • libal Access list package.
    • vicedep Header files and RPC2 interface definition files put here to break circular dependencies
    • fail Network failure and variable speed emulator
    • auth2 Authentication server
    • login Implements clog, cunlog, ctokens, etc.
    • cfs The VFS driver; most of this code is linked into the Mach kernel
    • vv Version vector routines
    • mond Coda usage data collector
    • resolve Library used by repair (should get integrated into repair )
    • vol Volume package used by server
    • res VM-based directory resolution algorithms
    • repair Repair tool
    • venus Cache manager
    • volutil Volume utilities
    • vtools Miscellaneous Venus tools
    • rvmres RVM-based directory resolution algorithms
    • vice Server code
    • update The daemon which updates server databases
    • norton A Coda server, RVM debugger
    • asr Application Specific Resolver package
    • egasr ASR examples@end)annote

14.4 Building a Release

Prerequisites

To compile Coda "out of the box" you need the following compilation tools:

It should be possible to modify the code to use other versions of C++ (such as g++ ), or to use standard Unix make. But we havent tried this, and dont plan to.

Special directories

Before you can compile Coda, you need to populate the {include,lib,bin,src}-special directories. These contain files that are (a) needed but arent in the Coda sources or (b) standard Mach header files, with slight modifications. The MAKECODA script contains an up-to-date list of what these files should be. Here is a list that was current at the time of writing this document:

RVM files

RVM is a lightweight transactional package that is used on servers and clients. It is a package that is independent of Coda. The current set of files from this package are:


include-special/{rvm.h,rvm_lwp.h,rvm_statistics.h,
                rvm_segment.h,rds.h}

        lib-special/{librvm.a,librvmlwp.a,libseg.a,librds.a}

        bin-special/{rdsinit,rvmutl}

        src-special/{Makeconf,Makefile,READ_ME,plumber,rds,
                rvm,seg}

Tracing facility files

These files pertain to a file-tracing facility, dfstrace , used at CMU. To function as a trace-driven simulator, Venus requires the following files:


include-special/tracelib.h

        lib-special/libtrace.a

If you don't intend to use Venus as a simulator, you could construct a dummy libtrace.a with empty routines to avoid unresolved references. You do need tracelib.h though.
Plumber

malloc The Coda makefiles allow you to build versions of venus and codasrv that use a special malloc to help detect memory leaks. The following files are needed for this:


include-special/newplumb.h

        lib-special/{libplumber.a,libnewplumb.a}

If you don't plan to build the plumbing facilities, you can just create zero-length files with these names to keep the MAKECODA script happy.
Modified Mach headers

The changes in these files had to be made because of compilation errors from C++, or (as in the case of assert.h ) to define different behavior for standard macros:


include-special/{assert.h,cthreads.h,setjmp.h,sys/inode.h,
                i386/fpreg.h,i386_mach/endian.h}

Using the MAKECODA script

Once you have taken care of the prerequisites and special files, you can compile a release. The simplest way to do this is to run the MAKECODA script in src . This script takes one required and two optional arguments. The required argument is OBJECTDIR , which is the pathname of the directory where the object files should be placed. It is sensible to specify a different directory for each machine type, and the @@sys facility of AFS and Coda lets you do this.

So, for example, to compile the beta release, I would do the following:


cd /coda/project/coda/beta/src
./MAKECODA OBJECTDIR=/coda/usr/satya/OBJS/@@sys

The MAKECODA script will first check to make sure that all necessary special files are present. If any are missing it will prompt you. It then goes through the Coda modules in the correct order and does a make install on each. The usual checkin to RCS that is done by alphaci as part of make install (see Section XXX ) is supressed. If all goes well, everything in Coda will be compiled, and the bin , lib , and include directores will be populated. You will have to repeat this once for each machine type.

Sometimes, you will run into a problem part-way through MAKECODA . After you have fixed the problem, youd probably like to continue where you left off rather than redoing everything from the beginning. Hence MAKECODA lets you specify the name of the module to start from. Here is an example:


./MAKECODA OBJECTDIR=/tmp/@@sys FIRSTMODULE=vol

Finally, the pathname of the root of the release you are compiling is specified by the variable ROOT in MAKECODA . You can change this by editing MAKECODA , and this is in fact what the makebeta script does for you when you create a new release. But you can also override it on the command line thus:


cd /tmp
MAKECODA OBJECTDIR=/tmp/@@sys  ROOT=/coda/project/coda/alpha

14.5 Making Incremental Changes

Once the alpha release has been built, you can start code development. We have already seen the general procedure in Section XXX . Lets look at an example in more detail.

Suppose I am working on the module vtools and need to change the files cmon.c and codacon.c . Heres what I would do:


# Create a scratch directory for my work
mkdir /coda/usr/satya/src
cd /coda/usr/satya/src

# Set up links to RCS directory, and root of release
ln -s /afs/cs/project/coda/rcs/coda-1.0 RCSLINK
ln -s /coda/project/coda/alpha/src SRCROOT
ln -s SRCROOT/Makeconf

# Create directory for this module
mkdir vtools
cd vtools
ln -s ../RCSLINK/vtools RCS

# Lock and checkout the file (s) to be modified
rcsco -l cmon.c codacon.c

# Now edit cmon.c and codacon.c

# Then compile this module
# Source the standard environment file to make sure your 
# compilation environment (PATH, LPATH, CPATH, etc.) is set up right 
source ../SRCROOT/SOURCEME
make OBJECTDIR=/coda/usr/satya/OBJS/@@sys cmon codacon

# Test cmon
&
codacon, then iterate on edit/debug cycle above

# Now you are ready to install your changes.
# You must first create a file called RCSMSG, and enter text
# in it that will become the RCS log message for the checkin.
# It will also be posted to the changelog bboard, so that 
# others in the group will know of your installation

echo "Fixes to annoying bugs ... blah blah blah ..."
>
RCSMSG
make OBJECTDIR=/coda/usr/satya/OBJS/@@sys install

# You are now done!
# The install step automatically released the write locks on cmon.c
# and codacon.c

# Repeat the install step for each of the other supported platforms.

Notice that many other files may be need for compilation, but you dont have to check them out. This is because the CMU-SCS make knows to check out any needed files automatically into the compilation target area.

The alphaci script

The automatic checkin is done by a script called alphaci that is invoked as the last step of make install . alphaci will give you an error if the RCSMSG file is missing; it moves it to RCSMSG.old once it has checked in files. alphaci is smart enough to discover new files that arent mentioned in RCS, and prompts you to ask if you want them checked in too. Often you may not want this, because the files in question were just test files created for debugging. alphaci assumes that only writable files should be checked in; it prompts you about what to do with files that are not writable.

The RCSMSG file is also used by alphaci for posting on the changelog bboard ( cmu.cs.proj.coda.changelog at CMU). You should follow the posts on this bboard closely, so that you are aware of changes to Coda modules by other project members. Here are some typical posts from the bboard:


03-Feb-93 18:56   M Satya   Installed scripts for IBMRT
Posted by alphaci from STRAUSS.CODA.CS.CMU.EDU
No files checked into RCS

03-Feb-93 18:50   M Satya   Installed scripts for I386
Posted by alphaci from WEBER.CODA.CS.CMU.EDU
No files checked into RCS

03-Feb-93 18:22   M Satya   Installed scripts for PMAX
Posted by alphaci from MOZART.CODA.CS.CMU.EDU
Files checked into RCS:      Makefile makebeta restartserver restore.sh
RCS message follows:
Created new script, makebeta, to make a clone of the entire alpha
source tree, and to create branches in RCS for all files.

Notice how the installations for machine types I386 and IBMRT caused no RCS files to be checked in. This is because the installation for the first machine type, PMAX, did the checkin.

14.6 Promoting Releases

As we have seen earlier, there are 3 important releases of software: alpha, beta, and omega . Servers are also classified by this scheme. At the time of writing this document, there were 3 omega servers ( rossini , puccini , and scarlatti ), 3 beta servers ( grieg , haydn , and wagner ), and 4 alpha servers ( schumann , gershwin , mahler , and vivaldi ).

The omega servers hold real data, so software let loose on them should have been tested very well. Having to reinitialize and to restore many gigabytes on the omega servers because of storage corruption is not an experience you are likely to forget! The beta servers hold some real data, but the number of users depending on that data is restricted to a few Coda project members. Further, that data is of a kind that can be easily reconstructed. The alpha servers are used by project members for testing. Obviously, only one person can test their software on a server at a time, so you should coordinate use of the servers with the other members.

Promoting alpha to beta

When the alpha release has diverged substantially from beta , and is relatively stable, we will decide to make a new beta release. One member of the Coda project will serve as release coordinator for the promotion.

The first step in this process is for the release coordinator to post on cmu.cs.proj.coda.general , asking project members to checkin changes and to drop all RCS locks on the main line of development by a certain deadline. As soon as possible you should release all mainline locks. It is ok to hold on to locks on private branches. Once the deadline expires, the release coordinator will feel free to break mainline locks.

In the second step, the release coordinator runs the makebeta script to create a clone of the current state of alpha . makebeta first synthesizes a unique identifier of the form date_xxxxx where xxxxx is the number of seconds since midnight. The name of the new release is then beta- < unique identifier > ; for example, beta-3Feb1993_43696 . This name is used to tag a new RCS branch for every file in the release, so that emergency fixes are possible long after the alpha release has diverged from this beta release.

The makebeta script works as follows:

The above procedure is quite slow, since many RCS interactions are involved. It usually takes a few hours. If all goes well, makebeta requires little babysitting. Note that the script does no locking. In other words, the $ALPHA tree better remain frozen for the entire duration of the promotion.

In the third step, the release coordinator cd s into $BETA/src , and runs MAKECODA once for each machine type. Note that he doesnt have to specify anything other than OBJECTDIR , since makebeta has already set ROOT correctly in the MAKECODA script. This step guarantees that the binaries in $BETA were indeed compiled from the sources in $BETA .

In the final step, the release coordinator makes the symlink beta point to $BETA . This is the "commit" step that blesses the newly-cloned release as the beta release.

Promoting beta to omega

After the beta release has been stressed for some time, it will be promoted to omega . This is a much simpler procedure than the alpha to beta promotion. The release coordinator merely has to make the symlink omega point to the beta-xxxx where xxxx is the unique identity of the release being promoted. This "commit" step blesses what was hitherto beta as omega

14.7 Copyright Notices

Since Coda is distributed outside CMU, it is important that every source file contain a copyright notice. The blurb program simplifies adding and changing copyright notices. blurb expects to find the copyright notice is at the very beginning of a file, so make sure you dont move it when modifying a file.

The only modules without Coda copyright notices are sunrpc, which is public-domain code from Sun, and cfs, which is mostly Mach kernel code.

Some files in Coda are derived from the 1986 version of AFS-2.0. Since AFS-2.0 is owned by IBM, these files have an IBM copyright notice in addition to the Coda copyright notice.

When you modify a file or add a new file, you should pay attention to the copyright notices. Here are some simple rules to follow:

  1. If you add new files, just include the Coda copyright notice. By definition, new code is not derived from AFS-2.0 and should not receive an IBM copyright notice.
  2. If you modify an existing file, dont do anything with the copyright notice. All else being equal, add new code to new files, or to existing files without the IBM copyright notice. But use good judgement here -- if it is clear that the correct home for a piece of code is a file with an IBM copyright notice, go ahead and put it there.
  3. If you make a substantial copy of a file with an IBM copyright notice then the copy also acquires the IBM copyright notice. The interpretation of "substantial copy" is, of course, subjective. If you copy a variable declaration, it is not a "substantial copy". If you copy a whole bunch of procedures or macros, it is definitely a "substantial copy".

A simple way to think of this is that some files in Coda are "tainted" (i.e., they are derived from AFS-2.0). Tainted files can infect other files if enough of their innards are copied. Existing untainted files, and new files, stay untainted forever. In general, do your best to keep the number of tainted files to a minimum. Note that this discipline regarding copyrights is not intended to be onerous or constraining --- just common sense and a little self-discipline.

14.8 Using RCS for Revision Control


1. Branch overview

One common problem in managing large software projects comes
to play when there are several people working on the same sets
of files. RCS helps by providing per-file locking to guarantee
that no two users modify the same file at the same time. If
several users are modifying the same line of development,
however, this locking does nothing to guarantee that one users
changes in one file wont interfere with another users changes
in another file.

To combat this problem, RCS provides the notion of a "branch".
A branch is a separate line of development carried out in parallel
to the main line of development. Version found on branches have
more than two digits in their version numbers. All digits make up
the version number of an element on a branch. For example, version
1.2.1.1 is an element in the first branch off of version 1.2. 

As branch numbers are hard to deal with,  you  are recommended to
assign symbolic names to branches.


2. How to use branch

For example, a typical command to create and name a branch would be

        rcsci -b -nKUDO_STATISTICS foo.c
                where KUDO_STATISTICS is a symbolic name of branch
        * This command will create a branch named KUDO_STATISTICS
          off the last version on the mainline of foo.c

If you would like to specify the version number of which the first
branch element is off, a command would be

        rcsci -b -r1.2 -nKUDO_STATISTICS foo.c
                where 1.2 is the version number of which the first
                branch element is off


The name is assigned to the branch (not the branched element), so
the subsequent commands such as 

        rcsco -l -rKUDO_STATISTICS foo.c

will refer to the latest element in that line of development.


3. What I checked in using branch

I checked in the following files using branch with the symbolic name
"KUDO_STATISTICS".

        vicedep/mond.rpc2
        venus/  fso.h
                fso0.c
                fso1.c
                hdb.c
                sighand.c
                venus.c
                venus.private.h
                venusresov.h
                venusutil.c
                venusvol.c
                venusvol.h
                venusvm.c
                venusvm.h
                vol_vsr.c
                vproc.c

The purpose of the modifications is to collect session statistics 
and send them to the mond data collector. Currently, venii built 
with these files are running on Brahms and faust, safely (I think so).

14.9 Coding Tips

<
TO BE COMPLETED
>

Next Previous Contents