moonbase / module and code testing / qa

Thu Oct 2 12:04:17 GMT 2003

Hello all module developers.

Feedback on these ideas or other ways to make test/release easier are 
solicited! Sorry this is a long email.

I wish to propose adopting guidelines for deciding the degree of testing we 
want developers to use in determining when a module is ready for release into
moonbase.

We're all volunteers on this project and I think we all do pretty damn
good work. 

However from back in SGL days there have been problems when modules with
system-wide impact are released into moonbase and trigger unexpected 
problems. Db4, glibc, gcc, gnome have all been examples.

There's a principle in engineering that's also well known/proven in
software. The cost of fixing bugs/flaws grows exponentially as
they move from R&D to alpha to beta to production. I beleive we'll
save ourselves time in the long run if we can stop dropping untested
updates into critical sections of the moonbase.

Personally I'm will promise time for testing updates and would rather 
dedicate some time to that upfront than spend (more) time fixing
things after they've gone out to the wider userbase.

Suggestions:

1. For complex modules, testing on the developement machine isn't 
adequate. Dependencies which most users won't have installed are 
certain to have been installed (perhaps a few times) on the target
machine. Because Lunar doesn't have a rollback capabity returing
to tabula-rasa

2. For modules with wide impact (Qt, libgnome, gtk+* glibc<shudder> ...)
the effects of update may vary widely depending on what's installed

3. Lunar users are not even close to in-sync on software versions. It's
up to the system owner to decide what to hold or update. Therefor 
it's *certain* that the userbase is running a wide variety of configs.

4. The 'profile' of the average lunar install is very different than
the average developer's system. The best guess of our userbase is 4-600 
systems. *most* of these people in fact don't update more often than 
every 30-60 days. I imagine there are a few that go a year or more without
general updates.

I think we could adopt a rating system where required degree of testing is 
determined by a formula, something like: (criticality*delta) / impact.

criticality is high for security flaws, low for feature enhancements,
somewhere in between for everything else.

delta = deltaV*10+ delta-majorV + delta-minorV/10 (this probably needs 
adjusting per-module because different projects have different policies
for release / compatibility. e.g. openssl allows API changes between 
e.g. 0.9.6-7 while gcc (to my knowlege) is object code compatible 
across e.g. 3.2 - 3.3 versions.

impact is high for anything that creates libraries in the ld.so.conf paths,
is likely to affect critical operations for users

We may want to build some variation on this formula into moonbase. 

To facilitate getting testing done with minimal dev overhead, perhaps
(rather than informally passing around dev-modules via email) we can add a 
directory dbguin's download area for modules in development, and 
scripts to grab that dev area into /zlocal.

I can dedicate a couple of vmware boxes which are easy to restore to
known-state as well as my primary dev system to this work.

Feedback on this or other ways to make test/release easier are solicited!

Here's a preliminary list of modules which I beleive ought to be tested
on at least 3-8 systems prior to release to moonbase. 

glibc
gcc
readline
bash
coreutils
ncurses
dialog
db
db4
perl
openssl
openssh
krb5
zlib
gawk
gnome2
gettext
qt3
kde3
nfs-utils
samba
util-linux
nasm
modutils
pkgconfig
Linux-PAM