The value of a network is only equal to the square of its nodes
if you assume that all nodes, and all connections between nodes, are
of equal value. This is, for most real-world networks and very
certainly for software development, not true. And we also need to
consider DeLong's Law, which points out that in any real-world
network, the most valuable connections will be made first. I'd guess
that the simple application of Metcalfe's Law might work up to
around 15% of nodes, but after that, I don't think you could count
on it to give answers valid to an order of magnitude, let alone
The empirical evidence tends to support this view. Roy T. Fielding,
UCLA professor and apache team member, has published his research in the
"A Case Study Of Open Source Software Development",
(ACM membership or payment required -- will try to find an accessible
link). His conclusions are that within the Apache development
community, the core team has consisted over the years of about 15
people, with about six active at any one time. Another 400 people have
contributed code at some point or another. A total of 3,060 people
submitted bug reports. Clearly, development effort is rather
concentrated among this base. But even within these parameters, over
85% of code submitted comes from the top 15 developers. Clearly, all
nodes in the Apache development network are not equal.
Thanks also for the pointer to
DeLong's Law, I
wasn't familiar with it, though its significance is clear.
And the whole thing needs to be reality-checked anyway. If this law
worked, free software would indeed provide "thousands of times the
value". But GNU/Linux isn't even one thousand times as good as
commercial Unix. In many senses, it's worse. Apache might, on a
good day, be considered to be twice as good as IIS. The numbers
don't add up. No real-world application (and certainly not
GNU/Linux or fetchmail) has given any particularly good evidence
that debugging is massively parallelisable.
There are, however, a few problems with your analysis.
First: a network's value is equal to some constant
k times the square of its nodes, where k is the typical
value (again, stats primer -- need to check whether or not mean is a
valid predictor). As a network grows in nodes, one can expect the value
of k to change.
Second: but this isn't our problem. We're not talking about a
growing network of developers, but of a partitioning of an
existing network. So the issue isn't one of adding lower-value
nodes, but of divving up a set of existing nodes of (reasonably) fixed
Third: you're measuring the worth of the wrong output, with
your GNU/Linux v. proprietary  Unix example. The network I'm
referring to is a developer network, and its valuation, not the
valuation of any one output of this network. While GNU/Linux may or may
not be a match for proprietary Unix in any one application -- say,
running an E10K Starfire server from Sun -- it offers a breadth of
application which no single proprietary Unix can match.
I don't see Solaris positioned for the embedded or handheld markets,
and it barely competes with GNU/Linux on x86 hardware. Similarly,
GNU/Linux is favorable for clustering applications in large part because
of the software cost factor, above and beyond technical flexibility
which allowed projects such as Mosix and Beowulf to proceed in the first
You might also ask whether or not you'd prefer, say, Sun's set of
userland utilities over GNU tools, the Sun C compiler over gcc,
or CDE over GNOME, KDE, or WindowMaker. Beyond the OS kernel itself,
the quality, richness, and diversity of GNU/Linux as a whole is far
greater than that of proprietary Unix, in my experience (SunOS, Solaris,
HPUX, Irix, and others). Sun, IBM, and Hewlett-Packard make good
high-end servers and decent specialized workstations, but on a
price-performance and flexibility perspective, GNU/Linux wins,
particularly on low-end commodity hardware, and in the embedded space.
Moreover, by providing a uniform environment over a broad range of
hardware and HW configurations -- embedded, handheld, portable, PC,
small sever, cluster, large server -- GNU/Linux offers a rich and
diverse application deployment environment. The total value
isn't merely the kernel, but (to borrow a word from myself), the
gestalt: kernel, OS, development tools, userland, free software
applications, proprietary apps, hardware platforms, configurations....
This is actually an old idea born again -- IBM's OS/360 project of
the 1960s (and Mythical Man Month fame) was the first time a
range of disparate hardware was unified under a single operating system.
In this sense, IBM's adoption of GNU/Linux is a return to a very
successful strategy from its past.
Fourth: Parallel debugging has two aspects. Under the "all
bugs are shallow" model, you're simply increasing the probability that
someone will both see anomalous behavior, and use the source to narrow
down possible sources, or even provide a fix (or hints to same). Under
the "most bugs are shallow, some are deep" (which I tend to subscribe
to), you get the above, plus the ability for tiger teams of detailed
code auditors to independently review sources. Within the GNU/Linux
and *BSD communities, there are at least two such efforts I'm aware of.
One is a code audit sponsored by Red Hat, the other is the OpenBSD
development effort, which is proactively secure. Evidence from BugTraq,
from papers such as
and others, suggests that bug and security flaw identification and
resolution rates are higher for free than proprietary projects.
Incidence, now that's another question. ;-)
Fifth: As mentioned above, you've mis-applied DeLong's law
to a network partitioning, not growth, phenomenon. The question then
becomes: what is the nature of this partitioning? The scheme proposed
by enterfornone applies an arbitrary, externally imposed, set of
partitions on an existing network. While there may be some selection
toward higher quality nodes (users of software will tend to be those
who've legitimately acquired it, have familiarity, and will be inclined
and qualified to contribute to its development), the same cannot be said
about the beneficial aspect of partitioning across multiple user
subsets. Even with a degree of self-selection, you're still applying a
heavily arbitrary partition to the network, which, as previously
described, tends strongly to reduce the real size, and Metcalfe value,
of the development community. Other "gated community" partitions may
not be as drastic, but I believe the ultimate, long term, effects are
It's been said (by Tim O'Reilly, link posted above) that the GPL may
itself be a form of a gated community, selecting against those who don't
agree with the terms of the GPL. However this is an internally
imposed restriction -- on the part of an individual developer.
There's no external authority preventing the developer from changing
his or her mind down the road.
The power of the GPLd partition is that it doesn't
arbitrarily exclude participation. High-value nodes, er, developers,
are welcome to contributed without bias. Which leads to the next point:
One has to consider that the "gated community" model is pretty much
exactly that which led to the development of commercial Unix, which
has many faults, but very few that aren't shared by most free
Unix development actually is an argument in favor of open, not gated,
development. During the period of maximum growth of Unix systems
development -- the 1970s and early 1980s -- the OS competed with
development processes which were far more closed: OS/360, VMS, and the
fledgling DOS and Apple systems. Though Unix code wasn't fully open, in
the sense we think of today, it was eminently practical for Joe Random
Grad Student at UC Berkeley, MIT, and other leading technical
universities, to get their hands on, muck around in, and extend, the
And this was were many of the best minds gathered -- partly because
they could, partly because they liked the freedom. Today, the historic
Unix development model is no longer the least inhibiting one available
(cf: Minix, Plan 9, Solaris academic licensing), instead, this is
offered by GNU/Linux and the *BSDs. Concomitantly, OS systems
development effort has become strongly focused on the hobby project of a
Finnish grad student. The comparison of the gated model to Unix's
history fails to capture this key point. In many ways, free software
has replaced the "think tank" or technical incubator of the past.
 Note the distinction between "commercial" (GNU/Linux is a
commercial product) and "proprietary". Pedantic, but relevant.
Karsten M. Self
SCO -- backgrounder on Caldera/SCO vs IBM
Support the EFF!!
There is no K5 cabal.
[ Parent ]