To be or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles,
And, by opposing, end them.
-- Hamlet, Act 3, scene i.
It is my contention that a particular project where I work has reached this
point. It has taken a team of four people over a week to come to understand
the flow of data through the program, and even as yet we do not fully
understand, nor is it at all easy to answer questions we have, such as those on
the life cycle of objects (in this case, because there are many unrestrained
global variables and functions), necessary for our `port' to a three-tier
system (for the curious, we want to go from from client-database to
client-CORBA objects-database to allow object sharing and update
We have presently only considered the inductive approach, that is, going
from the specific to the general. We determine how the system works, and its
abstractions, and how to change the existing system to make it work as desired.
We consider which is the best way to make this change so that effort is
minimal and chance of introducing new errors is least, while still delivering
the required solution with some modicum of efficiency.
The problem with this is that it is still a band-aid; we merely add another
impenetrable layer of cruft to layers that need not to be added to, but rather
 yes, I know the original context, but it fits here too
 see below under Justification for Rebuild
A Deductive Approach
Behold, I make all things new.
Certainly, if one is to begin again, one cannot begin from the very
beginning and discard the many man-years of work that have gone into a project;
nothing this dramatic is suggested. Instead, we wish to salvage the most
The deductive solution considers the particular needs of the client
(`client' here meaning the users of the interface); in this case, our clients
are two: graphical user interface objects, and a calculation engine. It is
thus that objects are created: determine what interfaces are needed and
build them. In so doing, remove the current depenencies on global variables
and refactor as much as possible.
For this project, as an example, we know that we need a class to represent a
river (let us call it river_c, following the existing convention) within
a hydrological system; in fact, we need a whole list of rivers, because often a
river is queried by its ID. From that, we also know that we need to be able to
an ID-based lookup, which dictates the presence of a class with properties like
the Standard Template Library (STL) std::map for such lookups.
The existing code that (somewhere, somehow) loads the current version of
river_c must be pulled out, keeping the raw database access classes
unchanged, and its functionality duplicated, rewriting where necessary, merely
moving code otherwise. In our middle tier (CORBA server) initialization,
realize we need to initialize (for example) a std::map<int,river_c>
mapping river IDs to actual river objects. Furthermore, also based on
usage--the specific in our deductive logic--we see that we need to expose a
function to retrieve a river_c& by ID, and, this river_c
object needs to have the properties ID and name. Note that we don't base these
decisions on the database table structure, although of course one must be able
to derive an (improper) subset of our object and its attributes from the
database or we cannot create the objects, but rather on client usage
requirements. Next, we would recurse into the requirements of the classes
contained by river_c, such as a list of (power) plants, and from there a
list of generating units, etc.
river_c should contain its own code to create itself from a row in
the raw database table object (e.g. a via constructor), and a static method to
create the aforementioned map; currently this code is split among a
variety of unrelated global functions. In many cases, multiple tables and
conversions are required, and naturally this code would be brought in as
This constructive approach has the advantage of eliminating any excess
baggage, in that there will never be any reason to write any more code than
will be used.
It is already acknowledged by the management here that this project must be
rearchitected--the only question is the `when.' The amount of time (short-term)
to add the third tier as a `bag on the side' should be close to the time
required to rearchitect as we go (since with a better structure we can save a
lot of time when writing the new code), and the gains in clarity improve future
maintenance in manifold ways for the long-term.
Any radical (to management) approach such as this is naturally viewed with suspiscion and even incredulity at first. But based on the current state of
the program, the desired changes, and the fact that this method is
feasible to yield the desired result within the same or better time as the
alternates, it is clear that it is the best of all possible choices. As
Brooks said, "Plan to throw one away; you will anyhow." The time is now.
Chemical engineers learned long ago that a process that works in
the laboratory cannot be implemented in a factory in one step. An intermediate
step called the pilot plant is necessary.... In most [software]
projects, the first system is barely usable. It may be too slow, too big,
awkward to use, or all three. There is no alternative but to start again,
smarting but smarter, and build a redesigned version in which these problems
are solved.... Delivering the throwaway to customers buys time, but it does so
only at the cost of agony for the user, distraction for the builders while they
do the redesign, and a bad reputation for the product that the best redesign
will find hard to live down. Hence, plan to throw one away; you will,
 Brooks, Frederick P. Jr. The Mythical Man-Month: Essays on Software
Engineering. Reading, Mass. : Addison-Wesley, 1975 (revised 1995).
ISBN 0-201-83595-9. Should be required reading for anyone going anywhere near code.
Justification for Rebuild
Some of this is specific to the my company's project, but it is probably
true for many more as well.
This project (name withheld to protect the guilty) is a clear example of
code that has succumbed to what Brooks ( above) calls the "Second-System
Global variables and functions make it very difficult to trace the flow of
control, especially when these variables are used as default parameters to
constructors (for example), making tracking down their usage--as they are
aliased; a veritable nightmare. In many cases, merely creating an object
automatically adds it to a global list, but this is not at all obvious since
this addition is through a default parameter to the constructor and the adding
is done in the base class.
The code is verbose in the extreme, because often it is copied pedantically
over and over, e.g. for error checking, when it would have been better to
encapsulate these checks in (then) a macro which could return a string
desciption of the error or (now) an inline function that can throw an exception
if an error is encountered. There are often ten or so (large) classes that are
exact copies of each other with literally four lines that differ. Refactoring
is a necessity.
Much of the data is loaded multiple times in different ways. A deductive
"from the ground up" loading of needed data would eliminate a lot of
We depend on many libraries that are no longer needed (e.g. replace any use
of MFC strings or containers with STL) or are not supported any more (either
because the supporting company has gone out of business or just discontinued
the library in question) and are unintuitive (e.g. using operator() to advance
a list iterator). The dependencies on these libraries could be removed; the
STL is (a) widely taught, (b) well documented, and (c) standardized as part of
any conforming C++ implementation.
 International Standard ISO-IEC 14882-1998 etc.,
IT managers won't like it, but often it makes far more sense to scrap a bad
design than to attempt to maintain it and hope it keeps working. The time it
takes for someone new to the project to get `up to speed' on a project with a
million lines of code (excluding libraries) is high enough without the project
being a maze of brain-damaged cruft.
How do projects get this way? In part, I've tried to keep the blame to the
many years (over ten in the case of the project I was talking about) that the
application has been developed, or to the time constraints that make a better
fix for various problems impossible. But frankly, many of the problems also
arise from pure and simple badly written code, because of the ignorance and
cluelessness of the writers thereof. Many of them in my project's case were
engineers seconded into the role of programmers (the term `engineer code' isn't
spoken with fear and loathing for no reason). Some code was written by co-op
students and never audited. Some of the mess is, again, due to sheer
Writing solid code is not an easy task. It comes from a combination of a
sound theoretical background and experience reading and writing code. And not
everyone can do it; with luck, those people that can't will be fired (or
promoted to management, or have an unfortunate run-in with a Mack truck),
before they can do too much damage.
How can we stop bad code from getting into projects, commercial or not? One
supposes that open source projects are free from bad code because many eyes can
see it, but this is only the case if many eyes do see it and it becomes
taboo to add such, and someone can and does say "No, this code is horrible andisn't going into this project", a phenomenon more common in larger projects
(KDE, the Linux kernel, Mozilla, etc.) than the morass of smaller endeavours.
As I see it, the solution is the same for commercial or open projects: code must
be reviewed, changes must be tracked (source code control--which we don't have
here, even now, although I've been pushing for it--is an absolute necessity).
The tenets of Extreme Programming (XP) and other such methods go even further,
to pair programming and sequential integration. To a software company, though,
pair programming appears to divide their employee base in two without tangible
gain, so having code reviews is a good compromise that managers should find
acceptable. As to the type of reviews, this can vary: two or three people (one
the programmer) congregate around a machine to talk about the code after being
sent the changed files to look at individually first, perhaps, or a more formal
`panel' sitting around a table looking at printouts. Whatever works for
Programming may be art and may be science and is often both, but the quality
of the code we write is a reflection on us and our profession, just as the
design, robustness, lifespan, and ability to handle load of bridge reflects on
the civil engineers and crew that build it. Let us take pride in our work.
(Plug for future article: should software developers need certification,like the P.Eng. certification that engineers usually require--the P.Dev.
designation? Now, any idiot churned out by DeVry or ITT Tech [one year college
programs] can call himself a programmer, and companies aren't all that adept at