Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Optimization and the Bazaar

By carlfish in Technology
Wed Dec 20, 2000 at 01:27:43 AM EST
Tags: Software (all tags)
Software

Donald Knuth coined the phrase "Premature optimization is the root of all evil." In the development stages of a software project, the accepted wisdom is that it is most important to code for clarity first, to make sure the design is logical and the code easy to understand, and not worry about speed until the project is feature-complete. But where does this leave a project operating under the Bazaar model, that has to release un-optimized code to the public?


Well-factored code is beautiful. When you sit down in front of someone else's program for the first time, and in short order you've not only found the piece of functionality you're interested in, you understand exactly how it works, that's the first, most difficult step on the road to making a fix, or contributing a feature already done for you.

Coding for clarity is important when you sit in a cubicle, because the guy down in the next office will have to maintain whatever you've just written, and any extra time he takes because of your ugly code costs your employer money. However, coding for clarity is even more important in an Open Source project, because the big advantage of Open Source is the number of eyeballs you are going to have looking at your code, and given the huge density of Open Source projects out there, if those eyeballs can't see what you're doing, then there's a good chance they'll not bother looking any more.

The more readable your code, the better chance that a complete stranger will mail you a patch, rather than a bug-report. The better designed your code, the more chance that a complete stranger will be willing to add a feature.

The problem with well-factored, well-designed code, is that to start with, it's less efficient. In the early stages of a project, you trade in the opportunity to use a faster, but more complicated algorithm, against the opportunity to use something that can be understood at a glance. You design encapsulated components, knowing that it would be faster in some cases to mix the code around and break encapsulation. Later in the development, of course, this turns around - once you have a fully functional system, well factored code is easier to profile accurately, it's easier to locate which optimizations will noticeably affect the speed or size of the overall program, and the code itself is more malleable, and thus easier to tighten in the right places. The problem is that software written using the Bazaar model doesn't have the luxury of hiding the program until it's ready.

There are several problems with releasing un-optimized code. Firstly, the majority of users of Open Source don't read the source. They either download binaries, or download the source and look at the options for ./configure. There's several million lines of available sourcecode making my Linux workstation run, and I don't have the time or inclination to look at most of it, even if it goes wrong. The practical upshot of this is that when something runs slowly, or uses up a lot of memory, or both, you uninstall it. You never see that the program is doing everything right, it's just not reached the stage of being tightened up.

As a culture, hackers tend to idolize great feats of efficiency. Mel the Real Programmer is our hero, never mind that (or more likely, because) the narrator in that story took two weeks just to figure out how Mel's optimization hack worked. There is nothing more evil to us than inefficient code, and "bloat" is almost the most unspeakable obscenity.

When a program is missing features, it's easy to explain to users that the feature will be included later. When a program is slow, or uses up lots of memory, people are much less likely to believe that this is something that will also be fixed in the course of development.

As a result, many projects are labeled "slow" and "bloated" before they've even reached the phase of heavy optimization, (three that spring to mind are early GNOME, Mozilla and Enlightenment), and even when they do speed up, they hang on to the damning labels for much longer than they deserve. Mozilla is still slow in a number of critical areas, but despite what Netscape might have us believe, it's still pre-release. Compared with the milestones of six months ago, it's flying, and the profiling and optimization work continues. Or in another example, Enlightenment reached the stage where the window manager had gone through a ground-up rewrite, but people still judged it on the performance of a very early development release, and wouldn't touch it with a bargepole.

So, the question is, do projects that release early and often open themselves up to unjustified labels, because they dared distribute unfinished work? How can Open Source projects protect themselves, without having to submit to the evils of premature optimization?

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o huge density
o Mel the Real Programmer
o Also by carlfish


Display: Sort:
Optimization and the Bazaar | 61 comments (58 topical, 3 editorial, 0 hidden)
Optimization vs. Code Efficiency (3.83 / 18) (#4)
by Dougan on Wed Dec 20, 2000 at 12:02:14 AM EST

I think it's somewhat of a fallacy to imply that the faster code is, the less readable it is, or conversely that easily-readable code is doomed to be slow. Interfaces are the places where code readability is paramount, and interfaces are made readable through diligent use of....commenting.

This is the place most open source projects fall short; they are usually solidly coded, but have extremely poor internal documentation. (A generalization, I know, but I'd argue that on the whole it's true.)

In the absence of code comments, it's absolutely usually true that a program which uses a faster (and thus usually more complicated) algorithm is going to be more difficult to read. However, if your interfaces and code are well documented it shouldn't matter whether you use Hoare's algorithm for finding the k-th smallest element of your array, or the naive one that anyone can understand at a glance -- the task is the same.

Cheers,
Greg

Wow, I should really sleep... (2.40 / 5) (#5)
by Dougan on Wed Dec 20, 2000 at 12:04:26 AM EST

The subject should read "optimization vs. code readability."

Cheers,
Greg

[ Parent ]

A counter-example. (4.14 / 7) (#8)
by carlfish on Wed Dec 20, 2000 at 12:30:56 AM EST

I agree, efficiency doesn't always imply unreadability. However, there are quite a few situations in which you will reduce performance - usually because you're adding a layer of indirection. It's usually a case of design, rather than algorithm choice. Some examples (from random glancing through Martin Fowler's Refactoring)

  • Replacing a confusing block of inline code with a well-named function call.
  • Replacing inline code that is repeated in many places with a function call.
  • Replacing a temporary variable with multiple calls to some query function.
  • Replacing some kind of type flag with subclasses, and replacing conditionals with polymorphism.
  • etc...

Off on a tangent, I'm mostly a Java hacker. One of the things in Java that is rather inefficient is String handling - strings are stored entirely in unicode, so things like toLowerCase() can be pretty costly because they're working on multi-byte characters, and work differently in different locales. On top of that, String objects are immutable, so every time you want to change the contents of one, you take the hit of creating an entirely new object.

From the point of view of efficiency, you'd be better off using byte arrays. This makes code Really Ugly, and should only ever be attempted in time-critical sections. However, I still have people telling me regularly that you should never use strings in Java because they're inefficient. My response is that you should use strings exclusively, and only use byte arrays after you've done your profiling - because in 95% of the places you "optimize" by not using strings, you're not going to even notice the difference in performance, but you're really going to notice the difference in maintainability.

Charles Miller
The more I learn about the Internet, the more amazed I am that it works at all.
[ Parent ]

Optimising compilers (3.66 / 3) (#11)
by Spinoza on Wed Dec 20, 2000 at 02:41:10 AM EST

If you use a compiler that tries to be smart about inlining functions (ie. gcc with -O3) your first two points may not be a problem -- there's a chance that they'll be inlined by the compiler anyway.

Your second paragraph makes a good point: The section of a program needing optimisation is not always where you expect to find it. Assuming the use of higher-performace data types will speed up a program significantly is not always true.

[ Parent ]

Inlining Considerations (none / 0) (#61)
by Morn on Thu Jan 04, 2001 at 01:26:34 PM EST

If you use a compiler that tries to be smart about inlining functions (ie. gcc with -O3) your first two points may not be a problemNot, I believe, if they're in seperate compilation units (i.e. seperate '.o' files), which is often the case in well-organised code.

[ Parent ]
OT: Java and Strings (5.00 / 1) (#20)
by jacob on Wed Dec 20, 2000 at 10:38:11 AM EST

(Yep, this is off-topic. Sorry.)

You said:

...On top of that, String objects are immutable, so every time you want to change the contents of one, you take the hit of creating an entirely new object.

Actually, Java provides a way around this because Sun realized early on that it would be a big performance hit. The StringBuffer class is like a String, but mutable. When you're ready, you call toString() and you get an immutable String out of it. Problem solved!

However, I think you make a pretty good point otherwise-- as I like to say, not all O(n^3) algorithms with O(n log n) replacements are created equal. The one that makes code more readable and only ever gets executed once probably ought to stay; the one that happens in a tight loop with n > 1,000 every time needs to go.



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Addendum. (4.64 / 17) (#6)
by carlfish on Wed Dec 20, 2000 at 12:04:34 AM EST

After submitting the article, I discovered a very good short piece by Bill Harlan called A Tirade Against the Cult of Performance that I really should have linked to from within the story.

Charles Miller
The more I learn about the Internet, the more amazed I am that it works at all.

Getting a 404 on this. (nomsg) (none / 0) (#33)
by marlowe on Wed Dec 20, 2000 at 12:37:13 PM EST


-- The Americans are the Jews of the 21st century. Only we won't go as quietly to the gas chambers. --
[ Parent ]
Google Cache (4.25 / 4) (#47)
by farmgeek on Wed Dec 20, 2000 at 03:12:14 PM EST

Google's cache of A Tirade Against the Cult of Performance

[ Parent ]
I know, I know! (2.75 / 16) (#7)
by Friendless on Wed Dec 20, 2000 at 12:15:55 AM EST

You are almost right that efficiency and good design are at odds with each other. I am a programmer on a piece of Java code (90KLOC) which must be high performance. To make things interesting, when I inherited it it didn't work and it was outrageously slow. For a long time, I could make efficiency and correctness enhancements by rewriting the code so it didn't do stupid things. In those cases, efficiency and good design went together, e.g. by grouping commonly used objects into classes where I could also put optimised methods which worked on the group, and cache results, and so on. Unfortunately I have just this afternoon come to a situation where my good design allowed extension in one direction, and an efficiency requirement requires good design in another direction, and I cannot figure out how to merge the two. Anyway, back to the problem, it's not going to solve itself.

Later, later, later... (3.91 / 24) (#9)
by Sunir on Wed Dec 20, 2000 at 12:43:24 AM EST

Optimizing

The ideal. After code is well factored, it can be optimized by replacing an implementation of an abstraction with a more efficient implementation of that abstraction. For instance, if you need a priority queue, you can quickly write one by using insertion sort. However, after a while you may discover that the priority queue is a bottleneck. Then, you can replace your priority queue implementation with a min heap with no effect on the calling code. In this case, then, the code remains readable and efficient.

Optimize later!

Pre-releases

In order for the bazaar to work, one needs to release early, release often. However, this only works within the development community. If the product reaches its end market early, naturally the end market will turn on it. This is the same with commercial software that reaches One-Point-Oh prematurely.

With proper labeling, things may be averted. Name the development stream the "development stream", in "alpha" or "beta", or "unstable". If you outright claim the software sucks, people's expectations will be lower.

In that vein, perhaps one shouldn't publish unstable binaries. At least not on the project front page, but on the special developer's release section.

I believe it's naive to think the programmers are the same as the market and vice versa. The market doesn't care a damn about the source code, and rightly so. They just want to solve their problems. If your solution doesn't work, then they shouldn't care any more. But if you don't pitch a development version as a solution, you're safe.

For a sufficiently popular open source project, enough developers will play with the nightly builds anyway. I don't think the loss of potential beta testers from the non-development sector outweights the loss of potential users from the non-development sector.

I guess the lesson is Ship later!

"Look! You're free! Go, and be free!" and everyone hated it for that. --r

Why not optimise milestones? (2.83 / 6) (#10)
by enterfornone on Wed Dec 20, 2000 at 02:13:38 AM EST

I'm thinking Mozilla here. Most of the developers I imagine work from CVS. There would be a few who follow develoment closely and just go by nightly builds. But the target audience for milestones was always going to be end users.

Why not fork for a milestone, optimise as much as possible and make a public release.

Yeah, it's fairly pointless and it's not going to help development in itself (since development will be happening from the unoptimised trunk) but it will give the end user a much better idea of what the finished product will be like.

Otherwise, do away with end user builds of alpha sortware. CVS still counts as releasing early/often, and you know with CVS that for the most part only serious developers are going to bother touching it.

--
efn 26/m/syd
Will sponsor new accounts for porn.
[ Parent ]
Optimize later, but not too late (4.50 / 6) (#16)
by Sunir on Wed Dec 20, 2000 at 05:08:57 AM EST

I will, I will, I will resist the urge to troll, despite the fact that hunk of crap keeps slicing MeatballWiki to ribbons. So, let's not use Mozilla as an example. Instead, let's use the hypothetical Goodzilla browser that is developed using decent, lightweight application development practices. Thus, it ships its iterations on time, is relatively bug-free, and has good feature karma. However, because it's an application, it's well-factored, not optimal.

If you can divert enough development effort into the go-nowhere, marketing happy optimization work, cool. Go for it! It couldn't do any damage except slow the main development effort down for a bit. Moreover, since the entire codebase may not churn between iterations, the effort could be salvaged.

On the other hand, if the application is obviously too slow, especially after adding something new, you should optimize it then and there. You already know what's wrong. Since you know it's only one module's fault, you firewall the optimization damage to one section. Or, if the module can't be optimized, it shows you that the existing system design is flawed. It's best to fix the design earlier than later. Consider that if you build the entire crystal palace before optimizing, when you knock down the walls the whole building will crash down around you.

Thus, if Goodzilla is coming up to an iteration release and it notices that loading pages heavy with animated, transparent GIFs is slow, then it should look into that and fix it like the bug that it is. Since Goodzilla is always well-factored, being refactored mercilessly, the damage should be minimal.

Indeed, this is one of the purposes of iterations: to discover usability flaws. If the application is unusable because it is bloated or too slow, then that is a showstopper bug. Fix it!

Consider the counterexample of the telecom company (who shall remain nameless) that tried to rewrite their end-loop Plain Old Telephone System software object-oriented. Being real software engineers with real software architects (architects play golf!), they designed it the Right Way. However, after two years, they tried it out and it sucked. It took 10 seconds to get a dial tone. 100 times too long.

What failed? During development, they never ran the system, stopped, realized it was too slow, and fixed it. They just chugged along until they built their glass house. After that, they were stuck because the system was far too complex to fix after that.

This is why it's important to get an end-to-end system up first and iterate quickly (one to three week iterations) afterwards to get feedback over what's working and what's not.

Optimize later, but not too late.

"Look! You're free! Go, and be free!" and everyone hated it for that. --r
[ Parent ]

I think that the way this is solved in practice .. (3.66 / 21) (#12)
by streetlawyer on Wed Dec 20, 2000 at 02:56:28 AM EST

is that Open Source software, whatever Eric Raymond says, doesn't innovate. (This is the best point at which to rate this post 0 and post a "TROLL ALERT" reply, if that's what you're planning to do, btw). Most software developed under the Bazaar model aims either to replicate existing commercial software, or to provide small spplications for highly specialised uses. In both cases, the nature of the problem solves most of the questions regarding algorithm choice, and it also gives readers of the code a plentiful source of clues as to what a given piece of code is trying to do.

This isn't a bad thing, necessarily; the purpose of this comment is to point out that the author of the article may have identified a genuine weakness in the Bazaar model, and perhaps gone some of the way toward explaining why so many Open Source projects are either "Yet Another/ Is Not" copyware (pick your favourite example), or not really developed on Bazaar lines at all (remember that the original Raymond paper was actually a criticism of Stallman, and only later retrod as a critique of Microsoft).

Finally, I'd note that medieval cathedrals weren't built in the way that Raymond says they were, and that Microsoft's "daily build" is actually much closer to Raymond's bazaar model than most open source projects. I'm afraid that the article was a classic example of what happens when someone intelligent attempts to practice social sciences without a licence; they tend to fall in love with one particular idea (in Raymond's case, free markets) and force everything to fit.

--
Just because things have been nonergodic so far, doesn't mean that they'll be nonergodic forever

So where is innovation happening? (4.50 / 10) (#13)
by Spinoza on Wed Dec 20, 2000 at 03:17:16 AM EST

I'll begin by offering the obligatory pointless counter-example to the "open source doesn't innovate" section of your comment. Today's counter-example is: ReiserFS.

That said, I don't think you are particularly wrong about lack of innovation in open source, but I think it would be closer to the truth to say that open source is no more innovative that closed source, and that real innnovation depends on the people involved in a project. Innovative ideas for software don't just happen because of your development model.

There's also a problem of defining innovation. To some people, innovation may entail wildly new, world changing software. In this case, innovations are limited to such things as the web, the GUI, the first word-processor, and so on. To others innovation may occur as a new method for handling an old job.

I'm willing to speculate that no matter which way you define innovation, you won't find a working development model that encourages it any more than any other. Certainly not for the first definition.

[ Parent ]

I'll restate (2.85 / 7) (#14)
by streetlawyer on Wed Dec 20, 2000 at 03:24:05 AM EST

I think that what I should have said is that the Bazaar model does not innovate. Since most open source projects are actually not developed on the bazaar model, this is probably a more defensible position.

--
Just because things have been nonergodic so far, doesn't mean that they'll be nonergodic forever
[ Parent ]
Innovation in software is marketroid-speak (2.75 / 4) (#22)
by 0xdeadbeef on Wed Dec 20, 2000 at 10:58:27 AM EST

because even the most piecemeal advancement in the state of the code is an "innovation". The only real innovation happens in reseach labs and universities. It's a reflection of goals and priorities, not of development models or ideologies.

Every time someone checks out a piece of code it becomes of a cathedral of one. It doesn't matter how the code got to the state it is, the same opportunity for "innovation" exists. The bazaar is not mindless evolution. You really are stretching the metaphor past the point of usefulness.

[ Parent ]
No, that's wrong (3.40 / 5) (#25)
by streetlawyer on Wed Dec 20, 2000 at 11:13:36 AM EST

The fact that a figure of speech comes from marketing doesn't mean that it is useless or meaningless. An innovation is either a new product, as seen from the point of view of the user, or a new implementation of an old product, using a different algorithm or programming technique, and giving a noticeable improvement in performance. Trivial improvements in coding (and most stages of optimisation are individually trivial) should not count as innovations in anything other than a technical, uninteresting sense. And if you think that the bazaar model is a useless metaphor, your criticism is better addressed toward Eric Raymond than me; I agree.

--
Just because things have been nonergodic so far, doesn't mean that they'll be nonergodic forever
[ Parent ]
It's right and you know it. (2.75 / 4) (#28)
by 0xdeadbeef on Wed Dec 20, 2000 at 12:03:14 PM EST

The problem with arguing with a known troll is I don't know if you're serious or taking me for a ride...

It is not meaningless, but the as marketing jargon it is empty propoganda, about as meaningful as "New and Improved!" on dishwasher soap. If you consider every new product an innovation, from the users' perspective, then you've really destroyed your own agrument. I see innovation every time a project has changed to address anything more than a flaw. From my point of view, the bazaar model is far more responsive at fulfilling my needs than software written in any other style.

[ Parent ]
no need for insults (2.50 / 4) (#29)
by streetlawyer on Wed Dec 20, 2000 at 12:18:38 PM EST

If you consider every new product an innovation, from the users' perspective, then you've really destroyed your own agrument

In what way? Some innovations are useful, some are not; the bazaar model produces fewer of each type.

I see innovation every time a project has changed to address anything more than a flaw.

But this robs a useful concept of its usefulness, with no very obvious benefit in return other than whatever pleasure you get from referring insultingly to "marketroids". Why do you do this, particularly when it is so far out of line with common usage?

From my point of view, the bazaar model is far more responsive at fulfilling my needs than software written in any other style.

Thank you for sharing that with the group; but I don't recall ever arguing otherwise.

--
Just because things have been nonergodic so far, doesn't mean that they'll be nonergodic forever
[ Parent ]

Streetlawyer is a sophisticated troll-bot (2.42 / 7) (#34)
by 0xdeadbeef on Wed Dec 20, 2000 at 12:54:02 PM EST

I simply altered my definition of "innovation" to conform to yours, that is, that "innovation" happens whenever the user perceives something new. A word taken out of its useful context, and presented with a new definition to weaken the meaning of the old, is dishonest propoganda. It is poetic justice to use similar properties of language to punish those who do this, so I stand by my insult. An honest marketeer should take no offense.

Give evidence for your assertion that the bazaar model produces fewer innovations. I define a bazaar model product to be any piece of software that conforms to the open source definition. I define an innovation as any enhancement to an existing product that increases functionality, or a new product that provides functionality not already available in the same market niche (ie. other open source software). Count the innovations in this and any other development model. Your results will be scaled to reflect the total number of man-hours invested into the products of each type.

Until you can do this, you're just talkin' trash.

[ Parent ]
I've no idea what I've ever done to offend you (2.00 / 3) (#35)
by streetlawyer on Wed Dec 20, 2000 at 01:07:33 PM EST

and I don't understand what you're saying here.

I simply altered my definition of "innovation" to conform to yours, that is, that "innovation" happens whenever the user perceives something new

I don't think that you did. Your definition was to do with changes in programs other than to address faults; mine was to do with noticeable changes from the POV of the user. This is the ordinary language meaning of the word "innovation", I submit, and your functional definition captures too many changes which would not be recognised as innovations. Your insult is completely out of place (and I am not a marketeer).

Give evidence for your assertion that the bazaar model produces fewer innovations

Not until I get a bit of evidence that I wouldn't be wasting my time.

I define a bazaar model product to be any piece of software that conforms to the open source definition.

I cannot accept this; it is much more inclusive than Raymond's original definition, which, remember, excluded most of the GNU project from the definition.

. I define an innovation as any enhancement to an existing product that increases functionality, or a new product that provides functionality not already available in the same market niche (ie. other open source software).

Nor can I accept this; you seem to want me to count open source clones of commercial software as innovations; a truly Orwellian redefinition.

Your results will be scaled to reflect the total number of man-hours invested into the products of each type

Nor should this be accepted; the whole point of the bazaar model was to *increase* the number of man-hours available.

I fear that with such a ridiculous request for me to do work, while offering none yourself, you are arguing in bad faith. Indeed, you are talking not trash, but shit.

--
Just because things have been nonergodic so far, doesn't mean that they'll be nonergodic forever
[ Parent ]

You ensnare innocents in your ruthless troll web (1.50 / 2) (#49)
by 0xdeadbeef on Wed Dec 20, 2000 at 04:17:42 PM EST

You make an assertion, but cannot demonstrate that reality backs that assertion. I am picking on you because you like to carry on like this, and you should know better. (When did I call you a marketer? I called you a troll-bot.)

You keep switching between a conventional definition of innovation and one defined by the user's perception and experience. That is the most Orwellian redefinition in this thread. I maintain that real innovation is more than an incremental improvement, and will have little correlation with the development model because it is primarily a creative act.

The bazaar is simply a communication model for distributed development, and can exist between people sharing cubicles or between people half-way across the world. The unique feature is the fact that the code is open, which is why I use that as the discriminating factor bewteen a cathedral of many and a bazaar of one. A bazaar can scale because it has no barriers. Maybe that doesn't jive with Raymond's statements, oh well.

You may scratch my market niche requirement. I simply regard a product with code available as having a unique feature when compared to a product without, so according to your user-based definition, it is an innovation in its own right. The man-hours scaling factor is to account for the fact that closed development far outnumbers open development, so a simple count would be misleading.



[ Parent ]
** unsubscribe ** (none / 0) (#55)
by streetlawyer on Thu Dec 21, 2000 at 02:17:21 AM EST

You keep switching between a conventional definition of innovation and one defined by the user's perception and experience.

The conventional definition of an innovation which I have consistently used, is one which is defined by the user's perception and it is now abundantly clear from this trhread which one of us is the "troll-bot". Please don't make claims you're not prepared to support in future.

--
Just because things have been nonergodic so far, doesn't mean that they'll be nonergodic forever
[ Parent ]

Adding new functionality is NOT innovation (2.66 / 3) (#36)
by Carnage4Life on Wed Dec 20, 2000 at 01:16:31 PM EST

Give evidence for your assertion that the bazaar model produces fewer innovations. I define a bazaar model product to be any piece of software that conforms to the open source definition. I define an innovation as any enhancement to an existing product that increases functionality, or a new product that provides functionality not already available in the same market niche (ie. other open source software).

This is probably one of the poorest definitions of innovation in software I have ever seen. Simply adding new functionality to a product is not innovative, neither is adding functionality that does not exist in other Open Source projects a definition of innovation.

I define innovation in software as either creating functionality that nobody else (not just other Open Source projects) has created before or combining elements in a way that no one has done before (thus Amazon's 1-click shopping can be considered innovative).

That said, I do not agree with streetlawyer that Open Source projects are not innovative. Many people simply look at KDE and GNOME and assume that since they both copy some form of Windows functionality then there isn't any innovation going on in Open Source software. Sendmail, Apache, the Linux kernel, Mozilla, Freenet, Perl, Emacs, Python, the BSD operating systems, etc are all projects that have done things that have never been done before are thus can be considered innovative. I also agree with you that if a comparison is done based solely on man hours spent developing, that it is unlikely that there will be a big difference in the number of innovations created in closed source software vs. open source software



[ Parent ]
I concur. (3.00 / 1) (#50)
by Spinoza on Wed Dec 20, 2000 at 04:41:26 PM EST

I can see what you're getting at now. The bazaar model may be effective at refining software, but once the initial release has been made, and the bazaar model takes over, innovation is difficult. All the innovation happens when a one person, or a small team is working on the program, before the release to the open source world.

[ Parent ]
Cathedral/Bazaar (3.66 / 6) (#15)
by stuartf on Wed Dec 20, 2000 at 04:12:57 AM EST

Finally, I'd note that medieval cathedrals weren't built in the way that Raymond says they were, and that Microsoft's "daily build" is actually much closer to Raymond's bazaar model than most open source projects. I'm afraid that the article was a classic example of what happens when someone intelligent attempts to practice social sciences without a licence; they tend to fall in love with one particular idea (in Raymond's case, free markets) and force everything to fit.

Even ESR's own example - Fetchmail - doesn't conform to the bazaar model. It quite strictly follows a chief architect model, and would be a perfect example for "The Mythical Man Month". I can't quite see why he's such a hero.
What his example demonstrates is a few open source principles that a lot of people fail to grasp (see sourceforge for many many examples)
1. You need to release something working before anyone else will help - a good idea is not enough to attract most people
2. You need project management skills to make it work - classic examples being Linus & ESR. The instant Linus took his mind off the Linux kernel, that's when the delays started.

[ Parent ]
Well put (3.00 / 1) (#32)
by sugarman on Wed Dec 20, 2000 at 12:36:13 PM EST

Thank you. You've managed to express what's been bugging me about the OSS model since I became aware / involved with it a couple years ago. It's always been hanging at the tip of my tongue, but I haven't been able to express it quite as well.

As for the comment about practicing socail sciences without a license, I think it follows along these lines: those with an exposure to social sciences tend to be bombarded with several different ideologies, all of which seem to have their good and bad points. A brief exposure to any one discipline can give someone unfamiliar with the field a myopic view and the feeling that what they've learned is the One True Way.

As you spend more time in the field, in college for example, spending time in more courses, with different profs with diverse backgrounds and agendas, you see the field from a broader perspective. It is rare to be exposed to only one PoV in college.

Yeah, not 'insightful' or anything. I know. But like I said, it has been bugging me for a while.

--sugarman--
[ Parent ]

I have to disagree with this ... (4.41 / 24) (#17)
by StrontiumDog on Wed Dec 20, 2000 at 06:29:25 AM EST

The problem with well-factored, well-designed code, is that to start with, it's less efficient

Horse doo-doo. Well factored, well-designed code is the most efficient thing there is. The problem is, you're lumping two different concepts together under the term "efficiency". These concepts are: bloat, and performance.

Bloat is a design problem. Code bloat arises either because of poor initial design (very common in Open Source failures), or because the bounds of the original design are overstepped to include features that weren't originally planned for (very common in successful Open Source projects: look at the monster that is the current Linux kernel).

Bloat is not limited to Open Source. In commercial software, the causes are: marketing hype ("we must add e-Foo and i-Baz to this!"), late customer requests ("I know it's almost finished but we've discovered we simply must have Bar functionality too!"), and plain old bad planning ("Shit, is this thing supposed to fly? We forgot to add wings!").

Performance is a scaling problem. All projects are fast enough in the beginning, when the data set used is small. Performance only becomes a problem when the input is scaled to real-world proportions. Everybody can write a 3D engine that can rotate a cube at 120 FPS -- few write a 3D engine that can rotate 10,000 cubes at 120 FPS. Similarly, that free text editor you cobbled up in your spare time works perfectly on 3K source code files, but barfs while trying to regexp a 3MB file.

Real life example: at a project I worked at some time ago, we decided to use Windows NT for a file-cache server in a very large distributed project. A large number of processes on different machines ranging from mainframes to PCs communicated via temp files on this machine. Speed was not essential, so we thought, so temp files were chosen because of their simplicity. Well, I found out the hard way that a directory with 10,000 files overflows NT's sort buffer. The equivalent of dir *.foo takes forever, even on a machine with 512MB, and the results returned are unreliable -- more often than not garbage. See, we never had a problem during test runs -- no more than a few hundred temp files were generated -- and everything seemed fine. The problems only showed up when the project was scaled to production levels. Fixing this otherwise seemingly trivial problem did require ugly kludges to several separated chunks of existing code: added performance, at the expense of added bloat. The lesson? Pay careful attention to scaling issues from the beginning: therein lies the solution to almost all performance problems.

I have to disagree with you disagreeing with this. (4.60 / 5) (#18)
by carlfish on Wed Dec 20, 2000 at 07:42:06 AM EST

That's mostly true in the case of server applications. The vast majority of problems you'll get with servers are scaling problems. The last project I worked on, which funnily enough involved passing files around between clients and servers, we were told in advance what they expected the maximum file-size would be, and what the expected throughput of the system was. So from the start we had functional tests reflecting these requirements that would warn us if we were running into problems that stemmed from design, because the tests would go red.

On the other hand, in the last week of production our functional tests all ran green, but various unit tests were going really slowly. Further looking showed that while we were within the acceptable limits, we were still spending an inordinate amount of time doing EJBHome lookups. It wouldn't have been a huge scaling problem - it was linear - but it was a performance hit. So we knew it was worthwhile to implement a cache for EJBHome objects instead of just looking them up on the fly. We could have written the cache when we started, but we didn't know whether it was worth the effort until we had real data to back up our assumption.

In the case of desktop applications, and there's a large number of them out there (the examples I used in the article were deliberately desktop apps), the load you put on a program when you're sitting at home playing with it in testing is going to stay within a degree of magnitude of the load that Bob User is going to put it on while he loads up X, launches a few applications and browses the web. Developers of desktop applications tend to "eat their own dogfood" these days as a matter of pride.

Take a look at Mozilla weekly reports, especially in the Architecture section, and some of the performance fixes that get put in - they tend to involve things similar to the EJBHome example above - things like adding an internal cache here, sharing data structures between components there, all things that could probably have been designed into the browser from the ground up, but they add complexity to the implementation, so it's only after profiling under normal loads that the developers knew what the best places were to expend the effort.

Charles Miller
The more I learn about the Internet, the more amazed I am that it works at all.
[ Parent ]
You dumped 10,000 tmp files in a shared directory? (2.50 / 6) (#21)
by maynard on Wed Dec 20, 2000 at 10:46:53 AM EST

And you're arguing that the Linux kernel is code bloat at it's worst? While using a single network shared directory for a huge number of temp files as an interprocess communications mechanism isn't what I would call "bloat", I certainly wouldn't call it "good design" either. I find it highly ironic that here you're labeling certain specific Free Software projects as poorly deisgned, when the only example you give as a simple "scaling" problem was such a glaring mistake from the get go. You honestly thought that would work? "ugly kludges..." Are you kidding me?

I'm sorry, I think your response completely misses the point of this article, and your examples show little insight. JMO. --M

Read The Proxies, a short crime thriller.
[ Parent ]

Yep, 10,000 files. (4.25 / 4) (#23)
by StrontiumDog on Wed Dec 20, 2000 at 11:06:38 AM EST

The files had varying ages. The average lifespan of a temp file could vary between a few milliseconds to a few weeks. Test runs did not indicate that more than about 1000 files would be in the directory at any given time. Production showed otherwise: more long lived temp files were created in the production environment than according to the original (flawed) estimate.

And for the record, it was not my idea. I was working on a project involving teams from Oracle, CMG, Cap Gemini and a number of other smaller consultancies: this was the design (originally from a Cap Gemini team, I believe). This is reality.

The problem was not with the original estimate of 1000 files. In the real world, estimates are off all the time. The problem was failure to check if the system would still be robust with 10,000 files. That's scalability.

[ Parent ]

I just can't resist a cheap shot. (1.80 / 5) (#39)
by marlowe on Wed Dec 20, 2000 at 01:37:57 PM EST

If you're concerned about scalability, why are you even using NT?

-- The Americans are the Jews of the 21st century. Only we won't go as quietly to the gas chambers. --
[ Parent ]
I'd have to disagree with some of this (4.00 / 13) (#19)
by RangerBob on Wed Dec 20, 2000 at 10:08:46 AM EST

First off, well designed code isn't inefficient, POORLY designed code is. A well designed system is one in which there was careful design and planning for everything, including efficiency. I know some software engineering "purists" who would disagree about planning for efficiency, but they're also way wrong I think. The best designers are those who think about efficiency as they're doing their designs.

In my opinion, one of the biggest mistakes I've seen in Open Source projects is that there's no real plan or design. Many don't follow good software engineering practices. The commonly used "scratching an itch" analogy is good to some extent, but sometimes the itch can be awfully big and just hacking code without a path can take forever to get there, if it gets there at all.

Adding bells and whistles before the core operating has been completed is also a problem. Mozilla and Kdevelop 2 are some of the apps that have kinda gone down this path. Mozilla has gotten a lot better, in my opinion, now that they have focused on bug fixes instead of adding features. Kdevelop 2 is also going this way in that they're adding all sorts of cool features while the core functions still don't work.

There have been some good successes though. I think the Linux kernel is one. Linus controls what goes in or out. He makes feature freezes where the only work that can be done are bug fixes (although this does slip). Qt is a very good example. I personally think is has a very well designed and thought out system. It's far more useful and powerful then say, MFC (I primarly do Unix work but sometimes have to do Windows junk).

Amen (1.50 / 2) (#40)
by pianoman113 on Wed Dec 20, 2000 at 01:38:40 PM EST

I think a professor at Canegie Mellon University published a book about writing good code. I read about it on some post or another (I think it was K5). In any case, you are right on about MFC. At my company we use Insure++ to debug some troublesome features and it has pointed out bugs and memory leaks in the MFC code (not to mention some in itself, what an honest program). I highly recommend Insure++ if you've got the thousands of dollars to spend on it.


A recent survey of universities nation-wide yeilded astounding results: when asked which was worse, ignorance or apathy, 36% responeded "I don't know," and 24% responeded "I don't care." The remaining 40% just wanted the free pen.
[ Parent ]
Similar discussion on #kuro5hin (2.40 / 5) (#24)
by maketo on Wed Dec 20, 2000 at 11:11:41 AM EST

The other day we had a similar discussion on #kuro5hin. Basically, I would rather have a highly-optimized program with two options that are well debugged and work all the time than have a program that is slow, not well debugged but has 300 options. Many a user agrees to the latter programs and thus encourages bad design/programming practice. The author/company has to be _told_ that their way of doing things is wrong and has to be _persuaded_ to change the faulty ways.
agents, bugs, nanites....see the connection?
The Point (none / 0) (#38)
by pianoman113 on Wed Dec 20, 2000 at 01:33:42 PM EST

I think your comment misses the point of the article. Feature bloat is different from unoptimized code. During the devel process, a feature shouldn't be optimized until you are sure that it is working and written in a way that can be understood. Once that feature works, optimize it. If a program runs slow and has 300 undebugged features, it needs to be further developed. Rather than complaining about it DEBUG IT, if you've got the source. However, in a corporate world, the company needs to take responsibility for that.


A recent survey of universities nation-wide yeilded astounding results: when asked which was worse, ignorance or apathy, 36% responeded "I don't know," and 24% responeded "I don't care." The remaining 40% just wanted the free pen.
[ Parent ]
Mel the real programmer not my hero (2.77 / 9) (#26)
by speek on Wed Dec 20, 2000 at 11:40:33 AM EST

I pretty much hate Mel. I don't think very highly of the performance junkies. If your code is well-written, it should explain itself. If your code needs comments, it's poorly written. If it runs very slowly, you shouldn't have too much trouble finding the trouble spots, and optimizing them as you go. I don't think the code needs to be frozen to reach a point where it makes sense to do an optimization.

As far as planning up front: all in moderation. Too much planning is a waste of time. More important to write code that is easily modified (meaning, understandable, non-duplicated code), so that, when you do learn some real facts about how your system should be designed, it's not much trouble to go in that direction. Yes, sometimes your effort will be duplicated, and you will move things multiple times, but to expect that you could have predicted everything correctly up front is a fantasy.

--
al queda is kicking themsleves for not knowing about the levees

The value of comments (4.66 / 3) (#30)
by joto on Wed Dec 20, 2000 at 12:33:16 PM EST

I pretty much hate Mel

I love Mel. It's such a fun story. He is not my hero though. Actually he is quite the opposite. But the story is so well-written, funny, and most importantly: right on target. There are a lot of Mel's out there. Mel has a huge ego. Mel is clever, very clever. He understands his field better than almost anyone else. But he is a bitch to work with, because he makes the assumption that everyone should be as clever as him when in reality, there can only be one person that clever in exactly what Mel do. And that is Mel. If Mel could understand that, maybe others could understand Mel's code. But Mel is absorbed with his great ego, and doesn't want anyone else to understand. In my opinion, he is an asshole!

If your code is well-written, it should explain itself. If your code needs comments

If your code needs a comment, that means it is probably is doing something tricky. And it is very appropriate to do tricky things. Be it if you are writing OS kernels, compilers and code-generators, DBMS, proof-assistants, signal processing code, graphics algorithms, simulations, agents, whatever... But you need to explain what you do. A high-level comment at the top of each source file, explaining what it contains, literature references to algorithms, explanations of data-structures and their purpose, perhaps a diagram explaining tricky parts, preconditions, postconditions, invariants, that stuff. And, not forgetting the all-important "what is the purpose of all this stuff?" .

But I agree that mindless comments a'la :

int i; /* an integer */
i = *a; /* set i to the value of what a points to */
i++; /* increase i */

only serves as obfuscation. High-level comments on the other hand is something very useful. It is something I always try to do, and something I very often miss in other peoples code. Hunting around with a debugger just to see what the purpose of some obscure function is not my idea of fun. And unless you already know the algorithm used, chances are you will have a lot of trouble understanding it, if it does something tricky. Of course you have to aim at a target audience that has basic knowledge of what you are trying to do. But don't expect your code to be so clean that the algorithm you used 3 weeks (or months) to tune is readily understood by everyone unless you can explain it in plain english.

Self-explaining code is only possible if you are doing something simple. And if you do something simple, I agree that code should not be obfuscated by explaining the obvious.

[ Parent ]

The Mel Paradox (3.66 / 3) (#42)
by ubu on Wed Dec 20, 2000 at 01:53:45 PM EST

I love Mel...In my opinion, he is an asshole!

Precisely. This is the Mel Paradox.

Ubu
--
As good old software hats say - "You are in very safe hands, if you are using CVS !!!"
[ Parent ]
high level comments (4.00 / 2) (#45)
by speek on Wed Dec 20, 2000 at 02:20:41 PM EST

If you are commenting a method, explaining how it works - that's wonderful. However, if you are commenting a section of code within a method, then odds are 9 billion to 1 you should break that section of code out into it's own well-named method that goes a long way toward documenting what it does. I have nothing against commenting really, but excessively helpful comments like you mention are a sign the code is not clear in it's own right and opportunities probably exist to improve it's clarity. The comments can stay as method-level comments, however. Although, sometimes those get in the way too.

--
al queda is kicking themsleves for not knowing about the levees
[ Parent ]

Mel IS my hero. (2.66 / 3) (#31)
by simmons75 on Wed Dec 20, 2000 at 12:35:21 PM EST

Mel comes from an era when optimization was VERY important. Highly optimized code can mean the difference between running software FAST on an XT with 640K of RAM, of a program running SLOW on a P4 with 512MB of RAM (and constantly using swap, at that.)
poot!
So there.

[ Parent ]
Doubtful (5.00 / 1) (#41)
by Jim Dabell on Wed Dec 20, 2000 at 01:46:45 PM EST

Highly optimized code can mean the difference between running software FAST on an XT with 640K of RAM, of a program running SLOW on a P4 with 512MB of RAM

No, optimisation will probably not get you anything like that difference in performance. A change of algorithm might, however.



[ Parent ]
Marketing and Open Source Projects (3.00 / 6) (#27)
by slaytanic killer on Wed Dec 20, 2000 at 11:41:49 AM EST

I think this article would be better labelled "Marketing and Open Source Projects." As I understand it, your article is about the bad reputation an open source project may get because of users trying out early builds that haven't been optimized.

Therefore I think the solution to this is more social than technical. The maintainers should be honest people, with the attitude that It is done when it's done.

On the other hand, it is clear that such reasonable notions don't fly for businesses. They want promises, and builds which suck so that the maintainers feel pressure and a sense of importance that businesses need them to optimize the software. Welcome to the world of Worse is Better, with RedHat boxes that lead people to believe all is well (except perhaps on your flaky computer).

It's all about balance. (3.33 / 3) (#37)
by rebelcool on Wed Dec 20, 2000 at 01:27:08 PM EST

Theres 2 extremes to every issue, and the key is to find the middle way (how buddhist of me). In one keen example, look at the mel the real programmer. That is optimization to the extreme. Whereas at the other end, he couldve written highly legible code..but it wouldnt be near as fast. So the key is to find the right balance in there.

The same is true with releasing software. Release Early and release often inevitably ends up with many people taking this to heart and releasing bloated, slow code that barely works. Of course you don't want to wait too long when you've got your complex optimizations in and then discover a bug. I think its best to optimize the parts you are most comfortable with, and then release. In fact, you should never release something you're not comfortable with. It's not only bad for your image, but bad for the poor souls who need to use your software and then discover it doesn't work well.

COG. Build your own community. Free, easy, powerful. Demo site

Ambiguity of "optimization" (4.54 / 11) (#43)
by pmk on Wed Dec 20, 2000 at 02:00:36 PM EST

After reading some of the comments here, I think that there is some confusion resulting from using the word "optimization" to mean two different things.

Optimization in the sense of Mel the Programmer is what Knuth was complaining about: taking an existing program and pounding on its implementation to speed it up, making tons of tradeoffs for performance against readability, portability, reliability, scalability, extensibility, and memory efficiency. At the extreme, it results in something that runs faster on one machine with one compiler and can be maintained by one programmer alone. Of course that's a bad thing in and of itself, and yet it is often entirely justifiable -- as opposed to tradeoffs made gratuitously, which is were the topic of Knuth's complaint. The worlds of embedded programming and lowest-level kernel coding need this kind of optimization; I want my kernel's TLB miss handler to use as few cycles as possible and I don't care how ugly it has to be if it has to be ugly to be fast.

Optimization in the sense of good design is something else altogether. This is the optimization of taking the time to choose the right data structures and algorithms up front, perhaps after taking the time to build a prototyping framework in which different alternatives can be explored and measured. This form of optimization is not contrary to a development process, unlike the first form; it is a development process.

The most valuable form of optimization, however, is the kind that you don't get to see. It takes place when an experienced programmer makes the right design and implementation choices the first time, instead of going with inferior alternatives that will have to be expensively replaced later. A large part of a good programmer's value is little different from that of a good engineer in any field: wide knowledge of alternatives and their tradeoffs.

Regarding comments: when somebody tells me that good code needs no comments, I hand them a lollipop and tell them to go out and play. I don't expect to see comments in a symbol table routine explaining how a binary tree works, but I do expect to see comments explaining anything that is unusual about the specific binary tree implementation, as well as why the binary tree is not a trie, a hash table, a balanced tree, or some other plausible alternative. Any code that's made a tradeoff against portability also deserves a comment. The question I ask myself as I read code is: did the programmer know about the alternatives and the tradeoffs, or did the programmer just use the one technique that he or she understood? It is not the case that some algorithms and data structures are more "readable" than others, but it is true that some algorithms and data structures are more widely known than others.



Another form of optimization (2.33 / 3) (#46)
by kagaku_ninja on Wed Dec 20, 2000 at 02:37:03 PM EST

...is optimizing the time available to a programmer. Having a highly modular design that is easy to modify. Using existing class libraries, rather than hand-coding a more "optimal" one (in my experience, many coders write optimized code that isn't, but I digress).

These things may result in slower code, but if the programmer gets more done in a shorter period of time, the organization benefits.

[ Parent ]
Re: Another form of optimization (4.00 / 2) (#48)
by jfpoole on Wed Dec 20, 2000 at 03:28:55 PM EST

These things may result in slower code, but if the programmer gets more done in a shorter period of time, the organization benefits.

While having a more productive programmer is generally a good thing, there are times when valuing the developer's time over the user's time can be a bad thing. For example, if you're developing an application that's going to be run by tens of thousands of users (if not more), then paying attention to performance is a good idea. On the other hand, if there are only a couple of people using an application (say, internally at a company), then the develeoper's time should become more of a consideration.

-j

[ Parent ]

Re: Another form of optimization (2.00 / 1) (#51)
by kagaku_ninja on Wed Dec 20, 2000 at 05:15:13 PM EST

Perhaps in the open source world, this attitude makes sense. Try working for a startup and you will understand what I mean. There is never enough time, the money is running out, and it can be very hard to find good people even if you can afford them.

Anyway, if you follow the advice of Knuth and others, you could theoretically optimize it "later".

[ Parent ]
Re: Another form of optimization (4.00 / 1) (#53)
by jfpoole on Wed Dec 20, 2000 at 06:15:31 PM EST

Perhaps in the open source world, this attitude makes sense. Try working for a startup and you will understand what I mean. There is never enough time, the money is running out, and it can be very hard to find good people even if you can afford them.

It just doesn't make sense in the open source world -- it should make sense to anyone that writes software. I don't want to waste my users time, especially when they pay for the software I write.

Anyway, if you follow the advice of Knuth and others, you could theoretically optimize it "later".

IIRC, Knuth said premature optimization is the root of all evil. That doesn't mean you can start optimizing once everything's written and come out with something as efficient as something written from the get-go with efficiency in mind.

-j

[ Parent ]

heh..the old saying.. (4.00 / 1) (#54)
by rebelcool on Wed Dec 20, 2000 at 07:15:38 PM EST

a good programmer knows how to write his own stuff. a great programmer knows when to reuse.

COG. Build your own community. Free, easy, powerful. Demo site
[ Parent ]

Optimization and Open Source (3.00 / 8) (#44)
by Brandybuck on Wed Dec 20, 2000 at 02:12:37 PM EST

The typical Open Source project has more than one developer. So why make your code unreadable for your peers? You are not Mel, you will never be Mel, so stop trying to outdo him with your bizarre constructs. In my never to be humble opinion, readability should *never* be replaced with optimization. Even after a stable release. Do you really expect users to spontaneously offer up patches if they can't read your code?

That said, there are good reasons for optimization. But use it sparingly. Contrary to the implicit belief of Open Source developers (as demonstrated by their actions if not by their words), comments do not slow down your code. You can put reams of comments in your program and not see one cycle of performance hit. So thoroughly comment all optimizations.

Just a thought... (3.00 / 4) (#52)
by depsypher on Wed Dec 20, 2000 at 05:50:18 PM EST

How about creating a seperate 'optimized' version, the same way you have seperate debug and release versions (one with built-in validity checking (asserts) and the other without). The un-optimized version could be a sort of test stage to get the certain functionality working, which would be refined in the optimized version. It'd be tougher to manage a project this way, but maybe worth the effort.

a good idea. (4.50 / 2) (#56)
by cloudnine on Thu Dec 21, 2000 at 04:07:08 AM EST

But unfortunately, it (at least I have found) doesn't work out in real life. It's hard enough to manage one software tree, let alone two. No matter what, some fundamental change in the code always occurs, and developers are stuck synching/debugging two trees of code that do the same thing, but in different fashions. It's a nightmare that i wouldn't wish on anyone. Also i think with open source stuff, contributions, excluding from the core developers, are sporadic and made by many different people (correct me if i'm way off base), which only adds more grief to the equation.
But after saying all that, i certainly don't have a solution for this article's quandry! :-)
ttyl, -9
Happy holidays, all.

[ Parent ]
It's not as hard as you claim (4.00 / 1) (#58)
by The Welcome Rain on Thu Dec 21, 2000 at 05:03:40 PM EST

Most reasonable revision control systems allow one to perform branching operations pretty easily. It is common for development efforts to have mainline and release branches, and for fixes to propagate along both. That's what release engineering is good for.

[ Parent ]

Protecting your project (5.00 / 4) (#57)
by amokscience on Thu Dec 21, 2000 at 02:09:30 PM EST

As in anything that deals with people: Manage their expectations. Drill that into your head. Stencil that into your eyes. As I see it you can't design perfect performance into your project (if you can please call me, I want to work for you =) so a great deal of your work is not directly related to optimizations but controlling user fallout as they run up against performance problems.

I can't say that enough. If there's one thing that people routinely fail at it is managing expetations. I see it many times when games are being hyped, when software is being advertised, at sporting events, volunteer organizatoins, in relationships with co-workers, and on and on down the list.

Here are some suggestions. I'm positive they go against marketing rules and everything that major industries (tobacco, defense, movie, sports, etc) would do but you are different. You are running an open source project so be open with your information!

1. Good Documentation. Have a plan and how you plan to deal with problems. A roadmap and battleplan is a good way to get rid of most people's anxieties. Also create a FAQ for questions and be as specific as possible. Saying "optimizations will be taken care of one the product is mature" is a dumb way to deal with a problem. Especially when the product is still not mature a couple years later.

2. Admit your failures and problems. If a problem comes up that is important, immediately take steps to address it. Don't keep it quiet hoping no one notices (perhaps an exception for security bugs?). This will also save you time when a gazillion people ask you if you noticed that problem XYZ is occurring on their systems.

3. Be responsive. Ignoring user input is the single worst thing that you can do. Why? The people who care enough to respond are the exact people you need to keep using your project. The people who never report have probably already moved on to something else. A quick, friendly (or at least neutral) response will brighten a users day and he will think even more positively of the project. This leads to good word of mouth advertising.

4. Do a optimization release every once in a while. When the timing seems right (after a .0 release?) get some optimizations done along with the bugfixes. This'll help scratch your own itches that you've noticed while testing your project.

5. Don't hesitate to consider a redesign. When a project starts to bog down it's a good sign that you haven't designed enough. Also, check your design after each major release milestone. If you are beginning to exceed your design reqs then that is a gigantic RED FLAG telling you to recheck your additions. Poor design will only get in the way of good optimizatoins.


If you can't do these things then you probably shouldn't be releasing to the public. Either that or accept that you will have massive turnover and backlash throught your deelopment cycle. Those are my experiences from both ends of the dev cycle as both a user and developer.

Just an opinion, sort of orthoganal. (4.50 / 2) (#59)
by Crutcher on Thu Dec 21, 2000 at 11:48:15 PM EST

In my coding thus far, I have consistently found that the following is true:

"All problems have simpler representations at higher levels."

By this, I do not mean abstraction levels, but cognitive ones. It is the problem of synthesis, or as a coder I respect greatly puts it, "one maniac alone can do what 20 together cannot.". All code comprehension problems are not to be found at the communication level, though Brooks was right, and many are there. But sometimes, the best answer to a problem cannot be represented cognitively below a certain level, and so if you have a coder that can fit it all in their head, they can solve it there, and write it back out, but that it /cannot/ be represented in smaller cognitive pieces.

Coding in single blocks often makes possible space/time efficiences that arent even explainable in the design by parts, and this is not being obtuse. Coders at that level can read it just fine, just many people are not at that level.

Remeber, math is hard. If it wasn't, coding wouldn't pay worth crap.
Crutcher - "Elegant, Documented, On Time. Pick Two"
Courage! (2.00 / 2) (#60)
by elenchos on Thu Dec 28, 2000 at 05:17:40 AM EST

While Open Source projects ultimately may depend on widespread acceptance for their success, you cannot let ill-formed public perception cause you to make programming or design decisions that you know are wrong. Falling into that trap destroys any hope an Open Source project has of producing superior-quality software. If you care only about being popular with the masses and look only as far ahead as next quarter, you might as well be writing closed-source commercial shrinkwrap.

You should be forthright and completley honest about what stage your project is at, and what its limitations are, including all known bugs, especially security-related bugs. It's unfair when the ignorant give your project a bad reputation because they can't understand what to expect from an early release, but if you stick to your principles, you will be vindicated in the long term. Throw a sop to public opinion by prematurely optimizing your code, and your bad reputation might then be proven to be well-deserved.

Adequacy.org

Optimization and the Bazaar | 61 comments (58 topical, 3 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!