Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

What about CWEB?

By klash in Technology
Thu Feb 08, 2001 at 12:13:11 PM EST
Tags: Software (all tags)

When Donald Knuth, regarded by many as the world greatest Computer Scientist, says something as bold as "The CWEB system...makes programming better than any other method known in the world, by far," you'd think people would listen. Yet I can't think of a single time I've heard of anyone using CWEB. Why?

From the same interview: "I simply have to be honest and say that it's the greatest thing that's there." From the same guy who brought you The Art of Computer Programming and TeX, why doesn't anyone talk about CWEB? Very few people have even heard of it! So I embarked on a quest to find out more about this magical productivity doubler, and see what it could teach me.

So what exactly is CWEB? A search on Google for "cweb" brings up only one interesting result in the first 30, and it's only a brochure page about the book The CWEB System of Structured Documentation by Knuth and Silveo Levy. The best description I could find was "WEB is a software system that facilitates the creation of readable programs...The main idea is to regard a program as a communication to human beings rather than as a set of instructions to a computer. Your program is also viewed as a hypertext document, rather like the World Wide Web." It seems to combine Knuth's TeX with software development somehow, though trying to discover just how is no easy task. This web page makes the same bold claims that we saw in the interview: "If you are in the software industry and do not use CWEB but your competitors do, your competitors will soon overtake you---and you'll miss out on a lot of fun besides."

So if CWEB will really allow you to do all the things a smart guy like Donald Knuth says it will, things like:

  • Write programs of superior quality;
  • Produce state-of-the-art documentation;
  • Greatly reduce debugging time;
  • Maintain programs easily as conditions change.
is this something we should be checking out, or at least talking about?


Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure


Related Links
o Google
o the same interview
o brochure page about the book
o Also by klash

Display: Sort:
What about CWEB? | 30 comments (21 topical, 9 editorial, 0 hidden)
Re: CWEB (4.28 / 7) (#1)
by j on Tue Feb 06, 2001 at 04:53:23 PM EST

I remember WEB, the predecessor of CWEB; the only real difference seems to be that, while WEB was geared towards working with Pascal, CWEB is intended to be used in conjunction with C.
The basic idea is the following: You write a special kind of source code that contains both the actual code and the documentation for this code. Then, you run it through a precompiler that splits it into pure source code and pure documentation; both can then be compiled separately. I wonder whether CWEB files compile directly - that would certainly be a good idea.
The idea is basically the same as the idea behind JavaDoc or PerlPOD: By keeping the documentation close to the code, you make it easier for the developer to document it as [s]he writes it.
Yes, it makes documenting your code a bit less easier. Certainly not more fun. It does improve maintainability if you work in a team or write libraries that are going to be used by oter programmers. But I think Knuth got a little carried away in his description of CWEB's benefits.

POD is not literate programming (4.40 / 5) (#6)
by Luke Francl on Tue Feb 06, 2001 at 05:08:52 PM EST

And neither is JavaDoc. Check out the article "POD is not Literate Programming" by Mark-Jason Dominus on perl.com.

The difference is that while POD, JavaDoc and similar systems allow you to write documentation in your code -- a very useful feature -- they do not allow you to seperate error-catching code from the rest of the program. This is called "tangle", and allows you to read the code much easier. Just think how much C code is devoted to checking return types. In literate programming, that code is stored elsewhere in the program, then "tangled" into the appropriate place at compile time.

[ Parent ]

It certainly isn't. (3.50 / 4) (#7)
by j on Tue Feb 06, 2001 at 05:32:25 PM EST

The wording of my reply was poorly chosen. I did not mean to say that either POD or JavaDoc represented Literate Programming. I just wanted to give examples of approaches where the documentation for program code is directly mixed with the code. I did neglect to mention that WEB goes beyond that.
As to the increase in readability: I'm not sure. It's probably my fault, but I just can't find WEB files all that readable. That is at least partially due to the fact that my TeX is a bit rusty. I looked at this example that came with cweb (in Debian/unstable) and I think it would have been easier for me to figure out how it works if it wasn't for all the documentation.
On the other hand, things might be different with a life-size project and if it wasn't about five years since I last used TeX.

[ Parent ]
That example sucks! (3.00 / 1) (#20)
by Luke Francl on Thu Feb 08, 2001 at 01:33:23 PM EST

I just looked at the code you linked to, and I would have to agree with you. It seemed like too much formatting. I find JavaDoc-style comments pretty easy to read, because it is simple. You can use HTML and there are a few @ tags, but that's all you have to worry about, the javadoc program handles all the formatting for you, really.

Honestly, I've never tried literate programming. I would enjoy giving it a whirl, though; it seems like a decent idea.

[ Parent ]
Javadoc is a hassle (3.71 / 7) (#8)
by jabber on Tue Feb 06, 2001 at 05:37:29 PM EST

It is. Not only do you worry about writing your code, you also worry about keeping your API documentation up to date. Geeks hate documentation, and only do it as a necessary evil. Javadoc is a good idea in theory, but in practice, its even more of a hassle than comments - because of the specific @tags and markup conventions it requires. Maintaining documantation in source or separately is still maintaining two separate resources. Synchronization of the two is a hassle.

I use Javadoc, but I do not enjoy doing so. I recognize and appreciate the benefits, but I consider it a chore. I write the documentation first, like a good edumacated CS graduate should. Then I write my code to do what the documentation says it ought to. Then, once the product slips into maintenance phase, I find myself needing to update documentation to reflect the code, and that's very unpleasant to do. One more thing to worry about on a tight schedule.

If what you say about CWEB is all that there is to be said about it, then maybe it is better that it isn't more popular. If CWEB is something that is the "greatest thing since sliced bread", academically speaking, but a pain to use in practice, then I'm not surprised that the non-academics are not familiar with it. Let's leave it to the ivy-covered Computer Scientists, I say!

But, Knuth is brilliant, and he wouldn't shackle us to yet another inconvenient convention. TeX makes lives easier. Computers make lives easier. Knuth doesn't strike me as one of those people who invent a convention or a process, just to get his name on yet another academic publication (unlike Booch, for example).

So I think that there's got to be something special about CWEB. Something that makes the line between documentation and instructions very, very, well, non-existant. If CWEB blurrs source and documentation, so that one is derived from the other, or so the language is itself a hybrid of both - simultaneously readable to people and parsable as instructions to machines, then CWEB really ought to peek out into the real word.

Hey! Maybe CWEB is the language in which the OS for the Ginger is written.. ;)

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"
[ Parent ]

Literate programmnig & wine buying (4.75 / 4) (#13)
by slaytanic killer on Wed Feb 07, 2001 at 07:31:07 AM EST

Literate programming is not for every situation, like with startups. Startups cut corners all the time, including performance, security, and readability. Oftentimes, the corners cut end up slowing a project down if they're not done with foresight. So there's no wonder why people say, "Leave it all to the academic world!"

I actually find using Javadoc/Literate Programming as interesting as the programming. On most of my projects, the commenting takes more time than the coding, but that's part of the deliverable. My deliverable would be less useful and dependable without it.

So everyone knows what we're talking about, here's the form of a Javadoc comment, taken from code:

* Constructs a debug graphics context from an existing graphics
* context that supports slowed down drawing.
* @return just an example, there is no return val
* @param graphics the Graphics context to slow down

So, for Javadoc you gotta use the funny comment style, but your tools either make that take half a sec, or it's really simple to do using cut 'n paste. The @return tag says what you return, the @param tag says what your parameters mean. Just a couple other tags, and again these can be autogenerated so you don't actually have to remember what a tag looks like.

You can even delete them -- the point of Javadoc is to turn your documentation into HTML form, and even the Sun programmers sometimes just leave them out. If your documentation is obvious, then perhaps you don't need the tags.

When I have to go to the store to buy wine in a foreign country ("You better bring back Rioja, you can't fucking go there and not bring it back!"), I look at the documentation on the back to find out the best one. If it's competently done and is truly informative (not just "We step on the finest grapes," but temperatures & lifecycles), then I buy it. You can expect the maker to at least know what wine is used for. Same with commenting; I want the programmer to foresee the usage.

That said, Literate Programming is for saner times than these... Knuth expected novels to be created around code.

[ Parent ]
Very well said (4.33 / 3) (#15)
by jabber on Wed Feb 07, 2001 at 09:26:26 AM EST

I absolutely agree with the value that Javadoc (and similar creatures) add to code. In being a maintenance developer at times, I've needed to rely on someone's documentation, and that gives me an appreciation of keeping it up to date as a primary coder.

As you say, today it is difficult to keep the value of proper software development in perspective. Proper SE includes control of requirements, good design, complete documentation and the TESTING that's too often cut short to make a delivery date. I suppose the field needs to mature some more before management stops budgetting and scheduling around the 'best case' scenario, and stops making the penny wise - pound foolish decisions about getting a product out the door sooner rather than better.

Code reuse is a much hyped thing these days and documentation is important there, but it is still too easy to neglect for the short-term gain. Once we get comfortable with systems reuse it will be absolutely crucial to have complete and correct documentation, and documenting while in the same frame of mind as when coding seems to deliver better information.

I appreciate your well thought-out response. I can always use one more variation on the theme to present to the powers that be when it comes to putting together the next Great Schedule. Of course items such as documentation and testing are usually dismissed as 'padding'. *sigh*

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"
[ Parent ]

Well, I'm hooked (3.00 / 3) (#2)
by Remmis on Tue Feb 06, 2001 at 04:57:36 PM EST

So what is it? :P As a programmer I like to at least find out if wild claims like "...better than any other method known in the world, by far", have any base to them. I've never heard of CWEB. It sounds vaguely familiar but I can't place why. Maybe nobody uses it because they've never heard of/can't find it? Maybe?

Literate programming is alive and well (4.66 / 9) (#5)
by tmoertel on Tue Feb 06, 2001 at 05:05:26 PM EST

While it might be somewhat difficult to find a ton of information on CWEB, it's much easier to find out about Literate Programming in general: Also, some programming languages such as Haskell have support for literate programming as part of their specifications.

And so while much of the programming industry overlooks literate programming, it is still alive and well, and growing every day.

And Knuth is right: Literate programs are more fun. ;-)

My blog | LectroTest

[ Disagree? Reply. ]

Alive maybe, but on life support (none / 0) (#30)
by Ross Patterson on Sun Feb 11, 2001 at 11:44:22 PM EST

Half of the sites on the web ring are 404'ed, and the only tool development since I stopped paying attention in 1995 is Microsoft's AutoDuck. It's a shame, because with the WWW and HTML, we finally have the tools we need to ensure that literate programming can really prosper. But there's no innovation.

[ Parent ]
Use of CWeb... (2.50 / 4) (#12)
by mystic on Wed Feb 07, 2001 at 03:45:10 AM EST

I am taking a course Design and Implementation of Software Tools where one of our project is to hack on a JCASE software using FleXor. We will be using Tangle and Weave to help in our coding and documentation.

That is all I know so far ! Hehe :)

The Trouble with Literate Programming... (4.00 / 5) (#17)
by Morn on Wed Feb 07, 2001 at 01:31:19 PM EST

...is that it's hard to do well enough to be useful to others.

I've had to expand a literate programming project for a project in CS before, and it was a nightmare. I was writing modules to interface with existing code, but had to wade through the intricacies of the implementation of the existing modules just to find out how to interface with it. If you too want a taste [hint: you probably don't], take a look at the document I've linked too and try to figure out how the calling conventions for spawning a new process in Abstract Machine code, and the C code required to generate AM code for the sequence.

It takes a very good author to write a literate program which is easily useful for others to interface to at code level. Only a moron can't write Doxygen or JavaDoc comments.

Just so that I'm not too one-sided, I will say that if you want (or need) to understand the internal functioning of a piece of code, a well-written literate piece of code is about the best kind you could hope for.

I only know one program that uses this technique (3.00 / 1) (#21)
by Ricdude on Thu Feb 08, 2001 at 02:22:57 PM EST

It's the IFMapper program for the Palm Pilot. Searching google will return many hits, but the main site appears to be abandoned. Maybe it's a sign...

C-Web is great (2.00 / 3) (#22)
by PresJPolk on Thu Feb 08, 2001 at 07:06:34 PM EST

For me, it was a tough choice between Chris Webber and Rasheed Wallace for who to vote alongside Tim Duncan for the Western Conference All-Star forwards.

Jason Williams' flashy passes wouldn't be very fun if there weren't someone equally flashy to catch them, would they? Of course not.

If C-Web ends up in New York next year, only Miami, if they have a healty Alonzo Mourning, would be able to stop the Knicks from reaching the finals.


Literate Programming (3.00 / 1) (#23)
by arnald on Thu Feb 08, 2001 at 08:39:51 PM EST

For an explanation of what's going on and what it's all about, look <A HREF=http://www.literateprogramming.com>here</A>.

It's a lovely way to write. But as Knuth acknowledged in a recent lecture of his that I attended, it's a rare person who makes a great programmer AND a great expositor.

not necessarily the best of both worlds (5.00 / 1) (#24)
by drgerg on Fri Feb 09, 2001 at 10:16:34 AM EST

I screwed around with CWEB a bit when I was in grad school and I was somewhat less than impressed.

As others have already stated, the idea of CWEB is that your code and documentation are intimately interwoven with each other. In this specific case, you are mixing C code with TeX formatted documentation.

In principle this is a nice idea:

  • you can extract nicely formatted docs from your code in a straightforward manner
  • you can have richly formatted documentation which lives in close proximity to the code it's documenting. This would be great for complicated mathematical code.
The problem I had CWEB is, I think, inherent in anything which allows rich formatting: there is way too much markup in the comments. When I am working on code, I don't want to either have to parse TeX in my head (in order to be able to understand the documentation) or to keep a separate window with a DVI (or whatever) viewer up in order to be able to read the comments. This requires way too much work (and wasted brain cycles) on my part. I also didn't much like having to mentally switch modes from C to TeX in order to be able to add documentation to my code.

So I think that literate programming is a great idea, but Web/CWeb is an awkward implementation. Given how far things have come in the last N years (can't remember what N is at the moment) since Web was introduced, I'm pretty sure that someone could come up with a more developer-friendly alternative.


Some thoughts (2.00 / 1) (#25)
by krlynch on Fri Feb 09, 2001 at 02:00:35 PM EST

I think that there are a number of reasons, all connected, that no one uses it (at least in open source software; don't know about in industry); I think this applies to literate programming in general:

  • "Real programmers don't need documentation": if you can't convince programmers to integrate comments in their code, you aren't going to be able to convince them to write lots of documentation.
  • "Real programmers don't need IDEs": writing literate code without tool support is even more difficult than writing uncommented code without tool support...and as we know, no self-respecting programmer uses an editor more advanced than vi.
  • "Real programmers write to the metal": these techniques seem to be most applicable when you are writing at a higher level of abstraction; when most of your code is in "portable assembly", you tend to have to write more lines of code, which leaves you with less time to write all the documentation you should be writing.

Of course, to be fair, I don't do any of these things in my own coding either :-)

Literate Programming is Hard (4.00 / 1) (#26)
by pfaffben on Fri Feb 09, 2001 at 03:50:21 PM EST

Literate programming is more difficult than other types of programming, because you need to be able to write a coherent essay, or even a book, as well as write a program. I personally like literate programming, and am currently rewriting my library for binary trees and balanced binary trees (GNU libavl) as a literate program.

Literate programming is what you use when you want the finished product to be a polished gem, a beautiful crystalline structure. You don't use it for little scripts that you whip up and for programs that need to be done right now.

My thoughts on CWEB and literate programming (5.00 / 2) (#27)
by azul on Sun Feb 11, 2001 at 11:46:09 AM EST

I was asking my self the same thing around half year ago: "Wow, it's Knuth here and he is saying such strong words!" I then decided to give literate programming, incarnated in his CWEB, a try.

I have written two free programs using Knuth's CWEB:

Real-time telnet game with ASCII animations where each player has a ship and is supposed to destroy others.
Internet super-server. Actually, the name is wrong in that it isn't connected in any way to email spam. Basically, a program to send as ``requests'' (using whatever TCP/IP protocol: it could be HTTP, SMTP, XWindows, whatever) to a given internet daemon as fast as the underlying operating system allows.

In the first, I didn't use CWEB rightly. I didn't take the time to write comments properly. And I don't regret that. CWEB was, however, useful because of the macro systems it provides.

In SpamBot things were differently. I took a lot of time to actually write down what I was doing. You can see the resulting documentation as PostScript (268 KB) and as compressed PostScript (68 KB).

I have not received any code contributions at all for any of those programs but, on the other hand, neither have I received any code contributions for many other free programs I have released.

There is something I would change in CWEB: I would make it possible to specify parameters to the @<macros@>. In his papers, Knuth says he thinks this is not required since you already have C's #define macros (available in CWEB using the @d construct). Yes, that's right, there are, and you could live using nothing but C's #defines. Eventually, you'll run into two problems.

The first is that you can't call @<macros@> inside of them. That's okay, you don't need @<macros@> but just @d macros. Fixed. Just remove the @<macros@> from CWEB, they are not needed.

The second problem, however, is that __LINE__ is defined ``wrong'' inside of C #defines (at least when compiling with GCC). I use ASSERTs heavily in most of my code. If you take a look at Matanza or HB, you will see there are functions with far more lines with ASSERTs than actual code. When one of those ASSERTs is triggered, the __LINE__ where the error took place is reported to be the line where the definition was used, not the actual line inside the definition where the __LINE__ was written. This may not seem that big a problem, but when you have #defines that use other #defines that use other #defines that use other #defines, this is a very big problem: there is no easy way for you to know which was the line on which the error happened.

A priori, this would seem easy to fix: just make it so the compiler will report the real line where __LINE__ was written, but that would make things even worse as all those __LINE__s were written in the definition of my ASSERT macro! So whenever an error takes place, no matter where (in a @<macro@> or a @d (#define)), it would point me to the definition of my ASSERT macro.

But Knuth is not interested in my opinions on CWEB anyway.

Why haven't I ``fixed'' CWEB to have its @<macros@> accept parameters? Even though it would make my life easier on this sense, I would introduce yet another incompatible tool to make literate programming. People wanting to study my software would need to learn it, not standard CWEB, to be able to understand/modify it. They would also have to install my own version, not standard CWEB, to be able to test their changes. So I'm still sticking to standard CWEB.

I know there are many other alternatives to CWEB. I guess some of them fix this concern I have. Could anyone suggest one? I don't want to spend way too many time looking at different alternatives.

I think that rather than TeX, a literate programming system ought to use HTML (or perhaps XML? Umm) these days. The cweave program ought to spit a set of HTML files for every macro (or a CGI script in Perl or CGI application in C or Java servlet or HB code or PHP code or whatever). Each HTML file would have the following information:

  1. Comments in HTML.
  2. Code of the macro/section.
  3. Links to all the other sections that use this one.
  4. Links to all the sections used by this one.

So why not take this one step further and design an entire IDE based on the browser? One would make it possible to modify the code and comments using HTML forms. The IDE would keep versioning information for both the code and the comments (who changed what when and why). I would even add a way to make assertions that can be incorporated as code at the beginning and end of the sections/macros to make programming by contract easier. And, of course, I would add useful views such as the whole list of sections, the sections last modified by a given programmer, a tree of sections and so on. The IDE would also require locking a section, to make it possible for any given number of persons to work on the code simultaneously. Finally, it should be able to export the final code in any number of compressed formats. Is anyone interested in codeveloping said system?

In my experience, literate programming does make programming more enjoyable.

One problem literate programming has is that rewriting the software means rewriting the documentation. This might be not that relevant in some contexts but in others certainly is. When you are talking about a program on which you'll be throwing big portions of your code away every now and then, rewriting the documentation adds a significative burden on the programmer. There are times when changing a single line of code requires you to change three paragraphs explaining what you are doing.

Basically, I recommend literate programming for software that you don't think will be changing much in the future. Examples of this are TeX and CWeb, which have not changed in a long time. Some persons argue that all the software ought to be written that way: so one can talk about whether a program is finished or not. That discussion is outside of the scope of this comment. In other cases, in software that is continually evolving, I don't recommend literate programming that much. But when you have clearly defined what your program will and won't do, in a way such that thinking about future versions (of the program itself, this does not include packages that include the program combined with others: you can think of a program as a module of a bigger application) makes no sense, I recommend the use of literate programming very much.



Comments are always good (none / 0) (#29)
by Ross Patterson on Sun Feb 11, 2001 at 11:08:31 PM EST

One problem literate programming has is that rewriting the software means rewriting the documentation. This might be not that relevant in some contexts but in others certainly is. When you are talking about a program on which you'll be throwing big portions of your code away every now and then, rewriting the documentation adds a significative burden on the programmer. There are times when changing a single line of code requires you to change three paragraphs explaining what you are doing.

Literate programming doesn't require you to write documentation, it allows you to write documentation that co-resides with the code and therefore stands some chance of being correct. You can still write uncommented programs "literately", using CWEBish facilities to organize the program chunks in a logical fashion (as opposed to the fashion required by the language) if you don't believe in comments.

That said, any program you're not going to immediately destroy needs some degree of commentary, and if you can do it in-line and in a literate fashion, you're better off doing so. Even moreso if you work for a commercial programming operation (software vendor, in-house application team, etc.), because then the only thing standing between you and a badly updated program is the previous programmer's willingness to document their work and keep the documentation current.

[ Parent ]
Knuth&Levy - All program, little book (3.00 / 1) (#28)
by Ross Patterson on Sun Feb 11, 2001 at 11:00:43 PM EST

Don't bother trying to learn "literate programming" (as Knuth calls this programming model) from Knuth&Levy - it's an eleven-page book with 203 pages of printed CWEB etc. program. Now, admittedly, the program is written "literately", but most sections contain only a few lines of commentary and a lot more pretty-printed code. Knuth's "The TEX Book" was a much better example, but even then didn't do the topic real justice.

What about CWEB? | 30 comments (21 topical, 9 editorial, 0 hidden)
Display: Sort:


All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!