Here is the traditional paradigm of computer use:
A person, called a programmer, prepares a file, called a
program, using a text editor.
A pre-existing program, called a compiler, converts the
programmer's new program into an executable file.
The user, often a different person from the programmer,
thinks of the exectuable as being the program he uses. The
user prepares a file, called data, using a text editor.
Then he uses the program that the programmer has provided
for him, to process his data.
In the traditional paradigm the input files to computer
programs are prepared by humans using text editors. This
applies both to the data, which is processed by programs
written specially to process them, and to the source files
from which those programs have been compiled.
Scratch beneath the surface and you find middle-ware. The
compile does not actually produce an executable, it writes
assembler code, which is converted to an executable by an
assembler. The word processing program does not know about
all the printers on the market. It writes out its data in a
page description language such as PostScript. The PostScript
interpreter generates the file of commands for the
printer. The browser reads HTML files. Some persons type
HTML using a text editor, but others use a web authoring
package. In the latter case HTML is the output language of
the authoring package in addition to being the input
language of the browser. Assembler, PostScript, and HTML are
examples of middleware.
Once upon a time programmers actually wrote assembler code
themselves. The syntax of the assembler has various
elaborations intended to help the human preparing input to
the program himself. It might have been better to design the
syntax to suit the human preparing the source files for the
program that generates the assembler code. The emphasis
would have been rather different. Working at one remove,
writing a program that generates files in a particular
format, one wants the format to be simple and consistant,
and one cares very little if it is tedious and repetitive to
write, because it is the computer than will be writing it.
The law of migration to middleware is that most programs
intended to process files prepared by humans using text
editors end up processing files written by other programs
earlier in the processing chain. This has four implications
for file syntax.
1)The files need to be readable by humans. When things go
wrong, someone has to look inside the intermediate files
to determine if things went wrong earlier or later in the
2)There is little benefit from syntactic complications aimed
at letting humans abbreviate the input. One will only be
typing short test cases in manually.
3)There is a cost from syntactic complications. Since humans
only look at the files occassionally they cannot be
expected to remember the subtleties of the syntax. Any
cleverness means that the person trying to track down a
problem has to spend hours relearning a syntax he only
uses when things go wrong.
4)There is another cost from syntactic complications. The
program that reads the input is much more complicated, but
all for nothing. The program that writes the input
produces `plain vanilla' files, that make no use of the
features included for human convenience.
When the C programming language was designed, it was natural
to want a high level assembler, with a view to using it
directly as a programming language, sitting at a keyboard,
typing it in.
Starting with a blank sheet today, designing an alternative
high level assembler to replace C, the biggest change is
that one is designing middle-ware. I would look to LISP for
inspiration. Programmers writing in Common Lisp today
usually write part of their code indirectly, as
macro-expansions that expand into the Lisp code that gets
compiled. Thus Common Lisp has partly migrated to being
middle-ware, with source that has been written mechanically,
that is, it is the output of other programs. This is an
important point, but one that is easy to overlook. The
peculiar feature of Common Lisp that makes one overlook this,
is that, having learnt the base language, one does not
have to learn a macro-language layered on top of it. Lisp
program source is a textual representation of Lisp data
structures, which helps greatly when it comes to writing one
Lisp program to automatically write another less
sophisticated program. Consequently a Lisp source file looks
like it is just Lisp, even when an important part of it is
code that writes Lisp, and might thus be thought of as
meta-lisp, and be expected to appear different in the source file.
My radical suggestion is that New-C should have a Lisp-like
syntax, and have Lisp or Scheme as its macro-language. The
most important goal of New-C should be an active acceptance
of its place as middle-ware. Programs such as YACC and Lex,
which emit C code, are the inspiration here. They point
a way higher programmer productivity, but they point to a
path little trodden. Writing C code that writes C code is
hard. It could be very much easier if New-C was designed to
make it so.