Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
What's wrong with C++ templates?

By jacob in Technology
Tue May 27, 2003 at 11:47:08 AM EST
Tags: Software (all tags)
Software

If you've read The Hitchhiker's Guide to the Galaxy and its sequels, you probably remember the Vogons, the incredibly ugly, disgusting, and bad-tempered aliens charged with destroying Earth to clear the path for an intergalactic highway. The Vogons' brains, it turns out, were "originally a badly deformed, misplaced and dyspeptic liver" -- and that explains their demeanor. In this article, I'll explain why I think C++ has a badly deformed, misplaced and dyspeptic liver of its own: its template system.


Before I make my case, I want to make sure my position on templates is clear. For the record: if you're going to program in C++, templates are unquestionably useful, and I hope you won't mistake me for one of those people who say that templates aren't necessary and we should all be using inheritance instead or some gobbledygook like that -- I'm not. If you want to program in C++ there are lots of times when templates are absolutely the best option the language gives you for writing generic, reusable code.

On the other hand, just because templates are the best C++ has to offer doesn't mean they're good. It turns out that all the problems templates solve were already solved better, before templates were ever introduced, by other languages. In fact, the kludgey way templates work precludes C++ programmers from a lot of the good stuff that nicer solutions have.

What are templates, and why should I care?

Templates are a widely-used C++ feature that have simple behavior with complicated consequences. What they do is allow you to mark particular code fragments that might otherwise look like functions, methods or classes as being, in fact, templates for those functions, methods or classes that have holes in them where later on, different people can place certain interesting constants, like type names or numbers. For example, the function


int myCleverFunction() {
return 4;
}

is just a regular function, but


template <int N>
int myCleverFunction() {
return N;
}

isn't a function at all, but a pattern for many different functions that the user can make just by supplying a concrete value for N.

Sound useless? It's not. There are a couple of very useful things people do with templates: one is writing code that is abstracted over types, and another is a clever trick called template metaprogramming in which programmers use templates to actually make decisions or perform calculations during compilation, sometimes with amazing effects.

In the rest of this article, we'll look at the various ways people use C++ templates and we'll see what features other languages provide to let programmers achieve the same effects. We'll look at basic and advanced topics, starting with the original and simplest use of templates: generics, also known as parametric polymorphism or just "writing the same code to work with multiple types of data."

Generics: write once, run anywhere

Most programmers who use templates use them to write data structures like lists and other containers. Templates are a natural match for lists (and container classes in general) because they not only let programmers write one List implementation rather than many different List kinds for all the different types of values that lists will need to store, but also let them write down statically-checkable rules like "this particular list must contain ints only."

For instance, In C++, you could write a simple linked-list as follows:


template <class T>
class ListNode {
public:
ListNode(T it, ListNode* next) {
this->it = it;
this->next = next;
}
T getItem() { return it; }
ListNode* nextNode() { return next; }
private:
T it;
ListNode* next;
};

When the compiler sees this code, it remembers the definition but emits no assembly instructions. Later, when it sees a use of the template instantiated with a particular type (say, int) that it hasn't seen before, it generates a fresh code fragment by replacing T with int everywhere in the body of the class definition and changing the class name to be unique, and then rewrites the usage to refer to the newly-generated code. So the code would allow you to write type-safe lists of any type:


// fine
ListNode<int>* il = new ListNode<int>(2,
new ListNode<int>(4,
NULL));
// also fine
ListNode<string>* sl = new ListNode<string>("hi",
new ListNode<string>("bye",
NULL));
// type error
ListNode<int>* il2 = new ListNode<int>(3,
new ListNode<string>("hi",
NULL));
// fine
int i = il->getItem();
// also fine
string s = sl->getItem();
// type error
string s2 = il->getItem();

This is a very handy trick, and one that you can't get any other way in C++ (even using void pointers or single-rooted class hierarchies, neither of which provide type-safety).

So handy, in fact, that it's hard to believe that nobody had thought of the idea before C++ templates were introduced in the mid-80's. You might know that C++ got the idea from Ada, but what you may not know is that the idea predates both -- in fact, the earliest versions of ML in the mid-seventies used a type-inference scheme that explicitly allowed functions to have polymorphic types. The notion had been around in research literature earlier than that, but ML was the first real programming language to have the feature.

ML's approach to the problem of writing a function that works on arbitrary types is very different from C++'s. In ML, the system isn't a pre-type-checking phase that generates new copies of the code for every different type of value the code gets used with, but instead it's a feature of ML's type-checker that allows it to make clever deductions about how functions behave. It tries to infer types for functions based on their source code: for instance, if the SML/NJ compiler sees the function definition

fun f(a,b) = a + 2*b

it is smart enough to realize that a and b must be numbers and the result type must also be a number -- even though the programmer didn't have to type that in, the type-checker realizes it anyway. On the other hand, if it sees

fun g(x) = x

it will conclude that x can be anything and the return type will be whatever was input. This is a perfectly sensible type, called a polymorphic type, and the type-checker can reason about it just fine. For example, if somewhere else in the same program it sees the code fragment

fun h(a,b) = g(f(a,b))

it will know that h takes two numbers and returns a number.

ML's type-checker gives ML programmers every bit as much power to write type-independent programs as C++ templates give C++ programmers: for example, we could write the SML/NJ version of the linked-list template above like so:


datatype 'a List = ListNode of 'a * 'a List | Empty
exception EmptyList

fun getItem (ListNode (i,_)) = i
| getItem (Empty) = raise EmptyList

fun nextNode (ListNode(_,rest)) = rest
| nextNode (Empty) = raise EmptyList

and the same lists will typecheck:


- val il = ListNode(2, ListNode(4, Empty));
val il = ListNode (2,ListNode (4,Empty)) : int List

- val sl = ListNode("hi", ListNode("bye", Empty));
val sl = ListNode ("hi",ListNode ("bye",Empty)) : string List

- val il2 = ListNode(3, ListNode("hi",Empty));
stdIn:3.1-3.36 Error: operator and operand don't agree [literal]
operator domain: int * int List
operand: int * string List
in expression:
ListNode (3,ListNode ("hi",Empty))

- val i = getItem(il);
val i = 2 : int

- val s = getItem(sl);
val s = "hi" : string

- val s2 : string = getItem(il);
stdIn:5.1-5.30 Error: pattern and expression in val dec don't agree [tycon misma
tch]
pattern: string
expression: int
in declaration:
s2 : string = getItem il

Aside from the syntactic differences and the fact that ML deduces types on its own, the C++ version and the SML/NJ version appear pretty similar. But the ML way offers a few tangible benefits: first, the ML compiler can check to ensure that a polymorphically-typed function has no type errors even if you never call it (C++ can't check your template for type errors until you instantiate it, and must check each instantiation separately), which is a big advantage for incremental development and for library authors. Furthermore, if ML says your polymorphic function is safe, then it's guaranteed to be safe no matter what types anybody uses with it: in C++, just because your template compiles with types A, B, and C doesn't say anything about whether it will compile if you instantiate it with type D later. This strategy also allows an ML compiler's code generator to make tradeoffs between the size of the code it generates and that code's efficiency -- C++ compilers get no such choice.

Interfaces: the virtue of (implementation) ignorance

As cool as ML's polymorphic functions are, you may have already realized that they have a pretty major drawback compared to templates: you can't rely on universally-quantified types to have any properties at all. That means that while you could write a function that took a list of any type and computed its length (because the length of a list doesn't depend on anything about the type of elements it holds), you couldn't write a function that sorted that list in any meaningful way without doing some extra work (because the proper sorting of a list depends on properties of the elements it holds).

You never have to worry about this problem when you write C++ templates. You just use any functions, methods, or operators you want, and when C++ fills in the template for you and recompiles the template body you'll automatically get the right thing (provided it exists, of course). For instance, you could add the following method to the C++ list example above with no problem:


template <class T>
class ListNode {
// ... as before ...
ostream & print(ostream &o) { return o << it; }<br> // ...
}

What happens when you apply this thing to a type? Well, if the type you used has an appropriate << operator defined for it, exactly what you'd expect happens, and print works fine. On the other hand, if it doesn't, you'll get an explosion of template error messages that don't really indicate the source of the problem. The worst thing about this situation is that it can cause insidious lurking bugs: the code works fine for a year, then one day the new guy uses it in a way that's not quite expected and all the sudden everything breaks for no obvious reason.

This is the sort of bug type systems were invented to catch, so it's not surprising that there are type systems that will catch it. The one you've most likely heard of is Java's interface system. That system allows a programmer to declare a certain bundle of method signatures apart from any class and then use that bundle as a type, meaning that methods can accept objects of any class that announces that it has all method for each method signature in the bundle.

This system works well, but unfortunately it requires that every class you want to use declares itself to implement a particular bundle of functionality ahead of time. ML's functor system (no relation to the things C++ calls functors, which in ML terms would simply be higher-order functions) deals with this problem nicely using the concept of a parameterized module system.

What's that? It's like a normal C++ library, but with some details (like types of things, for instance) sucked out so they have to be specified later. That may not make much sense, but hopefully an example will clarify: to add a print feature to the ML version of the list introduced above, we could rewrite it as a functor in the following way:


signature PRINTABLE = sig
type t
val printT : t -> unit
end

functor List(T : PRINTABLE) =
struct
datatype List = ListNode of T.t * List | Empty
exception EmptyList
(* ... other functions as before ... *)

fun print (ListNode (i,_)) = T.printT i
end

Now we can make new lists more-or-less on the fly, even holding types that don't have a printT function explicitly declared for them, like so:


- structure L = List(struct
type t = int
val printT = print o (Int.toString)
end);
structure L :
sig
val getItem : List -> T.t
val nextNode : List -> List
val print : List -> unit
exception EmptyList
datatype List = Empty | ListNode of T.t * List
end

- L.print (L.ListNode (5, L.ListNode(6, L.Empty)));
5
val it = () : unit

This system gives you more abstraction power than C++ templates or Java interfaces while providing type-safety.

Metaprogramming: the art of good timing

Another purpose for which particularly devious programmers can use C++ templates is "template metaprogramming," which means writing pieces of code that run while the main program gets compiled rather than when it runs. Here's an example of a program that computes the factorials of 4, 5, 6, and 7 (which are 24, 120, 720, and 5040) at compile-time:


#include <stdio.h>

template <int n>
class Fact {
public:
static const int val = Fact<n-1>::val * n;
};

class Fact<0> { public: static const int val = 1; };

int main() {
printf("fact 4 = %d\n", Fact<4>::val);
printf("fact 5 = %d\n", Fact<5>::val);
printf("fact 6 = %d\n", Fact<6>::val);
printf("fact 7 = %d\n", Fact<7>::val);

return 0;
}

If you look at the assembly code g++ or any other reasonable compiler produces for this code, you'll see that the compiler has inserted 24, 120, 720, and 5040 as immediate values in the arguments to printf, so there's absolutely no runtime cost to the computation. (I really encourage you to do this if you never have before: save the code as template.cc and compile with g++ -S template.cc. Now template.s is assembly code you can look over.) As the example suggests, it turns out that you can get the compiler to solve any problem a Turing machine can solve by means of template metaprogramming.

This technique might sound like some strange abuse of C++ that's primarily useful for code obfuscation, but it turns out to have some practical applications. For one thing, you can improve the speed of your programs by doing extra work in the compile phases, as the example shows. In addition to that, it turns out that you can actually use the same technique to provide convenient syntax for complicated operations while allowing them to achieve high performance (matrix-manipulation libraries, for instance, can be written using templates). If you're clever, you can even get effects like changing the order in which C++ evaluates expressions for particular chunks of code to produce closures or lazy evaluation.

Again, it turns out that this ability was old before templates were a glimmer in Bjarne Stroustrup's eye in the form of Lisp macros. You may recoil at the use of that name, but don't worry: Lisp macros are much more pleasant to work with than their higher-profile cousins. At about the same time Kernigan and Ritchie were inventing C and C preprocessor macros, a group at the MIT Media Lab was inventing a system called MacLISP that introduced Lisp macros, a totally different implementation of the macro concept that survives to this day in Common Lisp and Scheme as well as in a number of offshoots and related languages.

As they exist today, Lisp macros and C macros do similar things: they allow the programmer to substitute one fragment of code with another before the program gets run. The big difference between the two is that while C macros work by scanning for and replacing literal text phrases within source code, Lisp macros replace portions of a parse-tree instead. That might not sound revolutionary, but it turns out to be the difference between a system that gurus recommend you never use and one that goes a long way towards defining a language.

Lisp macros offer a kind of compile-time computation that goes one step above C++ template metaprogramming by allowing you to actually write your compile-time programs in Lisp itself. The same code you'd use to write a regular program, put in the proper place, runs at compile-time instead, and its result gets inserted into the source code of your program. For instance, here's how you could write a normal factorial function in Scheme (this is PLT Scheme version 203):


(define (fact n)
(cond
[(= n 0) 1]
[else (* n (fact (- n 1)))]))

If you wanted to make a version of the same function that was guaranteed to run at compile-time, you could just write:


(define-syntax (ctfact stx)
(define (fact n)
(cond
[(= n 0) 1]
[else (* n (fact (- n 1)))]))

(syntax-case stx ()
[(_ n) (fact (syntax-object->datum n))]))

Aside from some mumbo-jumbo telling Scheme that this is a macro and how to read its arguments, it's exactly the same as the original Scheme function. That's true in general of Lisp macros: they're just regular functions that you tell the language to run at compile-time rather than runtime. While that may not sound all that important, it actually makes huge practical difference: it allows your macros to use parts of your code that also run at runtime, to load libraries and make library calls at compile time -- in PLT Scheme, you could even write a macro that popped up a GUI with a dialog box that asked the user how to compile a particular expression! -- and more, all with no extra effort. C++ templates can't use normal run-time C++ code in the process of expanding, and suffer for it: for instance, the C++ factorial program is limited in that it produces 32-bit integers rather than arbitrary-length bignums. If we had that problem in a Lisp system, it would be no problem: just load a bignum package and rewrite your macro to use it, and everything works out (and still all happens at compile-time). In C++, though, the bignum library is no use to us at all, and we'd have to implement another "compile-time bignum" library to make the fix.

Just as metaprogramming is more powerful than computing little mathematical functions at runtime, Lisp macros have quite a few more uses too. In fact, they were made specifically for extending Lisp's syntax with new constructs. For instance, PLT Scheme has no equivalent of C++'s while loop, but you can add one in just a few lines of code:


(define-syntax (while stx)
(syntax-case stx ()
[(_ test body)
#'(letrec ((loop (lambda () (if test (begin body (loop))))))
(loop))]
[else (raise-syntax-error 'while "Illegal syntax in while loop" stx)]))

Notice in that code fragment how obvious it is what's happening, even if you don't know Scheme: whenever Scheme sees (while <test> <body>) for any code fragments test and body, it should replace that bit with


(letrec ((loop (lambda () (if test (begin body (loop))))))
(loop))
which is Scheme code that performs the loop properly. Otherwise, the user used bad syntax so print a syntax error.

Even with a simple example like while, you can begin to see how these macros are more powerful than C++ templates: while it's clear that since they perform the same copy-and-paste function that templates perform, they can fill the same role, they also have a lot more built-in support for making your metaprograms play well with your normal program. Allowing user-defined syntax errors, for example, would have been an easy way to let the STL authors write code that produced helpful, meaningful error messages rather than the notoriously unhelpful error messages it prints now.

In fact whole large syntax systems can easily be built out of this mechanism, particularly when you remember you can transform your syntax trees not just using pattern matching, but in fact using any arbitrary code you want. A good example of a big system you can build using macros is PLT Scheme's object-oriented programming system: it's a normal, unprivileged library that adds seamless object-oriented programming to PLT Scheme, which has no built-in object-oriented features. You get syntax forms, natural error messages, and everything else an in-language system would provide. In the Lisp world this is standard and many large-scale Lisp and Scheme projects use macros -- a quick check of the standard libraries included with the PLT Scheme distribution shows 292 uses of define-syntax in about 200,000 lines of Scheme. What's more amazing is that this count doesn't include the many macros that PLT Scheme uses to actually define the core of Scheme, like cond, define, let, and so on, which are all macros in PLT Scheme. It might surprise you to learn that in fact the only syntax forms in any of the PLT Scheme examples in this article that are not in fact macros that expand into some simpler form are the begin and if forms I used to implement the while loop above.

So what?

So some other languages invented some features before C++, and they implemented them in arguably better ways. So what?

For one thing, templates hurt the C++ compiler's ability to generate efficient code. It might surprise you to hear that, considering that C++ is "efficient" while functional languages like ML and Scheme are "inefficient," but it's true. C++ templates hide your intentions from the compiler: did you mean type abstraction? Did you mean a syntax extension? Did you mean constant-folding or compile-time code generation? The compiler doesn't know, so it has to just blindly apply the copy-and-paste strategy and then see what happens. In ML or Scheme, though, aside from the individual benefits described above, the simple fact that you're telling the compiler what you want to achieve lets it optimize for you much more effectively.

Another reason to care is that if you understand the context in which templates exist, you'll be able to make more effective use of them and you'll be able to make more intelligent decisions about when to use them.

But from a broader perspective, realizing that templates are really just a C++ version of Lisp macros geared towards generating type declarations rather than extending C++'s syntax helps you understand the wider history of programming languages rather than just knowing the flavor of the month (which is rapidly becoming the flavor of last month!).

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Poll
What's the handiest feature of C++ templates?
o Generics 37%
o Interfaces 2%
o Metaprogramming 7%
o Other 2%
o I program in C++ but don't ever use templates 16%
o I don't ever program in C++ 33%

Votes: 95
Results | Other Polls

Related Links
o SML/NJ
o MacLISP
o Common Lisp
o Scheme
o PLT Scheme
o Also by jacob


Display: Sort:
What's wrong with C++ templates? | 324 comments (253 topical, 71 editorial, 0 hidden)
What's cool about templates (3.00 / 1) (#5)
by hobbified on Mon May 26, 2003 at 10:35:29 PM EST

I can't find it right now, but there was a post fairly recently on the perl6-language mailing list all about what's so cool about C++ templates and what's not quite right about them. If I get some time I'll find it, but for now I'll give you a chance to display some initiative, as I'm ready to get to sleep. :)


Perhaps you mean (5.00 / 1) (#76)
by Three Pi Mesons on Tue May 27, 2003 at 09:05:46 AM EST

This post by Luke Palmer, in April?

Background: the Perl community is gradually building version 6 of the language, and discussing at great length what changes should be made. It seems that type-checking is going to get stronger, though this will be optional (if you don't use type declarations, it's just like Perl 5; but once you start saying my @array of Int the system will enforce that). There has been an enormous amount of discussion about whether or not this is a good idea, how exactly it should work, and how to implement it.

One of the interesting points about this discussion is that C++'s lack of explicit interfaces is regarded as a feature - since it's possible to instantiate your template with some unknown object. A class doesn't have to say that it implements IMyLovelyInterface, as long as it actually does provide methods with the right names and types. I'm not convinced that this really is a good idea; I rather like the ML way of inferring most ordinary types, while requiring declarations for higher-level constructions - the opposite of C++.

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]

Yeah (none / 0) (#138)
by hobbified on Tue May 27, 2003 at 04:10:52 PM EST

that looks like the one.
I'm not so sure whether it's the hottest idea, but it's definitely a very perlish option to be able to have.

[ Parent ]
what's wrong with geeks? (1.04 / 45) (#13)
by BankofNigeria ATM on Mon May 26, 2003 at 11:22:33 PM EST

you elitist jerks, what are you doing masturbating to templates? there are people starving in this world.

1. S 2. V 3. PREP 4. V 5. N 6. PRO 7. N 8. PREP 9. V 10. V 11. V 12. PRO 13. PRO 14. V 15. N 16. V 17. PREP 18. ADV 19. N 20. ADV

Wow, 30+ 1.00 votes. That's talent. (nt) (5.00 / 1) (#170)
by coderlemming on Tue May 27, 2003 at 06:40:44 PM EST




--
Go be impersonally used as an organic semen collector!  (porkchop_d_clown)
[ Parent ]
hmm (1.25 / 4) (#14)
by tang gnat on Mon May 26, 2003 at 11:25:40 PM EST

Yes, I believe that's the approach Windows NT is taking.

basically (2.00 / 2) (#15)
by jacob on Mon May 26, 2003 at 11:30:59 PM EST

but they have some object-oriented in their solution, and some .NET also.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
RAM is cheap and all, but... (2.50 / 2) (#26)
by tang gnat on Tue May 27, 2003 at 01:05:10 AM EST

We can't afford the transaction overhead.

[ Parent ]
well, only if (3.66 / 3) (#72)
by jacob on Tue May 27, 2003 at 08:46:34 AM EST

we're using old-style DFA/PDA implementation strategies. Upgrading to TTL will largely mitigate the problems you're thinking of, as will the halting problem. The only real problem with the strategy is the close coupling with portlet-based solutions and e-mindshare.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
all i can say is (none / 0) (#209)
by tang gnat on Tue May 27, 2003 at 11:59:19 PM EST

Trying to build a team behind that technology would be a staffing nightmare.

[ Parent ]
Good article, bad thesis? and comments (4.75 / 4) (#22)
by sesh on Tue May 27, 2003 at 12:41:40 AM EST

Unfortunately my understanding of lisp and ML are so minimal that I cannot assess your comparison of the languages, but I do have a few questions/comments.

  • Do the mechanisms you mentioned extend to the OO paradigm (eg, class templates)?
  • Your examples all seem algorithm based - do they extend to concepts like policies and traits?
  • I cant see how you can avoid the mentioned optimisation problems when implementing templates if they are to remain completely generic. Also, do you have any benchmarking references indicating the benefits of ML generics over C++?
What is your point? Are you suggesting that C++ should use the same implementation mechanics and syntax as these other languages? Is it valid to compare major features of two languages with completely different architectures and purposes in such a small article?

I enjoyed the article, by the way, but I think your thesis didnt represent your article - your article was for far too narrow a scope (or perhaps it was the lack of examples), and the proposed thesis was for a very broad subject (templates are a major feature in C++).

Also, I think your attitude towards C++ templates will unfortunately bring the rabid knee-jerk C++ critics out of the woodworks.

Answers (4.00 / 2) (#81)
by jacob on Tue May 27, 2003 at 09:34:26 AM EST

  • Yes, straightforwardly. Interfaces are well-known in OO languages and macros have no interaction. Polymorphic types are also fine, though type-inference becomes more difficult in the presence of subclasses. Still possible, though.
  • I'm not really qualified to talk intelligently about policies or traits, but from what I can see all you really need for those are higher-order functions or, if you think risking that the compiler will use a calculated jump is too expensive, macros.
  • No, no benchmarks (and of course benchmarks would be inconclusive, because there's a whole lot more going on in C++ and ML than just templates and typechecking, respectively) but neither has any associated runtime speed cost.
As for my point: take what you will away from it, but I'm hoping that the article helps you place C++ templates in a larger context of programming language features.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Answers (none / 0) (#253)
by trixx on Wed May 28, 2003 at 12:31:43 PM EST

> Do the mechanisms you mentioned extend to the OO paradigm (eg, class templates)? Yes. Check Eiffel, an OO language with genericity. I just posted a long comment on it. > Is it valid to compare major features of two languages with completely different architectures and purposes in such a small article? That's why I commented about Eiffel, being a language quite closer in paradigm and purpose to C++.

[ Parent ]
Answers (none / 0) (#254)
by trixx on Wed May 28, 2003 at 12:32:27 PM EST

[sorry for the previous misformat]

> Do the mechanisms you mentioned extend to the OO paradigm (eg, class templates)?

Yes. Check Eiffel, an OO language with genericity. I just posted a long comment on it.

> Is it valid to compare major features of two languages with completely different architectures and purposes in such a small article?

That's why I commented about Eiffel, being a language quite closer in paradigm and purpose to C++.


[ Parent ]

What's wrong with bashing of C++ templates? (4.77 / 9) (#29)
by i on Tue May 27, 2003 at 01:57:34 AM EST

Let me concede one point right now: not everything. It is true that error reporting is abysmal. It is true that a mechanism akin of SML functors is badly needed. It is true that the syntax leaves much to desire.

So what's wrong with it?

Let us recall what C++ templates are. They are a Turing complete macro system geared toward types. That is, they eat typed code fragments (types or typed expressions) and produce typed code fragments (declarations, types or expressions).

Now this is true that other languages have better implementations of similar features. Languages of ML family have polymorphism, which can be thought of as a macro system geared toward types. Lisp and relatives have their own highly sophisticated Turing complete macro systems. What's so special about C++ templates?

Precisely their being both Turing complete and geared toward types. This makes them unique. No other mainstream language has such a combination of features. (Some experimental languages provide dependent types, a feature more powerful and complete than C++ templates, ML-style polymorphism and probably even Lisp macros.)

And it is precisely this combination of features that makes C++ templates suitable for generating efficient, sophisticated, type safe code, such as found in libraries like Blitz++ or SIunits.

Finally I would like to address the question of efficiency. Quoth the author:

For one thing, templates hurt the C++ compiler's ability to generate efficient code. It might surprise you to hear that, considering that C++ is "efficient" while functional languages like ML and Scheme are "inefficient," but it's true. C++ templates hide your intentions from the compiler: did you mean type abstraction? Did you mean a syntax extension? Did you mean constant-folding or compile-time code generation? The compiler doesn't know, so it has to just blindly apply the copy-and-paste strategy and then see what happens. In ML or Scheme, though, aside from the individual benefits described above, the simple fact that you're telling the compiler what you want to achieve lets it optimize for you much more effectively.

Now this doesn't strike me as particularly convincing. Templates just generate perfectly ordinary C++ code. It is not entirely clear why such code should present less opportunities for optimisation than type-abstracted ML code. The only plausible reason is that typically there is more generated C++ code than generated ML code. Yes, the infamous template bloat. However, with a combination of reasonable compilation strategy and some help from the programmer this, too, can be mitigated.

One day I just might write an article about all this.

and we have a contradicton according to our assumptions and the factor theorem

well said... (none / 0) (#258)
by han on Wed May 28, 2003 at 04:03:54 PM EST

Indeed, the way the template system works with, and extends the C++ type system is its strongest suit, which is not surprising since that's what it was designed for.  Nevertheless, the Turing completeness of C++ templates isn't something I would advertise, given the contortions you need to go through to actually take advantage of it.

But using the Turing-complete lisp macro systems isn't all roses either: While you can easily represent types as data objects at compile time, you still can't implement STL-like efficient generic algorithms and data structures as easily and efficiently as you can in C++, because lisp doesn't have a suitable compile-time type system to work with.  Even though most lisp compilers do type inferencing for efficiency, interfacing to the inferred types is compiler-specific and you aren't guaranteed to get the information you'd need.

So there's still plenty to improve on the generic programming front...

[ Parent ]

From a recent email (2.00 / 2) (#33)
by KWillets on Tue May 27, 2003 at 02:18:31 AM EST

i.e. a friend operator taking a template class object as a parameter. In the compilers from a couple of years back (at least Watcom's), this used to work; with recent gcc and MS Visual C++ versions it doesn't.

This type of problem seems to come up all the time. I worked on a heavy C++ project a few years ago, and we skipped templates because of compiler differences.

The explanation of the principles of compile-time code execution makes a lot of sense. Most explanations of templates spend a lot of time on how versatile they are, without explaining the more theoretical issues and limitations.

Templates are not quite turing complete. (4.33 / 3) (#35)
by President Saddam on Tue May 27, 2003 at 02:43:12 AM EST

ANSI C++ only requires that compiles are able to nest templates to 17 levels (why 17? ask ANSI).

So your example of computing factorials can only be guaranteed to work up to 17 factorial, and the same can be said of any similar recursive/iterative metaprogramming performed with templates.

<editorial>
I'll abstain; this article isn't quite good enough.

Arguing against templates on 'efficiency' grounds  is meaningless. Do you mean speed or code size?. Templates are fast, but they do cause bloat.

As other people have mentioned their achille's heel is the fact that they increase code size considerably. However, partial specialisation should help that, but in practice I haven't seen huge gains. Also template members are only instantiated if they are used. Often, it seems that using the STL gives your code an extra megabyte.

---
Allah Akbar

Standard limits (4.25 / 4) (#78)
by Three Pi Mesons on Tue May 27, 2003 at 09:24:32 AM EST

Most compilers, though, allow a greater nesting depth, possibly with an option (gcc has -ftemplate-depth-n). It's likely that the limit will be increased or removed in the next revision of the standard, as more and more people are getting interested in "exotic" uses of templates.

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]
'efficiency' => code size (4.00 / 2) (#82)
by jacob on Tue May 27, 2003 at 09:47:27 AM EST

None of the techniques mentioned in this article impose any runtime speed penalty except in the presence of separate compilation, which C++ templates don't support anyway.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
separate compilation (3.50 / 4) (#97)
by codemonkey_uk on Tue May 27, 2003 at 12:36:07 PM EST

Oh Jacob, you keep confusing the state of play of C++ compilers last year with the C++ language as specified:
"... separate compilation, which C++ templates don't support anyway"
*cough* export *cough* EDG *cough* Dinkumware *cough*


---
Thad
"The most savage controversies are those about matters as to which there is no good evidence either way." - Bertrand Russell
[ Parent ]

How does this work? (3.00 / 1) (#288)
by frabcus on Thu May 29, 2003 at 09:13:50 PM EST

Can you explain how this works? Does it let us put template definitions in .cpp files, rather than having to have them in the headers?

[ Parent ]
export (4.00 / 1) (#289)
by hymie on Thu May 29, 2003 at 10:56:34 PM EST

Yes. In the header, you precede the template declaration with the keyword 'export'. You put the definition of the method templates and static data in a .cpp file, preceding those with 'export' as well. The compiler may require that you compile this file ahead of any files which actually use these methods.

[ Parent ]
Bloat equals speed penalty (5.00 / 1) (#226)
by statusbar on Wed May 28, 2003 at 03:24:15 AM EST

Bloat does cause a speed penalty because you now have way more code that needs to be copied into the cpu's cache.  Bloat applies pressure to the cache and causes more cache misses, causing slower execution.

There have been some people researching duplicate code removal for this very purpose.  After linking your executable, the system would do a MD5 checksum or something like that on all the machine codes in the executable on a per-function basis.  Any matching functions are compared and discarded and pointers to them are re-assigned.  No compiler does this yet.

--jeff++

[ Parent ]

17! > 2^32 (none / 0) (#298)
by blp on Sun Jun 01, 2003 at 02:28:05 PM EST

The expample only works to 13 factorial on most(all?) machines/complilers, after that it will overflow.

I can no longer sit back and allow: Communist Infiltration, Communist Indoctrination, Communist Subversion and the International Communist Conspiracy to sap and inpurify all of our precious bodily fluids.
[ Parent ]

-1 language wars (4.66 / 6) (#43)
by hugues on Tue May 27, 2003 at 03:05:17 AM EST

ML and Lisp are great languages as you demonstrate, there are others too: haskell, ocaml, etc.

I think that most practitioners of C++ are well aware of its hackish, complex and somewhat ugly nature. To me however it looks like a oboe: very complex, temperamental and hard to play, but beautiful in its complexity and unique.

Eventually most practitioners of C++ get to appreciate its good sides: the huge support across the industry, the fine, standard libraries (STL anyone?), the sheer speed. Your hand wavy argument about optimization are less than convincing. Low-level optimization does at least 90% of the work. There is not much that high-level optimization can do that I rethink of the algorithm can't. Let's see some real life benchmark, shall we?

Language advocacy is fine but there is nothing in your piece that can't be found anywhere else, and it's not short or to the point.

ahhhh STL? (none / 0) (#189)
by QuantumG on Tue May 27, 2003 at 08:31:25 PM EST

How can you refer to STL as a good library?

Gun fire is the sound of freedom.
[ Parent ]
By never having had to use it :) (nt) (none / 0) (#229)
by Simon Kinahan on Wed May 28, 2003 at 03:51:01 AM EST



Simon

If you disagree, post, don't moderate
[ Parent ]
You've Argued Your Point Well (4.27 / 11) (#50)
by OldCoder on Tue May 27, 2003 at 04:28:22 AM EST

C++ is a research language that has gone mainstream. If C++ had been designed with the knowledge that was available in, say 1980, when it was first dribbling out of AT&T Bell Labs, it would be a better language than it is.

A basic historical flaw in C++ is that it was designed based on a flawed concept of compatibility with C. The good idea of compatibility was that C programmers could readily adopt to it and use it as "A better C". The mistake was deciding that C++ could compile and execute all the standard C programs (like all the code in K&R), as if C compilers would no longer be availabe after the switch.

C++ has too many features that were added to solve "Local problems" that were basically caused by the language. Templates are one example, invented to support container development. Then, too much cleverness was applied and the feature over-used to do things it was never intended to do.

Likewise, the hack of using << and>> as I/O operators was extemely clever but ran into the problem of needing friend functions that aren't in the inheritance tree. Come to think of it, friend functions are another hack to solve problems cause by the C++ language.

Another clever innovation that never quite worked out right is Smart Pointers. To get it right, the committee has decided to alter the next release of the language.

C++ has proven to be a decent place for some programming language experimentation and innovation but shows it's research roots. You tell from the design of C# or Java for example, that more time and effort and thought went into the language design with regard to the community of programmers who would use the language. C++ was designed by a very small group that didn't have a large support staff to work out bugs. The small staff couldn't consider programming language alternatives from the perspective of knowing a great many other languages well.

The same critique of solving a language problem rather than a customer problem applies to the "Curiously recurring template patterns" where a base class has its derived class as a template parameter. The base class is only instantiated when defining the derived class. This is an extremely clever trick that solves problems. Unfortunately, the problems it solves are the problems created by the C++ language, and not the end user problem the programmer is supposed to be working on.

Even the much praised "Traits" feature is an off the cuff too-clever invention that was created to solve problems with the C++ programming language rather than the problems of the end-user.

The book C++ Gems is a cornucopia of solutions to programming-language problems pretending to be programming solutions. The authors are very proud of their inventions.

Using cleverness to trick a compiler or a programming language produces code that is hard to understand and maintain. This all reminds me of early FORTRAN; The users of the FORTRAN feature called "Equivalence classes" used a lot of cleverness to produce brilliant kludges to solve problems that shouldn't even exist. Smart people wrote mathematical papers on the theory to use in a compiler so the compiler could figure out how to implement Equivalence classes efficiently. The whole topic is justifiably gone from programming.

Having said all that, I must be a little fair and vent about C# and Java: The choice of programming languages shouldn't also fix your choice of whether to use Garbage Collection, C+ destructors, COM release semantics or whatever. Languages should be flexible enough in this day and age to permit some choice. Only C++ even tries.

The premature release of OOP upon the innocent world of programming was sold as the way to achieve "Code Reuse" and "Productivity". It is only recently that the problems introduced by the new programming paradigm have begun to be solved in convincing ways. Take a look at the C# keywords override and new as applied to class declarations for a nice solution to a problem that vexed C++ for at least a decade (foreign base class upgrades that define methods synonomous with existing methods in derived domestic derived classes).

After all these years, there is still no silver bullet.

--
By reading this signature, you have agreed.
Copyright © 2003 OldCoder

Smart pointer change is library, not language (none / 0) (#101)
by jongleur on Tue May 27, 2003 at 01:20:54 PM EST

Boost's shared_ptr<> has been preliminarily added. No language changes were needed for smart pointers (though there probably will be some).
--
"If you can't imagine a better way let silence bury you" - Midnight Oil
[ Parent ]
My Excessively Verbose Opinion (none / 0) (#323)
by Dragon Lord on Tue Jul 01, 2003 at 08:34:25 PM EST

I pretty much agree with most of what you are saying, but being descended from a long-winded father I'm going to go ahead and throw in my own verbiage as well.

I would say that C++ has done more then provide a "decent place for some programming language experimentation and innovation".  There are LOTS of research languages, most of which never make it out of academia.  For whatever reason C++ has become very wide-spread, and been used for a great deal of real software.  The designers of C++ must have done SOMETHING right, in order for the language to have survived and become as widely spread as it has.

I have no real idea how much time and effort were invested in the design of C# and Java vs. C++, but I would say that the situation is much more complicated then just saying that C++ wasn't as well thought out as newer languages.

Java and C# were not designed with the same goals in mind as C++, certainly they have chosen different tradeoffs, and you could argue that the entire lifetime of C++ was part of the design of Java and C#.

Java had the experience of C++ and other languages to work from, and "fixes" some C++ problems, but introduces it's own (some of which are not in C++, or are directly because of the "fixes" to C++).  Likewise C# had Java and C++ to learn from, and "fixes" problems of both, but I'm sure we will find that it too has made bad decisions, and has problems of it's own.

We also have to consider that the Target Audience, "the community of programmers who would use the language" has changed.  When C++ was created the programmer community was much more strictly divided.  There were the "Real coders" who used ASM & C, and there were the people who "Played with BASIC" (which was slightly more distasteful then playing with yourself :-))

Now we have a different breed of programmer who want to be more productive with less training, who don't want to deal with complexity, and (in many cases at least) who doesn't really want to have to "think" in order to program, or at least doesn't want a language that forces you to think.

Considering that the community of programmers who would use C++ was the C programmers, I think C++ was at least as good a fit as Java and C# are for the current community.  Many of the "complaints" about C++ from the current programming community stem precisely from that fact that it WAS an excellent fit for it's target community, which is simply NOT the current community.

You also have to take into consideration that not only do languages evolve, but also what we want to do with them, and how we want to do it evolves as well.

We face a different market situation where we need to do vastly more complicated things, with more developers who have had less experience and all in less time.

We face a different hardware landscape where everything is bigger, processors are faster, memory is less limited and hard-drives are vast, and where everything get's bigger, faster and cheaper every month.  Where investing in hardware is now often cheaper then investing in programmers, and where efficiency matters less and less.

In short, we have a completely different software landscape then when C++ was conceived, and while C++ may not be a perfect fit for much of the current software development landscape, neither are many of the currently "superior" languages a good fit for the landscape C++ was designed for (every tried to run Java or C#.NET on an 80286 machine with 640K RAM?  [Given, current C++ compilers probably wouldn't run there either, but that is implementation more then language dependant, and in any case it is possible to "scale back" C++ so that it will operate in those environments, something which is more difficult with newer languages requiring hefty runtime environments.)

In any case, comparing an older language to newer languages which have had much more cumulative experience (including that gained by using the older language) to build from AND were designed with different goals in mind is not only unfair, it is in many ways ludicrous.  The reality is that there is no perfect language.  Every language makes tradeoffs, every one has good things and bad things, and every one is better for solving some problems, and worse at solving others.

There is no silver bullet, partly because there are so many targets and the targets keep changing.  You don't use a BB to shoot an elephant, or an elephant gun to kill a fly, and to think that there should be one "silver bullet" language is IMHO a misconception of the entire computing problem domain.

Admitted, C++ has flaws, but so does every language.  Other languages may "fit" better with the way we think about programming now, but the way we think about programming has changed (newer languages are more interested in "pure" object orientation, where as C++ was more interested in interoperability with C, low level manipulation and raw speed.

And let us not forget that C++ was a much simpler beasty when it started out.  It is always dangerous (in my opinion) to compare old languages which have been forced to mutate and evolve to new languages which have not.  Current languages such as Java and C# have not had to evolve as much as C++, and I doubt they will adapt as well to future changes as vast as those that have occurred since C++ was designed.  If they do survive, I highly doubt that they will be much "prettier" then C++ is now.  You can already see a certain amount of this in Java which is much younger then C++ (by language standards).

Despite the massive change in the software landscape and the "inferiority" of C++ compared to other languages, you still find instances where companies are using C++, and in some cases re-writing large portions of codebases in other languages (usually Java) in C++ because they have "hit a wall" with their language of choice (often acceptable speed issues).

No matter what Java people say (and I'm not really anti-any language, I believe you should choose the best tool for the job from the tools you have on hand), Java isn't a good fit for end user, UI intensive, desktop programming.  I HAVE desktop programs which use Java and while they work, and have many nice features they are noticeably slow, sometimes painfully so.

Developers using C# and .NET are still shaking out the wrinkles, and haven't had a chance yet to encountering all the flaws, although several people using .NET have already expressed serious doubt in the viability of (the current generation at least) to produce shrink wrap software.  In fact, I would go so far as to say C#'s primary claim to design superiority is that it is built on things which have worked in other languages, rather then any particularly revolutionary design advances.

The fact that C++ has adapted as well as it has to the current software landscape, even if much of that adaptation isn't very pretty, is, to my mind, a testament to the quality with which it was designed.  In retrospect with the hind site of several decades we can point to things which should have been done differently, but I have to assume that like all of us the designers of C++ did the best they could with what they had, made a good guess at the time, and took their best shot.

It reminds me of the "Ape Language" question.  Who is smarter, the ape who managed to learn "only" 100 human words/signs, or the researcher who managed to learn NO ape words/signs in the same amount of time.  Which is ultimately the superior language, the one which has adapted to a drastically altered landscape, or the one which was designed only for that landscape.  That depends on your definition of "superior", and the landscape in which you work, of course.

And, although I don't have specific details, I have the impression that C++ has evolved better to fit more divergent needs then languages which pre-dated it, such as Machine Code, ASM, COBOL & FORTRAN.  At the same time, as you pointed out, a major part of C++'s problem has been the attempt to adapt it to do a little bit of everything which makes it, by definition, a "jack of all trades, ideal for none" type of language.

Perhaps C++ is no longer a good fit for the current software landscape, and will soon be relegated to the same position as ASM (a specialty language used for specific tasks, and to create low level modules which are only used by other languages), but let's give credit where credit is due.

C++ has evolved tremendously, is still capable of solving real world problems, even with all it's wrinkles.  It has had a dramatic impact on newer languages including contributing experience to those design of those languages, and with newer languages being more and more domain specific, with wider areas of inappropriateness, it is still very difficult to find a language which is as generically applicable to as wide a range of variation both in terms of style and project type as C++.

So while it may not hold up when compared to a domain specific language in that languages domain, it is still more widely applicable then many newer languages.  In the end, I always come back to the same conclusion, use the best tool from what you have available.

[ Parent ]

Taking the Vogon analogy further (4.75 / 4) (#54)
by arvindn on Tue May 27, 2003 at 04:39:12 AM EST

It turns out that all the problems templates solve were already solved better, before templates were ever introduced, by other languages.

Curiuosly, this has a close parallel in the Hitchhiker's guide as well. It turns out that all the problems that the destruction of the earth solved were already solved better, before the Vogons ever reached the earth, using other technologies. Indeed,

"...a wonderful new form of spaceship drive was at this moment being unveiled at a government research base on Damogran which would henceforth make all hyperspatial express routes unnecessary." (Chapter 5).

So you think your vocabulary's good?

Bah. C++ itself (2.55 / 9) (#61)
by porkchop_d_clown on Tue May 27, 2003 at 08:01:24 AM EST

Is a bastard language, gluing OO onto what was intended to be a highly efficient, low-level language.

If you want OO, use something that was designed for OO from the ground up. If you want efficiency, use C. If you want to suffer, use C++.


--
I only read Usenet for the articles.


Okay (5.00 / 1) (#63)
by codemonkey_uk on Tue May 27, 2003 at 08:08:37 AM EST

So what if you want efficiency and OO?
---
Thad
"The most savage controversies are those about matters as to which there is no good evidence either way." - Bertrand Russell
[ Parent ]
Use Visual Basic (nt) (2.66 / 3) (#65)
by reklaw on Tue May 27, 2003 at 08:19:51 AM EST


-
[ Parent ]
Real Soon Now (tm) n/t (2.00 / 4) (#66)
by SanSeveroPrince on Tue May 27, 2003 at 08:19:59 AM EST



----

Life is a tragedy to those who feel, and a comedy to those who think


[ Parent ]
ever looked at ocaml? (4.00 / 4) (#75)
by jacob on Tue May 27, 2003 at 09:01:42 AM EST

OCaml. Functional/object-oriented, high-tech, very efficient (beat C++ in Doug Bagley's Computer Language Shootout), geared towards solving practical problems.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
C# (nt) (1.50 / 2) (#124)
by Run4YourLives on Tue May 27, 2003 at 02:23:19 PM EST



It's slightly Japanese, but without all of that fanatical devotion to the workplace. - CheeseburgerBrown
[ Parent ]
C# (none / 0) (#217)
by ucblockhead on Wed May 28, 2003 at 01:16:27 AM EST

C# is no where near as efficient as C++.
-----------------------
This is k5. We're all tools - duxup
[ Parent ]
well that depends (none / 0) (#223)
by Run4YourLives on Wed May 28, 2003 at 02:29:08 AM EST

on the programmer... not the C# one, the C++ one.

It's slightly Japanese, but without all of that fanatical devotion to the workplace. - CheeseburgerBrown
[ Parent ]
yes and no (none / 0) (#244)
by ucblockhead on Wed May 28, 2003 at 11:13:00 AM EST

A poor programmer can screw up the efficiency in both cases. Though you are correct in that it is certainly easier to do so in C++.
-----------------------
This is k5. We're all tools - duxup
[ Parent ]
Lol. The more abstract the data structures (4.00 / 1) (#137)
by porkchop_d_clown on Tue May 27, 2003 at 03:58:40 PM EST

the more work the compiler and the CPU have to do. It's a basic trade off. Nothing is as efficient as hand-coded assembler, but nothing is harder to write in. The "higher level" the language, the easier it is to code complex concepts and structures, but the harder the CPU has to work to make the thing dance.

Heh. Right now I spend part of my day writing in FORTH and part of my day writing in C#. I've got a very keen awareness of the difference between computational efficiency and developer efficiency right now...


--
I only read Usenet for the articles.


[ Parent ]
Have you tried Eiffel [nt] (none / 0) (#252)
by trixx on Wed May 28, 2003 at 12:28:20 PM EST



[ Parent ]
Objective-C (3.00 / 1) (#74)
by Random Number Generator Troll on Tue May 27, 2003 at 09:01:38 AM EST

I have a friend at work who knows C++ very well, and is interested in learning ObjC. I spent about 3 days trying to learn C++ but I got distracted and never got very far, however I know ObjC quite well now. But this friend keeps asking wether ObjC deals with templates. Before I read this article I didn't have a clue because I didn't know what templates where, but now I think I can hazard a guess...

I guess that because I can just pass in, and return id to a function (id is a pointer to an object of unspecified type), and then just say [incomingObject someAction], there is no need for templates? If not, how does ObjC go about solving the templates problem?

compilation-time type checking (5.00 / 2) (#80)
by Chep on Tue May 27, 2003 at 09:26:56 AM EST

I'm no ObjC wizard, but IIRC, [incomingObject someAction] allows you to pass an incomingObject which lacks said someAction method (in which case it will throw an exception), and still produce an executable. Assuming you have:

template<typename T> void foo(T& incoming_object) {
incoming_object.someAction();
}

then calling foo(bar) will *compile* only if said bar object is of a type which includes a (const or non-const), (volatile or not volatile) method called someAction with no non-default parameters.

The id you mention is little more than a void pointer (actually, it's a pointer to the common single root class, from which all ObjC classes derive implicitly and mandatorily).

The closest equivalent in C++ to mimic ObjC behaviour (or the belief I have about its behaviour) would be using the following construct:

void foo(CommonClass& incomingObject) {
dynamic_cast<HasSomeActionClass&>(incomingObject).someAction();
}

(this assumes that all possible incomingObject are of classes derived from CommonClass (which you have to define), and that all objects which have a method called someAction() are derived of class HasSomeActionClass (which you made deriving from CommonClass as well)).


--

Our Constitution ... is called a democracy because power is in the hands not of a minority but of the greatest number.
Thucydide II, 37


[ Parent ]

No no no... (3.00 / 1) (#87)
by Random Number Generator Troll on Tue May 27, 2003 at 10:42:03 AM EST

I think you misunderstood...

What I meant, was that because you can pass in (id) to any function and return (id), it kinda means that every Obj-C function that uses (id) is basically a template. However, because Obj-C is truly dynamic in terms of type, you don't get any overhead from creating a function for each possible type you pass in when you come to compile.

Is this true?

[ Parent ]
My bad (none / 0) (#88)
by Random Number Generator Troll on Tue May 27, 2003 at 10:50:25 AM EST

you don't get any overhead from creating a function for each possible type you pass in

Oops, of course you get overhead because you still need a function for each type. But the main point is correct?

[ Parent ]
nope. (5.00 / 2) (#93)
by Chep on Tue May 27, 2003 at 11:18:18 AM EST

AFAIK, ObjC does not make an on-the-fly compilation of instance bodies when it finds out one needs to be instanciated with another type. From what I remember, what the ObjC runtime is really doing is similar to dynamic_cast<T&amp;> before each method invocation, to check that said type is indeed able to support this or that message.

ObjC's "id" mechanism is really Polymorphism (we have lots of classes which relate to a common ancestor, about which we know something everywhere -- but this happens *at runtime*), not Genericity (we have lots of classes which might have common traits, which we may or may not be able to process the same, all we know about these classes is that we're able to process them if they have the traits we need during processing -- *at compile time*)

Though I don't see what would forbide it, I don't think any "widespread" ObjC implementation out there attempts to provide implementations of each method using every possible type (which would actually be /worse/ than C++'s template bloat). I don't see what prevents the compiler from using profile-based or source flow analysis-based methods to know which method bodies to provide a static-typed implementation for, but the base mechanism is really dynamic type checking at run time (as opposed to C++ templates' static type checking at compile time).

--

Our Constitution ... is called a democracy because power is in the hands not of a minority but of the greatest number.
Thucydide II, 37


[ Parent ]

Compilers for Java and Self ... (5.00 / 1) (#103)
by Simon Kinahan on Tue May 27, 2003 at 01:31:44 PM EST

... actually do this. They use partial-evaluation based techniques at run-time to find the most commonly executed specialisations of methods and optimise for those cases.

Regarding polymorphism: Genericity is actually a kind of polymorphism, generally called parametric polymorphism. The "normal" kind of polymorphism in OO languages is interface polymorphism. You can combine the two: Some languages with parametric polymorphism allow the type parameters to be constrained to have a particular interface. This is called bounded polymorphism.

Simon

If you disagree, post, don't moderate
[ Parent ]

Yep, but the're interpreted/JITed (none / 0) (#116)
by Chep on Tue May 27, 2003 at 02:13:23 PM EST

Unlike the usual ObjC case, which is usually compiled (with a heavy runtime linked in).

Now, on bounded polymorphism, you can do that in C++ as well, using more template magic <grin>.

--

Our Constitution ... is called a democracy because power is in the hands not of a minority but of the greatest number.
Thucydide II, 37


[ Parent ]

Yep (none / 0) (#133)
by Simon Kinahan on Tue May 27, 2003 at 03:28:16 PM EST

Unfortunately its very hard to do the necessary analysis for partial evaluation on languages that allow pointer aliasing, which includes most assemblers, however much processor architects might wish it didn't.

On bounded polymorphism, how would you do it in C++ ? You can certainly use a static assertion to constrain the type parameter, but unfortunately that doesn't show up in the type of the template instance (although I guess you could make it), and you can't leave the type parameter unbound and pass values of the interface type back and forth, as most generic Java variants allow.

Simon

If you disagree, post, don't moderate
[ Parent ]

Bounded polymorphism & C++ (none / 0) (#220)
by Chep on Wed May 28, 2003 at 01:25:19 AM EST

Oh, that comment was a joke. That's the usual C++ answer "yes we can do that, but you need 2kloc of unreadable template magic to achieve it". I am positive I saw something along these line in an issue of CUJ, circa 10-18 months old, but it probably has more restrictions and caveats than what you're thinking of.

I haven't yeat read Alexandrescu's MC++D, and I must.

--

Our Constitution ... is called a democracy because power is in the hands not of a minority but of the greatest number.
Thucydide II, 37


[ Parent ]

Objective C is more dynamic than that. (5.00 / 1) (#285)
by thoran on Thu May 29, 2003 at 04:58:47 PM EST

I don't see what prevents the compiler from using profile-based or source flow analysis-based methods to know which method bodies to provide a static-typed implementation for
[...]


The semantic of the language forbids such optimisations. Even if the compiler is perfectly sure that the argument of a function is of type Foo, it cannot generates a direct call to Foo method, because it is possible to override some methods of Foo at runtime (posing mechanism).

[ Parent ]
Tell your friend "no" (5.00 / 2) (#173)
by epepke on Tue May 27, 2003 at 06:55:40 PM EST

Seriously, if he's asking about whether Objective C does templates or not, it means that he is so entrenched in the C++ mindset that he is used to using C++ templates to solve problems.

Saying that the kind of problems that templates are supposed to solve in C++ don't normally arise in Objective C due to selector dynamism, etc., or whatever isn't going to convince him, and you might as well save your breath.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
How to deal with it... (5.00 / 1) (#214)
by joto on Wed May 28, 2003 at 12:39:59 AM EST

The simple answer to that question would be: "No, objective C does not support templates". Then before he has time to respond, say "but objective C++ does", then walk away before he has any more time to ask question (especially, since you know that whatever he asks next, you will have no clue about...)

[ Parent ]
C++ Articles (3.66 / 3) (#90)
by CaptainSuperBoy on Tue May 27, 2003 at 11:06:49 AM EST

Every successive "What's wrong with C++" article makes me a little more confident in my decision never to learn C++. Luckily I have never needed to use the language in the past, and all my code nowadays is pretty much in a high level language. I'm a firm believer that unless there is an honest need to do it low level, high level and/or RAD is probably better.

--
jimmysquid.com - I take pictures.
I agree (none / 0) (#99)
by Kalani on Tue May 27, 2003 at 01:12:25 PM EST

I'm a firm believer that unless there is an honest need to do it low level, high level and/or RAD is probably better.

Of course, you can't really make that judgement unless you have a pretty thorough understanding of what low level programming involves. I don't think that it's too much to ask for people to understand computational concepts at least down to the level of digital logic on circuit boards.

It certainly is a waste of time to rebuild critical tools from the ground up just because they're incidentally related to your project and your development software didn't happen to include them (by the way, I once had to write my own implementation of regular expression based text parsing for Delphi because it wasn't included). So I guess I'm saying that I agree with you as long as you learn about the low level details by some other means than through C++.

-----
"I have often made the hypothesis that ultimately physics will not require a mathematical statement; in the end the machinery will be revealed
[ Parent ]
Sure (none / 0) (#102)
by CaptainSuperBoy on Tue May 27, 2003 at 01:29:10 PM EST

Yes, one should be aware of how the low-level implementations work. I am humble enough to admit that someone else has already done it better than I would given my project's timeframe.

--
jimmysquid.com - I take pictures.
[ Parent ]
I am boggled... (none / 0) (#100)
by Control Group on Tue May 27, 2003 at 01:17:29 PM EST

...by your implication that C++ isn't a high level language. What, pray tell, is a high level language, in your mind?

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]
perhaps he is talking of...... (none / 0) (#104)
by modmans2ndcoming on Tue May 27, 2003 at 01:36:03 PM EST

logo? :-)

[ Parent ]
Programming elitism (2.00 / 1) (#107)
by CaptainSuperBoy on Tue May 27, 2003 at 01:42:30 PM EST

I fire people who think like that. Language elitism gets you nowhere. If logo were the right tool for the job, I'd use it.

--
jimmysquid.com - I take pictures.
[ Parent ]
On this one, (none / 0) (#117)
by Control Group on Tue May 27, 2003 at 02:14:25 PM EST

I'm with you completely. Use what works for you...if you can get it done with a DOS batch file, more power to you.

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]
oh my god...... (none / 0) (#201)
by modmans2ndcoming on Tue May 27, 2003 at 10:46:06 PM EST

it was a joke!!!! I mean comon...give me one thing logo is good for other than putting young children in the programmer's mindset?

the reason I said what I siad was becasue Logo is very easy to program...it is abstracted to uslessness.


[ Parent ]

High-level (none / 0) (#106)
by CaptainSuperBoy on Tue May 27, 2003 at 01:41:11 PM EST

I guess where you put C++ in the high-low range depends on your background. I'm strictly a software guy, a hardware guy would probably put C++ on the high end of the spectrum. I would call Java, C#, VB or perl high level languages. C or C++ would only be useful to me if I had to write compiled code that worked very close to the level of the hardware, which is why I'd classify them as low level languages. If you're saying assembly-level languages are the only low-level languages, I have no use for them with the type of programming I do.

--
jimmysquid.com - I take pictures.
[ Parent ]
See my reply to Work, above (none / 0) (#113)
by Control Group on Tue May 27, 2003 at 02:05:06 PM EST

But a bit more to address this particularly: I might be willing to grant that Java & C# are, to some extent, higher-level than C++, since they don't actually compile to assembly. VB, though, is exactly the same distance from the metal as C++. The only difference is in the coding environment (I've only ever seen VB written/compiled in robust, GUI IDEs) - for one thing, it allows you to draw forms, buttons, and whatnot, and attaches the necessary code without you typing it. Which is certainly neat, and is the reason I use VB from time to time.

However, that's just a function of the environment, not of the language. You could do exactly the same thing with C++, given the proper IDE (in fact, isn't this what VC++ does? I've never used it, so I'm not certain, but I always sort of assumed...).

WRT Java/C#, as I said, I'm tempted to let you have that. I'm up in the air on that one, since Java, in my (limited) experience is pretty much C++ sans pointers and multiple inheritance...and I'm unconvinced that just removing features from a language can make it "higher level." I can't speak to C#, since I've never seen even a line of code in it.

Out of curiosity, how high level would you put Pascal/Delphi?

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]

Pascal/Delphi (none / 0) (#120)
by CaptainSuperBoy on Tue May 27, 2003 at 02:15:29 PM EST

What I am seeing is that we all have completely different definitions of high/low level. The more I think about it, I define it by the language's reliance on an API. I would hesitate to call Visual C++ anything, since you have big libraries like MFC but you could also write code that hits the hardware (almost) directly. The reason I say VB is higher level than C++ is because it wraps Windows messages with its own event model that is much easier to use. It also hides the ugly details of COM from the user, most of the time.

I haven't used Delphi or Pascal since freshman year in college, so I don't know... Due to the size of their API and the way the languages are most likely going to be used I would say they are high level as well, although I understand Delphi is very flexible.

--
jimmysquid.com - I take pictures.
[ Parent ]

Define it by what the language doesn't let you do? (none / 0) (#162)
by curien on Tue May 27, 2003 at 05:56:04 PM EST

C++ is very flexible, much more so than any other language I've even a passing familiarity with. It can be used to write close-to-the-machine system code, and it can be used in just as abstract a way as C# or Java.

Oh, and the comment about Java not having pointers is way off. Java "references" are just pointers by another name. They're not quite as "dumb" as vanilla C pointers, but they're close. It seems to make Java programmers feel safer to pretend that they're something different, though.

Anyway, back to my point. A lot of the problems with C++ come when you use it as a systems programming language or the associated features (which are often messy). But if the programmer can excercise a little cunning and a bit of restraint, C++ can be used in as high a level as you care to imagine.

If C++ isn't "high level" because it refuses to force you into it, then so be it. I, however, prefer to define the height of a language's level by what it allows you to do, rather than what it prevents you from doing.

--
All doctors do is support weak genes. Might as well be communists. -- sigwinch
[ Parent ]

Pointer aliasing (none / 0) (#230)
by Simon Kinahan on Wed May 28, 2003 at 04:02:36 AM EST

Having Java-style references rather than C-style pointers is actually as very important difference, one Java shares with most other languages. It not only simplifies programming, but it enables compile-time and run-time optimisations that are impossible for C, and actually makes processor architecture much easier too.

The important point is that in Java (as in most other languages) arbitrary arithmetic is never performed on pointers. At the very most, they are used in simple pointer+offset indexing operations, which are supported directly by most instruction sets. Since pointers are always to the start of objects, and indexes are constrained to be within the same object, checking pointers for equivalence is very straightforward. Mention pointer aliasing to a compiler writer some day. But leave a few hours for the ensuing rant.

Simon

If you disagree, post, don't moderate
[ Parent ]

Aliasing (none / 0) (#235)
by curien on Wed May 28, 2003 at 07:24:46 AM EST

That's what the restrict keyword is for in C, so technically, Java doesn't have any such advantage over "C-style" pointers. C++ doesn't have that keyword, but in C++, you shouldn't be using vanilla pointers unless you have to; use smart pointers instead. This is why I said that C++ is as high level as the programmer is willing to make it. If you do dumb things like use regular pointers where you should use smart ones, you deserve what you get.

--
All doctors do is support weak genes. Might as well be communists. -- sigwinch
[ Parent ]
I must admit ... (none / 0) (#261)
by Simon Kinahan on Wed May 28, 2003 at 04:33:46 PM EST

... I'd forgotten about "restrict", but I think it only solves part of the problem. The main focus seems to be on making loop-unrolling optimisations for fine grained paralellism possible. You might be able to use it to do some other compile-time stuff too, but since the compiler can't actually check the restriction, widespread use would be pretty error-prone.

Languages that don't allow arbitrary pointers have an advantage over C (even with restrict) in that there is a rapid run-time check for pointers to the same thing. Because of pointer arithmetic, this is much harder (NP-something IIRC - not something you want to do an runtime). This enables dynamic optimisation a la Self and HotSpot, and also makes it much easier to write a decent garbage collector. Even processor architecture is simplified if pointer arithmetic is removed from the instruction set (although admittedly this is hard to guarantee at that level).

Pointer arithmetic is one of those "powerful features" are worth giving up because of what you get in exchange. C++ smart pointers don't help, by the way, because neither the compiler not the runtime knows the restraints you've placed on your "smart" pointer type.

Simon

If you disagree, post, don't moderate
[ Parent ]

Ah... I see now (none / 0) (#265)
by curien on Wed May 28, 2003 at 05:18:20 PM EST

restrict is a complete solution, as far as C goes. It doesn't actually change anything about the pointer (it can still alias just like normal pointers), but it is a signal to the compiler to generate optimized code that assumes the pointer isn't aliased. In traditional C fashion, it's then left up to the programmer to actually enforce this requirement. This isn't usually as hard as it seems, but when not taken into account, it can cause horrible run-time failures.

Yes, I see what you mean about smart pointers not helping with the aliasing problem.

--
All doctors do is support weak genes. Might as well be communists. -- sigwinch
[ Parent ]

I call it middle level. (none / 0) (#109)
by Work on Tue May 27, 2003 at 01:50:33 PM EST

It has low level functionality (pointers), but high level concepts (classes).

A strictly high level language is something like LISP, and java. Neither of those has much in the way of direct hardware interface.

[ Parent ]

I suppose I could be wrong (none / 0) (#110)
by Control Group on Tue May 27, 2003 at 01:54:48 PM EST

But, to me, "high level language" means a language which is not inherently hardware-dependent. Theoretically, you can compile the same C++ on widely disparate hardware - it doesn't depend on machine-specific instructions.

Obviously, this is something of an idealistic view of the language which doesn't actually hold up in real life, but the definition is still valid.

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]

i suppose... (none / 0) (#112)
by Work on Tue May 27, 2003 at 02:02:45 PM EST

but then you're binaryizing it into 'low' and 'high'. Theres too much variety in languages today for that. What about in-line assembler supported by many C compilers? Clearly a low level functionality. You won't see anything like that in lisp or java.

While you can take C (and C++) and compile across multiple platforms, you can still directly access the hardware via pointers. To me, this makes them somewhere inbetween the low level of assembler and the high level of java or lisp.

This pointer functionality also ties many programs to specific hardware architectures. The way to access a piece of hardware directly (ala drivers) is it reserves memory addresses. By writing the data to those specific memory addresses, you're tying it directly to the hardware architecture its built for. Thus a driver written for an intel architecture won't work on sparc. And of course, most drivers are written in C.

[ Parent ]

Granted, but (none / 0) (#115)
by Control Group on Tue May 27, 2003 at 02:12:28 PM EST

You don't have to use inline assembly in C++ for it to be C++. The implication of your logic is that if you somehow inlined assembly into Java, you would make the language lower level. I don't think I can buy into that - how does adding a feature make the language closer to the metal?

Pointer functionality is a much tougher one for me to address - I can't begin to imagine programming in C++ effectively without using pointers. But I think I could get away with programming without tying those pointers to the hardware, necessarily...I'll have to think about that. Still, using drivers as your example is a bit iffy, to me. Is it even theoretically possible to write machine-independent drivers? If not, and if the possibility of being machine-dependent is what separates high- and higher-level languages, then the clear implication is that you can't write drivers in a sufficiently HLL...which means that, by definition, higher level languages are less functional than lower.

This result may be true, I suppose, but it's certainly a result I don't intuitively care for.

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]

drivers. (none / 0) (#123)
by Work on Tue May 27, 2003 at 02:21:31 PM EST

Depending on your definition of a 'driver'. There are for example, JDBC java 'drivers' written entirely in java. But those are for merely connecting java program -> database software. A software -> software connection.

For hardware drivers though, I don't see how you could make one really machine indepedent. At some point you have to specify the physical address by which to access the hardware directly. You might be able to do some kind of layering, or design it to make it EASIER to cross platform develop drivers, but you still have to have physical addresses compiled or read in somewhere. And then you start sacrificing efficiency with all those layers.

'less function' as opposed to 'more functional' is irrelevant. In fact, we should focus more on making languages less of a jack-of-all-trades that does nothing WELL, and more focused on making it easier for languages that specialize on certain functions to work well with each other.

Never send a hammer to do a screwdrivers job.

[ Parent ]

I fear change (none / 0) (#128)
by Control Group on Tue May 27, 2003 at 02:33:48 PM EST

Perhaps I'm becoming stodgy in my old age (he said from his lofty height of 25 years), but there's something about your special-purpose language proposal that strikes me the wrong way. It could just be fear of change - after all, I had to learn C++, all those young whippersnappers ought to, too!

On the other hand, it would certainly be a fascinating step in the evolution of programming languages. You start out with a hardware-specific paradigm; each machine has its own method of programming it. You move to more abstraction, languages are conceptually machine-independent. Step next is to use languages that are both conceptually and functionally cross-platform. Then we'd move into a region where languages became task-specific. Interesting.

There's a niggling doubt in my head regarding Turing-completeness, but I haven't thought it all the way through, so it might or might not be relevant.

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]

going low-level in lisp (none / 0) (#259)
by han on Wed May 28, 2003 at 04:26:54 PM EST

As a matter of fact, you will see in-line assembler in just about any native compiling lisp.  It isn't standardized, but it is invariably there.  In practice interfacing to the high-level lisp object model will require studying the lisp system internals -- any assembler experience you may have won't help much there.

However, the fact that in-line assembler is reasonably usable in lisp is not directly for the benefit for users, but because it makes it easier for the compiler implementor to write special-case low-level optimizations.  Usually users are expected to achieve the last bits of speed optimization by calling "foreign" C or assembler functions, or more interestingly, by constructing specialized lisp code sequences and passing them through the compiler at run-time.


[ Parent ]

arch-specific drivers? (none / 0) (#268)
by cwitty on Wed May 28, 2003 at 06:15:52 PM EST

Linux uses the same driver source code for intel and sparc (and many other architectures).

Very few drivers have memory addresses hard-coded.  For example, the address for accessing a PCI card may depend on the card, the motherboard, the BIOS, which slot the card is in, how many of that kind of card are present, etc.

[ Parent ]

Hm. (none / 0) (#114)
by jacob on Tue May 27, 2003 at 02:05:53 PM EST

It seems to me more of a question of what the underlying programming model of the language in question is. Assembly is low-level because the model you use for computations is very machine-like: that of a processor with big array of memory on the side and some registers. (There's nothing inherently hardware-specific about this. SPIM, for example, is a hardware-independent MIPS-architecture assembly interpreter, so programs written in MIPS assembly language are to some extent cross-platform compatible in a perverse way.) C has a somewhat more abstracted view, but still it's basically that you've got a bunch of statements and a big bag of memory cells and a stack somewhere.

Scheme, on the other hand, mostly discourages that point of view, encouraging you to think of programs expressions that get rewritten by a series of abstract rules into a value. Memory, processors, stacks and the like aren't supposed to be what you're thinking about as you write your programs. That's what I think of as a high-level language.

It's tempting to say C++ falls in the middle, but really it doesn't. It falls squarely on the C side of the divide, encouraging you to keep exactly the same model of computation that works for C, so in my mind it's pretty clearly a low-level language (albeit with a bunch of concepts you find more often in high-level languages).

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

ive used SPIM (5.00 / 1) (#118)
by Work on Tue May 27, 2003 at 02:14:40 PM EST

quite handy. When you think about it, C structures are similar to SPIM's pseudoinstructions, in that its fairly easy to translate them directly into the real hardware assembler instructions.

[ Parent ]
Fair enough. (none / 0) (#122)
by Control Group on Tue May 27, 2003 at 02:20:10 PM EST

This, to me, is an entirely valid point of view - and not one I had considered before. I'd always thought about it in terms of what you could do with the language, rather than the sorts of the things the language leads you into doing. I didn't put that very well, but I hope you take my meaning.

If nothing else, this little discussion has demonstrated that the weight of popular opinion is certainly against me.

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]

What about Java bytecode? (none / 0) (#280)
by ajf on Thu May 29, 2003 at 11:16:31 AM EST

But, to me, "high level language" means a language which is not inherently hardware-dependent. This fits your definition, but I don't think it's fair to call it "high-level":
Method java.lang.String testStringBufferChained()
0 new #2 <Class java.lang.StringBuffer>
3 dup
4 invokespecial #3 <Method java.lang.StringBuffer()>
7 astore_1
8 aload_1
9 ldc #4 <String "this ">
11 invokevirtual #5 <Method java.lang.StringBuffer append(java.lang.String)>
14 aload_0
15 ldc #6 <String "is ">
17 invokespecial #7 <Method java.lang.String makeString(java.lang.String)>
20 invokevirtual #5 <Method java.lang.StringBuffer append(java.lang.String)>
23 ldc #8 <String "a ">
25 invokevirtual #5 <Method java.lang.StringBuffer append(java.lang.String)>
28 aload_0
29 ldc #9 <String "test">
31 invokespecial #7 <Method java.lang.String makeString(java.lang.String)>
34 invokevirtual #5 <Method java.lang.StringBuffer append(java.lang.String)>
37 pop
38 aload_1
39 invokevirtual #10 <Method java.lang.String toString()>
42 areturn
(Original code here, in case anyone reading this is interested in string concatenation in Java.)

"I have no idea if it is true or not, but given what you read on the Web, it seems to be a valid concern." -jjayson
[ Parent ]
You Didn't Ask Me, But (none / 0) (#111)
by Lagged2Death on Tue May 27, 2003 at 01:58:21 PM EST

Whereas it's fairly clear that Assembler is low-level and Lisp is high-level, C and C++ are uniquely difficult to fit into a high-level / low-level spectrum. They can be either or both, and consequently can be applied to a vast array of real-world problems, although it's not always pretty.

Starfish automatically creates colorful abstract art for your PC desktop!
[ Parent ]
For lack of a better parent (none / 0) (#125)
by Control Group on Tue May 27, 2003 at 02:26:15 PM EST

I'll attach this to the original post.

I've officially changed my stance on this issue (and it only took like 30 minutes...this might be some sort of record for me). Based on the response from a variety of people (most notably jacob and CaptainSuperBoy), I've expanded my definition of high/low level.

I still wouldn't refer to C++ as a low level language, but I can certainly now see it as either middle level, or a hybrid (as compared to languages such as Scheme, Java, or C#). The remaining difference, in my mind, is that C++ doesn't force you to take the hardware into account, so it isn't low-level - but it certainly leads you down that path more than a higher-level language does.

So much for making an incisive point on K5 today. ;)

***
"Oh, nothing. It just looks like a simple Kung-Fu Swedish Rastafarian Helldemon."
[ Parent ]

Thats a shame (none / 0) (#190)
by sesh on Tue May 27, 2003 at 08:32:18 PM EST

If you arent willing to put a lot of time into learning a language (and this is completely valid) then C++ (and certainly templates) are not for you.

I dont know anyone who has used C++ for a significant amount of time have any of the problems described in the "whats wrong with C++" posts here, however. Its a powerful language, and templates are an increadibly powerful tool. Of course, it is really easy to shoot yourself in the foot if you dont know what you are doing.

While Scheme, Lisp and other functional languages may implement many of the features offered by templates, they are completely different languages, and you are kidding yourself if you believe they can be comprehensively compared as 'better' or 'worse'.

Its a shame that you would decide never to learn a language because a bunch of people on a blog dont like it.

[ Parent ]

Haskell classes (5.00 / 2) (#96)
by carlossch on Tue May 27, 2003 at 12:11:02 PM EST

I liked the article, but I think you could have mentioned Haskell classes, specially when you talk about the problems with universally quantified types and needing to know properties of those types. The Haskell type system allows just that. You define a class:

class Eq a where
  equal :: a -> a -> Bool

and then define instances of that class for each type in which you want to let the system know about the properties. For example:

instance Eq Int where
  equal :: Int -> Int -> Bool
  equal 0 0 = True
  equal 1 1 = True
...
  equal x y = False

(Obviously the real function does not work by enumarting all equal pairs.) Having defined a class, a function can use the operations this class provides by annotating the type accordingly:

sort :: Eq a => [a] -> [a]
(sort uses 'equal' at some point)

By the way, in all but some hairy cases, this annotation can be inferred by the type-checker.

There are many other nifty things you can do with Haskell's type system, but its original purpose is to solve exactly the sort of problem you describe up there.

Nice article, nevertheless.

Carlos
He took a duck in the face at two hundred and fifty miles an hour.

Haskell types (5.00 / 1) (#166)
by Three Pi Mesons on Tue May 27, 2003 at 06:22:47 PM EST

There are many other nifty things you can do with Haskell's type system
Ain't that the truth. The Haskell type system and its extensions are tremendously useful for playing with types. On the subject of templates, Simon Peyton-Jones presented an interesting "template" proposal for Haskell last year - it's very much like the macro system in Lisp, but with the difference that Template Haskell macros are first-class objects free from side effects, just like Haskell functions. This makes some optimisations significantly easier, and some algorithms easier to express.

Probably the highlight is a way to allow printf-style format strings to be type-checked. The system will actually emit a compile-time error if you write $(printf "%s") 17 (the $() signals macro evaluation). It's all done with regular Haskell parsing code, just at compile time rather than run time.

The need to support printf lies behind some of the more horrible parts of C and C++; variadic functions in particular are a real pain, very messy. This is a huge advantage of C++'s I/O syntax, whatever you may think of the "abuse" of the shift operator. It's a big win if Haskell can do printf-style formatting while still retaining type-safety.

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]

How much for a link? [nt] (none / 0) (#171)
by i on Tue May 27, 2003 at 06:41:59 PM EST



and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
How's this? (none / 0) (#176)
by Three Pi Mesons on Tue May 27, 2003 at 07:00:47 PM EST

Paper by SPJ and Tim Sheard (forgot about him, whoops).
http://research.microsoft.com/Users/simonpj/papers/meta-haskell/

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]
Links (none / 0) (#177)
by carlossch on Tue May 27, 2003 at 07:02:23 PM EST

The paper

Section from GHC, the Haskell compiler used to implement Template Haskell.

Google Template Haskell, you'll find plenty of stuff.

Carlos
He took a duck in the face at two hundred and fifty miles an hour.
[ Parent ]

Confused by ghc reference (none / 0) (#181)
by Three Pi Mesons on Tue May 27, 2003 at 07:32:23 PM EST

That's not in my manual (ghc 5.04.3) though the page says 5.04 at the top. Is it perhaps something for the next release?

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]
I really don't know (none / 0) (#208)
by carlossch on Tue May 27, 2003 at 11:45:10 PM EST

I found that in google, and figured it'd be a part of the regular compiler documentation (as it says so). But I can't find it in my local copy of the ghc docs either (also 5.04.3). The 'prev' and 'next' links and the inconsistent section number seem to indicate that it is indeed an unpublished version of the docs.

Carlos
He took a duck in the face at two hundred and fifty miles an hour.
[ Parent ]

OK, mystery solved (5.00 / 1) (#242)
by Three Pi Mesons on Wed May 28, 2003 at 10:40:08 AM EST

It's for ghc 6.0: the manual draft has that in section 7.5 - with version number updated. The release may be fairly soon, according to the development mailing list.

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]
GHC 6.0 was just released (5.00 / 1) (#290)
by carlossch on Fri May 30, 2003 at 12:12:32 PM EST

Just thought you'd like to know :)

Carlos
He took a duck in the face at two hundred and fifty miles an hour.
[ Parent ]

Thanks! (none / 0) (#302)
by Three Pi Mesons on Mon Jun 02, 2003 at 11:27:01 AM EST

*grabs it*

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]
You don't need TH for type-safe printf (none / 0) (#174)
by carlossch on Tue May 27, 2003 at 06:58:36 PM EST

Olivier Danvy wrote a paper entitled Functional Unparsing, by Olivier Danvy, which uses continuation-passing style to construct a type-safe printf look-alike in Haskell. Very cool.

Carlos
He took a duck in the face at two hundred and fifty miles an hour.
[ Parent ]

Isn't that something different? (none / 0) (#182)
by Three Pi Mesons on Tue May 27, 2003 at 07:33:54 PM EST

Danvy describes a printf-a-like whose format specifier is a sequence of functions. This produces a printf with the required type, which may be checked. The point about the Haskell example is that it operates on a format string, carrying out parsing operations in order to derive the types, which is where the template business comes in.

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]
true (none / 0) (#204)
by carlossch on Tue May 27, 2003 at 11:23:27 PM EST

The need to support printf lies behind some of the more horrible parts of C and C++; variadic functions in particular are a real pain, very messy. This is a huge advantage of C++'s I/O syntax, whatever you may think of the "abuse" of the shift operator. It's a big win if Haskell can do printf-style formatting while still retaining type-safety.

I simply wanted to point out that it was possible without resorting to TH, but after re-reading your sentence, I might have misunderstood it as saying that there was no way to design something like printf in a type-safe manner in Haskell. My bad.

But I like the unparsing solution better, maybe because it does not rely on a compile-time step, being then much more flexible. I suppose Template Haskell will be great for stuff like type-safe XML parsing and transformation, automatic parse generation, etc.

Carlos
He took a duck in the face at two hundred and fifty miles an hour.
[ Parent ]

Oh, it is rather nice (none / 0) (#243)
by Three Pi Mesons on Wed May 28, 2003 at 10:41:31 AM EST

I hadn't come across the unparsing thing before; thank you for pointing it out. The Template way seems to have a lot more "mechanism" behind it; I certainly think it would be possible to do some fairly mind-bending things. I'm not too familiar with it - I'll have a play around once the new ghc comes out. Unparsing, on the other hand, looks to depend on nothing more than the already-existing Hindley-Milner type system, so it's more "pure", and looks a lot easier to understand.

:: "Every problem in the world can be fixed with either flowers, or duct tape, or both." - illuzion
[ Parent ]
Hard to read (4.66 / 6) (#105)
by egg troll on Tue May 27, 2003 at 01:37:54 PM EST

Is it just me, or is anything written with the code tag really hard to read. Rusty, why oh why did you elect to make said tag use a five-point font?!

He's a bondage fan, a gastronome, a sensualist
Unparalleled for sinister lasciviousness.

I second this (none / 0) (#126)
by thekubrix on Tue May 27, 2003 at 02:29:27 PM EST

I have good vision (20/20) and have never had a problem with my sight, but reading that font makes my eyes strain terribly.

Maybe we can have an option to select how we view the code tags?

oh god.........my eyes are bleeding

[ Parent ]

Browser dependant (none / 0) (#134)
by s8n on Tue May 27, 2003 at 03:32:11 PM EST

I have a feeling this is to do with your browser. In a good browser it renders perfectly fine or can be changed easily in others (look at the monospace font size). In worse browsers though it looks like you can't do much about it :(. Maybe rusty should investigate a better CSS setting to use for tt and code :).

[ Parent ]
Template bloat (4.50 / 2) (#119)
by lauraw on Tue May 27, 2003 at 02:15:22 PM EST

[I posted this as an editorial comment last night, but I decided to re-post most of it now that the article has been voted up....]

I'm a fan of template when they're used appropriately, but I don't think this article has enough emphasis on one of the major problems of C++ templates: "bloat" from template code generation. This can be a huge problem in large, complex programs with lots of instantiations.

This is a hot-button for me because I worked at Taligent, the company that was trying to build a new, object-oriented operating system in C++. It could be argued that templates were the main reason that Taligent didn't succeed. (It could also be argued that nobody really wanted an overdesigned, object-oriented operating system. :-) After the Taligent system had templates inflicted on it, well over 1/2 of the object code in the system was the result of template instantiations, and it caused a huge performance degradation and ridiculous resource requirements. We ended up inventing all sorts of ugly hacks to get rid of template code. Yuck.

The main reason for the bloat was that the C++ template system was basically designed as a glorified macro preprocessor. It would "paste" in the type-specific code for a particular instantiation when it was needed. A good contrast is the new "generics" that will be in Java 1.5. They designed it in such a way that new code isn't generated for each instantiation, which is a big improvement. Of course, this is much easier to do in Java because of all the run-time type information that the VM has available. And the downside is that you can't use generics on primitive types.

Still, C++ templates can be very useful in some circumstances. One of the techniques I used a lot was to create a templatized wrapper around a non-typesafe implementation class, which minimizes bloat and still gives you type safety. But my C++ is now rusty; I've been doing Java for the last few years.

-- Laura

Depends on the implementation (5.00 / 1) (#127)
by enkidu on Tue May 27, 2003 at 02:33:48 PM EST

Well, that depends on the implementation of the compiler. In the Tru64 compiler, cxx, the compiler sends all template-generated code to a cxx-repository where it is (theoretically) shared by all code. Of course this makes the dependency detection REALLY slow, but at least you don't have the same template code duplicated in every object file. Also, some compilers cull out template code during library link time.

Of course, this doesn't get around that fact that templates are an abomination, created by the fact that C++ objects aren't objects, but glorified structs.

[ Parent ]

Couple of questions (none / 0) (#121)
by yamla on Tue May 27, 2003 at 02:18:40 PM EST

I have two questions.  I'm not claiming that templates are wonderful, just clearing up a couple of possible misunderstandings I have about your article.

First, you claim that all the problems solved by templates were solved better by other languages (without using templates).  You use the example of ML that allowed polymorphic types without templates.  Is ML's type-inference scheme done at compile time or at runtime?  I'm assuming compile-time because templates are a compile-time feature of C++.

Second, the interfaces problem.  This is indeed problematic in C++ with templates.  However, it seems to me that there is nothing stopping you from defining an interface class and inheriting from that.  You can then use templates (specifically, partial template specialisation and template metaprogramming) to ensure your templated functions only operate on classes that inherit from your interface class, thus guaranteeing your templated functions are never used on types that don't fit the bill.  You may be able to do this even without defining an interface, my metaprogramming skills aren't up to the task, however.  Certainly, C++ doesn't require you to use templates like this but doesn't the fact that you can diminish your argument somewhat?

Couple of answers (none / 0) (#130)
by jacob on Tue May 27, 2003 at 02:52:59 PM EST

First: yes, that's right, ML handles type-checking entirely at compile-time. There's no runtime speed penalty associated with any of the technologies I mentioned.

Second: people do that. I think templates can be favorable in this case because they don't require any prior access to the source code or mixin-style extension to retroactively graft the proper operations onto a class, and could be slightly more efficient at runtime (no virtual method table).

There may be a further reason having to do with operations that require friend-level access, but I'm having trouble formulating a specific problem, so I might be wrong on that.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

Meta-programming (5.00 / 2) (#129)
by The Writer on Tue May 27, 2003 at 02:37:57 PM EST

I've been fascinated with metaprogramming for a while now. Although C++ templates may not be the best place for this (though you can argue both ways), I think metaprogramming should be a fundamental part of modern programming languages.

I've often written programs which are driven by tables, or other such data, which is constructed at compile-time rather than runtime. A common example is parsers and lexers. The reason tools like flex and bison exist is because it is very tedious to hand-code these tables. You want a higher-level representation (eg., grammar rules and lexical regexes) that a human can read and modify easily; you don't want to have to recompute state changes and rule numbers every time you tweak something in your grammar.

Other examples include game rules and customizations which are built into a game engine at compile-time: you don't want to wait to runtime to compute these things, 'cos it's unnecessary overhead. Especially if you have scriptable features. Or, you have a set of optional modules which you can compile into your webserver core; but you need to hand-tweak various module tables and lists before you can do it, or sprinkle tons of #ifdef's all over the code to make it work.

Meta-programming makes these problems much more manageable. For example, some of my programs have perl scripts that run at build-time, to generate from specification files large tables that would be very time-consuming and error-prone if done by hand. One such program has a flex input file that contains over 200 tokens, each returning a bitfield-encoded token number. The rules are extremely repetitive (they are essentially all permutations of a set of symbols), but the program is very sensitive to incorrect token numbers. Coding this by hand is ridiculously tedious, and can easily introduce errors that are hard to detect. The solution? A spec file that describes the set of symbols to permute, the rules for computing the token number, and a Perl script to transform this into a flex input file at compile-time. Essentially, the spec file is a kind of "meta-language" that describes the program at a higher level; the Perl script then translates this into a lower-level format that the compile tools understand.

This is all fine and dandy, except for one flaw: using Perl scripts, or other external means of meta-programming, is outside the domain of the compiler; so a lot of compiler features like type-checking, etc., are not directly accessible. This means you can get type errors because the tool you use failed to check for some boundary condition. Just like in C++ templates, the errors are very cryptic because they come from a different layer than the high-level meta-source that you're working with.

It would be great if compilers (or languages for that matter) can be built with meta-programming support in mind. It isn't really that hard: you can just have a meta section in the same programming language (or a suitable subset thereof), which is executed at compile time by the compiler, to produce (all or part of) the program text to be compiled. The compiler can then perform full syntax and type-checking for you.

Funny you use lex & yacc as examples (5.00 / 1) (#131)
by jacob on Tue May 27, 2003 at 03:01:34 PM EST

PLT Scheme comes with a parser-tools collection that does exactly what you're describing: it provides lex and yacc macros that take token and grammar rules and expand at compile-time lexers and parsers. Because it's implemented as macros, you don't have to do anything special to make any of your old program analysis tools work on your lex and yacc declarations -- they just work.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Interesting (none / 0) (#186)
by 0xdeadbeef on Tue May 27, 2003 at 08:29:28 PM EST

The same exists in C++.

Compiling with it makes my computer cry.

[ Parent ]

My problems with C++ templates (5.00 / 2) (#132)
by bhurt on Tue May 27, 2003 at 03:23:30 PM EST

I have two problems with templates as they're implemented in C++:

1) They break LALR(1) parsing of the language.  Consider the sequence of tokens:
    a < b , c > d ;
In C, how to parse this expression is clear: it's two comparisons joined by the comma operator, and is parsed like:
    ( a < b ) , ( c > d ) ;
Slightly odd, but perfectly legal and understandable.  C++ has that possible implementation, but in addition that above sequence of symbols could also be a variable declaration, introducing variable d with a type a<b,c>.  You can't parse C++ with classic compiler tools like Lex and Yacc.

This is a minor nit in some ways, but displays (I think) a larger flaw in C++ in general- that it was standardized without first being implemented.  An implementation before the template syntax was standardized would have brought up the LALR(1) parsing problem- which could then be easily solved simply by changing the < > to some other symbol combination- say, <[ ]>.

2) Code generation.  In general, when ML or Ocaml or the like generate a function with an abstract type, it only needs to generate one version of that function.  In Java terms- everything, even ints and booleans, are full objects, derived from class Object (not exactly, but close enough to get the idea across).  So the abstract function, at the assembly code level, only needs to handle Objects (or void*'s for C programmers).  To prevent this from being a performance problem, the compiler often "unboxes" small data types like ints and booleans.

With C++ every instantiation of the template with different arguments requires a different instantiation of the code.  If you have a template foo<>, foo<int> is one implementation, foo<char> another, foo<double> a third, and so on.  All of a sudden 2K of code becomes 20K, or even 200K (as the compiler finds it hard to know that it only needs one implementation for foo<bar*> and foo<baz*>).  Not to mention how many times I've seen stupid template definitions- like the array template which took not only the type of objects to hold, but also the number of items to hold.  So arrays of length 3 had a different implementation than arrays of length 4.  

This code specialization comes at a cost- a cost of larger executables, larger memory foot prints, and larger code working sets making cache less usefull.

Brian

Way off base (none / 0) (#135)
by Vulcannis on Tue May 27, 2003 at 03:56:19 PM EST

  1. A programming language is meant to be written and read by humans first.  Restricting a language based upon a restrictive grammar class simply to ease the burden of compiler writers is ridiculous.
  2. Modelling an array parameterized by a size is not stupid.  It is simply acknowledging and modelling that the array's size affects its type.  Whether the choice of modelling is efficient or otherwise a good idea given C++'s implementation of templates is another question.
If you actually care about 200k of memory then instantiating large numbers of templates is something you shouldn't be doing in the first place.

---
If it's not black and white, you're not looking close enough.
[ Parent ]
Oft repeated, but not true (5.00 / 1) (#141)
by jacob on Tue May 27, 2003 at 04:42:26 PM EST

Grammar simplicity and human readability are often not at odds. The example given:


A<12,34> c;

is ambiguous not just for parsers but for actual humans too. This is not usually a sign of a pellucid language.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

Not a common problem (none / 0) (#143)
by DodgyGeezer on Tue May 27, 2003 at 04:57:04 PM EST

It might be ambiguous, but after 8 years programming C++ full time, this is the first time I've encountered it.  Most people are sane enough not to write their code like that, letting the rest of us make the natural assumption that we're looking at a parameterised variable declaration.  Furthermore, most of us good programmers don't use names like "A" for classes, and so would recognise the class, especially if it's scoped with a namespace.  Finally, this is just another example of why hungarian notation is a good idea, especially when using C++.  ;)

Although you have a fair point about there being an ambiguity in the language, the reality of the situation is that it's unlikely to cause people problems.

[ Parent ]

Uncommon problems are the worst ones (4.00 / 1) (#146)
by jacob on Tue May 27, 2003 at 05:06:26 PM EST

Because when you encounter them they make no sense. codemonkey_uk posted a great one a while back in his diary (this is just copied from there):

Can you explain why the line marked "!!!" does not compile? Answers on a postcard...


template <int N, int M>
class A
{
public:
      static inline double f() { return N*M; }
};

void
func(double i, double j)
{
      std::cout << "i = " << i << " , j = " << j << std::endl;
}

#define TEST_MACRO(a,b) func(a,b)

int main()
{
      TEST_MACRO(5.0, A<3,4>::f());     // !!!

      return 0;
}

There is a lesson to be learnt here, but its debatable what the lesson actually is! :)




--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Lovely, preproc and templates (none / 0) (#153)
by DodgyGeezer on Tue May 27, 2003 at 05:19:48 PM EST

I was going to say it violated one of my coding standards: unnecessary use of the preprocessor.  But then I thought I could quote you instead ;)

[ Parent ]
heh :) (none / 0) (#155)
by jacob on Tue May 27, 2003 at 05:22:23 PM EST

I'm not saying it's a good thing to do, I'm just saying it could come up.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Testing, testing, testing ;) (n/t) (none / 0) (#158)
by DodgyGeezer on Tue May 27, 2003 at 05:26:59 PM EST



[ Parent ]
I never said they were (none / 0) (#161)
by Vulcannis on Tue May 27, 2003 at 05:47:31 PM EST

The original comment quite clearly criticized the syntax solely due to it's grammar conflict.  I responded because that argument is irrelevant.

I did not endorse the C++ syntax as particularly human-readable, nor did I state that there wasn't any relation between the concepts of grammar ambiguities and ease of parsing for humans.

So what exactly did I say that is "not true?"

---
If it's not black and white, you're not looking close enough.
[ Parent ]

Also off base (none / 0) (#160)
by coderlemming on Tue May 27, 2003 at 05:42:25 PM EST

  1. Look at C and C++. Plenty of the structures in C are there because they're simple to break down into efficient assembly code. C was designed to be a very efficient yet readable medium-level language which is suitable for time-critical programming tasks. C++ is its object-oriented child, and it's kept many of the same ideals. Since both C and C++ are not really defined with readability and beauty in mind, I don't think the original comment is that far off base.
  2. You're just misinterpreting the original comment so you can make a scathing reply. It's pretty obvious the original comment's author meant exactly what you said: that modelling an array parameterized by a size is a grossly inefficient design in light of C++'s implementation of templates.



--
Go be impersonally used as an organic semen collector!  (porkchop_d_clown)
[ Parent ]
Oh please (none / 0) (#165)
by Vulcannis on Tue May 27, 2003 at 06:22:45 PM EST

How a particular language syntax "breaks down" into assembly code, whether efficient or not, has nothing do to with whether it's grammar is LALR.  Neither do readability or beauty.  A programming language is a tool for use by humans, and irrelevant of how well C or C++ meet any such criteria, criticism of a language feature because of the difficulty in writing parsers for it is just specious.

I did not say parameterizing arrays by size was "grossly inefficient," or even inefficient.  I cannot make such a judgement without more information than he presented, and neither can you.  Similarly, while he may believe that such a design is "stupid," the information he provided leads to no such conclusion.

It is these misattributions that I took exception to and led to my "scathing reply."  But I'm sure nothing I can say will mean anything to you once subjected to your "interpretation."

---
If it's not black and white, you're not looking close enough.
[ Parent ]

Wrong (5.00 / 1) (#203)
by sigwinch on Tue May 27, 2003 at 11:10:15 PM EST

Restricting a language based upon a restrictive grammar class simply to ease the burden of compiler writers is ridiculous.
1. Ten years and hundreds of millions of dollars have been flushed into the sewer that is C++, and we still don't have a single correct compiler. Moreover, even if we were to pick one and declare it correct by fiat, it still produces worthless diagnostic messages.

2. Automated code analyzers would be a major help in improving the correctness of programs, especially with respect to information security. Given the experience writing compilers, C++ analyzers are not even a lost cause.

3. A language's syntax does not have to be baroque to be powerful and flexible. Consider the D programming language, which has syntax regularity and simplicity as one of its major goals. In fact, Walter Bright says this:

"Isn't ease of use for the user of the language more important? Yes, it is. But a vaporware language is useless to everyone. The easier a language is to implement, the more robust implementations there will be. In C's heyday, there were 30 different commericial C compilers for the IBM PC. Not many made the transition to C++. In looking at the C++ compilers on the market today, how many years of development went into each? At least 10 years? Programmers waited years for the various pieces of C++ to get implemented after they were specified. If C++ was not so enormously popular, it's doubtful that very complex features like multiple inheritance, templates, etc., would ever have been implemented. I suggest that if a language is easier to implement, then it is likely also easier to understand. Isn't it better to spend time learning to write better programs than language arcana? If a language can capture 90% of the power of C++ with 10% of its complexity, I argue that is a worthwhile tradeoff."
D is becoming substantially as powerful as C++, and a few people working almost as a hobby project have made a D compiler better than the early commercial C++ compilers.

Along those lines, Linus Torvalds says this about C++:

"Another reason was related to the above, namely compiler speed and stability. Because C++ is a more complex language, it also has a propensity for a lot more compiler bugs and compiles are usually slower. This can be considered a compiler implementation issue, but the basic complexity of C++ certainly is something that can be objectively considered to be harmful for kernel development."

--
I don't want the world, I just want your half.
[ Parent ]

Yes, you are (none / 0) (#218)
by Vulcannis on Wed May 28, 2003 at 01:19:39 AM EST

  1. The difficulty with compiler construction for C++ has nothing to do with syntax and everything to do with semantics. Basic syntax parsing of a language like C++ is a solved problem and has been for decades. Other classes such as GLR grammars are quite sufficient for modern compiler use; there is no need to place arbitrary restrictions on your syntax by going to more restrictive and less expressive classes such as LALR. The C++ export feature is a good example, as implemented by EDG:
    "The export feature alone took more than three person-years to code and test (not including design); by comparison, implementing the entire Java language took the same three people only two person-years." ("Export" Restrictions, Part 2)
    The export feature is hardly a complex piece of syntax. Moving to a LALR grammar would have no effect on how difficult it is to implement difficult language features.
  2. Of course. And moving analyses into the compiler or even the language definition itself would be even better. But what does this have to do with anything I was talking about?
  3. Whoever said it did? And what does being LALR have to do with being regular and simple to a human? LALR is simply a class of grammars that can be parsed easily and efficiently. It has no relation at all to whether a human finds the language syntax regular or simple. And while D looks interesting, I fail to see how pulling these quotes out does anything for your argument.


---
If it's not black and white, you're not looking close enough.
[ Parent ]
about the benefits of simplicity (none / 0) (#213)
by joto on Wed May 28, 2003 at 12:29:17 AM EST

A programming language is meant to be written and read by humans first. Restricting a language based upon a restrictive grammar class simply to ease the burden of compiler writers is ridiculous.

If that was the only consideration, I would agree, but it is not!

Restricting the language based upon a sane grammar doesn't only help compiler writers. It also helps people writing: code-analysis tools, refactoring tools, semantic checkers (e.g. lint), third party preprocessors (Qt's moc, the SQL pre-processor), developers using more than one language (mock up your own ad-hoc interface generator fast with a standard grammar), and so on, the list could continue indefinitely...

[ Parent ]

And I'll say it again... (none / 0) (#221)
by Vulcannis on Wed May 28, 2003 at 01:28:29 AM EST

Parsing syntax is the easy part.  Anyone writing complex tools that require parsing of a language such as C++ will probably be doing far more difficult things than writing an at-worst-case GLR class parser.

---
If it's not black and white, you're not looking close enough.
[ Parent ]
and i'll say it again too (none / 0) (#270)
by joto on Wed May 28, 2003 at 07:05:17 PM EST

Parsing syntax is the easy part

No, it isn't. Trust me, I've tried, on several occasions for several purposes. You might be able to buy a working C++ grammar from someone, but it's not something you will do unless you intend to sell it commercially, and think it will succeed. You might be able to write one yourself, but it will take a long time, and equally huge investment, so that's not a good solution either. Finally, you can mock up something that cover some subset of the most common cases you expect to deal with. For some tasks, this works ok, for others, it is far from optimum.

Anyone writing complex tools that require parsing of a language such as C++ will probably be doing far more difficult things than writing an at-worst-case GLR class parser.

This might be true now, but that is only because parsing C++ requires a huge investment up front. If parsing it was simple, simple tools would exist. And many of the tools I mentioned in the previous post could be made extremely simple, if there only was a simple way to parse C++ code.

Here's another example: I was once thinking about a better TAGS for C++, that could be used in e.g. emacs, or in a special tool written with the sole purpose of browsing the database. This would have been ridiculously simple, if I didn't have to write the parser. It would also have been very useful. Having to write the parser made me just give up on it, though...

[ Parent ]

Re: and i'll say it again too (none / 0) (#272)
by antc on Wed May 28, 2003 at 08:04:49 PM EST

This might be true now, but that is only because parsing C++ requires a huge investment up front. If parsing it was simple, simple tools would exist. And many of the tools I mentioned in the previous post could be made extremely simple, if there only was a simple way to parse C++ code.

You're right. If there was only a easy way to parse C++ code.

[ Parent ]

yeah, another promising but half-baked solution. (none / 0) (#273)
by joto on Wed May 28, 2003 at 08:57:53 PM EST

Qouting the FAQ:

Why are C++ function bodies not dumped in XML?
The original sponsors of the project had no need for function bodies. So far the authors have not had time or funding to implement the support. Contact the mailing list if you are interested in contributing this support or providing funding to have it implemented.

This surely doesn't make it useless for every project. There are a lot of useful things you can still do with it, but don't think it's a full parser.

[ Parent ]

Oh really? (4.00 / 1) (#275)
by Vulcannis on Wed May 28, 2003 at 10:15:09 PM EST

Gee, I guess 10 minutes of googling is too much for some people.

http://www.cs.berkeley.edu/~smcpeak/elkhound/sources/elsa/index.html
http://www.antlr.org/
http://www.stack.nl/~dimitri/doxygen/
http://www.empathy.com/pccts/download.html
http://gcc.gnu.org/

All of those links provide source for C++ parsers or products containing C++ parsers, of varying quality.

---
If it's not black and white, you're not looking close enough.
[ Parent ]

Exactly...varying quality... (5.00 / 1) (#278)
by joto on Thu May 29, 2003 at 05:41:42 AM EST

Varying quality is the keyword here. Trust me, I know how to use google, although it was some time ago, and I haven't seen Elsa before, which seems quite nice, although certainly not perfect. Maybe the most realistic starting point these days, but I haven't tried it.

Qoute from doxygen page: "Doxygen is not a real compiler, it is only a lexical scanner." Good enough for generating documentation, not good enough for much else (well, SWIG-like tasks could probably be handled easily).

ANTLR documentation says: Please note that we have used this grammar almost entirely for analysing C programs so we have very little experience of using it for parsing C++. For instance, it was issued in 1994 and doesn't know about 'namespaces'. I think this speaks volumes...

PCCTS is an older version of ANTLR.

gcc is probably the best option, but please don't tell me that it doesn't require a huge investment of time to get it to do anything useful besides being a compiler. See this page for more on that.

Now, do you see a trend here? There is no limit to the kind of cool tools we could have, if mocking up a good C++ parser was easy. But it isn't. As a result, automated source code tools are scarce, and not trustworthy. If you were to add a third argument to a function globally throughout a project today, would you use an automated tool, or would you use a text-editor?

A good parser should be perfect in interpreting the standard, and it should be simple enough to modify for understanding dialects (which there certainly is no scarcity of in the C++ world). Making it simple to work with the parse-tree is also of importance. gccxml is certainly the best idea I've seen in this regard, being based on a free and common compiler, and if it was finished, I would shut my mouth up.

(And lastly, this rant of course applies to any language that is non-trivial to parse, which basically means anything different from S-expr, or XML. But C++ is certainly worse than most everything else that exists of normal computer languages. And yes, you are allowed to call me a lazy whiner, because with enough effort, it certainly should be possible to do what I want anyway, trouble is, that effort is much more than I would be willing to put into it, I am no Linus!)

[ Parent ]

One more suggestion (none / 0) (#281)
by cwitty on Thu May 29, 2003 at 01:40:57 PM EST

Have you looked at OpenC++? The home page says: "OpenC++ is also a perfect code base for projects requiring C++ parser and static analyzer." (I don't know if that's actually true.)

Note that I'm not disagreeing with the idea that C++ syntax is too complex. Here's a quote I like on the topic, from http://compilers.iecc.com/comparch/article/91-07-037; it discusses the author's efforts to create a C++ parser which complied with the then-available (pre-ANSI) standard, and also matched the behavior of cfront, the original C++ implementation:

It should be noted that my grammar cannot be in constant agreement with such implementations as cfront because a) my grammar is internally consistent (mostly courtesy of its formal nature and YACC verification), and b) YACC generated parsers don't dump core. (I will probably take a lot of flack for that last snipe, but.... every time I have had difficulty figuring what was meant syntactically by some construct that the ARM was vague about, and I fed it to cfront, cfront dumped core.)


[ Parent ]
Misunderstanding (none / 0) (#286)
by Vulcannis on Thu May 29, 2003 at 05:53:36 PM EST

I think we're talking about slightly different things. The original poster only mentioned grammar-level concepts, such as LALR and token choices. There is far more to writing a compiler than simply parsing tokens, and thus my original reply that simplifying the grammar in isolation was irrelevant. I still hold that transforming the character stream into a basic AST is fairly easy.

It sounds like you're referring to the larger process of going from the character stream right up to a very richly annotated AST, symbol table, etc.. I'm not arguing with this, as anything that involves wading through C++ semantics will be complicated. But I simply think that falls outside the bounds of the original argument. For example, you could take a LALR subset of the C++ language, but you would still have to deal with difficult concepts such as Koenig Lookup, etc.. I don't think adding namespace support to the grammars mentioned in the links would be hard, but supporting them in the symbol table, AST, etc., would be a royal pain. But again that's a lot later in the game than the original poster was talking about.

When it comes right down to it though, difficulty is simply subjective.

I think there is only one existing example of a parser that meets your defintion of "good" though, the Comeau compiler which is based upon EDG. And since no other major compiler seems to have such complete C++ Standard support, I don't think an analysis tool would need it right now either.

To answer your question about adding a third argument to a function, I would prefer to use a refactoring browser. But since I don't have that option at work, yes, I'd end up resorting to plain old text editting to do the job.

---
If it's not black and white, you're not looking close enough.
[ Parent ]

... standardized without first being implemented (none / 0) (#136)
by illegal eagle on Tue May 27, 2003 at 03:58:27 PM EST

Also applies to Standard Pascal. I'm really glad that Borland decided to extend the language, the original design is nearly unusable.

[ Parent ]
Uhhh. (none / 0) (#163)
by awgsilyari on Tue May 27, 2003 at 06:04:21 PM EST

C is also not LALR, so what? There is a syntactic difference between a typename and an identifier, e.g.:

typedef int foo;
foo bar;

In this case 'foo' is lexed as a TYPENAME, not an IDENT. How is this managed? When the first typedef is encountered and parsed, the compiler notes that 'foo' is not a IDENT anymore, but a TYPENAME. A little hanky-panky goes on between the lexer and parser, nothing more.

Similarly, in your a<b,c> d example, the compiler will notice that 'a' has not yet been declared as a variable, sees that the next token is '<', and rather easily concludes that 'a' must be a template typename, not an identifier.

Sure, it isn't strictly LALR, but it isn't any worse than C which has precisely the same problem.


--------
Please direct SPAM to john@neuralnw.com
[ Parent ]

C++ was never LALR (none / 0) (#200)
by Pseudonym on Tue May 27, 2003 at 10:40:13 PM EST

C++ was never LALR(1), though in a different way than C. C is LALR(2) because of the typedef issue, but C++ is actually inherently ambiguous. Consider this code:

A B();

This can either be a function declaration or a declaration of a variable "B" of type "A" initialised with a default constructor. The only way to tell is by looking at the types.

LR-parsability is, IMO, somewhat overrated these days, especially in newer languages which support declaring new operators with their own precedence and associativity. The extra expressiveness is worth it.



sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
[ Parent ]
function declaration or variable... (5.00 / 1) (#212)
by joto on Wed May 28, 2003 at 12:20:43 AM EST

I thought it was only a function declaration. Could you give me an example of when it is possible to have the C++ compiler accept this code without changes as a variable definition?

[ Parent ]
Oops (none / 0) (#216)
by Pseudonym on Wed May 28, 2003 at 01:13:57 AM EST

You're right. My mistake.

The kinds of ambiguities I really meant are detailed elsewhere. Note that the disambiguity is entirely syntactical (e.g. if you can interpret it as a declaration, then it is, otherwise if you can interpret it as an expression then it is), but it's still inherently ambiguous.



sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
[ Parent ]
Use 'em all the time, couldn't live without 'em. (5.00 / 2) (#139)
by alyosha1 on Tue May 27, 2003 at 04:16:08 PM EST

I'm in the process of implementing a medical imaging library , that would have been vastly more complicated without the powerful features of C++ templates. The library implements the DICOM (Digital Imaging and Communications In Medicine) standard, which is a huge document several thousand pages long. Dicom has it's own type scheme declaring types such as 'UL' (unsigned long), 'PN' (Patient name) and so on.

To interface with dicom images in C++ their needs to be a mapping between dicom types and c++ types. (Similar issues arise in other problem domains - any database application needs to map SQL data types to C++ types).

Templates allow me to state these relationships explicitly and simply:

typedef CppTypeFromDicomType<PN>::Type CppType ;
CppType data;
cout << data;

In the above (simplified) snippet, 'PN' is an enum value specified in the dicom standard, and CppTypeFromDicomType is a template meta-function that at compile time looks up the enum value 'PN' and tells the compiler (in this case) "the variable 'data' has type 'string'". If we'd fed in the enum value 'UL', then 'data' would have type 'unsigned long', and so on.

I haven't explained how the meta-function works here but you can see it in the VR.hpp file in dicomlib.zip on the linked site.

Since I implemented this, I have not run into ANY type-mismatch bugs or problems in my library between dicom types and c++ types. Because this all happens at compile time, the code will refuse to compile if I make a type error - for example, trying to write a C++ integer onto a DICOM string.

Another place where I've seen something similar used is in the very excellent libpqxx C++ client API for postgresql. A value returned from an SQL query can obviously have any SQL data type, and libpqxx makes it very easy to extract that value onto a C++ type. Here's a snippet from one of my applications that use the library:

void SomeFunction(const Result::Tuple& Row)
{
string Host;int Port;
Row["host"].to(Host);
Row["port"].to(Port);
}


Note how libpqxx automagically figures out what type I want and handles all the data type conversion, using some clever behind-the-scenes template technique that I, the end programmer, don't even need to worry about.

So no, templates are not perfect, the syntax can be ugly at times and the learning curve steep, but here's one programmer who finds them indespensible for churning out robust, usable library code.

Why C++ truly sucks. (1.33 / 3) (#140)
by Phillip Asheo on Tue May 27, 2003 at 04:35:26 PM EST

Because people try and use it for the WRONG things...

They use it where they should use Java, they use it where they should use Pearl, hell they use it where they should use C!

If I had a dollar for every line of shit C++ code I have had to read during my employment, I would have a lot of cash by now.

C++ is a truly shitty language. However, if people want to pay me $$$s to clean up their 'mission critical' shitty C++ mess after them, thats fine by me.

--
"Never say what you can grunt. Never grunt what you can wink. Never wink what you can nod, never nod what you can shrug, and don't shrug when it ain't necessary"
-Earl Long

Java (none / 0) (#159)
by Arevos on Tue May 27, 2003 at 05:37:28 PM EST

Under what circumstances would it be right to use Java? The only thing Java seems to be good for is cross platform portability. Besides which, Java is basically just C++ with less rope to hang yourself with, but not enough rope to make it a better language.

And yes, I do rather have a grudge against that hell-spawned language :)

[ Parent ]

Whenever... (none / 0) (#211)
by joto on Wed May 28, 2003 at 12:15:49 AM EST

Under what circumstances would it be right to use Java?

Whenever:

  • Your programming team consists of people not all being C++ gurus.
  • Performance is somewhat important, but not all-important.
  • There exists useful java libraries that help you finish the task faster.
  • You don't inherit a large portion of old code, but have a chance to start afresh.
  • An object oriented approach will help.
The only thing Java seems to be good for is cross platform portability.

Purposefully, I did not state portability in my reasons above. That's at least my experience...

And yes, I do rather have a grudge against that hell-spawned language :)

Well, so do I. It's a language designed for average programmers, not for being a cool hackers toy. This is important in the real world, but when programming just for fun, I would never choose java.

[ Parent ]

Bitching (none / 0) (#234)
by Arevos on Wed May 28, 2003 at 06:41:38 AM EST

Your programming team consists of people not all being C++ gurus.

Personally, I find designing a GUI program in a toolkit like Qt is far easier than using Swing. Though, of course, Qt costs money, and the Windows version needs Visual Studio IIRC. So, perhaps Java could be used under such circumstances, although there are probably other libraries about which could do a better job.

Performance is somewhat important, but not all-important.

Use something like Perl or Python, then :)

There exists useful java libraries that help you finish the task faster.

Ok, true.

You don't inherit a large portion of old code, but have a chance to start afresh.

Again, true.

An object oriented approach will help.

I wouldn't say that Java is a good object orientated language, but there you go.

Ok, perhaps Java can be used under some circumstances. I'd be reluctant to use the language in a major project though, especially after having experience of the language for my degree.

Yes, I'm bitching :)

[ Parent ]

Templates are only a big deal to C++ newbies (5.00 / 2) (#142)
by DodgyGeezer on Tue May 27, 2003 at 04:46:42 PM EST

"On the other hand, if it doesn't, you'll get an explosion of template error messages that don't really indicate the source of the problem. The worst thing about this situation is that it can cause insidious lurking bugs: the code works fine for a year, then one day the new guy uses it in a way that's not quite expected and all the sudden everything breaks for no obvious reason."

If you get errors, it hide away for a year as it won't compile.  If it's a bunch of warnings and they're ignored, the programmer is either bad or inexperienced.  The "explosion of template error messages" isn't such a big deal for somebody who isn't new to the language.  Experience and commonsense track them down... and if you're writing new code, then you know where it probably lies already.

Unrelated to templates, but as for the point about Java interfaces: this has always been possible in C++ through the use of pure virtual classes.  It's not quite as neat as Java and can easily be changed without warning by somebody later on, but it's still there.  It's used extremely heavily by implementations of things like COM.

It seems most people who have a problem with templates really have a problem with writing them.  I can understand that... but then I rarely have to write new templates.  Instead I VERY HEAVILY use the STL - with that the template syntax is only evident during variable declaration and compilation errors.  Debugging in to the STL is an endeavour of patience and experience, but that's nothing to do with templates, just the very terse code that would be hard to follow even if it wasn't parametric.

my fault (5.00 / 1) (#144)
by jacob on Tue May 27, 2003 at 04:58:18 PM EST

That paragraph seems to be the result of a somewhat unfortunate juxtaposition. What I meant was something along the lines of, "You can write a template and it will work fine for a year because every type you provide it with has the << operator defined for it, but then one day a year later you may provide the template with a type that doesn't have << defined for it and when that happens you'll get a compiler error."

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Ahhh, I see (none / 0) (#148)
by DodgyGeezer on Tue May 27, 2003 at 05:11:27 PM EST

I guess I don't have a problem with that.  That could add some unexpected implementation time to a project.  The template writer has to make some assumptions about what's reasonable for behaviour in other classes, and determine what might cause extra effort.  For example, the STL documentation is clear about what operators are required for some of the collections - e.g. occasionally I have to write operator less.  The first time I encountered this, I did have some wierd behaviour... but that's what unit testing is for.  I should have RTF[STL]M properly for that template the first time though!  Definitely a weakness in the language.  But no worse than the pointer problems C programmers have ;)

[ Parent ]
As good as it gets (4.00 / 2) (#145)
by glauber on Tue May 27, 2003 at 05:01:41 PM EST

Templates are not wrong. In fact, they're as good as you can get with a C/C++ type of language. Granted, you can do better in ML and in Lisp, but C++ isn't ML and it isn't Lisp either.

You can see the need that C++ templates fill as soon as you start programming in Java. Java is supposed to be safer than C because it's type safe, but as soon as you start using its collections framework, Java is no better than C, because you have to cast everything. C++ solves this elegantly through templates.

On the other hand, use of templates for pre-compiling optimization is a hack that should be punished by death through unending debugging sessions.


glauber (PGP Key: 0x44CFAA9B)

Sigh. (2.00 / 3) (#147)
by jacob on Tue May 27, 2003 at 05:07:41 PM EST

Or, you could use polymorphism.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Sigh. (none / 0) (#149)
by Skywise on Tue May 27, 2003 at 05:14:43 PM EST

Not if you're doing identical operations over dissimilar object collections.

[ Parent ]
By 'polymorphism' (2.00 / 2) (#152)
by jacob on Tue May 27, 2003 at 05:19:32 PM EST

I mean parametric polymorphism, the kind I describe in the first section of the article.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
In C++? (4.00 / 2) (#156)
by glauber on Tue May 27, 2003 at 05:23:04 PM EST

That's what i said, you could do better in ML or in Lisp, but not in a C/C++ type of language. Parametric polymorphism involves ML's type system, which is much different from the type system used in C/C++. We could posit a C++ language with ML's type system, but it would violate C++'s goal of being a superset of C.


glauber (PGP Key: 0x44CFAA9B)
[ Parent ]

More than posit ... (5.00 / 1) (#164)
by spcmanspiff on Tue May 27, 2003 at 06:14:23 PM EST

These are both C-like with ML-inspired type systems (to a greater/lesser extent). Download and enjoy!

 

[ Parent ]

One more (none / 0) (#233)
by MfA on Wed May 28, 2003 at 06:12:08 AM EST

Cforall

"C++ extended the C type-system using an object-oriented approach with templates. Cforall extends the C type-system using overloading, parametric polymorphism, and type generators."

[ Parent ]
you could certainly throw (none / 0) (#169)
by jacob on Tue May 27, 2003 at 06:39:55 PM EST

type polymorphism into a superset of C. C++ threw in subtyping, after all ...

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Polymorphism doesn't cut it (none / 0) (#151)
by glauber on Tue May 27, 2003 at 05:18:20 PM EST

You can't implement a generic collections framework using polymorphism. Or rather, you can, but it's meaningless. In Java, all Lists are lists of "Object", the root class of Java's hierarchy. These "Object"s are almost useless until you cast them to something else higher up that tree.


glauber (PGP Key: 0x44CFAA9B)
[ Parent ]

If that were what I meant (2.00 / 3) (#154)
by jacob on Tue May 27, 2003 at 05:21:03 PM EST

rather than the kind of polymorphism discussed in the article (parametric polymorphism), you'd have a good point. But I don't.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
OK (none / 0) (#157)
by glauber on Tue May 27, 2003 at 05:24:04 PM EST

Allrighty, then!


glauber (PGP Key: 0x44CFAA9B)
[ Parent ]

Oh yes... (none / 0) (#210)
by joto on Wed May 28, 2003 at 12:02:47 AM EST

There are a number of things that could be done to make templates in C++ "better":
  • Not use < and > as brackets. This would simplify syntax immensely.
  • Clarify rules for name lookup (ever tried to use templates and friends at the same time, it's very hard to get right)
  • Avoid syntactic clutter. I'm sure there is a better way than to prefix absolutely everything with template <class T>. Either the compiler should already know they are there, or the syntax should be toned down, so that you can still see your + operator inside that Matrix<T>::operator +() function.
  • Allow (or even better: "force") you to use some kind of contract for specifying what can be done with template arguments in a template function/class. This would make it much easier to typecheck things, and thus would produce saner error messages. Today we use traits for the same purpose, but with few of the benefits.
  • Abandon the raw object files, and move to something more akin to a module/unit/package/etc to avoid libraries in header files and/or the export keyword.
  • Have a typeof() operator similar to sizeof(). This would make it possible to have sane templatized mixed arithmetic, among other benefits.
  • Change some of the rules for template instantiation, to avoid bloat. E.g. there is no need for separate code to be generated for a vector of int, long, float, or char *, when they are all the same size on the machine.
...and so on. I'm sure you can fine more...

[ Parent ]
Some Reasons (none / 0) (#260)
by SuperSheep on Wed May 28, 2003 at 04:29:39 PM EST

Not use < and > as brackets. This would simplify syntax immensely.

C++ uses operators where a lot of other languages use words. It is the language's style. Unfortunately this means that they don't have a lot of choices when it comes to operators.

Clarify rules for name lookup (ever tried to use templates and friends at the same time, it's very hard to get right)

I have tried and it took a little while, but it does make sense given the entire context that vector<int> and vector<float> are different classes. If you think of them as just vector_int and vecotor_float you'll be fine.

Avoid syntactic clutter. I'm sure there is a better way than to prefix absolutely everything with template <class T>. Either the compiler should already know they are there, or the syntax should be toned down, so that you can still see your + operator inside that Matrix<T>::operator +() function.

All the parts of that line are necessary.

  • template<class T> specifies that it is a template function and what symbols below are template types (here T).
  • Matrix specifies the name of the class.
  • The <T> specifies that you're doing this is a generic type and not a specific type (<int> or <float>). I actually used this specialization ability in templates along with inline to produce a 0-overhead OO wrapper for OpenGL. Something that no other language that I know of could have done.
  • The operator specifies that it's an operator and not a named function; the '+' specifies the operator symbol.
  • The parameters (here they were left out) specify, well, the parameters.

Allow (or even better: "force") you to use some kind of contract for specifying what can be done with template arguments in a template function/class. This would make it much easier to typecheck things, and thus would produce saner error messages. Today we use traits for the same purpose, but with few of the benefits.

I'm not sure if this would provide any benefit since it would instead just have to check to make sure it's doing what you said it could do in additional to checking the overal syntax. The compilers will get better.

Abandon the raw object files, and move to something more akin to a module/unit/package/etc to avoid libraries in header files and/or the export keyword.

Some people, myself included, like header files over packages. I find it a lot easier to determine what a class can do by looking directly at the declarations without having to search the file trying to match brackets.

Have a typeof() operator similar to sizeof(). This would make it possible to have sane templatized mixed arithmetic, among other benefits.

It does, I've used it. Templates + polymorphism actually make it mostly unnecessary.

Change some of the rules for template instantiation, to avoid bloat. E.g. there is no need for separate code to be generated for a vector of int, long, float, or char *, when they are all the same size on the machine.

int, long, float, char * don't always use the same sizes on the machine. They can, but they definately don't have to. And I'd rather they don't use the same class because they aren't acted on the same way. And if you did use the same class then the resulting functions called from within the class would all be called with the same prototypes (maybe int). I wouldn't want my floats and char *'s interpreted as ints for display or arithematic. Some code could be consolidated, but it would be complicated to do so. You could call this a limitation of the language, but I'm sure there were reasons behind it.

SuperSheep



[ Parent ]
more reasons (none / 0) (#271)
by joto on Wed May 28, 2003 at 07:58:26 PM EST

C++ uses operators where a lot of other languages use words. It is the language's style. Unfortunately this means that they don't have a lot of choices when it comes to operators.

Here are two choices that are better. Normal parens: () , or square brackets: [].

I have tried and it took a little while, but it does make sense given the entire context that vector<int> and vector<float> are different classes. If you think of them as just vector_int and vecotor_float you'll be fine.

Well, yeah, most of the time. But oops, how could I forget to throw namespaces into the mix. That's when it get's really funny, because then we have two different rules for name-lookup, depending on whether it's a friend declaration or not.

All the parts of that line are necessary.

Not really...

The compiler already knows that it is a template class, and it knows which template arguments are required for each member function, so it should not be necessary to write template above each member function definion.

If you insist on local placeholder-names (which I would agree is a good idea), then since they are already ordered, they can be gotten from the Matrix[T], and the compiler will know which template argument the T refers to. You still doesn't need any template above the definition.

The <T> specifies that you're doing this is a generic type and not a specific type (<int> or <float>).

I suppose this is a typo, as I am sure you know the T is a placeholder only, and it is the class or typename keyword that make the T a placeholder for a typename, instead of a placeholder for a builtin type or template.

I'm not sure if this would provide any benefit since it would instead just have to check to make sure it's doing what you said it could do in additional to checking the overal syntax. The compilers will get better.

Huh, do you know why we have static checking at all? To be able to check more things of course...

Even something as ancient as STL is full of these "concepts". Now, why, oh why can't we just be able to codify these concepts, since they've been used since the very beginning of template support in C++?

Some people, myself included, like header files over packages. I find it a lot easier to determine what a class can do by looking directly at the declarations without having to search the file trying to match brackets.

This is a typical braindead C++-centric approach. Just because you are used to write header files, does not mean that they do anything even remotely useful. If you want to read them yourself, they could just as well be automatically generated by a tool (e.g. javadoc in java). The only reason they are still there, is as a relic and historical reminder of the age of C, and that the preprocessor still exists.

On the other hand, I would say that I prefer simple object files, to more complex compiler-dependent packages. But since the C++ linker has already been different from the standard linker for as long as I can remember, I see no reason why we need to fool ourselves any longer.

int, long, float, char * don't always use the same sizes on the machine

Neither did I say anything that could have implied that. So why object to it?

And I'd rather they don't use the same class because they aren't acted on the same way

Yes, in a vector, they certainly are.

And if you did use the same class then the resulting functions called from within the class would all be called with the same prototypes (maybe int). I wouldn't want my floats and char *'s interpreted as ints for display or arithematic.

This is a debugger problem. I'm sure debuggers could be updated to support yet another name-mangling scheme.

Some code could be consolidated, but it would be complicated to do so. You could call this a limitation of the language, but I'm sure there were reasons behind it.

Probably. I suspect it is also a red herring, and that compilers already (or at least are planning to) do this regardless of what the standard says. There is at least no way it could be detected by simply running the program.

[ Parent ]

Specialization (none / 0) (#306)
by wuzzeb on Mon Jun 02, 2003 at 10:00:49 PM EST

In _The C++ Programming Language_ by Bjarne Stroustrup (the creater of C++), he talks about Specialization.  (In section 13.5 in the Special Edition on page 341)

consider a vector template like
template<class T> class Vector {
  T *v;
  int size;
public:
  Vector();
  T &elem(int i) { return v[i]; }
  //..
};

"The default behavior of most C++ implementations is to replicate the code for template functions.  This is good for run-time performance, but unless care is taken it leads to code bloat in critical cases such as the Vector example.
  Fortunately, there is an obvious solution.  Containers of pointers can share a single implementation.  This can be expressed through specialization.  First, we define a version (a specialization of Vector for pointers to void:"

template<> class Vector<void *> {
  void **p;
  int size;

  void *&elem(int i);
  //..
};

template<class T> class Vector<T*> : private Vector<void *> {
public:
  typedef Vector<void *> Base;
  Vector() : Base() {}
  T *&elem(int i) { return reinterpret_cast<T*&amp;>(Base::elem(i)); }
  //..
};

The following is no longer a quote from the book... just a shortened overview of the next few pages....

The first class is a specilization for Vectors of void *'s.  We then use a partial template specilization in the second class to use define a template that will be used for all Vectors of pointers.  All the functions in the second class are inline functions that call the Vector<void *> functions and will be optimized away.

So when you write Vector<Shape *> the compiler takes the second template (template<class T> Vector<T*> ...).  It takes T to be Shape and generates a new class.  But that new class it generated does not have any functions... any functions it calls are just passed to Vector<void *>.  So when you call Vector<Shape *>::elem(5), the Vector<void *>::elem(5) function is called.  Only the Vector<void *> version of the elem function will be generated and present in the binary.

Again a quote
"It is important that this refinement of the implementation of Vector is achieved without affecting the interface presented to users.  Specialization is a way of specifying alternative implementations for different uses of a common interface.  Naturally, we could have given the general Vector and the Vector of pointers different names... In this case, it is much better to hide the crucial implementation details behind a common interface."

The rest of the section talks about other details, like the order of lookup for specialization selection, specialization of functions.

But the idea here is that a given version of the STL (say libstl for HP-UX) can define specializations specific to its archetecture while a different STL can define different ones.  Say the HP-UX version has vector<int> and vector<long long> and vector<T *> all use the same runtime code, while STL for i386-linux might have vector<int> and vector<long> all use the same code.

So as an example, by default say the vector<char> and vector<int> don't use the same code because a int is bigger than a char.  But say you wanted to have all vector<char> and vector<int> use the same code.  Somewhere in a header file, you could write something like

template<> vector<char> : private vector<int> {
  typedef vector<int> Base;
  vector() : Base() {}
  char &elem(int i) {
    int temp = Base::elem(i);
#ifdef DEBUG
    if (temp > 127 || temp < -128)
      throw Overflow();
#endif
    return temp;
  }
  // ...
}

So in your code vector<char> and vector<int> would still be two totally different classes... you would not be able to add a char to a vector<int> and you would not be able to add a int to a vector<char>.  All type checking still takes place.  But the runtime code generation will only emit functions for vector<int> and any code for vector<char> will use the vector<int> functions.  And you can even add optional checking.

partial template specialization is used a lot... say for example
template<class T>
swap(Vector<T> &a, Vector<T> &b) {
  a.swap(b);
}

the vector<T>::swap function on the vector would presumably only swap the pointers (T *p and int size in our example).  Thus the vector<T*>::swap function has already been specialized to the vector<void *>::swap function.  Now all calls to swap(vector<Shape *> &a, vector<Shape *> &b) will also use the vector<void *>::swap function (after a few inlines are performed).  Again only one function is present in the generated binary file, but this is transparent to the user.

I believe almost every c++ compiler supports this, otherwise the code bloat would be too huge... on the order of several megabytes for some programs.  Bjarne writes

"This techniqe proved sccessful in curbing code bloat in real use.  People who do not use a technique like this (in C++ or in other languages with similar facilities for type parameterization) have found that replicated code can cost megabytes of code space even in moderately-sized programs.  By eliminating the time needed to compile those additional versions of the vector operations, this technique can also cut compile and link times dramatically.  Using a single specialization to implement all lists of pointers is an example of the general technique of minimizing code bloat by maximizing the amount of shared code."

PS.  C++ does have a typeid() operator... you can write something like so
template<class T>
int somefunc(T a) {
  if (typeid(a) == typeid(int)) {

  }
  cout << typeid(a).name();
  cout << typeid(T*).name();
}
but again you can use specialization instead
template<class T> int somefunc(T a) {
  // general function if T != int
}

template<> int somefunc(int a) {
  // specialized function if T = int
}


[ Parent ]

You can do btter than C++ in similar languages (none / 0) (#251)
by trixx on Wed May 28, 2003 at 12:21:31 PM EST

Check my post on Eiffel Genericity. No, C++ is not Eiffel, but they are much closer than C++ and Lisp

[ Parent ]
Polymorphism doesn't do it (none / 0) (#150)
by glauber on Tue May 27, 2003 at 05:16:45 PM EST

You can't implement a generic collections framework using polymorphism. Or rather, you can, but it's meaningless. In Java, all Lists are lists of "Object", the root class of Java's hierarchy. These "Object"s are almost useless until you cast them to something else higher up that tree.


glauber (PGP Key: 0x44CFAA9B)

How is it meaningless ? (none / 0) (#227)
by Simon Kinahan on Wed May 28, 2003 at 03:41:39 AM EST

Almost all uses of Java collections in real code have an implied element type. The code just casts the elements to the appropriate type when they are retrieved. Generic Java variants, including the oner that has been slated for implementation in 1.5, simply automate the insertion of these casts.

Simon

If you disagree, post, don't moderate
[ Parent ]
Do you know about reflection? (none / 0) (#238)
by porkchop_d_clown on Wed May 28, 2003 at 09:37:03 AM EST

It's trivial in Java to take an arbitrary object and dynamically retrieve its actual class.


--
I only read Usenet for the articles.


[ Parent ]
Good article. (4.00 / 3) (#167)
by reflective recursion on Tue May 27, 2003 at 06:29:44 PM EST

I have only briefly used C++ and have never gotten as far as using templates. Interesting stuff.

Lisp/Scheme vs. C++ macros/templates...
The big difference between the two is that while C macros work by scanning for and replacing literal text phrases within source code, Lisp macros replace portions of a parse-tree instead. That might not sound revolutionary, but it turns out to be the difference between a system that gurus recommend you never use and one that goes a long way towards defining a language.
If I'm reading that right, you're saying C++ gurus recommend staying away from templates. The funny thing is that most Lisp gurus say the same about their macros. Macros as specified in Common Lisp standard can be extremely dangerous to use because they can inadvertently capture variables that exist outside the macro definition. The only remedy is using gensym style functions. While that may seem hackish, it is quite a bit nicer IMO than what Scheme does.

Scheme has what are called "hygienic" macros, which are designed to prevent that inadvertent variable capture of Common Lisp style macros (defmacro and cousins). While they are okay for "typical" macros, they do pose a serious problem with defining macros which create definitions themselves (such as creating object-oriented structures or C-style structs/records). Scheme macros are also very ill-specified and leave a couple of semantic ambiguous situations. And then you have the situation that they are a complete bitch to implement "correctly" (depending on how you interpret the R5RS spec), whereas CL style macros are a simple quasiquote transformation with a repeated evaluation. Which is why most Scheme implementations will usually implement CL-style macros, if they implement a macro system at all.

I don't have much to say about your conclusion. Except, perhaps, I hope I'm never in a situation where I have to use them (or C++ for that matter).

macros (none / 0) (#168)
by jacob on Tue May 27, 2003 at 06:37:48 PM EST

If I'm reading that right, you're saying C++ gurus recommend staying away from templates.

No, that C gurus recommend staying away from C preprocessor macros.

If I'm reading that right, you're saying C++ gurus recommend staying away from templates.

Not the ones I'm familiar with ... I've definitely heard it said you should never use a macro when you could use a function, and that's God's honest truth, but it's also a far cry from never using macros period. About the hygiene issue, not much to say except 'me too!' PLT Scheme (apparently following Chez, though I don't know much about Chez) uses a syntax-object-based system ('object' in this sense meaning 'blob of syntax' rather than the OOP meaning) that is hygienic by default, but allows you to deliberately introduce shadowing if you'd like. Worth checking out if you're interested in Lisp-style macros.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

Whoops.. (1.00 / 1) (#178)
by reflective recursion on Tue May 27, 2003 at 07:04:10 PM EST

Yeah, something did seem odd after reading that. I'll have to check out PLT Scheme macros. I've mostly used guile (*bleh*), but I do have mzscheme (PLT) though I have never used their macros. I was in the process of writing a 100% compliant Scheme compiler, but then I discovered the macro problem and ditched that effort.

[ Parent ]
Look up 'datum->syntax-object' (none / 0) (#179)
by jacob on Tue May 27, 2003 at 07:18:14 PM EST

or 'syntax-quasiquote' in the PLT docs; those terms will point you in the right direction on how to break hygiene in PLT.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
More info? (none / 0) (#191)
by p3d0 on Tue May 27, 2003 at 08:39:09 PM EST

I suddenly wish I hadn't pissed you off in that other thread, because you sound like the first person I have ever encountered that knows more about Lisp macros than just to say they're better than C's macros.

Do you have any suggestions for where I could learn more about the pitfalls of Lisp macros, and the difference between Scheme's and CL's macros?
--
Patrick Doyle
My comments do not reflect the opinions of my employer.
[ Parent ]

Try (5.00 / 1) (#197)
by jacob on Tue May 27, 2003 at 09:19:30 PM EST

here. You can download PLT Scheme and play around with that macro system, which is documented in their help system and described in Dybvig's "Writing Hygenic Macros in Scheme with Syntax-Case".
I'd recommend Shriram Krishnamurthi, Will Clinger, and Matthias Felleisen's papers on the subject, as well as Matthew Flatt's for a more modern improvement to the field (disclaimer: I'm pimping the research group to which I belong here; all four of those guys have at least some affiliation with the PLT).

Also look at the slides from Shriram's Lightweight Languages 1 (LL1) talk.

Another article to read for a different perspective is the <bigwig> project's paper (ps, pdf) "Growing Languages with Metamorphic Syntax Macros," which includes an overview of many different macro/metaprogramming techniques as implemented in various languages.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

yeah.. (3.00 / 2) (#202)
by reflective recursion on Tue May 27, 2003 at 10:51:26 PM EST

The Syntactic Extension chapter from The Scheme Programming Language, 2nd Edition. There is also On Lisp chapters 7, 8, 9. This book is written by Paul Graham, a Lisp advocate and general macro-nut.

The pitfalls with CL style macros are basically variable capture and possibly redefining macros, depending on implementation, etc. With a strict standard compliant Scheme the pitfalls are basically 1.) you can't easily create macros which define any top-level symbols, useful for creating data structures and 2.) semantic ambiguity associated with Scheme's ellipsis counting and pattern matching algorithm. There is also the problem of redefining macros in Scheme too. Then you have the macro systems which are in the middle, like PLT Scheme. The pitfall here is that, of course, it is non-standard.

[ Parent ]
Which gurus? (none / 0) (#207)
by joto on Tue May 27, 2003 at 11:43:43 PM EST

I've never seen a C++ guru recommend you to never use templates. He would most likely not be considered a real guru then, as would one that would recommend to never use e.g. pointers (or any other arbitrary feature of the language that the "guru" did not understand properly).

Now, most C++ gurus try to tell you that you should try to limit your use of the preprocessor as much as possible. Which is mostly good, because the preprocessor is evil (no namespace, if you happen to have a symbol defined that matches a variable name, it can take you quite some time to figure it out). And also because of unintended side effects (e.g. MY_MACRO(i++); - how many times will i++ be evaluated, and will the result be valid C++?). On the other hand, to not use the preprocessor at all, would mean that you couldn't use any header files, which is no good either.

About lisp macros, I've never seen a lisp guru recommend to not use macros either. Of course, you should be careful about them, but most lisp gurus I've seen flame schemes hygienic macros for not allowing intended variable capture. Go figure...

[ Parent ]

read previous comments.. (1.00 / 1) (#215)
by reflective recursion on Wed May 28, 2003 at 01:08:56 AM EST

I've never seen a C++ guru recommend you to never use templates.
That was a bit of a mistake.. should have been C macros. Wrote that sentence in a bit of a rush.
About lisp macros, I've never seen a lisp guru recommend to not use macros either. Of course, you should be careful about them, but most lisp gurus I've seen flame schemes hygienic macros for not allowing intended variable capture. Go figure...
It goes both ways, actually. Some knowledgeable people want the power of CL macros, while some prefer having the less knowledgeable (or even themselves) not screw things up by having too much power. This is much like the arguments surrounding the Lisp1/Lisp2 namespace issue.

I didn't mean that "gurus" recommend staying away from macros completely. Much like I don't believe C gurus recommend staying away from preprocessor macros completely. Hell, I consider the authors of GTK+ and GIMP C gurus. Just look at what they did with preprocessor macros. There is a definite "use only when necessary" attitude with Lisp macros, though (Scheme or CL). This is unfortunate because they are so powerful and blend into Lisp perfectly. This is mostly because of the variable capture issue.
Now, most C++ gurus try to tell you that you should try to limit your use of the preprocessor as much as possible. Which is mostly good, because the preprocessor is evil (no namespace, if you happen to have a symbol defined that matches a variable name, it can take you quite some time to figure it out).
Heh. Funny though, Scheme macros are identical to C preprocessor macros, in that they have no internal state. The truely odd thing with regards to CL and Scheme is that CL is paired with the "safe" namespace and the "unsafe" macros whereas Scheme is paired with the "unsafe" namespace and the "safe" macros. Yet the flames continue on both sides about very similar issues. It is quite an odd sight when a CL advocate bashes Scheme for its namespace and praises CL's macros, or vice versa.

[ Parent ]
using macros (none / 0) (#264)
by han on Wed May 28, 2003 at 05:14:15 PM EST

I didn't mean that "gurus" recommend staying away from macros completely. Much like I don't believe C gurus recommend staying away from preprocessor macros completely. Hell, I consider the authors of GTK+ and GIMP C gurus. Just look at what they did with preprocessor macros. There is a definite "use only when necessary" attitude with Lisp macros, though (Scheme or CL). This is unfortunate because they are so powerful and blend into Lisp perfectly. This is mostly because of the variable capture issue.
I don't know where you find those gurus... I concur with joto, in that the lisp gurus I've read about don't have anything against using macros, as long as you don't use them where mere functions would do. Some advise care with macros due to the point that macro design is language design, since macros circumvent the usual evaluation rules.

As for the hygiene issue, that's something only Scheme people obsess about -- it's not a problem in CL in practice. You just need to be aware of the issue, make your share of mistakes on the first few macros you write, and then get on with it. As Kent Pitman says, avoiding unintended variable capture in macros is no harder than remembering the correct use of = vs == in C or C++. For maximum convenience, use the common helper macros rebinding and with-unique-names.

[ Parent ]

gurus.. (1.00 / 1) (#267)
by reflective recursion on Wed May 28, 2003 at 06:12:49 PM EST

What exactly do you mean by "guru?" I have yet to see these C gurus which have anything against C macros (to the extent that you hold me to claiming about Lisp gurus).
As for the hygiene issue, that's something only Scheme people obsess about
Not entirely true. You will find Scheme users talk more of them, but that is only because they have them. If you look on Google, it seems that half the talk is in comp.lang.lisp which is predominantly CL.
it's not a problem in CL in practice
Just as it's not a problem in C in practice. What I'm saying is that both have pitfalls, whether the programmer has learned to overcome those pitfalls or not. The "guru" will advise you to use macros only when you need them (C or Lisp or Scheme). I consider every author who has their name on the R5RS spec a "Lisp guru." I'm sure at least one person on there has an issue with CL style macros (and at least one with an issue on their own hygienic macros, which I'm sure KMP can fill this hole).

[ Parent ]
CL macro usage (none / 0) (#303)
by voodoo1man on Mon Jun 02, 2003 at 02:51:48 PM EST

"There is a definite "use only when necessary" attitude with Lisp macros, though (Scheme or CL). This is unfortunate because they are so powerful and blend into Lisp perfectly. This is mostly because of the variable capture issue."

When it comes to CL, I don't think variable capture is the big problem with macro usage. It's a widely known and documented problem, and avoiding it is not that tough. With a few utilities like with-unique-names in one's utility kit, it's even easier.

The far more annoying thing about macros is run-time redefinition - functions that use macros aren't changed when the macros are redefined at run-time. Since most lisp compilers don't keep track of usable who-calls (not even to mention macros), it's oftentime easiest just to recompile a whole system after a macro redefinition. Of course, this is also a problem with inlined functions. This also makes it more difficult to find the source of problems when something unexpected happens in the middle of a macro-using function. Besides hampering debugging if you have the source code, this totally screws you over if you don't.

Generally, you shouldn't use macros or inlined functions when you don't need them because it affords maximum flexibility at run-time (it also doesn't bloat the code size, but that's a different story).

[ Parent ]

What about implementation-hiding? (none / 0) (#172)
by coderlemming on Tue May 27, 2003 at 06:50:49 PM EST

It's interesting that, in this article and the entire set of replies so far, I haven't seen this important problem with templates:

What if I want to write a library with classes that are not type-specific, and I want to sell this library in a closed-source form? It's not possible. In C++ sans templates, I can just pass out the .h files to provide the public interface, and distribute the compiled libraries. But C++ requires that the entire template class is available at compile-time... and what's more, it's got to be included verbatim, meaning the entire class definition has to be stuffed into a .h file (or included .c file, shudder). That's a pretty big problem, in my mind, because it detracts from information hiding and modularity.


--
Go be impersonally used as an organic semen collector!  (porkchop_d_clown)
Good point. (none / 0) (#175)
by jacob on Tue May 27, 2003 at 06:59:18 PM EST

I should've mentioned that (though Lisp macros have some pretty interesting problems on this front, as you might imagine). However, there is an attempt, pointed out to me by codemonkey_uk and others, to make templates work with separate compilation via a feature called export.

Apparently it's very difficult to implement and so nobody does, but it exists in the standard.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

C++ defines a solution for exporting templates (5.00 / 1) (#180)
by beavan on Tue May 27, 2003 at 07:30:43 PM EST

While writing closed source libraries is a bad practice in my opinion, the C++ standard has a solution for that, however, I don't know any compiler that supports it. The keyword export was made just for that.
for example:
// t.h export <template typename T> class myTemplate .... should define all of myTemplate<T> as exported, meaning you can write the implementation in a C++ file and build a library from it, thus not exposing your C++ code (again, I don't like it!). I was told HP's C++ compiler supports this keyword. gcc 2.9.53-3.1 (and possibly higher) complains:
warning: keyword `export' not implemented, and will be ignored
Microsoft's VC doesn't even know this keyword. The problem is not with C++ in this case, it's with the compilers.

I love burekas in the morning
[ Parent ]
it's not about closing the source (none / 0) (#183)
by jacob on Tue May 27, 2003 at 07:39:41 PM EST

it's about allowing separate compilation. How many CPU cycles have been spent compiling the same STL code over and over and over again?

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
With precompiled headers? (none / 0) (#185)
by Skywise on Tue May 27, 2003 at 07:51:46 PM EST

Probably not much...

[ Parent ]
precompiled headers don't work (none / 0) (#192)
by jacob on Tue May 27, 2003 at 09:06:16 PM EST

when you can't compile code until you see its use.

[test1.cc]
#include <list>
#include <iostream>
using namespace std;

int main() {
  list<char> a;
  list<char> b;
  list<char> c;
  list<char> d;

  a.push_front('a');
  b.push_front('b');
  c.push_front('c');
  d.push_front('d');

  return 0;
}

jacob@ninjapants:~$ time g++ -o t test.cc

real    0m2.931s
user    0m2.850s
sys     0m0.080s
jacob@ninjapants:~$ time g++ -o t test.cc

real    0m2.928s
user    0m2.840s
sys     0m0.090s
jacob@ninjapants:~$ time g++ -o t test.cc

real    0m2.929s
user    0m2.830s
sys     0m0.100s
jacob@ninjapants:~$ time g++ -o t test.cc

real    0m2.926s
user    0m2.870s
sys     0m0.050s

[test2.cc]
#include <list>
#include <iostream>
using namespace std;

int main() {
  list<char>   a;
  list<int>    b;
  list<int *>  c;
  list<char *> d;

  a.push_front('a');
  b.push_front(4);
  c.push_front(NULL);
  d.push_front("hi there");

  return 0;
}

jacob@ninjapants:~$ time g++ -o t test.cc

real    0m3.438s
user    0m3.340s
sys     0m0.090s
jacob@ninjapants:~$ time g++ -o t test.cc

real    0m3.445s
user    0m3.290s
sys     0m0.140s
jacob@ninjapants:~$ time g++ -o t test.cc

real    0m3.432s
user    0m3.350s
sys     0m0.080s
jacob@ninjapants:~$ time g++ -o t test.cc

real    0m3.492s
user    0m3.370s
sys     0m0.130s

Curious, eh?

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

(heh. where of course (none / 0) (#193)
by jacob on Tue May 27, 2003 at 09:07:36 PM EST

the compilations are of files containing the code listed above them, not some third file test.cc.)

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Which would explain... (none / 0) (#196)
by Skywise on Tue May 27, 2003 at 09:18:31 PM EST

an interesting phenomenon I've noticed among Microsoft's programmers and their WTL library... Everything and I mean EVERYTHING is in an .H file and uses #pragma once liberally throughout.

WTL = Windows Template Library (the orphaned step-child successor to MFC)

[ Parent ]

Precompiled headers (none / 0) (#248)
by p3d0 on Wed May 28, 2003 at 11:54:30 AM EST

I don't understand--is this example actually using precompiled headers?
--
Patrick Doyle
My comments do not reflect the opinions of my employer.
[ Parent ]
A Problem with the Language or the Compilers? (3.50 / 2) (#194)
by OldCoder on Tue May 27, 2003 at 09:13:41 PM EST

A language that provides a "Solution" to a programmers problem, that is as hard to implement as the C++ export directive has a problem. The compiler writers would have implemented it if they were able to.

The problem of creating and selling proprietary template classes (a problem created by the template feature of the language) without giving away the source code was "Solved" by providing an export keyword. The export keyword is a problem in itself. Possibly time to start from scratch?

In the extreme, the language designers could have solved all the problems with a "dwim" keyword that instructed the compiler to simply "Do What I Mean" instead of depending on complicated text representations of solutions. This would have been implemented the same way export was...

--
By reading this signature, you have agreed.
Copyright © 2003 OldCoder
[ Parent ]

ehh, well... (5.00 / 1) (#206)
by joto on Tue May 27, 2003 at 11:32:05 PM EST

As a matter of fact the export keyword has been succesfully implemented in some compilers. DWIM has never been succesfully implemented, and probably never will.

But yes, it's ok to blame the C++ standardization committee for making the language too complex. On the other hand, they were pretty much doomed from the start, if you wanted them to produce a nice simple clean language.

[ Parent ]

Good point, but 'dwim' is a bit far fetched. (none / 0) (#224)
by beavan on Wed May 28, 2003 at 02:38:25 AM EST

Indeed, the gcc folks claim implementing the export keyword is too hard for them to deal with right now.
The gcc message boards are filled with suggestions and comments about this keyword dating back from '98.
I for one am willing to pay the compile time penalty for the benefits of templates.
As for closed source projects - there are tricks that can be made, but there's no true way to hide your code if your compiler doesn't support exporting templates.
As for the dwim keyword - I think the C++ committee does a good job.
it's too slow in my opinion, but I don't think you can find many examples for making too complex features for compiler vendors to support.
Let's not forget that gcc is an open source project, and they don't really care about anyone hiding his code (although the export keyword has many advantages besides hiding your code).
Microsoft's VC is not a good example, because this compiler doesn't support templates very good anyway.


I love burekas in the morning
[ Parent ]
Information hiding is a red herring here (none / 0) (#187)
by p3d0 on Tue May 27, 2003 at 08:29:52 PM EST

Distributing things in source form isn't information hiding. IH is about hiding information in one module from another module; it's not about hiding information from developers.
--
Patrick Doyle
My comments do not reflect the opinions of my employer.
[ Parent ]
Oops--"in binary form". Sorry. [n/t] (none / 0) (#188)
by p3d0 on Tue May 27, 2003 at 08:30:57 PM EST


--
Patrick Doyle
My comments do not reflect the opinions of my employer.
[ Parent ]
he said "implementation-hiding" n/t (2.50 / 2) (#195)
by Bill Melater on Tue May 27, 2003 at 09:15:58 PM EST



[ Parent ]
No he didn't (4.00 / 2) (#237)
by p3d0 on Wed May 28, 2003 at 08:34:35 AM EST

Last sentence: "That's a pretty big problem, in my mind, because it detracts from information hiding and modularity."
--
Patrick Doyle
My comments do not reflect the opinions of my employer.
[ Parent ]
He said both (4.00 / 1) (#245)
by Bill Melater on Wed May 28, 2003 at 11:29:52 AM EST

The rest of his post is talking about implementation-hiding. I think it's safe to assume that he's not talking about "information-hiding" in the OOP sense.

[ Parent ]
I didn't think it was safe to assume that (5.00 / 1) (#247)
by p3d0 on Wed May 28, 2003 at 11:49:57 AM EST

Perhaps, but he said "information hiding and modularity". If he didn't mean IH in the OOP sense, then this is a particularly poor choice of words. I still think it was worthwhile for me to make the distinction between hiding information from modules versus from programmers.
--
Patrick Doyle
My comments do not reflect the opinions of my employer.
[ Parent ]
Quite true (4.00 / 1) (#249)
by Bill Melater on Wed May 28, 2003 at 12:01:18 PM EST

Rereading it, I think he's got a great point right up to the last sentence, at which point he starts crossing the streams

... and you definitely don't want to be crossing the streams in C++ [rimshot]

[ Parent ]

templates and operator overloading (5.00 / 3) (#184)
by ZorbaTHut on Tue May 27, 2003 at 07:46:16 PM EST

for instance, if the SML/NJ compiler sees the function definition

fun f(a,b) = a + 2*b

it is smart enough to realize that a and b must be numbers and the result type must also be a number

Which would be rather nice if a and b were, necessarily, numbers, but it's worth pointing out that with C++, no such restrictions exist.

C++ also supports operator overloading - put simply, I can write my own custom classes that support basic arithmetic completely transparently. "a + 2 * b" could be integers or floats, or it could be Bob's Custom Big Integer Class, or it could be an arbitrary-precision floating-point class. Or, for that matter, it could even be a calculation involving vectors or matrices.

The assumption "oh, it involves math, it must be numbers" is one that C++ just plain can't make, because it's not true - and yes, this is a feature that I've used and abused beyond all reason, so I consider it a Very Good Thing (tm).

As for the error reporting, I'll admit it's been a weak point for quite a while - however, compilers are getting much better about this now, and the compiler I use at work gives much more useful error messages . . .

poly.cpp(642): error: expression must have class t ype
        in.test();
        ^
          detected during :
            instanti ation of "void interiorfunk(Data) [with Data=int]"  ;at line 646
            instanti ation of "void funk(Data) [with Data=int]"

This is a template function "funk", which calls another template function "interiorfunk", which tries to call the member function "test" on the data it's been given. I've passed funk an int (which obviously doesn't have a member function named "test") and it's giving me a somewhat useful error message. I say "somewhat" because it's still not telling me where the line that started the whole chain was - it knows that funk is calling interiorfunk at line 646, but it doesn't know where the first call of funk is (it's line 651, if you're curious).

So it's not perfect - but it is getting much better.

(Incidentally, the compiler is the Intel compiler - version 7.0)

Type classes (5.00 / 1) (#199)
by Pseudonym on Tue May 27, 2003 at 10:13:58 PM EST

Haskell also supports operator overloading in a non-ad-hoc way.

The type of the + operator, in Haskell, is:

(+) :: (Num a) => a -> a -> a

That is, if "a" is a type which is a member of the Num class, it takes two "a"s and returns an "a". Note that "a" is a polymorphic type variable, not a concrete type, so anything can be defined to be a number, such as machine words, arbitrary precision integers, floats, complex numbers or sets (which can in turn contain parametric types).

The Num class, for the purposes of this example, you can think of as like the ML signature given in the article.

This has the potential to be even more flexible than C++'s overloading, because in C++ you can't do this:

int randomNumber();
float randomNumber();

No such problem in Haskell:

class Random a where
    randomNumber :: IO a

instance Random Int where {- etc -}

instance Random Float where {- etc -}

The only problem in Haskell is that you have to be careful design your type classes to begin with to support later overloading. In the above example, the Num typeclass was designed before the need for different kinds of overloading was obvious (and before functional dependencies, which makes overloading a little more sane in some circumstances), so unfortunately you can't do the equivalent of this in Haskell:

vector operator*(matrix& lhs, vector& rhs);

I should stress that this is a problem of the Haskell standard library, not of the base language, and will probably be fixed in the next iteration of the standard.



sub f{($f)=@_;print"$f(q{$f});";}f(q{sub f{($f)=@_;print"$f(q{$f});";}f});
[ Parent ]
The real problems with C++ templates (4.00 / 2) (#198)
by vladpetric on Tue May 27, 2003 at 10:11:37 PM EST

1. Time of compilation.

If your template resides in a header file, everything's fine - the compiler can do "text replacement" and generate fairly efficient code. This is not the case if you want to compile incrementally (it's so much tougher to do optimizations at link time)

2. Too much syntax.

Templates should have been limited to types, and only to classes.

Recursive templates, like the one described in an article are obfuscation to the highest degree. I don't see why you can't just use the preprocessor.

Function templates are nice for stuff like min(a, b), but if you combine function templates with class templates stuff can become really messy.

I've seen, in a very respectable book, a class template which had a template-ized constructor (different template than the main one). It achieved a cool thing, but the price it payed in readability/maintainability suffered a lot.

In that respect, the new Java coming from Sun does much better - you can only do so much (you can't shoot yourself in the foot)

3. C++ references.

C++ references are actually what I consider the worst features of the language. They're dumb (they can't be NULL, they can't be set more than once) - so you can't for instance do higher level structures with them, like AVL/red-black trees and furthermore they break the nice invariant of C - pointers are passed as pointers, everything else is copied (In other words, when you program
with references, you've gotta be very careful or you risk having different argument passing semantics, or even worse, reference to some dead stack).

Most template libraries (including STL) work with references, instead of pointers. What behaviour should a programmer expect from the following class:

set<myclass>

In what way should this be different from:

set<myclass&> or set<myclass*>?

What if a method of set returns a reference ?

(the answers to these questions are pretty clear for the stl set, the obfuscation problem still remains, though)

4. Developers

Templates and operator overloading are the 2 most abused features of the C++, because of their coolness factor (makes people feel "in-control" ...). Abusing OO-features usually makes an OO program much worse in terms of maintainability than a  non-OO one. It's one of the reasons OO projects fail.

Some of the #4 people would argue that all these points are bullshit, and if you're careful enough, you can avoid them. I'd remind them that coding is only 35% of the development cycle - the rest is debugging and maintaining.

Conclusion: C++ - the most obfuscated real language ever (only brainfuck does worse, but it's a synthetic one).

hmmm (none / 0) (#205)
by joto on Tue May 27, 2003 at 11:26:07 PM EST

1. Time of compilation.

Yes

2. Too much syntax.

Obviously, yes...

Templates should have been limited to types, and only to classes.

Not at all. It permits many useful things. Ever looked at e.g. the SI unit library?

I've seen, in a very respectable book, a class template which had a template-ized constructor

Yes, this is a pretty normal thing to do, get used to it.

3. C++ references. C++ references are actually what I consider the worst features of the language.

Why do you think they are so bad? They really shouldn't be that confusing. Think of them as a pointer type that doesn't need indirection. In addition, they have some limited lifetime guarantees, but if you stay conservative and treat it like a pointer to something, then you are safe. Ok, it's more syntax, but not in the horrible way template syntax is...

What behaviour should a programmer expect from the following class: set<myclass> In what way should this be different from: set<myclass> or set<myclass*>?

This should be obvious if you understand C/C++'s pointers and their behaviour. That you need to understand that to use C++ is bad, but it's not the references fault.

Templates and operator overloading are the 2 most abused features of the C++, because of their coolness factor (makes people feel "in-control" ...). Abusing OO-features [..]

Templates and operator overloading are not OO features. But yes, they can be abused.

Conclusion: C++ - the most obfuscated real language ever

Probably true...

[ Parent ]

On references (4.50 / 2) (#232)
by Shano on Wed May 28, 2003 at 04:18:58 AM EST

I don't see a big issue with references, just the way they're used. Personally I feel all references should be declared const. That way you get the speed improvements from not copying objects all over the place, and don't have unexpected changes to your objects. That's what pointers are for.

At least, my C++ code is consistent with that, I just wish everyone else would do the same. Pain in the backside to debug that sort of thing.

(And don't even get me started on operator overloading. I know someone who overloaded + to transparently change one or both of the operands.)



[ Parent ]
these are only problems if you misuse them (none / 0) (#269)
by alkaline on Wed May 28, 2003 at 06:23:44 PM EST

2. Too much syntax.
Templates should have been limited to types, and only to classes.

Ok, well, what if you wanted to do list<int>? int is not a class. Limiting templates to using UDTs would require stupid wrapper classes a la Java:
class Integer
{
int i;
// ...
};
That's just silly and redundant. And there really cool uses of allowing basic types as template parameters. Say you want to define a multi-dimension array type, e.g.:
template<typename type, unsigned dimensions> class array
{
// ...
array<type, dimensions - 1> operator[] (unsigned);
};
Try to do *that* without allowing basic types as template parameters.
3. C++ references.
C++ references are actually what I consider the worst features of the language.

That's because you don't know why they were introduced. Everything you mentioned can be done with pointers, and more explicitly at that. References were introduced in C++ primarily for use as function arguments. Of course they can't be set to NULL, that wouldn't make sense. A reference is basically a pointer that is dereferenced each time you access it. It's syntactic sugar, that's all. I lets you write more readable code in many circumstances. To allow references to have all the functionality of pointers would be silly. And as far as the STL goes, the standard makes very clear what can and cannot be passed as template arguments for an STL container. It explicitly states every requirement for contained types.
I'm not saying C++ is the perfect language, but most of its features were well thought out, and there are good reasons behind most of its ideosyncracies. I do think its C backward-compatibilitiness is stupid, and the ISO still has lots (and lots) of kinks to iron out. But its not a lost cause (as many seem to think). And I don't know of any other language that provides as many features, and is still nearly as fast as regular handcoded C.

[ Parent ]
wrt to doing calculations at compile-time... (4.50 / 2) (#222)
by goonie on Wed May 28, 2003 at 01:44:16 AM EST

Isn't there a technique called "partial evaluation" that allows a compiler to identify bits of code that can be evaluated at compile time, evaluate them, and then insert the results into the generated code *without* the need to perform all this template/macro wackiness?

It's not the same (3.00 / 1) (#240)
by trixx on Wed May 28, 2003 at 10:01:32 AM EST

Partial evaluation usually acts on constants and operators; it can substitute things like 2+a+3 with 5+a. What partial evaluation cannot do is evaluate functions and substitute, like this: int f (int x) {return x+1;} int main (void) { int a ; a = f(5) ; a = 1+f(a) ; } Partial evaluation will _not_ substitute main with a=6; or a=a+2. So you can't define things like a factorial function (nor do everything turing-complete)

[ Parent ]
you're thinking of constant folding (5.00 / 1) (#241)
by jacob on Wed May 28, 2003 at 10:08:32 AM EST

Partial evaluation is the idea of taking a program and some subset of its input and producing another program that takes the rest of the input and then produces the value the original program would've produced. Here's a more thorough description.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
yes it would... (5.00 / 1) (#274)
by joto on Wed May 28, 2003 at 09:38:15 PM EST

It would be a very sad partial evaluator that didn't evaluate a program fully when it doesn't depend on external input. Since the program also lacks output, I suspect the partial evaluator would come up with an empty main function as a replacement for the whole program.

[ Parent ]
problem (3.25 / 4) (#225)
by Cruel Elevator on Wed May 28, 2003 at 02:55:26 AM EST

with one of your example. The correct example is given below:

int myCleverFunction() {
return 42;
}

It's 42, not 4, OK?

Sincerely,

Cruel Elevator.

My views (5.00 / 2) (#228)
by statusbar on Wed May 28, 2003 at 03:43:27 AM EST

Very good article, however it does not go far enough.  

I first started with c++ by porting gnu g++ v1.35 to my 4 meg Atari Mega ST with a 40 meg hd back in 1990.  My god what a mess, took 24 hours to build too!  I've always been a c++ nut, excited about every new feature and syntax that was added.

Until now.

A further problem is how the c++ standard has evolved.  One can argue that it is a better, more complete language now than it was in 1990.  But programming c++ was always like trying to hit a moving target.

A common misconception:  "This code is object oriented and therefore reusable!"

The reality:  "This code is organized for how c++ circa 1990 works.  It needs to be refactored again, otherwise it will not work with our new libraries - it isn't const correct or exception safe at all and our containers are now STL."

The complexity of c++ makes it even tougher for compilers to be correct.  I am amazed at how complete some of the compilers are now. But almost all of them are still incomplete.

Portability is a big problem.  I understand the need for ANSI/ISO to pass on specifying anything about multithreading behavior, but in reality many programs (right or wrong) need threads.  Some versions of STL were not thread safe, in obscure ways. Some versions of g++ did not have thread safe exceptions.  These are really tricky problems to solve when your code is correct but the compiler or library is the one spitting out thread-unsafe code.

My personal objective now is to use c and c++ to implement an embedded scripting language parser, like www.lua.org or scheme, or maybe even python.  Write the heavy duty algorithms in c++ and make them callable by the scripting language. Then write all the procedural code in the scripting language. Embedded all in one executable.  Anyone who thinks this would be too slow should analyze the design of Quake3 for a bit.

But what do I know, I'm still using C++ for embedded web-apps as well as for firmware on TI's 6701 DSP's.

--jeff++

Not only the templates are ugly (4.00 / 2) (#231)
by jope on Wed May 28, 2003 at 04:06:12 AM EST

... it is the whole language that is a terrible ugly hack and anachronism. The only excuse for using it is it's somewhat-backwardcompatibility to C, which is even more ugly and anachronistic, but - alas - *very* widespread. If you want beauty of concept take one of the many modern languages, some of them even useful for real developing, like Ocaml (www.ocaml.org). A whole universe of easyness and things you can do like you would expect will open up :)

Functors != Interfaces (none / 0) (#236)
by det on Wed May 28, 2003 at 07:47:55 AM EST

Functors and interfaces do different things. For example, pretend you wanted to write a 3d engine that had a list of objects that knew how to render themselves. With C++ you could have an abstract class called Renderable with a single pure virtual function named render and just have a STL list paramaterized by Renderable and any class that was derived from Renderable could be inserted. Similarily in haskell you could have a a type class named Renderable and have a List paramaterized by that. With ML however it doesnt seem possible. You could have a signature called RENDERABLE and a functor List(T: RENDERABLE) but then you have to initiate that to a paticular type like say:
structure L = List(struct
type t = sphere
val renderT = renderSphere
end);
Then you can only call L.render which is specific to sphere. You can never say RENDERABLE.render(foo). I personally think ocaml would be the coolest language in the world if it allowed some way to do interfaces while ditching it's crumby class system. Please correct me if there is a way to do this with functors!

Hm. (none / 0) (#239)
by i on Wed May 28, 2003 at 09:55:26 AM EST

You can't do that in Haskell 98 (no extensions) either. Type classes are not types. You can't have a list of Renderable.

More to the point: there's no single language-independent meaning of the word interface. Functors and type classes can't be used to define heterogenous collections, but otherwise they are very similar to Java interfaces or C++ ABCs. It's not wrong to call them interfaces too.

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]

Yes, but... (none / 0) (#304)
by jemfinch on Mon Jun 02, 2003 at 04:23:09 PM EST

Haskell 98 is 5 years old. You should at least mention that many Haskell compilers do support lists of equivalently typeclassed items. Jeremy

[ Parent ]
Whack template (3.33 / 3) (#246)
by darthaya on Wed May 28, 2003 at 11:41:11 AM EST

And you lose STL, the best part about C++.

Then you go back to the ugly C, where everyone does his/her own implementation of string/vector/list/etc., what a waste of brain cells.


Other way to solve it, in an OO language: Eiffel (5.00 / 5) (#250)
by trixx on Wed May 28, 2003 at 12:09:59 PM EST

The article mentions ML and Lisp variants as languages with proper solutions of the problem that templates try to solve. One could argue that it's comparing apples and oranges, so I would like to extend the article with a comparison to a language that tries to be in a similar niche to C++ (compiled OO-imperative language): Eiffel

Don't get me wrong, I'm not against the functional paradigm (I've coded some ML/Lisp and a lot of Haskell), but I believe they are useful just for a small problem set, while I see OO more general-purpose.

Nobody in their right mind will think "Well, do I use C++ or Lisp for this project?", because the problem will probably dictate clearly one of the two. On the other hand, I wanted to show you about Eiffel, a language that you could say "Do I use C++ or Eiffel for this project?", in most cases that C++ is used.

Eiffel is not much newer than C++ (first language revision dates from 1986), but gets a lot of things done right wrt C++. It is a pure OO compilable language, and compilers generate code with similar efficiency to C++ (with method dispatch usually faster). It's a simple language, with a set of small powerful features instead of the C++ philosophy of "one feature for everybody"

One of the features is type genericity, and it's used to solve most of the same problems that templates try to solve. If you don't use it, you get things like class INTEGER_LIST_NODE
-- This class is not complete, only for illustrative purposes
creation
make
feature
make (i: INTEGER; n: INTEGER_LIST_NODE) is
do
item := i
next := n
end
feature
item: INTEGER
next: INTEGER_LIST_NODE
end

The above is what you would do in C... A list node of a linked list with a fixed type. e know the problems of that, so let's use inheritance to make something like thee Java solution class UNTYPED_LIST_NODE
creation
make
feature
make (i: ANY; n: UNTYPED_LIST_NODE) is
do
item := i
next := n
end
feature
item: ANY
next: UNTYPED_LIST_NODE
end

Note than Eiffel's ANY is like Java's Object, the parent class of everything.

This is a little better in the sense that I don't have to rewrite code for each type. Since anything (even INTEGER) inherits from ANY, i could code: new_list,other_list: UNTYPED_LIST_NODE
....
create new_list.make (42, other_list)
create new_list.make ("Hello world", new_list)
create new_list.make (new_list, new_list)

Here we can see that UNTYPED_LIST_NODE is... well, untyped. What if we want lists of an specific type? We can't. The example above connects node with an INTEGER, a STRING, and an UNTYPED_NODE_LIST. That's the Java solution. Besides, when you want to take something out of a list, you have to cast it into an useful types (because there's not much you can do with an Object/ANY). You could do that... there's a form of cast in Eiffel (not exactly a cast, and always safe), but there are better ways to do things. Actually, Eiffel casts are rarely used (usually when interacting with the outside nontyped world, like files), and I've used them less than once per programming project.

So, what's the right solution? we can use the genericity mechanism. At first sight you can find it similar to templates, but let's compare a little. First, the generic class example: class LIST_NODE [T]
creation
make
feature
make (i: T; n: LIST_NODE [T]) is
do
item := i
next := n
end
feature
item: T
next: LIST_NODE [T]
end

I could now declare things like LIST_NODE[ANY] (equivalent to UNTYPED_LIST_NODE), LIST_NODE[INTEGER] (equivalent to INTEGER_LIST_NODE), or LIST_NODE[any type you have in Eiffel].

This looks very similar to a template with parameter T, with expansion of T with its actual generic parameter. But there are differences:

  • Type checking: The 'generic class' (That is what things like LIST_NODE are called) are processed once, and safely type checked independently of its instantiations
  • No code is replicated. The code is generated once, usually. There are exceptions when you use expanded types (expanded types are types that are usually used by copy instead of by reference, like INTEGER or BOOLEAN); in that case different code is generated to make the internal representation of LIST_NODE objects more tight (so, internally, LIST_NODE is usually a pair of pointers, but if you use LIST_NODE[INTEGER], you get a pair with an int and a pointer).
  • Generic classes is a method only for classes. Eiffel has no equivalent to template functions. Generic parameters can only be types. That is more limited, but helps to have good type checking, and I have never needed more.

OK. Now i know that if I have a l: LIST[INTEGER], l.item is INTEGER (the compiler knows that statically, so i could write l.item+l.next.item w/o worring about casts (like C++, unlike Java). What if i need special properties of the generic parameter (the [T]) inside the class? for example class VECTOR2 [T]
-- Addable vector with 2 T components
-- Does NOT work
creation
make
feature
make (xx, yy: T) is
do
x := xx
y := yy
end
feature
x,y: T
feature
infix "+" (other: VECTOR2[T]) is
do
create Result.make (x+other.x, y+other.y)
end
end

Note that we can define operators as methods, and are dinamically dispatched like any normal method (all Eiffel methods are 'virtual'); there is no static overloading in Eiffel (the thing known in C++ as overloading is static overloading).

The above class VECTOR2 won't compile? what's wrong with it? The compiler says: ****** Error: Current type is ANY. There is no feature infix "+"
in class ANY.
Line 16 column 45 in VECTOR2 (./vector2.e) :
create Result.make (x+other.x, y+other.y)
^

Note that the message is quite clear... given that we know nothing about 'T', the static type checker assumes it is an ANY, and ANY doesn't have an infix '+' operator. so what we wanted to do actually was: class VECTOR2 [T -> NUMERIC]
-- Addable vector with 2 components
creation
make
feature
make (xx, yy:T) is
do
x := xx
y := yy
end
feature
x,y: T
feature
infix "+" (other: VECTOR2[T]): VECTOR2[T] is
do
create Result.make (x+other.x, y+other.y)
end
end

Note that above we said T->NUMERIC instead of just T. With that, we mean that we cannot use any T, but only those that inherit NUMERIC (NUMERIC is a standard abstract class describing ring-like structures: operators '+' and '*', features 'zero' and 'one' that are neutral elements, and things like that), so we now can declare VECTOR2[INTEGER] and VECTOR2[REAL]. Note that we can not declare VECTOR2[STRING], even when STRING has a + operator (for concatenation), because STRING is not numeric. So type checking is not based in the names of the operators but in its semantics instead (a NUMERIC + is not the same that a STRING +) C++ cannot make that difference.

We could also improve the previous class like this: class VECTOR2 [T -> NUMERIC]
-- Addable vector with 2 components
inherit
NUMERIC
creation
make
feature
make (xx, yy:T) is
do
x := xx
y := yy
end
feature
x,y: T
feature
infix "+" (other: VECTOR2[T]): VECTOR2[T] is
do
create Result.make (x+other.x, y+other.y)
end
-- Other operators, zero, one defined here ...
end

and then we can now even do a: VECTOR2[VECTOR2[MATRIX[INTEGER]]]

(assuming that MATRIX is a class inheriting NUMERIC).

Note that we achieved the same results of ML or LISP, in an OO language that could be a replacement for C++ in real-world situations. Learning Eiffel is not very difficult if you already know C++ or Java (It's a simpler language, so you have only to get a couple of powerful concepts).

Also note that all this can be achieved without type inference. Type inference is a nice feature of several functional languages, but it's not needed to get generic types, as I've shown above.

Well, it is long but I hope to show that there's a lot more Object-Orientation than what you see in C++, and that it can be done much more elegantly.

Eiffel would be a nice language (none / 0) (#256)
by i on Wed May 28, 2003 at 03:31:59 PM EST

  • if it had a standard library worth mentioning;
  • if it featured actual, as opposed to proclaimed, type safety;
  • if its model of multiple inheritance weren't so hopelessly broken while being peddled as the best thing since sliced bread;
  • and if it weren't so mindbogglingly verbose.


and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
Re: Eiffel would be a nice language (none / 0) (#262)
by trixx on Wed May 28, 2003 at 04:52:52 PM EST

There is a standard library. It's small, but based on tested things. There are bigger defacto standard libraries like Gobo

If you have a hole in the Eiffel library, you can always resort to cross-calling C++ (or C), which is quite transparent. So it's not much worse than C++

Type safety has a couple of caveats, and I never said that Eiffel has a perfect type system (haskell, for example, is much nicer). But compared with C++, with its unsafe casts, it's much closer to type-safeness. I've ran into a type problem only once, and because of bad design; I've never seen really useful code that breaks the Eiffel type system (despite the theorethical examples).

What you don't like about Eiffel's multiple inheritance? I don't find any problems with it, nad have had much better experiences with it than others like C++ or Python.

And verbosity... well, I won't deny it, and it's a matter of taste.

[ Parent ]

My semi-famous C++ rant (3.25 / 4) (#255)
by Eric Green on Wed May 28, 2003 at 01:58:50 PM EST

I posted this to rec.arts.sf.written some time in the past when we were discussing what computer languages would look like in the future, and it has circulated around the world in various quotes files ever since:

"C++ is an atrocity, the bletcherous scab of the computing world, responsible for more buffer overflows, more security breaches, more blue screens of death, more mysterious failures than any other computer language in the history of the planet Earth. It is pathetic, pitiful, a bag of disparate bolts on the side of "C", a fancy preprocessor that attempts to make "C" look like an object-oriented language and ends up merely being pathetic. If there was any mercy in this world, we would all have adopted Objective "C" as our standard object-oriented "C" follow-on and left C++ to the garbage bin of history where it belongs. Instead, we have a language more bloated than PL/1 or Ada, whose runtime library has all the coherency of a madman cutting pieces of books out and pasting them together into the documentation for the inconsistent drivel that comprises the standard C++ library, we have binding and linkage conventions that are utterly ridiculous in a supposedly "object-oriented" language, and otherwise a pathetic, ridiculous, drooling moronic abortion of computer science that should have been given a decent burial long ago (and would have been, if Microsoft had not mysteriously decided to standardize upon C++ to write their operating systems).

As for what languages are better than C++, gosh, what languages are NOT better than C++? Basically, any language whose basic design eliminates the possibility of memory leaks, whose semantics are simple enough for mere mortals to not have to peruse the 12,000 pages of Stroustrup to understand, that has a coherent and consistent and well-documented runtime library and a well-thought-out syntax, that has "real" objects instead of a wrapper around "C" structs, that does not allow buffer overflows to crash or, worse, subvert your program. What language is that? Oh, pretty much anything, actually, other than C++. Python, Ruby, Java, Objective CAML (which, BTW, has a compiler that actually generates faster code than many "C" compilers!), and many, many other languages that actually have a design that makes sense, which nobody has accused C++ of doing. C++ is a kludge, a hack, a bag on the side of "C", and always will be, and nothing we say or do will ever make that different."

-Eric Lee Green (eric@badtux.org) in rec.arts.sf.written
--
You are feeling sleepy... you are feeling verrry sleepy...

Sillyness (5.00 / 2) (#257)
by Peaker on Wed May 28, 2003 at 03:48:17 PM EST

While I agree that C++ is not too great a language, mainly for its backwards compatability and horrible syntatic and ABI choices, I think this rant is based on ignorance and/or sillyness.

we would all have adopted Objective "C" as our standard object-oriented "C" follow-on and left C++ to the garbage bin of history where it belongs.

Come on, Objective C defeats its own purpose. Its a C addon, worse off than C++. C is useful for low-level close-to-hardware programming. A portable assembly that allows low-level description of algorithms. C++ adds mainly structuring features around this, but remains low-level. Objective C loses complete contact with C. Its dynamic method invocation is horribly slow. It seems Objective C is only useful at taking advantage of people's knowledge of C. I don't think that justifies a language, but instead combines the slowness of Smalltalk with the difficulty of C.

we have binding and linkage conventions that are utterly ridiculous in a supposedly "object-oriented" language

As far as I know, C++ doesn't pretend to be an "object-oriented language", but a general-purpose language with specific object-oriented features. What that has to do with its crappy ABI choices - it seems only you know.

and would have been, if Microsoft had not mysteriously decided to standardize upon C++ to write their operating systems

You give way too much credit to Microsoft's engineers influential power over the many programmers around the world.

what languages are NOT better than C++?

Bah. Languages don't have a real-number property of "goodness". They maybe have mostly-subjective "goodness" in many different fields. In many fields, C++ is really good. It allows writing well-structured programs that implement algorithms in a low-level efficient way - for cases you do not trust the optimizer.

Basically, any language whose basic design eliminates the possibility of memory leaks

This statement is just evidence of your ignorance of memory handling issues. Automatic memory management eliminates the possibility of accessing freed/reallocated memory, but not the possibility of memory leaks. In fact, those other languages you mention make leaking memory quite easy.

Note that since memory leakage is a logical bug that cannot be identified automatically and not a low-level bug that can, no language can eliminate it.

How can the language know if you indeed meant to keep that memory alive or not? Eliminating memory leakage will not happen any time soon.

whose semantics are simple enough for mere mortals to not have to peruse the 12,000 pages of Stroustrup to understand,

Bah, I understand C++ syntax and semantics very well, and I did even after just a few months of working with it. I have learned a quirk or two over the years, but nothing that was substantial to my understanding of C++.

that has a coherent and consistent and well-documented runtime library and a well-thought-out syntax,

I agree the standard library stinks and its documentation may well stink. I wouldn't know, because when I use C++, I do not use the standard library. I use Qt, for example. Its a very useful, well-documented and coherent C++ library that provides most if not all the standard library functionality.

that has "real" objects instead of a wrapper around "C" structs,

Bah, define a real object. That's just silly. You could claim that all objects in all languages just boil down to a "c struct".

that does not allow buffer overflows to crash or, worse, subvert your program.

In order to completely prevent buffer overflows to crash or subvert your program, runtime checks must be forced everywhere. Unless there is a way to prove the runtime tests aren't required at runtime (and the compiler can only sometimes do this), they will be generated and slow down execution. That's why C++ does not force those tests' generation. C++ does allow you to use buffers that check their access, preventing buffer overflows.

Also, it would seem ignorant of you to not know that Common Lisp (In its standard, or perhaps just in most implementations), has compiler directives that allow turning bounds checking off for performance, allowing buffer overflows to crash and subvert your program.

What language is that? Oh, pretty much anything, actually, other than C++

Many low-level languages support pointers, and that includes C, Pascal, ADA, and others. Pointer support without bounds-checking (the only way to allow fast code in some cases and inner loops) will result in the phenonema you suggest. Therefore it seems most of your claims can only be justified by ignorance, and not real C++ issues.

[ Parent ]

Dude, it's a rant, not an academic tome (3.00 / 1) (#263)
by Eric Green on Wed May 28, 2003 at 04:55:11 PM EST

It's only intended to be semi-serious, and yes, it exaggerates things for dramatic rant effects (for example, Stroustrup is only 900 pages, not 12,000 pages :-).

Regarding C, Pascal, ADA, and other languages of that ilk, none of them claim to be object-oriented languages. I was thinking more of languages such as Java, Python, etc. I will point out that I have implemented Python programs that were every bit as fast as your typical C++ program, albeit by coding criticial sections as new Python classes written in "C". C++ itself is an abomination -- a language that is neither a low level language nor a high level language, but some unholy union of both with the worst features of each.
--
You are feeling sleepy... you are feeling verrry sleepy...
[ Parent ]

C++ is low-level (5.00 / 1) (#283)
by Peaker on Thu May 29, 2003 at 02:51:47 PM EST

C++ is definitely a low-level language.

Classes are merely a code organization tool in C++. That's what the "Don't use C++ as a poor man's Smalltalk" is all about. Aside for templates and exceptions, any C++ program is an arguably-superior way to express a C program with the only difference being some syntax sugar. With templates and exceptions, there is a bit more difference than just syntax, but not much.

C++ is comparable to C and not to Python or Java. Most C code can be written well with OO techniques and organization. OO code written in C, and generally well-organized code in C can use some C++ syntax sugar to be more concise or shorter (Notably automatic construction/destruction, vtable management, template generalization, etc.)

[ Parent ]

re: Sillyness (none / 0) (#284)
by thoran on Thu May 29, 2003 at 03:33:10 PM EST

Objective C loses complete contact with C.
Objective-C is strictly conservative extension of the C language. That's the least we can ask for an "object extension" of the C language. And C++ is not. A lot of perfectly correct C programs fail to compile on a C++ compiler.
IMHO, if you want to create a new language that takes its roots in C, there are two sane positions:
  • You just want to have a syntax close to the C syntax (to no frighten old C coders). But for the other aspects of the language, you want something high level, clean and sound. Thus, you remove all the dangerous aspect of C. You get java and this is nice.
  • The priority is to be as close to C as possible (for performance or compatibility reasons). But you would like to add some idiomatic object oriented constructions. You get objective C. It is still dangerous and low level because you inherit all the aspects of C, but you earn a very powerfull object oriented layer. Thus, this is also good, but maybe not as good as java. It should be reserved to applications where performance is critical
But the choice of the conceptor of C++ is completely insane. He wanted something as close to C as possible which is not a bad idea at first. But he also wanted a strongly typed and high level language. It ends up in something that is not compatible with C anymore but still has (almost) all its the dangerous constructions and without providing a powerfull object layer.

Let me give an example among others. C doesn't have a bool type. Integers stands for boolean in control flow statements. C++ adds a bool type, which is nice for a high level language, but they also add an automatic conversion between integers and booleans, which is the stupidest ever conceived idea. Why, did they do this? Because they didn't want to break very common idiomatic constructions among C coders like

if (x) /* where x is an integer or a pointer */
...


But this has terrible consequences for a language that claims to be strongly typed. For an example of this fact, I urge you to look at c++ pitfalls. Read the example whose title is "The stream classes support a type conversion basic_ios ----> void* for testing if a stream is happy:".

Its dynamic method invocation is horribly slow. It seems Objective C is only useful at taking advantage of people's knowledge of C. I don't think that justifies a language, but instead combines the slowness of Smalltalk with the difficulty of C.

This is completely wrong. You are just talking about the cost of method invocation. That's true it is slower that a C++ invocation of a virtual method (something between 2 or 3 times slower). But will it make your programs 3 times slower? of course not. If you write a C++ program that just loops and call a virtual method that does nothing, then you will get a program 3 times faster than the corresponding objective C one. But obviously this program is not very interesting.

In places where a virtual call is costly (because it is deep inside a very inner loop), you will try to turn the method into a non-virtual one or cache the method adress in a function pointer. Same is true in objective C.

And in other situations, the cost of what the method does, is usually far more important that the cost of the method invocation. It turns out that if you try to directly convert a well written and optimized C++ programs into an objective C one, you will get something that is at most 5% slower. I think this is an acceptable tradeoff. Let me remind that NeXTStep was a very advanced and powerfull graphic environment and it was running on 486 at 33 Mhz.

On the other side, the method invocation system of objective C is far more powerfull than the C++ one. And it makes the translation of an Objective C program into a C++ one almost impossible unless you rewrite a dispatch mecanism as powerfull (QT writers did this). This is a real weakness of C++, I mean, a weakness that is very obvious, even if your are not an expert in language theory. For instance, you cannot simply have distributed objects (the possibility to access an object that is living in another process), or dynamically load classes that live in independant modules.

In C++, you have to explicitely write stubs for any classes and any methods you want to make accessible from the outside. People from mozilla had to write XCOM for this purpose. If you read their sources, you realize that you cannot calls a method of an external object simply using the usual C++ syntax. You must use the XCOM dispatch machinery and you end up with a terrible mess uglier than what you would have done in plain C. And finally, this becomes even slower than an objective C method dispatch...

This statement is just evidence of your ignorance of memory handling issues. Automatic memory management eliminates the possibility of accessing freed/reallocated memory, but not the possibility of memory leaks. In fact, those other languages you mention make leaking memory quite easy.

Obviously, you absolutely don't know what you're talking about.

[ Parent ]
Take it personally (none / 0) (#294)
by Peaker on Sat May 31, 2003 at 06:14:00 AM EST

"This statement is just evidence of your ignorance of memory handling issues. Automatic memory management eliminates the possibility of accessing freed/reallocated memory, but not the possibility of memory leaks. In fact, those other languages you mention make leaking memory quite easy."

Obviously, you absolutely don't know what you're talking about.

I can only assume you are Eric Green or at least hold the same oppinions and therefore took offense. Well, from the memory leak claim it was obvious there was ignorance about memory handling issues. Your claim that I know not what I'm talking about is obviously based on grudge and oppinions :)

I basically disagree that C++ does not provide an "object layer". It definitely does provide one, even if a thin one. Its still very useful, and much nicer to write a C++ program with the STL containers and algorithms for example, than to try and re-implement them in C, or use a C implementation with void*'s thrown all over the place.

Also, Objective C is just a little slower - because programmers won't use it for sensitive parts of the program. Its still a silly language because if you use C for the important parts, you can use Python or something much more powerful and convinient for the less sensitive parts.

[ Parent ]

GC (none / 0) (#296)
by thoran on Sat May 31, 2003 at 01:36:05 PM EST

I am not Eric Green but you said something wrong in response to something he said that was 99.9 % true:
Garbage collectors prevent memory leak.

A memory leak is by definition a block of memory, unreachable to the program because it has no pointer on it.
The goal of a garbage collector is to free any unreachable block. As a corollary, you get the property that you cannot access freed/reallocated memory. But before that, you cannot have any memory leak.
There are very specific cases where memory leaks could happen. For example, it may occur with the Boehm GC. Because it makes a conservative assumption, treating anything that looks like a pointer as a pointer. And it may wrongly consider a stupid integer as a pointer, adding false reference to unreachable blocks.
Apart from this kind of GC, that never happens.

[ Parent ]
memory leaks (none / 0) (#297)
by jacob on Sun Jun 01, 2003 at 01:43:44 AM EST

Your definition of garbage is actually subtly incorrect, but your mistake is causing you not to recognize an important category of memory leaks. The most general definition of garbage is allocated memory that no longer has any effect on the computation, not that is unreachable by pointers.

People use the "unreachable by pointers" metric because it can be implemented and basically works while  the "no longer affects computatation" metric would be undecidable, but that doesn't mean that memory that's pointed to but not used isn't garbage data that could be reused. There are lots of times when there could be valid pointers to memory that don't, in fact, have any effect on computation, and these are legitimately considered memory leaks: pointers to large data structures held by an object that are only used for initialization, for instance, can cause you to lose a lot of otherwise-useful memory, as can the famous stack-like memory leak (aka lack of tail-call optimization, or keeping memory allocated for a stack even though all functions on the stack made tail-calls to other functions and don't need to be returned to).

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]

I agree (none / 0) (#299)
by thoran on Sun Jun 01, 2003 at 04:07:42 PM EST

with this point perfectly. I also wanted to write something about this in my previous post but I was too lazy to do it.

The term "memory leak" designates two things for me. One is the point I was talking about. The other is what you are talking about, ie: the things we don't care for the future of the computation. Let's call this last one the general garbage . The general garbage obviously contains the memory leaks, as you said. My english is not so good, thus maybe there are two different terms for those concepts.
Anyway, there is no hope that any general tool (GC, static analysis, whatever) can manage the problem of the general garbage. It is up to the guy who writes the code.

I'm sure that Eric Green was only talking about the simple memory leaks. But they are the most critic one and it is very good to have a very efficient and automatic way to catch them all.
Memory leaks can make programs grow bigger and bigger until they crash the process or the computer. That the reason why they are so evil and deserve a particular concept (IMHO).

General garbage are not so evil, first because their existence is semantic: they exist because you have a handle on them, thus it is your fault. On the contrary, memory leaks are really a low level, and technical concept, and should not be seen by the programmer.
And second, the general garbage is not growing forever, like memory leaks in inner loops. I'm considering the case (you may find to restrictive) of interactive programs where you have a main loop. All your unneeded but referenced blocks are pointed by variable that are global or in the stack, above the main loop. You have a bounded number of such variable, hence a bounded quantity of unreclaimed memory. If the garbage collector is smart enough like a generational one, those blocks will fall, sooner or later, in the old generation and may eventually be swapped to disk. The GC will almost never traverse them again. Thus, I consider those leaks not so harmfull.
There is still the possibility of constantly adding stuffs to a global structure like a hashtable and forgetting to remove them in time. I know that this kind of problems occurs in real programs. But it is in any way easier to find those bugs than a memory leak in C programs. (Not to mention that this last kind of bug also appears in C programs)

When Peaker says
Automatic memory management eliminates the possibility of accessing freed/reallocated memory, but not the possibility of memory leaks. In fact, those other languages you mention make leaking memory quite easy.

Apart from the fact that the first sentence is non-sense to me, I think this is a really unfair statement. A garbage collector solves 99% of memory related problems. Those language becomes really much safer to use than C (and its variants). Automatic memory management is a real advancement and should not be criticize with so weak arguments.

[ Parent ]
How automatic memory management can cause leaks (none / 0) (#312)
by jello on Tue Jun 03, 2003 at 11:12:00 AM EST

When Peaker says
Automatic memory management eliminates the possibility of accessing freed/reallocated memory, but not the possibility of memory leaks. In fact, those other languages you mention make leaking memory quite easy.

Apart from the fact that the first sentence is non-sense to me, I think this is a really unfair statement. A garbage collector solves 99% of memory related problems. Those language becomes really much safer to use than C (and its variants). Automatic memory management is a real advancement and should not be criticize with so weak arguments.

Peaker's first sentence makes sense in the context of what you term 'general garbage.' Automatic memory management CAN make disposing of general garbage more difficult. The problem lies with the assumption taken by most automatic memory managers that all pointers should act as an anchor on the object pointed to; that is, a pointer to an object should always prevent the target object from being destroyed. In many cases, though, the semantics of a pointer should indicate only that the pointer should be nulled when the target object is destroyed.

For example, let's say you have a Client object and a Server object. The Client object registers itself for some callback service provided by the Server object. In response, the Server stores a pointer to the Client so that it can call the Client back. In a non-garbage-collected language, the Client object can deregister itself from the Server in its destructor. In a garbage-collected language, however, the pointer to the Client stored in the Server prevents the Client from being destroyed even when it is garbage. For an application that creates thousands or millions of Client objects over its lifetime, this becomes a serious issue. The only way to prevent this type of memory leak in a GC language is to place explicit deregistration code everywhere the Client could possibly become garbage (from the application perspective).

The ideal solution is to have two types of pointers. One type acts as an anchor that prevents its target from begin destroyed. The other type of pointer permits its target to be destroyed but is marked as invalid afterwards. This type of system can be created in a language like C++ that gives the programmer access to raw pointers, but it is impossible to do in a GC only language.

[ Parent ]

You're talking about weak references, (5.00 / 1) (#313)
by jacob on Tue Jun 03, 2003 at 01:38:40 PM EST

which most languages with GC support. FYI.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Weak pointers do not eliminate memory leakage (none / 0) (#317)
by Peaker on Thu Jun 05, 2003 at 02:55:48 PM EST

Let me start by making it clear that I always talked about what was coined as "general garbage".
Unreachable data being freed is a solved problem and not very interesting anymore.

Weak pointers are indeed a useful tool to allow the correct freeing of memory.
However, they only work if you use them, obviously. This means you properly classify the pointers you use as weak when required.

I personally heavily use Python, and I've had my programs leak insanely in some cases, after building large inter-connected trees.
Some of those links existed for more technical or short-term purposes and may have required special care (such as using weak references). Finding there was a memory leak was extremely difficult, and finding where the leak originated even more difficult (since dynamic memory allocation and freeing is so commonly done and spread all over the code).

In C++, on the other hand, only a small subset of memory allocations do not have built-in freeing relationships (static allocation leads to automatic destruction, as well as auto_ptr's and others).

The potential for memory leaks in most dynamic languages lies in every allocation done anywhere, while the potential for memory leaks in C++ lies only where the freeing semantics are not simple, automatic and deterministic. This means that in C++ I only need to scan "new" allocations that are not in auto_ptr's and other pointer abstractions, while in dynamic languages I must scan all instantiations of all objects. The conclusion is that it is probably easier to hunt down memory leaks, at least using the method of going through all program allocations, in C++.

[ Parent ]

optimizing away bounds checking (none / 0) (#308)
by Delirium on Tue Jun 03, 2003 at 01:50:49 AM EST

In order to completely prevent buffer overflows to crash or subvert your program, runtime checks must be forced everywhere. Unless there is a way to prove the runtime tests aren't required at runtime (and the compiler can only sometimes do this), they will be generated and slow down execution.

This isn't necessarily the case, if the language is designed with that in mind. For example, Cyclone is a variant of C that includes bounds-checked arrays and NULL-checked pointers, but is carefully designed so that a program coded to take advantage of its features will suffer a minimal speed hit, as most of the checks will be able to be optimized away (to ease porting a lot of plain-C syntax is allowed, so obviously those programs will suffer a more substantial speed hit). The reason bounds-checking in C or C++ suffers a much more substantial speed hit is because the language is not designed in such a way as to give the compiler enough hints to optimize away very many of the checks, so they all have to be performed. But this is only true of C and C++, not true of programming languages in general.

[ Parent ]

And its All True (none / 0) (#287)
by jefu on Thu May 29, 2003 at 08:40:39 PM EST

Actually C++ is rather worse than that. Templates calling templates that do overloading on templates and memory management (not garbage collection, mind you) behind the scenes .... I once ported something that took about 10000 loc in C++ to Sather and ended up with something that took less than 3000 loc, was very much more readable and that ran about as fast.

But I do understand that those who have learned C++ are reluctant to try anything else - after all, if another language will be as hard as C++ to get right, it will take a year or two to get proficient in it.

[ Parent ]

I wish Ocaml had C++-style templates (none / 0) (#266)
by cwitty on Wed May 28, 2003 at 05:36:31 PM EST

I hope my comment subject is sufficiently provocative. :-)

Often, when I'm programming in C++, I wish that it had ML-style generic polymorphism or Lisp-style macros.  Sometimes, when I'm programming in Ocaml, I wish it had C++-style templates.

For data structures, I agree that ML-style polymorphism is usually a better choice; with a couple of caveats.  

First, it's easier to write code which is portably efficient with C++ than with ML.  A ML implementation of a vector type (based on generic polymorphism) will be very inefficient when applied to characters, in some implementations of ML: it will use at least 4 times as much memory as it "should", and maybe more.  Some ML compilers may automatically generate a specialized implementation, which does not have this overhead (I'm not sure here; do real ML compilers do this specialization?); but many won't.  The C++ template version doesn't have this problem.

Second, the issue of interfaces is not as one-sided against C++ as you imply.  The exact problem you describe (where a List type has an output routine, but calling the output routine only works if the base type also has an output routine) can also be seen as a feature.  It means that you can provide an output routine as an optional feature of your container class, but still use the class with objects that can't be printed.  This is a useful, powerful feature of C++ that I don't see how to easily emulate with ML (the approaches I can think of lose modularity or run-time efficiency).  (I would prefer a way of having this feature that still let you completely typecheck the library code.  Maybe something like ML functors with optional arguments.)

I'm not sure what "This system [ML functors] gives you more abstraction power than C++ templates ..." means.  It seems to me that any ML functor can be translated (almost automatically) into a C++ template.

As far as comparing C++ templates to Lisp/Scheme macros, I would say that they're not really comparable; there are many things which you can do with one but not the other.  The main thing you can do with templates that you can't do with any other macro-style system I know of is compile-time dispatch based on the compile-time types of the chunks of program being manipulated.  Lisp can't really do this, since Lisp doesn't really have a compile-time type system.  (Maybe you could add a type inference system into your macro expansion, but the result would be clumsy and fragile.)  It's difficult in ML (or any other language that uses type inference) as well, since you would have to mix macro expansion and type inference somehow.

Here's a template example:

  template<typename T>
  void print_elements(T first, T last) {
    while (first != last) {
      cout << *first << ' ';
      first++;
    }
  }

This will print the elements of a C++ array, a STL vector, a STL linked list, or any of a wide variety of other containers, and it will do so whether the elements are integers, floating-point numbers, or some user-defined type.  And the container element access will not involve run-time virtual function calls or type dispatch.  I don't know of any other language where such code could be written so concisely and be so efficient (which is not to say that such languages don't exist).

In conclusion: I like OCaml much better than C++.  Basically, the only C++ feature I miss when writing OCaml code is templates.  Most uses of templates could be replaced by ML generic polymorphism or ML functors, but the result would likely be significantly less efficient with many (perhaps all) ML compilers, and would in some cases be far more verbose.  I would love to have a language in the ML family that supported C++-style templates (but this would be difficult, since the more interesting features of templates are difficult to combine with type inference).

Two words: Dynamic Typing (3.33 / 3) (#277)
by PolyEsther on Thu May 29, 2003 at 04:35:42 AM EST

It seems to me that dynamic typing is a much nicer alternative to C++'s templates.  

It would have been nice if Objective C had won out as the popular Object Oriented flavor of C.  C++ is a monstrosity and templates only add to the complexity of the beast (and at best, they are only a poor way of emulating the power of dynamic typing).

Dynamically typed languages like ObjC, Ruby and SmallTalk allow for much greater flexibility in program design.  In those languages, if an object doesn't respond to a particular message then it's not the correct type - type is determined by what messages an object will respond to.  But even then, all is not lost: by defining a 'method_missing' method in your class (In Ruby, but both SmallTalk and ObjC have similar facilities) you can define what happens when your object receives a  previously unknown message.  I've heard it put this way: When you walk down the street and someone says 'Blarg?' to you, do you don't just drop dead in your tracks because you don't know the meaning of 'Blarg?' - that's what happens in C++, the program drops dead.

In dynamically typed languages, you can  define several classes which have the same interface (respond to the same messages, or a subset of messages) and then use them in, for example, a collection, iterate through the collection sending the same message(s) to each object in the collection and not worry that each of those objects should be somehow related in a class hierarchy.  It's a very freeing concept, though perhaps a little scary at first if you're coming from a statically typed mindset, but once you get used to it you hate it when you have to go back to the confines of C++.

ohh, that's biased... (5.00 / 2) (#301)
by joto on Mon Jun 02, 2003 at 03:02:34 AM EST

Sure, dynamic typing is sometimes nice. But we have at least two good reasons for sometimes preferring static typing:
  1. Catch errors
  2. Speed
The first reason is pretty clear. Static typing really catches errors. Especially on larger projects with several programmers. Some languages, such as C, have a pragmatic approach to static typing, and can only catch some simpler errors. Some languages, such as ML or Haskell, makes it possible to design your types in such a way that it is pretty hard to write code that compiles, but is wrong. If you don't think this is important, you haven't tried it.

The second reason is also of importance, of course. While lisp-pundits can brag on about how their favourite lisp implementation can create code that runs as fast as fortran, it isn't really true. They only get this, after having tweaked their code by adding type-declarations everywhere, and thus looses any kind of type-safety, maintainability, or readability the original fortran code had. And there is a reason they choose fortran to compare with in the first place... You will not see any fortran implementations bragging about being as fast as lisp.

As for the "method_missing" method, well I'm sure it can be useful. Personally, I prefer to see errors as early as possible. I like the compiler to tell me that there is no such method, not to do some arbitrary other thing. If you walk down the street in a statically typed world, and someone says 'Blarg?' to you, you don't drop dead. Because, in a statically typed world, nobody would ever shout 'Blarg?' to you. This is guaranteed by the compiler. The program wouldn't compile, in other words; you wouldn't be walking down the street at all!

Sure, this looses some flexibility. And to overcome that, we either need to recreate some flexibility with dynamic dispatch (OO), templates, type-classes, or whatever you can think of. The idea is to be as safe as possible and as flexible as possible at the same time. Sometimes, this can get too complex, so sometimes we need to trade safety for simplicity or flexibility too. But it shouldn't be necessary. ML or Haskell are examples of entirely safe languages (you can't fool the typesystem) that doesn't feel restricting, and experienced programmers use the type-system to it's maximum potential to help guarantee correctness.

But yes, collections have traditionally been troublesome in statically typed languages. C++ overcomes this with templates, and it is a good solution, if not perfect. Dropping type-safety isn't perfect either. But, basically, creating good collection libraries in statically typed languages is a solved problem. ML and C++ programmers have known that for ages. With templates/genericity/whatever, you don't need a common type-hierarchy.

The OO approach of loosening up the type-system has always been a bit more troublesome. Because traditionally, types have almost always been related to classes in OO, but in hindsight, that was probably a mistake. It stems from Simula (the first OO language), and has somehow managed to be remain almost unchallenged afterwards. SmallTalk (the "other" first OO language) does not do it. But it has dynamic typing. There is no reason we couldn't do something similar in a statically typed langauge. For a better approach, look at how OO is done in e.g. Ocaml.

That being said, it is quite clear that dynamic typing is cool too. Sometimes, when you haven't yet made up your mind as to what to create, it is easier to battle with a debugger, than it is to battle with a restrictive type-system. Unit-testing can give us the same (or better) safety as type correctness can. Speed isn't always important, and for many people, dynamic typing is more intuitive (especially non-programmers).

There is room for both approaches, and there is no "best" language for all purposes. As you can tell, I like both, but mostly prefer to work with statically typed languages.

[ Parent ]

Static typing (none / 0) (#305)
by statusbar on Mon Jun 02, 2003 at 08:23:40 PM EST

What kind of errors can there be in your dynamically typed code if your code passes all your unit tests?

Most people who use the 'compiler static type checking' as a test, do not write their own unit tests.

Quite often, a static typed language is used as an excuse to not fully test the behavior of the code.  Of course this is not sufficient, so you need the unit tests ANYWAYS.  

It compiled! Ship it!

Too many programmers write code without unit tests in c++.

--jeff++

[ Parent ]

Unit tests (none / 0) (#309)
by Cro Magnon on Tue Jun 03, 2003 at 09:48:21 AM EST

Unit testing is neccesary, but anything that helps catch errors is an advantage. Static typing doesn't replace good unit testing, but it sure helps the process. And programmers using dynamic-typed languages are under the same pressures & time constraints as everyone else and may not unit test well enough.
Information wants to be beer.
[ Parent ]
Hmm... (5.00 / 2) (#311)
by joto on Tue Jun 03, 2003 at 10:46:22 AM EST

What kind of errors can there be in your dynamically typed code if your code passes all your unit tests?

You have already answered your own question. The answer is: anything your unit tests didn't test for. It might be the silly typo, a bad design, or anything else. Who knows?

One of the things I really like about ML, is the ability to create types that catches errors in the design of the program. Let's take a simple example: a stack:

datatype 'a stack = Empty
 | Something of 'a * 'a stack

When using this datatype, it forces you to think about whether the stack could be empty, because the only way to get the top element, is to do pattern matching against Something. And if you don't have a separate test for empty, the compiler will warn you about it (loudly!). In addition to being concise, this can save you a lot of errors, and (I shouldn't really say this, but...) a lot of unit tests for trivial things.

Here, by the way, is how you would write a function to extract the top element of a stack in ML (not that one would usually do that, but it illustrates how types helps in catching errors, the test for Empty really needs to be there, or the compiler will issue a warning, you don't necessarily need to raise an exception, though...):

exception EMPTY
fun top (Something (hd,tl)) = hd
 | top Empty = raise EMPTY

I don't know about you, but when I do unit-tests, I usually only test for non-trivial things, and in a bottom-up manner. To continue the silly stack example, if I were to write unit tests, I would write them to check the stack implementation (not that I really have to in ML), but I would not write unit-tests everywhere I used a stack. If you do, well, good for you, but I'm not sure I would like to wade through all your unit-tests either when reading your code.

Most people who use the 'compiler static type checking' as a test, do not write their own unit tests.

True. And while it might seem despicable, it's not necessarily crazy. Many people writing in dynamically typed languages doesn't write unit tests either. It depends on many factors, such as how much a bug will cost you, how trivial the code is, how good testing it will receive later, and so on. If you write code in C or C++ you really need to write unit tests, because these languages aren't designed to catch silly errors, such as e.g. ML (or Python for that matter, but in a different way).

But thinking that unit tests are the answer to every problem is as crazy as thinking that static typing will solve every problem. You can't write unit tests for everything. And you can't typecheck everything. But both can be used as a tool to get more correct code. But we also need to do proper design, documentation, testing (not just unit tests), user-feedback, and lot's of other things. Static code-checkers also helps, and of course, any kind of formal methods will help you design a working program, and static typing is a subset of that...

[ Parent ]

Unit Tests (none / 0) (#314)
by statusbar on Wed Jun 04, 2003 at 07:21:19 PM EST

The answer is: anything your unit tests didn't test for. It might be the silly typo, a bad design, or anything else. Who knows?

In my opinion, the unit test would be incomplete then.

The tests should exercise every code path in your object's methods. If they do, and they pass the tests, then the compiler's static typing diagnostics were not necessary.

static typing, in C++ at least, DOES make your code more complex and also gives you a false sense of security that the code is correct if it compiles.

C++'s template features are not only a nifty feature, they are features that are REQUIRED because of the static typing nature of C++.

All of these things that C++ templates do for you can be done in Python (for example), or smalltalk, etc, without the need for 'generics' or 'template' extensions.

Why? Because they are dynamically typed languages - a class is an object as well. In C++, a class is not an object on its own.

example:
class XX:
  pass

class YY:
  pass

def my_function( T ):
  obj = T()
  # do something with obj

my_function( YY )
my_function( XX )

--jeff++

[ Parent ]

Perfect world (none / 0) (#315)
by Cro Magnon on Thu Jun 05, 2003 at 11:04:25 AM EST

In my opinion, the unit test would be incomplete then. The tests should exercise every code path in your object's methods. If they do, and they pass the tests, then the compiler's static typing diagnostics were not necessary. static typing, in C++ at least, DOES make your code more complex and also gives you a false sense of security that the code is correct if it compiles.
Unfortunately, most unit tests, IME, ARE incomplete! I doubt that programmers using dynamic languages do any better testing than the static-typed programmers. The same programmers who get a "False sense of security" when their C++ program compiles will get a false sense of security when their Smalltalk/Python programs runs right - once. Static typing isn't perfect but it DOES catch some types of errors, and until more programmers get the time/skills to do full testing, SOME errors caught beats NO errors caught.
Information wants to be beer.
[ Parent ]
Static errors (none / 0) (#318)
by statusbar on Thu Jun 05, 2003 at 05:06:17 PM EST

You are right of course, reality is a bitch but it is also my experience that static typing ends up making code more complex and therefore more likely to contain the errors and funky workarounds.

--jeff++

[ Parent ]

Dynamic typing is excellent for small programs (none / 0) (#310)
by Per Abrahamsen on Tue Jun 03, 2003 at 10:12:10 AM EST

As long as the program is small enouh that you can keep the design in your head (my limit is around 10k  lines), dynamic typing is more efficient (programmer time wise) than static typing.  

Whenever I pass that limit, I start missing the extra structure impossed by a static type system.  The static typesis an alternative describtion of (a subset of) the program, so in a sense you have two descriptions of the same program.  When used right, the static typing can work as a design document, that is automatically checked for consistency with the actual program code whenever the program is compiled.

Having two descriptions of the same program is obviously more work, but it starts, at least in my  experience, to pay off when the program reach a certain size and complexity.  But only as long as the two descriptions are checked for consistency, otherwise they will always start to diverge as time passes.


[ Parent ]

boost (4.50 / 2) (#279)
by mitch0 on Thu May 29, 2003 at 08:08:49 AM EST

I read most of the comments, yet I didn't see any reference to the boost library.

It's a wonderful library with lots of neat things done with templates. Many of the complaints mentioned in the comments are addressed.

I'll just mention the Concept Check Library for those who complain about unintelligible error messages.

I still think that C++ is one of the most powerful languages nowadays. Sure, you have to watch out, but at least you CAN do almost anything.

anyway, check out the boost site.
cheers, mitch

FORTH.... (5.00 / 1) (#282)
by Alhazred on Thu May 29, 2003 at 02:19:25 PM EST

I'm simply amazed that out of all the 1000's of comments here nobody mentioned anything about FORTH...

The Forth "Outer Interpreter" is a beautiful example of accomplishing the same goals by turning the system inside-out.

For those who don't know anything about FORTH, it is  a hybrid system. An "Outer Interpreter" parses an input stream (usually be default a console in the old days) and executes each token (called a word in FORTH parlance, but equivalent to a function) it finds.

This interpreter has a 2nd state, compile mode, in which it generates (virtual) machine code. A programmer can thus initiate compilation simply by  changing this flag. Normally this is accomplished by the word ":" (colon) which adds a new header to the dictionary (linked list of available words) and sets compile mode on, parsing the next token from the input stream to use as the name of this newly defined word. Thus:

: MYWORD + . ;

defines a new word (function) called MYWORD which calls the words '+' (plus), and '.' (dot) in sequence. The last word ';' (semi-colon) is a 'compiling' word. It is marked in the dictionary as such and thus the FORTH compiler simply executes it (just as it would in interpreter mode). Semi-colon flips the state back to interpreting, and may do some clean up as well in some FORTH implementations.

The beauty of the whole system is that awkward constructs like templates or LISP style macros are totally unneeded. One can simply dodge back and forth between compiling and interpreting as required.

For instance assuming we had a word factorial we could put 3! as a constant into another function quite easily.

: MYWORD
   [INTERPRET] 3 factorial LITERAL [COMPILE]
   dosomethingelse
;

Notice how brutally simple FORTH is. This is the strength of simplicity. You just plain tell the compiler/interpreter what to do next.

With these simple tools it is quite easy (almost trivial) to create an object-oriented FORTH or almost any other kind of extension. As with most LISPs FORTH is almost always written in itself except for a small subset of core functions. An entire working FORTH compiler and language can be implemented in less than 10000 lines of code, including the core!
That is not dead which may eternal lie And with strange aeons death itself may die.

10000 lines!!! (none / 0) (#291)
by the on Fri May 30, 2003 at 08:31:46 PM EST

I'm sure you can do it in a lot less. I implemented most of FORTH an machine code (ie. typing in bytes in hex one at a time...) and I certainly wouldn't have had the patience for anywhere near 10,000 bytes of code! And that included routines for multiplication and division on hardware that didn't support it.

But yes! FORTH is beautiful. Implementing it can get weird because in some implementations the core interpreter, in some sense, is only a dozen or so bytes long. After that you're writing a mixture of code and data which are intertwined in a way I've never seen in any other language.

And you forgot to mention CREATE ... DOES>!

--
The Definite Article
[ Parent ]

Yeah, well I couldn't document ALL of FORTH (none / 0) (#292)
by Alhazred on Fri May 30, 2003 at 08:58:41 PM EST

in a post, hehehe, though the scary thing is that you can pretty much document it all in 10 pages.

I still have the original printouts of FIG FORTH floating around here somewhere in my library (along with a few dozen copies of 'Thinking FORTH' by Leo Brodie, remember him? hehe.
That is not dead which may eternal lie And with strange aeons death itself may die.
[ Parent ]

Is one of those copies of... (none / 0) (#293)
by the on Sat May 31, 2003 at 12:44:38 AM EST

...Thinking FORTH for sale?

--
The Definite Article
[ Parent ]
brodie (none / 0) (#316)
by iangreen on Thu Jun 05, 2003 at 02:03:41 PM EST

Yeah i've seen them around both in the library and on amazon.com - I guess those texts are fairly outdated but still very awesome. I have 'starting forth' and it's damn good.

[ Parent ]
Not REALLY outdated (none / 0) (#319)
by Alhazred on Sat Jun 07, 2003 at 07:30:16 PM EST

Personally I think its the classic work on factoring. Anyone that hasn't read it probably doesn't understand factoring. The problem sets and such may be a bit dated, and FORTH itself might in some ways be considered dated as well, but the ideas seem more current now than ever!
That is not dead which may eternal lie And with strange aeons death itself may die.
[ Parent ]
Where is difference from this: (none / 0) (#295)
by jtra on Sat May 31, 2003 at 01:26:25 PM EST

See following lisp fragment:

;;; badly recursive factorial function, please ignore :-)
(defun fact (n) (if (zerop n) 1 (* n (fact (1- n)))))

(defun foo ()
#.(fact 5))

#. input token does compile time evaluation so factorial in foo is computed only once at compile time.

Can FORTH do more with those compile and interpret words?

--Get textmode user interface for Ruby, http://klokan.sh.cvut.cz/~jtra/
[ Parent ]

Forth Lives (none / 0) (#321)
by billglover on Thu Jun 12, 2003 at 08:01:55 PM EST

And in a lot less than 10,000 lines! Chuck Moore is still at it and has only refined his approach over the years. The latest incarnation is ColorForth and it has an active developer group with new stuff happening all the time. Imagine an OS/Editor/Compiler/Everything in 2 kilobytes... http://www.colorforth.com
--
-Bill


---
-Bill
Help create a reputation economy.
[ Parent ]
Missed a key feature of templates. (none / 0) (#300)
by metalotus on Sun Jun 01, 2003 at 04:10:33 PM EST

The author mentioned generic and reusable components, and then discussed the various ways programming languages support type features. C++ templates create resuable binary code. For example, STL binary components are significantly different then modules in the Java library. A discussion about language support for type features does not capture the idea of reusable binary components. I think binary components are are most significant feature of STL, and this was not understood in the above article.

hmm? (none / 0) (#307)
by Delirium on Tue Jun 03, 2003 at 01:09:22 AM EST

The whole point of templates is that they don't, unlike some languages, allow for reusable binary code. They're essentially an enormously complex macro system for generating multiple specialized pieces of binary code. If you use vector<bool> and vector<int> in your program, the compiler uses the template to generate two entirely separate pieces of code, a "bool vector" and a "int vector." That's why they're called templates.

[ Parent ]
Generative Programming and C++ template LISP (none / 0) (#320)
by Leimy2k on Tue Jun 10, 2003 at 04:13:29 PM EST

The authors of the book Generative Programming had an implementation of Lisp using the recursive properties of C++ templates. I think such a meta-configuration-language is an intersting way to use templates.

Also doesn't Java 1.5 have some templatey thing coming out soon. The syntax looks awfully familiar.

At Los Alamos National Labs there is a template meta library called PETE [Portable Expression Template Engine] which is pretty powerful stuff. It can help optimize the heck out of your template code by eliminating unnecessary copies etc etc.

C++ often gets a bad review to the point where it almost seems very trendy to dislike it for very strange reasons. I have heard "I don't like using the bit shift operator for cout/cerr" but these are the same people who like the [] methods of Objective-C [which to me is an alien syntax that takes getting used to]. I suppose if they said operator overloading in general was a bad and unnecessary thing I could agree with them more since its nothing but syntactical sugar anyway... and maybe that sugar does indeed add more complexity than is necessary to the language to get work done. I mean... just compare the size of Bjarne's book to K&R's. There is definitely more to understand and more to get wrong about C++. It just takes a different kind of person to like that sort of thing I suppose.



Templates? Yes. Metaprogramming? Ick! (none / 0) (#322)
by ksandstr on Sat Jun 14, 2003 at 10:57:51 AM EST

The following is just my opinion, of course.  Couldn't be any other way.

Back in the day, I though that C++ templates were a pretty good thing.  I mean, type-agnostic collection/mapping/whathaveyou classes written in such a way that the compiler can inline most of the iteration primitives that you're going to use, type-safely no less? Gimme!

But these days, you have to wonder if the whole C++ template metaprogramming thing isn't just some elaborate, bizarre form of mental masturbation. I mean, sure, it's pretty cool how you can instruct the compiler (although in a syntactically clunky way) to generate this many iterations into this function, and that when constructing some classes that have a particularly high performance requirement, you can parameterize things like array sizes, invariants and so on. But run-of-the-mill unrolling? The Boost lambda module? "Traits" classes? Maybe it's just my lack of understanding (which, honestly, I'm not planning to improve at any specific time in the future) but shouldn't these things be provided by a sufficiently smart combination of language and compiler that would among other things treat any function, class or other language construct as generic and instantiate as desired?


Fin.

Metaprogram! (none / 0) (#324)
by the on Fri Feb 06, 2004 at 01:01:09 PM EST

shouldn't these things be provided by a sufficiently smart combination of language and compiler
If enough people do template metaprogramming and publish their work then C++ will eventually we'll see a language that does what template metaprogramming does without the crap. But if everyone keeps quiet about it the compiler developers will think we're happy with the current state of affairs. Right now there is no language that provides the performance of C++ with the abstraction of metaprogramming without the crazy syntax.

--
The Definite Article
[ Parent ]
What's wrong with C++ templates? | 324 comments (253 topical, 71 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!