Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
A functional language compiler that creates really fast code?

By ff in Technology
Thu Mar 14, 2002 at 09:22:08 AM EST
Tags: Software (all tags)
Software

Why don't we have a compiler that generates assembly language directly without generating C code first?


I'm fairly new to functional languages, having picked up Erlang only about 3 or 4 months ago. I really like many of the features afforded by functional languages. Things like type inferencing, bounds safety, pattern matching, and all the other features they often have are very useful. The problem is that I have been unable to find a functional language compiler that creates fast enough binaries. Do not get me wrong, I know speed is not everything and some of those features have a speed cost (and I'm OK with that). I like Perl, Python, and other high level languages that are not very fast compared to C and they work for what they do. My main problem is that I do not feel like the functional languages should be as slow as they are. A lot of the programs I like to write need every bit of speed they can get. Anything with high user interactivity requires maximum speed from the system. Games, Graphics/CAD applications, etc. require maximum performance. And yes, I would argue that C/C++ is not the proper language to write those types of applications in. Nobody likes corrupted memory and other stupid things you have to deal with in C/C++. I am using C's speed as the benchmark since I have not seen any other language besides assembly that is as fast.

Although most functional languages feel high-level when you are using them, if you look at the syntax and the operations, they are actually similar to C in their simplicity. I would think that they could be converted directly to assembly language and generate code that runs at least as fast or faster than C (assuming the same functionality). For instance, in my own testing I have found that accessing arrays in O'Caml code compiled with the "-unsafe" option is about 50% or more slower than the equivalent C code. That should not be, unsafe array access should be the same speed (or faster) as it is in C.

As I see it the problem is that every language I have looked at uses C as a middle layer before compiling to binary. I can see the desire to use C as a middle layer because it makes for a portable, fairly fast subsystem that keeps the compiler writer from having to think about too many things, but this is also the downfall of all the compilers. All compilers end up being slower than they should be because they are layered on top of C. It is not that C is slow, the problem is too many layers.

Imagine eliminating all the cruft that gets added by the C compiler. The registers, stack, and everything else could be optimized for the functional language. I think this is completely possible but no one has taken the time to make a compiler that works this way. I would approach the problem just as I imagine they did when creating the first C compiler: Your compiler has to create the assembly code directly, there is no middle language that is going to provide the speed you need to make your language as fast as it can be.

Another problem area is when using external API's that have a C interface (OpenGL for example). This is always much slower than I think it could be if you didn't have to go through multiple layers of C code to get to the external API.

Now, I could be completely off base. I know very little about language design and maybe it is the way it is for a reason. My gut tells me different, my gut tells me the compiler writers saved time and/or effort by using another high level language as a middle layer towards the final product. As a developer I completely understand this, but I think the situation could be better with a little work. There are some projects out there like HLA (high level assembly) that could really help.

Most functional languages will also compile to byte-code. I have never seen a JIT compiler for any of the languages, why is that? I think Erlang will have one soon based on the HIPE stuff but having used the HIPE compiler, I can say it is no where near as fast as C. I don't know if it's possible to JIT compile byte-code into something that runs as fast as C but that could be another option. O'Caml JIT compiler? I don't think this would be fast as creating assembly code directly.

Just some thoughts on my newly found and fun languages.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Also by ff


Display: Sort:
A functional language compiler that creates really fast code? | 156 comments (143 topical, 13 editorial, 0 hidden)
The reason (4.33 / 3) (#2)
by jacob on Wed Mar 13, 2002 at 08:37:30 PM EST

The traditional argument against compiling straight to assembly is this: a) the onus of writing native code backends for every platform on Earth would then fall on each and every functional programming language writer; and b) every optimization that an optimizing C compiler does for every platform would have to be duplicated by every functional compiler if it wanted to generate code comparable to the equivalent C code.

Functional programming language compilers don't want to do those things, generally, as it's already been done quite well enough, so they usually stick to compiling only to C.

That doesn't mean that functional languages have to be slow, though: the ML family of languages seems to do quite well for itself in terms of speed, actually. Check out O'Caml.



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

What about... (3.00 / 1) (#4)
by enterfornone on Wed Mar 13, 2002 at 08:41:31 PM EST

if we compiled to CLI instead?

--
efn 26/m/syd
Will sponsor new accounts for porn.
[ Parent ]
Being done (3.00 / 1) (#10)
by jacob on Wed Mar 13, 2002 at 08:59:55 PM EST

Many functional programming language compilers are starting to target the CLI, actually. See Microsoft Research's page, for example, or Mondrian.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
What about... (2.00 / 2) (#12)
by Ken Pompadour on Wed Mar 13, 2002 at 09:07:58 PM EST

If you compile to Java bytecode? Sheesh. What's with all the "why don't we compile language X to CLI?" quotes... it's like the new beowulf cluster :(

...The target is countrymen, friends and family... they have to die too. - candid trhurler
[ Parent ]
Answer: (4.00 / 1) (#17)
by jacob on Wed Mar 13, 2002 at 09:21:53 PM EST

Read this [note that the authors are Microsoft Research employees; however, their arguments stand on their own]. The most important reason is that the JVM does not support tail call optimization while the CLI does; that alone is enough to make the CLI a more attractive target for functional language compilers than the JVM.



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Not to sound like an idiot but... (2.00 / 2) (#25)
by Ken Pompadour on Wed Mar 13, 2002 at 09:50:18 PM EST

Couldn't Java bytecode support tail recursion if there was an emulator written in Java which the functional language would run on? Not that it would be fast, but at least tail recursion would be supported... *

silly 'in theory you could' question, I know, but from all I've heard, languages geared towards the CLI have all been gutted in one way or another...

*I feel dirty now. Ignore this paragraph, I'm not a teenager, I swear :(

...The target is countrymen, friends and family... they have to die too. - candid trhurler
[ Parent ]
Err ... (3.00 / 1) (#52)
by jacob on Thu Mar 14, 2002 at 12:05:15 AM EST

Yes, you could implement another common language runtime on top of the JVM that would support whatever you wanted. No, that would not be a good idea. :)

As far as languages being gutted for .NET, the only serious complaints I've heard have been about the exact properties of primitives. That doesn't bother me all that much, but of course your mileage may vary. Have you heard of any other serious compromises that have had to occur to get things running on that platform?



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Well (2.00 / 2) (#98)
by Ken Pompadour on Thu Mar 14, 2002 at 10:14:23 AM EST

If it would support tail recursion, why not? My understanding was that the tail recursion optimization was that new stacks didn't have to be allocated memory for each new call to the function.

Anyways, no, I haven't heard of any compromises for CLI, except that it only runs on Windows platforms. I've heard it's rather good, albeit an unapologetic Java clone with a few things changed or added.

...The target is countrymen, friends and family... they have to die too. - candid trhurler
[ Parent ]
Why not: (none / 0) (#143)
by jacob on Fri Mar 15, 2002 at 11:09:36 AM EST

No reason other than that it's much easier and less convoluted to just implement a virtual machine that supports tail calls from scratch rather than having an infinite chain of emulators, all else being equal. If you're stuck with the JVM, it's possible to create a tail-call optimization mechanism (I posted a link elsewhere to a paper by Odersky and somebody else about how they did it with a language called Funnel); it's not pretty, though.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
What's the deal? (3.00 / 1) (#116)
by blamario on Thu Mar 14, 2002 at 12:38:49 PM EST

I don't get all the importance Microsoft gives to having the tail-call instruction in the bytecode.

What is a tail call? It's simply an ordinary call followed by a return. It's a trivial thing for any JIT compiler to recognize the sequence and optimize it any way it likes. So what's the point of inserting a special new tail-call instruction to tell the JIT compiler "hey, this is not an ordinary call, it's actually the end of the function so you can optimize it"? If the compiler is so shitty it can't do a one-instruction lookahead, it has more problems than optimizing tail calls.

In my opinion adding unneeded features to a product (CLR) just for the purpose of being different from competitors (JVM) is bad engineering. But then again, marketing can be more important sometimes.



[ Parent ]
Harder than you think (4.50 / 2) (#120)
by jacob on Thu Mar 14, 2002 at 01:16:33 PM EST

While I'm not familiar with the details (I'm looking now), I know that general tail-call optimization (particularly so-called "sibling calls" -- tail calls to functions that aren't the caller) is pretty hard and no JVM's I know of do it. Besides, the ability to force it even in cases where the compiler couldn't prove it was safe would certainly be useful.

Moreover, many functional languages use recursion as their primary means of repetition. For those languages, the expectation of tail-call optimization isn't enough -- you need a guarantee, which the CLR provides and the JVM does not.

Incidentally, that feature was put in at language developers' request, not as a Microsoft marketing ploy.

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
I think it's easy (3.00 / 1) (#123)
by blamario on Thu Mar 14, 2002 at 01:30:11 PM EST

While I'm not familiar with the details (I'm looking now), I know that general tail-call optimization (particularly so-called "sibling calls" -- tail calls to functions that aren't the caller) is pretty hard and no JVM's I know of do it.

That's because they rarely need to. The code they are JIT-compiling is usually coming from a Java compiler and contains few optimizable tail calls. But that doesn't mean they couldn't do it if Sun thought about the "language coexistence" marketing story.

Besides, the ability to force it even in cases where the compiler couldn't prove it was safe would certainly be useful.

The whole point of having the JIT is that the compiler checks all the code. The bytecodes are not trusted by default. If the tailcall instruction breaks the security model as you say, than it's unsafe. If it's unsafe, then any IL code generated by a functional language is unsafe and can't be run in a safe environment. Wasn't this feature supposed to be a gift to the functional language developer community?

Moreover, many functional languages use recursion as their primary means of repetition. For those languages, the expectation of tail-call optimization isn't enough -- you need a guarantee, which the CLR provides and the JVM does not.

The guarantee need not be in the form of yet another bytecode instruction, it would be suficcient to say that any compliant JIT compiler must optimize the tail calls away. Scheme does it that way and it works.



[ Parent ]
good for you (4.00 / 1) (#137)
by jacob on Thu Mar 14, 2002 at 04:26:32 PM EST

That's because they rarely need to. The code they are JIT-compiling is usually coming from a Java compiler and contains few optimizable tail calls. But that doesn't mean they couldn't do it if Sun thought about the "language coexistence" marketing story.

Yeah, they could. They don't. Therefore, the JVM isn't guaranteed to be properly tail-recursive. Therefore, language implementors have to implement it themselves. See Michael Schinz and Martin Odersky, "Tail Call Elimination on the Java Virtual Machine."

The whole point of having the JIT is that the compiler checks all the code. The bytecodes are not trusted by default. If the tailcall instruction breaks the security model as you say, than it's unsafe ...

Sorry, I was unclear. By "safe" I don't mean "secure," I mean "the program won't crash" (though interestingly, tail-call optimization can have security impacts in Java; see here). It's very possible in a program-analysis context that looking at the original code you can prove things that you can no longer prove once you compile down to the bytecode level, particularly in the CLR which allows unsafe pointer-manipulating segments. For that reason, it's important for compiler writers to be able to specify that a particular function application is definitely a tail-call and it's safe to optimize as though it were.

The guarantee need not be in the form of yet another bytecode instruction, it would be suficcient to say that any compliant JIT compiler must optimize the tail calls away. Scheme does it that way and it works.

Yes, Scheme does it that way. For Scheme, the property of being an optimizable tail-call is a static property of the source code (though the place to jump to as a result is a dynamic property). That does not appear to be true of CLR assembly language. So, again, it's important for compiler-writers to be able to explicitly mark that certain calls are definitely optimizable tail calls.

I must point out again that I'm hardly an expert on this subject, though. Don't believe me.



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
There's already a CLI Scheme compiler (3.00 / 1) (#109)
by epepke on Thu Mar 14, 2002 at 11:55:46 AM EST

It's incredibly hideous and guts the language.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Reference? (none / 0) (#139)
by jacob on Thu Mar 14, 2002 at 04:42:53 PM EST

I couldn't find reference to it anywhere. Could you post a link or something?

--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
I think we need a better OSS c compiler (4.00 / 4) (#5)
by autonomous on Wed Mar 13, 2002 at 08:49:52 PM EST

I really think we need a better open source C compiler. GCC sucks ass. It is slow, it doesn't optimize for anything recent in any meaningful way, it is quite unreliable. I personally have spent countless hours compiling and recompiling and recompiling just to find the right combination of -foo's that make my code not crash (and when the exercise is done I generally only have a <10% speed improvment with all the optimization). However, I look at many commercial compilers around and they ROCK, I've seen>200% speed improvement over GCC.
why can't GCC go the same? (and why can't it be BSD licensed so we can get more commercial buy in by hardware companies?)
-- Always remember you are nothing more than a collection of complementary chemicals worth not more than $5.00
gcc aims at portability (3.00 / 1) (#36)
by Delirium on Wed Mar 13, 2002 at 10:23:27 PM EST

While I agree that more optimizations in GCC are certainly desirable, I think GCC succeeds in its main goal, which is to be a widely portable compiler. There are compilers which are much faster -- Intel's ICL is 15-30% faster I believe -- but they don't run on everything as GCC does.

So there's no reason not to further optimize GCC, but I don't think it's as horrible of a compiler overall as you make it sound.

[ Parent ]

portability need not == dog slow (3.00 / 1) (#57)
by autonomous on Thu Mar 14, 2002 at 12:21:04 AM EST

I agree that it is a nice widely portable compiler, the open source community REALLY REALLY (and I mean really) needs modern compiler technology to make things happen. I think there should be hardware vendor assisted backends.
-- Always remember you are nothing more than a collection of complementary chemicals worth not more than $5.00
[ Parent ]
software patents and gcc (3.66 / 3) (#59)
by arjan de lumens on Thu Mar 14, 2002 at 12:38:32 AM EST

AFAIK, several optimization techniques that are missing from GCC are being kept out because they collide with existing software patents held by e.g. Intel and Microsoft.

[ Parent ]
hmm well.. (2.00 / 1) (#105)
by QuantumG on Thu Mar 14, 2002 at 11:28:14 AM EST

as an educational (read: dick size) project I was gunna write a C compiler the other day. Doing this a sane way would be to use flex/bison or some other compiler compiler but I wanted to demonstrate bootstrapping so I was going to hand craft it in C. Anyways.

Gun fire is the sound of freedom.
[ Parent ]
BSD? (none / 0) (#140)
by blankbox on Thu Mar 14, 2002 at 07:47:33 PM EST

I would assume that GCC will never have a BSD license because... well... it's made by the GNU people. They frown on open source licenses, prefering the GPL.

[ Parent ]
Isn't the problem... (4.00 / 2) (#6)
by skim123 on Wed Mar 13, 2002 at 08:54:49 PM EST

I'm fairly new to functional languages, having picked up Erlang only about 3 or 4 months ago. I really like many of the features afforded by functional languages. Things like type inferencing, bounds safety, pattern matching, and all the other features they often have are very useful. The problem is that I have been unable to find a functional language compiler that creates fast enough binaries

Isn't the speed problem here not the fact that the functional language is compiling to C first, but that functional languages rely on a model of computation (lambda calculus) that is different from the model imperative languages rely on (Turing machines)? This in itself is not a problem, but the von Neumann architecture of modern computers (well, in essentially every commercial computer since the 50s) is designed for computing Turing machine-type problems as opposed to computing lambda calculus problems... that's my understanding at least, I may be wrong.

Money is in some respects like fire; it is a very excellent servant but a terrible master.
PT Barnum


possibly (3.00 / 1) (#21)
by ff on Wed Mar 13, 2002 at 09:30:58 PM EST

Very possibly, I don't know though. This is part of the reason I posted on the subject, I want to find out things like this.

[ Parent ]
Well (none / 0) (#51)
by skim123 on Thu Mar 14, 2002 at 12:02:34 AM EST

You may want to read this diary entry from today. I thought it funny to read this diary entry and then to see your submission...

Money is in some respects like fire; it is a very excellent servant but a terrible master.
PT Barnum


[ Parent ]
von Neuman machines not a problem (4.00 / 1) (#71)
by Booj on Thu Mar 14, 2002 at 03:51:02 AM EST

Actually it turns out that there's an elegant way to convert functional code into a sort of pseudo-assembly of exactly the type that von Neuman machines like modern computers run. Basically, it's based on the concept of continuations and the compiler transforms the code into a series of nested continuations. Each continuation represents a fairly simply step (i.e. add a number to a temporary, etc.) Reverse the order working from the inside to the outside and you essentially have an intermediate assembly code. From that point you pass it to a back end which optimizes it and emits native assembly code. The transformation to this style of nested continuations, called Continuation Passing Style (CPS), is, as I understand it the basis behind almost all functional language compilers.

One of the course handouts from a an old prof of mine (the one who taught me about this in the first place, actually) has more information on it and is fairly accessible.



[ Parent ]
CPS (3.00 / 1) (#82)
by ff on Thu Mar 14, 2002 at 08:34:17 AM EST

Interesting, thanks.

It must not be that simple, otherwise the current crop of funtional compilers would create code that runs as fast as C.

[ Parent ]

Because it's not worth it... (4.33 / 3) (#7)
by seebs on Wed Mar 13, 2002 at 08:55:37 PM EST

The main issue here is that the C compiler is *VERY* good at register and stack manipulation. For your hypothetical compiler to make sense, it has to be *better*, and it has to be better on a variety of systems.

Modern compilers are very good at reoptimizing machine-generated code, and essentially, the best you could hope for is to get comparable results while duplicating a huge amount of effort.

Consider, if you will, the huge development effort involved in optimizing for PPro and later x86 systems which support out-of-order execution. How many different compilers needed to do this? About three, rather than one for every single language.

C works very well as a moderately-portable underlying language to use for retargeting things. In the end, you'd just reinvent it.


maybe (3.50 / 2) (#49)
by ff on Wed Mar 13, 2002 at 11:42:09 PM EST

The main issue here is that the C compiler is *VERY* good at register and stack manipulation. For your hypothetical compiler to make sense, it has to be *better*, and it has to be better on a variety of systems.

It doesn't have to be better, it only has to be the same as the C compiler. Making something that created faster than C code would indeed be very difficult but that's not what I'm asking for. I'm asking for a funtional compiler that generates code that is at least as fast as C code that has the same functionality.

I feel your other points are true but it doesn't change the fact that I haven't found a functional language compiler that uses C as an intermediate language and creates code that runs at a comparable speed. They are close, but not there yet. What I'm wondering is if you can get that speed using a direct source-to-assembly-to-binary method.

[ Parent ]

No, it has to be better. (4.00 / 2) (#58)
by seebs on Thu Mar 14, 2002 at 12:31:21 AM EST

If it's just as good as the C, you've just put a *LOT* of development time into getting *NO* improvement - and ensuring that your language is only usable on the specific processors you targeted.

The nice thing about C is that reasonably clean C is already runnable on PowerPC, G3, G4 (possibly even with Altivec optimizations), x86, Athlon, Itanium, Alpha, SPARC, Mips, MC68k, and so on.

I don't think that direct source to assembly would help any; what would probably help a lot more is more careful consideration of what the generated source looks like. A lot of machine-generated code is horrendously inefficient - and the same inefficiencies would be there whether you were producing assembly or C.


[ Parent ]
Fuctional language faster than C, called K. (4.56 / 16) (#9)
by Cal Jayson on Wed Mar 13, 2002 at 08:59:50 PM EST

There is an unknown language called K produced by a company called Kx. K is the most elite language that I know of. It has its roots in APL and has added features from languages such as Lisp and Scheme, but it has some very interesting features of its own.

K is very very very fast to write and the run. It blazes in both categories. There is a full relational database that is written in K, called KDB. It crushed Oracle on the TPC-B and TPC-D benchmarks in both speed and storage size, requiring only a few percent above the dataset size in overhead. It has native clustering and replication that allowed it to run on a 50 cpu Linux cluster loaded with 2.5 billion stock trades and quotes and have simple table scans (such as, select max price from trade) take under a second and multi-dimensional aggregations (such as, 100 first desc select sum size*price by sym from trade) take only 10 seconds. Starting the database cluster took a tenth of a second. It is SQL92 compliant, has an extended ultra-powerful query language called KSQL that makes writing queries very simple, and the stored procedure languages are K and C.

In bwk's language benchmarks, even though this is not the K strong point, the sum of the execution times were: K at 32 seconds, Perl at 95, Java at 300, and TCL above 1400. The lines of code to implement were: K at 9 lines, awk at 95, Perl at 96, TCL at 105, Scheme at 170, VB at 200, and Java at 350.

Yes, K can look like line noise, but unlike Perl, you get alot from this. First you get extreme code density and see the entire problem on the screen at once. I came from a Scheme background and Perl hurt my eyes, so I was very skeptical, but after my roommate persuaded me to look at K harder, I realized that this high code density made it very easy debug and write code. It is rumored that KDB is written in 26 files of code, each file consisting of a single screen of code, labeled a to z. Try doing that in any other language. The language is exceptionally regular. It is so logical and consistent that it takes a little getting used to. You never have to remember any baroque language rules. Anything that makes sense, you can do. Also, even though it looks difficult, it is extremely easy to learn because K is directly translatable to English, in fact there is a K program that will do this automatically. For example to split a line by tabs you could write:
cut:{1_'(&x=*x)_ x:"\t",x}
And this is read:
cut gets function, 1 drop each, where x equals first x quantity, cut x. When X gets tab join x.
It may take a little getting used to, but with a month of K, my roommate and I were able to converse this way when describing K and you could see the picture developing in your head. It was amazing.

A unique feature of K is what is called the K tree. Unification is a very strong idea in K, so it unifies the idea of object, variables, attributes, namespaces, and dictionaries. A dictionary is a native K type. Each variable lives in a dictionary (somwhat like Python). These dictionaries are joined hierarchically and can be removed and added dynamically. All variables are on the K tree, too, so a new namespace is really just a dictionary on the K tree! This means that you can rearrange the K tree and change what functions get called. This is the most reflective language that I have ever seen (Python, Scheme, and CLisp come in a very close behind). All variables have attributes. All attributes are is a special dictionary attached to the variables (the language is so regular that this is really a namespace with a blank name so to refer to the attributes of a variable you say ns.var..attrib). And, of course, each attribute is just a variable so each of those can have attributes, too.

K also has the ideas of dependencies and triggers in the language, so if a..d:"1+b" then refering to a will dynamically calculate 1+b, but only when necessary (if you refer to a multiple times but b does not change between those references, a will only be calculated once and stored; K figures out the dependency graph for you). There are also triggers. If b..t:"a:b-1" then whenever b is assigned or modified then a will get the appropriate value. This trigger can be anything, such as a network operation or a gui command.

The language has some other unique features like an interesting callback oriented interprocess communication system and an on-the-fly optimizing vm.

Of course since it inherits some background from APL it has bulk operators, called adverbs, that modify functions in every conceivable way (much more powerful than APL or Perl). One of the signs of a good K programmer is one who knows how to do this and doesn't use any loops (KDB, the relational database, is written without any loops).

From functional languages K inherits higher-level functions and projections. Both which are very standard practices especially when combined with the bulk operators. b f[a;;c;]'d takes the four argument function f, fixes the first and third arguments projecting a function of two arguments, then applies it to each down the list of argument in b and d.

When you use K you truly are standing on the shoulders of giants. The person who wrote it, Arthur Whitney, has this amazing ability to identify the important pieces of a problem and simplify away the rest. The performance in K and KDB is incredibly; the simplicity and power of the language and the database is incredibly.

K runs on various flavors of Unix and NT, so people should take an open mind (I didn't have one at first and was very skeptical) and really try the language and try a new style of programming. Your code and thoughts on developing will never be the same.

-j
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
My eyes! My eyes! (2.00 / 2) (#11)
by Ken Pompadour on Wed Mar 13, 2002 at 09:05:22 PM EST

Oh my god... could this be a contender for worst looking language, ever?

A functional language that looks like Perl... I'm going to have nightmares for weeks :)

That being said, can you explain in which way K trees are more than synactic sugar?

...The target is countrymen, friends and family... they have to die too. - candid trhurler
[ Parent ]
Worst ever? Hardly. (3.00 / 1) (#32)
by ghackmann on Wed Mar 13, 2002 at 10:10:38 PM EST

That title I think goes to Befunge. Where most languages are satisfied to execute from top-to-bottom, Befunge the only language I know of that can flow in an arbitrary number of dimensions.

[ Parent ]
Not even ugliest functional language (3.00 / 1) (#75)
by zakalwe on Thu Mar 14, 2002 at 04:43:14 AM EST

And if he wants an even worse looking functional language, there's always unlambda

(some code from the above page)
```s``s``sii`ki
  `k.*``s``s`ks
  ``s`k`s`ks``s``s`ks``s`k`s`kr``s`k`sikk
  `k``s`ksk


[ Parent ]

No, it doesn't even come close (none / 0) (#156)
by joto on Thu Mar 21, 2002 at 09:04:22 AM EST

Actually, befunge is a quite nice language, if you compare it to the other obfuscated languages.

If you want the absolutely worst, use Malbolge

[ Parent ]

The K tree (4.00 / 2) (#60)
by Cal Jayson on Thu Mar 14, 2002 at 12:53:35 AM EST

Yes, K is rough on the eyes before you get used to it. I came from the impecibly clean world of Scheme so it was a huge change for me; I hated staring at K. After I learned to read K as english though, and I became accustomed to seeing the whole problem on the screen (or even line) at once, I now have a rough time with low density languages such as Java.

Unification is a very strong indea in K, so the K-tree unifies the idea of object, variable, attribute, namespace, and dictionary. A dictionary is a native type in K. Each variable lives in a dictionary (somewhat like Python). These dictionaries are joined hierarchically to form the K-tree. So a variable that exists in a namespace is really just a node on the K-tree, and a namespace is also just a node on the K-tree that happens to be a dictionary! At this level the K-tree just helps you to uniquely prefix your variables. Most packages define a dictionary of symbols that when they are loaded attach themselves to the K tree. If you want you can always reattach the namespace/branch to another location effectively dynamically renaming the namespace. Whenever code is evaluated it is done with respect to branch of the K-tree, so refering to the variable s really refers to the.containing.namespace.s. There are two ways to add dynamicism to this. First, you can decide to evaluate an expression an any dictionary you want to (the dictionary doesn't even need to be on the K-tree), so calling function f calls a different function depending on where you evaluate it. Second, the K-tree can be changed just like anyother value in the system so you could detach a branch (and save it or throw it away), attach a new brach, then evaluate the expression. This will also effect the evaluation of the expression. These two things mimic dynamic binding in an OO language.

All values can have an optional attribute dictionary attached to them. This attribute dictionary is given a blank name so you really refer to an empty namespace to refer to these attributes, such as the.containing.namespace.s..attribute; an attribute is just a K value, so an attribute can also have attributess. There are some predefined attributes, such as h that hold a help string that some development environments respect. The K-tree and attributes lead to a very elegent GUI.

In the K GUI, each variable can have an attribute named c (for class), and this can have certain values like `table, `check, `radio, `button, and others (the backtick ` is how you make a symbol). Lets take radio for an example. Then you would have another attribute o (for option) with possible values:
r..o:`zero`one`two`three`four
r:r..o[1]
r..c:`radio
`show$`r
These four lines would create a radio box with five choices, zero through four, and everytime you evaluated r whatever the radio was set to, r would evaluate to, and whenever the value of r was changed, the GUI would update, too. Basically, each variable has a direct on-screen representation (they default to `data) and is directly manipulable.

Since the K-tree is mutable as any other value in the system, this allows you to do very funky things like create a dictionary of symbols and values, attach that to the K-tree, and then treat it as if you syntactically defined those values initially. This allows you to run a very dynamic system, generating code and values from K code then treating those values as if they have always been there. Most other language you would need to change the way you executed that dynamically created code/data, such as in Lisp you would need to use eval, but it K you treat it the same as statically created code. Or you could create a sandbox to a function in and then remove that branch of the K-tree and analyze it, trash it, or save it for future invocation of the function.

The dependencies and triggers in K are implemented via attributes.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
K (5.00 / 1) (#14)
by ff on Wed Mar 13, 2002 at 09:13:01 PM EST

Yes, I have benchmarked K. And I've seen what ammounts to this same comment posted all over the place. hehe

K is not as fast as C in the benchmarks I tested. Take Ackermann's function for example. The K version was much (many factors) slower. Sorry, I don't have my benchmarks in front of me, but it was very slow compared to the C version. I also tested out a memoizing K version which was a whole lot faster but as we all know, a memoizing version is not practical because it uses too much memory. Plus even the non-memoizing version would quickly blow-out and crash with "large" numbers (stack overflow). BTW, a memoizing C version absolutely crushed the K memoizing version.

Even if K was fast, it's not free. I've never understood why you would make a good language and not let people use it (Clean). It stands to reason that if people can't use it, they won't :)

[ Parent ]

Ackermann's is the worst benchmark (4.00 / 1) (#22)
by Cal Jayson on Wed Mar 13, 2002 at 09:31:37 PM EST

Ackermann's function is the worst possible world for K. It is also a very poor benchmark. Most real-world problems can be vectorized and the ones that cannot be are usually not speed critical (GUI code for example). In K you operate over bulk data not scalars. There was a discussion about non-vectorizable problems on the K mailing list and nobody could really come up with a completely non-vectorizable problem, but there was one problem that somebody had that was not completely vectorizable. Explicit function calls are also a weakness in K, but that is usually not a problem since when you have an array of data you typically make a single explicit function call augmented by an adverb. Take the relational database implemented in K as a real-world example of K programming productivity and operational speed.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
OK (3.00 / 1) (#26)
by ff on Wed Mar 13, 2002 at 09:51:48 PM EST

Understandable.

How fast is K at calling external API's? Say OpenGL for example.

Are there any matrix multiplication benchmark-like programs for K? I might want to test those. I actually have a very specific use that I'm working towards. Interactive 3D games, graphics/CAD, etc. That is the reason I need real-time speed close to C.

[ Parent ]

matrix multiplication and OpenGL (4.00 / 1) (#53)
by Cal Jayson on Thu Mar 14, 2002 at 12:11:00 AM EST

Actually, I did some code to provide an OpenGL from K module. It worked very well, but I wasn't doing anything incredibly high performance, just some data visualization. For things like that your bottlenecks are going to be in the OpenGL library, not the application code. There are howevever two ways to do matrix multiplication in K. First, you can use the standard * operator, but that will do matrix multiplication the same way it will do vector multiplication the same way it will do scalar multiplication: the naive method. The second mathod is the _mul operator that I believe uses a more optimized algorithm. However, if your performance is strictly limited by how fast you can multiply matrices then you would probably want to use Lapak integrated inside K. The K-C interface is very minimal and fast.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
K will kill any other functional language (4.00 / 1) (#64)
by Cal Jayson on Thu Mar 14, 2002 at 01:17:22 AM EST

If all you are doing is matrix multiplication, then you probably cannot beat tightly coded C (or ASM). However, if you have other cosiderations in the program (such as memory and data layout) then K will kill any other language out there. In K all the memory allocation is hoisted as high as possible. The smallest memory allocation is a page, where it lays out all the data sequentially, so it is extremely cache and pipeline friendly. My roommate did some cache analysis of executing K code and he found that his code almost never had a cache miss or a branch misprediction because of this sequential access pattern. K is an absolute natural for any numerical processing including graphics.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
Saw it already (2.00 / 1) (#47)
by inerte on Wed Mar 13, 2002 at 11:29:53 PM EST

Are you the same Jayson? I hope so. I gave you 1 before realizing the posters name were the same. Since I can't come back on a comment rate here, I a giving you 5. But I am not sure if it's you.

Original comment on Slashdot:

http://developers.slashdot.org/comments.pl?sid=28597&cid=3075088

--
Bodily exercise, when compulsory, does no harm to the body; but knowledge which is acquired under compulsion obtains no hold on the mind.
Plato
[ Parent ]

yes (2.00 / 1) (#54)
by Cal Jayson on Thu Mar 14, 2002 at 12:14:52 AM EST

Yes, I am the same as the Slashdot Jayson. Just because K is so ruling and elite, I try to spread the kospel when it is topical. I am confused, though. Did you want to give me a 1 and accidentally gave me a 5?
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
1 because I thought you copy/paste (2.00 / 1) (#106)
by inerte on Thu Mar 14, 2002 at 11:32:55 AM EST

I gave you 1 because I thought you just copied and paste a post I saw on Slashdot. Then it hit me, could be the same person! So I went there, did a little search, and yes, was the same. So 1 was unfair, had to change....

--
Bodily exercise, when compulsory, does no harm to the body; but knowledge which is acquired under compulsion obtains no hold on the mind.
Plato
[ Parent ]

Price? (3.00 / 1) (#119)
by dennis on Thu Mar 14, 2002 at 01:03:08 PM EST

I looked but I couldn't find any pricing, and the download says noncommercial only. If I wanted to use K to write an application, what would it cost me? What if I wanted to embed KDB?

[ Parent ]
price (5.00 / 1) (#148)
by Cal Jayson on Fri Mar 15, 2002 at 03:43:26 PM EST

I think that KDB is $25,000 to $50,000 depending on the license. There is also a parallel, distributed version of KDB that might be priced slightly higher. You would save that on hardware costs alone. I do not know if there is a K only license, though. Arthur is very lenient on pricing and has even expressed interest in open-sourcing K if there was enough demand and community.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
Hmmm (none / 0) (#133)
by greenrd on Thu Mar 14, 2002 at 03:22:20 PM EST

Looks like a write-only language to me.

Not that I know what K programs would look like to a seasoned K programmer - but you have to take into account the cost of finding/training programmers who will be able to understand and maintain K code. TCO for programming languages?

I'm not convinced that extremely terse languages are a good idea. When I was a teenager I had trouble understanding some old QBasic code I'd written when returning to it a few months later - and QBasic is not especially terse!

Code input should be shortened with keyboard shortcuts etc., but code output (i.e. display, printing etc.) should be - to an extent - optimised for readability.


"Capitalism is the absurd belief that the worst of men, for the worst of reasons, will somehow work for the benefit of us all." -- John Maynard Keynes
[ Parent ]

The benefit of terse code. (none / 0) (#145)
by Cal Jayson on Fri Mar 15, 2002 at 02:12:34 PM EST

As I said, I came from a Scheme, Java, and C background so I was used to very verbose code. However, once you become used to seeing K it becomes easier to read.

First, as shown in the original post, K is directly translatable to English and can be read aloud. My roommate and I were able to carry on conversations and describe algorithms to each other in spoken K after a month or two of hacking in it. Since there is less to say, there is less to keep juggling in your head and ultimately to begin to visualize a very clean picture.

Second, being able to see all the specifics of an algorithm on a screen makes things much easier to build, read, and debug. I can stare at a line of K and zone out while trying to understand it; everything is in front of me. My concentration is not distracted by flipping between pages of code; I can see all the variables and functions being used and how they are defined. This lead me to write much simplier code as I didn't write unnecessary code but only the code that I needed since I could always see exactly what I needed.

Third, even if you do not believe in the benefits of code compression, there is still a point at which you write so much less code that you are willing to put up with it. If you had to write 10-times less code or 20-times less code would you put up with it? My roommate took 1200 lines of Java code and squished it to 6 lines (120 columns not 80 columns, and he had an order of magnitude performance increase). Now you have less code to manage.

Fourth, with less code you need fewer people. Most K projects are very large scale systems, but most K project team sizes are only a handful of people. This is because you need to write so much less code. Would you rather need to find 10 good programmers to write and maintain your code or just find 1 K programmer that could write the code and only take 10% of his time maintaining it (or you could just hire Arthur -- the K creator -- for a day to do the entire project probably).

Fifth, K4, the next version of K, will allow you to write English like code if you want.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
Thanks (none / 0) (#146)
by greenrd on Fri Mar 15, 2002 at 03:09:21 PM EST

Fifth, K4, the next version of K, will allow you to write English like code if you want.

Now that will be cool. I'll bookmark it and be sure to check back on it later.


"Capitalism is the absurd belief that the worst of men, for the worst of reasons, will somehow work for the benefit of us all." -- John Maynard Keynes
[ Parent ]

I post when it happens (none / 0) (#147)
by Cal Jayson on Fri Mar 15, 2002 at 03:38:02 PM EST

Don't worry, I'll post a K5 story when K4 comes out.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
Having written a fortran compiler, (4.00 / 1) (#16)
by acronos on Wed Mar 13, 2002 at 09:20:18 PM EST

It is not nearly as easy as you make it sound. C was specifically written to emulate how the problem would be solved in assembly. Almost all assembly commands are available in C. In C, most of the time you are working with the real way that the data is represented in memory. High-level languages have a different purpose. They are written to provide the easiest human programming interface. They simplify data typing. Simple objects such as strings are not easily represented in computer memory. Compilers simply are not as smart as humans most of the time. People who get their hands dirty with the details can almost always outsmart the compiler. The more distance you put between how the computer thinks (assembly) and how a human thinks (very high lvl language) the more the compiler is doing your thinking for you.

Another serious problem with your post is jit. Precompiled languages will always be significantly faster than languages that compile in real time. Surely it is obvious that the compiler takes processing time.

I agree that programmers for high-level languages could do better. But why dedicate the tremendous resources? Put the speed time into languages that are already optimized for speed. Put the time when developing high lvl languages into making them easy to use. Most programmers use the correct tools for the project. If you need speed, write in C or assembly. If you need a fast development time then choose the high lvl language that best fits your needs. Learning a new language is easier than making the wrong tool fit in a square hole. It is not uncommon for a programmer to know 15 languages with 3 or 4 that they know REALLY well. C or assembly are good languages to know well IMHO because they teach you how the computer thinks. Visual basic, java, and a good scripting language are also high on my list. Each serves a different purpose, and I use them all.


Low Level Rider.... (3.66 / 3) (#18)
by VoxLobster on Wed Mar 13, 2002 at 09:22:50 PM EST

C is fast, because it doesn't take care of all the stuff that you should be doing. Perl does everything for you, and that's why it's slow. C doesn't do everything for you, and it's fast. The main thing is good programmers. High level languages tend to promote less skilled programmers, because the programming language shields you from the stuff that makes programmers really good. If you want a fast program, you're going to have to buckle down and learn a low level language.

VoxLobster
I was raised by a cup of coffee! -- Homsar

for some values of 'really good' (4.00 / 2) (#20)
by jacob on Wed Mar 13, 2002 at 09:29:18 PM EST

Your argument rests on the idea that eking out all possible performance from a program is the only worthwhile skill to have as a programmer. Very high-level programming languages encourage other good qualities in programmers: writing well-abstracted, easily understood, modified, and extended code quickly. Sometimes (most of the time, I'd say) that's more important than getting every last drop of efficiency from your code.



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
the problem (3.00 / 1) (#27)
by ff on Wed Mar 13, 2002 at 09:57:08 PM EST

The problem is that I am actually trying to get every last bit of performance. I have a very narrow range of software I'm targeting. Interactive 3D games, graphics/CAD, etc... I need real time performance. I want to replace C with a functional language so I can get the high level constructs (and safeness) where I need it. I know I could use the high-level language to call C functions for speed, but actually having to call from the higher language into C is an even more serious performance hit (at least in O'Caml calling out to say, OpenGL).

[ Parent ]
Functional language to C interfaces (4.00 / 3) (#44)
by jacob on Wed Mar 13, 2002 at 11:09:59 PM EST

Have you considered organizing your program to minimize the number of these calls necessary? For instance, you could build a "scene" data structure in OCaml, then pass the whole thing to a native function that's responsible for 100% of the rendering. Alternately, the program could start as a C program and call into OCaml to do some of the more abstract stuff.

You may already be doing this, in which case I'm not sure what more to tell you.



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
hmmm (3.00 / 1) (#28)
by ff on Wed Mar 13, 2002 at 09:59:23 PM EST

I don't think I need to "buckle down and learn a low level language". I've been programming in C for more than 15 years and assembly for longer (we're talking Z80 days).

I consider functional language fairly low level but better than C. That's been my whole point, I think the functional languages could replace C both in terms of speed and better code.

[ Parent ]

pointless comment from a guy who knows 0 (4.50 / 2) (#19)
by Anonymous 242 on Wed Mar 13, 2002 at 09:23:27 PM EST

I can see how one could argue that compiling to C imposes an artificial speed limitation on functional languages. Intuitively (rightly or wrongly) it seems reasonable that no language that compiles to C will ever be faster than C.

On the other hand, there is no prima facie reason I'm aware that would dictate that a language that compiles to C ought to be slower than pure C. If most functional languages are slow when compiling to C, that would indicate to me a problem with implementation and optimization of the compiler, not of the language itself or of the fact that the language compiles to C.

If I recollect correctly, most Eiffel implentations compile to C and are comparable to C in speed. (Of course Eiffel is not a functional language.)

As a secondary point, on most computers for most purposes, the speed of compiled code relative to C is an argument of straw. Speed is in the eye of the beholder, if a program is "fast enough" for a given user, it is irrelevant how fast it would be written in this language or that language in comparisson to C. Past the threshold of usability, speed only matters in certain situations.

Regards,

Lee Irenæus Malatesta

ps: it's a shame this article is getting voted down. whatever its merits may or may not be it is an interesting topic for discussion.

true but (none / 0) (#24)
by ff on Wed Mar 13, 2002 at 09:47:10 PM EST

Does using C not impose artificial limitations on the language? Things like recursive calls are very slow in C, that's a serious limitation for an intermediate language. I know that you should be able to get around this issue by using iteration, but how many other things are there like this? Lke you said though, this may just be a problem with the compiler that is creating the C code.

I have benchmarked Eiffel and found it to be much slower than C. Of course I was using the se compiler and not a commercial product. The language shootout has some benchmarks that show how slow se can be. :( I'm not a big fan of Eiffel BTW... hehe

True that speed depends on what you are doing, that's part of the reason that I mentioned I also use Perl and other "slow" languages when appropriate. I actually don't care if the functional language is slow in some areas. But when I'm running a simulation or a game, it needs to be fast, C fast in the engine. Of course I could use C just for those areas, but the interface between C and <pick your other language> tends to be an additional layer of slowness. In my own testing the O'Caml OpenGL stuff is 2-3 times slower than the same C code almost solely due to all the language layers. 2-3 times slower is too much when doing interactive 3D stuff that is pumping billions of bits around in real time. I would like to replace C with a functional language.

[ Parent ]

OpenGL usage? (none / 0) (#34)
by arjan de lumens on Wed Mar 13, 2002 at 10:15:50 PM EST

Sounds a little weird to me that OpenGL performance should be held so much back by function call overhead - in particular if you are using features like vertex arrays and display lists ..? OpenGL should normally be bottlenecked by the graphics card unless you do a really large amount of CPU calculations per frame.

[ Parent ]
OpenGL (none / 0) (#38)
by ff on Wed Mar 13, 2002 at 10:37:41 PM EST

OpenGL is probably a bad example of an external API but... For the most part you are right, however, at certain times you take a big hit when making lots of GL calls. Building a large display list is one example. Hopefully you don't do that too often, but it will slow down when you do it. It's a user perception thing and it really depends on what you are doing. I tend to work with software that has a lot of user interaction and pumps billions of bits around (graphics, etc.). I have not benchmarked vertex arrays, I will soon and see how they are (I'm wondering about how fast O'Caml will actually be able to update the memory). I haven't had a whole lot of time to play lately.

[ Parent ]
Not using vertex arrays !?!? (none / 0) (#39)
by arjan de lumens on Wed Mar 13, 2002 at 10:44:37 PM EST

Pretty much all graphics cards have their OpenGL drivers strongly optimized around vertex array usage - with vertex arrays and proper locking of them (NV_vertex_array_range on nvidia cards) you might well be looking at a 20x performance increase - up to 40 MPolys/sec should be attainable on a Geforce4 this way.

[ Parent ]
Not over a display list (none / 0) (#42)
by ff on Wed Mar 13, 2002 at 11:03:35 PM EST

I've found display lists to be even faster for static data.

I do use vertex arrays, but I also use display lists.

Since I'm new to the functional scene I have only had time to benchmark the display list stuff, eventually I'll get to the vertex arrays in a functional language (probably O'Caml).

[ Parent ]

Vertex arrays and display lists (none / 0) (#48)
by arjan de lumens on Wed Mar 13, 2002 at 11:30:33 PM EST

You could still keep call overhead down by using vertex arrays during the construction of a display list, if you aren't doing this already.

In any case, if you are not using vertex arrays wherever possible, you will suffer substantial function call overhead whether you work in C or OCaml (albeit more severely in OCaml, due to the OCaml->C translation)..

[ Parent ]

se? (none / 0) (#35)
by Anonymous 242 on Wed Mar 13, 2002 at 10:15:51 PM EST

Small Eiffel? The GPL Eiffel?

How about Tower Eiffel or one of the other commercial ventures?

[ Parent ]

Right (none / 0) (#37)
by ff on Wed Mar 13, 2002 at 10:29:28 PM EST

yeah, like I said, I haven't used a commercial Eiffel It could very well be much faster.

There could even be a functional language out there that is as fast as C and uses C as an intermediate language. I wouldn't rule out the possibility, but I haven't seen it yet. I don't know how much work has been done on making commercial functional languages and I have not benchmarked or tried Clean (I will soon but I don't like the fact it's not open/free).

[ Parent ]

By the same logic (none / 0) (#79)
by QuantumG on Thu Mar 14, 2002 at 07:06:00 AM EST

As every language compiles to machine code, no language can ever claim to be faster than any other language. That's a pretty silly claim, you must agree. So what is it that makes one language "faster" than another? It's the level optimisation that can be performed during the translation of src to machine code. So in the case of functional languages to C you can do more analysis than you can if you are starting from C. So really the code should go faster. I think the reason why functional languages are not fast is simply because the compilers are not doing many optimisations. If you want to make the-fastest-language-on-earth then I think you should start with the fastest C compiler you can find (cause that's what we can compile fast these days) and say "now, if only I knew [whatever] about the source code I could perform this optimisation, resulting in faster code" and then add language support for the programmer to tell you that detail. This is what we do in our C compiler, we give the programmer the opportunity to give us "hints" which let us do optimisations which may otherwise change the meaning of the program. Taking this to the logical conclusion, a super fast language would be one where you can just specify what you want to do, not how you want to do it and the compiler selects the fastest algorithm based on hints you put in the source.

Gun fire is the sound of freedom.
[ Parent ]
That isn't the same logic (none / 0) (#103)
by Anonymous 242 on Thu Mar 14, 2002 at 11:11:04 AM EST

I did not make a claim comparable to argument to absurdity. I claimed that any language compiled to C has a limit of the speed of execution of C. This is true. A comparable claim is that any language that compiles to assembler has limit of the speed of execution of assembler. This is also true.

This does not mean that some languages that compile to C (or assembler) cannot be faster than others. This only means that a language that compiles to C (or assembler) will never be faster than C (or assembler) because it must compile to C (or assembler).

This doesn't mean that the C spit out by the compiler of the other language might not be faster than some implementations of the same program written in C. It only means that the absolute fastest code produced by the compiler is still C code and therefore can never be faster than C code because it is C code.

Regards,

Lee Irenæus Malatesta

[ Parent ]

Sorry (none / 0) (#107)
by QuantumG on Thu Mar 14, 2002 at 11:33:07 AM EST

I thought you were implying that code written in a language that compiles to C can never be faster than the same code written natively in C. My point was that a compiler that compiles to C can generate more optimised code than if that code has been written in C originally. Simply because the code can be optimised at a higher level of abstraction.

Gun fire is the sound of freedom.
[ Parent ]
The difference is illusory (none / 0) (#141)
by Anonymous 242 on Fri Mar 15, 2002 at 12:23:01 AM EST

I thought you were implying that code written in a language that compiles to C can never be faster than the same code written natively in C. My point was that a compiler that compiles to C can generate more optimised code than if that code has been written in C originally.
A "native C" program can theoretically achieve the same optimizations as any program written in a higher level that compiles to C. The higher level language must be able to express itself in C, which means that a "native C" program is capable of the same optimizations.

Certainly it is quite conceivable that a compiler would be better at optimizations than many (or morely accurately, most) programmers. That said, in a world where hand-tuned assembler by an assembler guru is almost always faster than assembler produced by a compiler . . .

Bear in mind, that I am not one of those programmers that writes insanely fast C. At best I'm a mediocre programmer.

Regards,

Lee Irenæus Malatesta

[ Parent ]

That's a different problem (none / 0) (#142)
by QuantumG on Fri Mar 15, 2002 at 06:08:04 AM EST

optimising assembly by hand is a different level of optimisation that a C programmer must perform to increase the speed of a program. For example, replacing an O(n**2) algorithm with an O(n log n) algorithm is not something anyone expects a compiler to perform, yet if the language it was written in was high enough perhaps they would.

Gun fire is the sound of freedom.
[ Parent ]
Regardless (none / 0) (#144)
by Anonymous 242 on Fri Mar 15, 2002 at 12:00:13 PM EST

It is a problem of implementation of a program in a given language by either the compiler or the programmer, not a problem inherent to the language being compiled to.

[ Parent ]
Slowness reputation (4.00 / 1) (#23)
by marx on Wed Mar 13, 2002 at 09:45:01 PM EST

I think the slowness reputation comes from the recursive programming, not the code generation. A typical recursive function can look like this:

fun f(list l) =
 if(empty(l))
  emptyList
 else
  makeList(
   someTransformation(head(l)),
   f(tail(l)))

For every level of recursion, a list must be created, i.e. there must be a lot of "temporaries".

I don't know if there are some smart ways to optimize away this, but I don't think so. You could write an iterative version which could be reduced to the optimal assembly code by a compiler, but if you can't use recursion there isn't much point in using a functional language.

Join me in the War on Torture: help eradicate torture from the world by holding torturers accountable.

Slowness reputation (1.50 / 2) (#29)
by Qarl on Wed Mar 13, 2002 at 09:59:51 PM EST

I'm not sure it's "functional languages" that have a "slowness reputation" here..

Cheap shot, but I can't resist a play on words.

Simple recursion like you've just posted, where the only recursive call is the very last step of the iteration, is called "tail recursion". Every decent, self-respecting functional language implementation optimizes this as simple, for-loop-style iteration.

More importantly, if you didn't already understand that -- not just to the poster I'm replying to, but to everyone -- if you don't know what tail recursion is, or if your knowledge STOPS there, do NOT post to discussions on functional languages. You simply do not know what you are talking about. Which is fine. On other topics, I don't know what I'm talking about either.

(Of course you CAN post, just please formulate posts as questions and not assertions.)


--Carl
[ Parent ]

tail recursion (none / 0) (#33)
by Delirium on Wed Mar 13, 2002 at 10:12:14 PM EST

Do functional language compilers actually optimize tail recursion themselves, or does the programmer have to do it? It would seem that from an elegance perspective tail recursion is sort of against the spirit of functional programming (i.e. treating functions somewhat like math functions), but if a programmer can program without dealing with it and then have the compiler optimize it when possible that'd be nice.

FWIW I believe GCC will do tail-recursion optimization in some limited cases for recursive C code. The one functional language I've used ("rex," some research language based on lists) didn't, but obviously that's not much of a sampling.

[ Parent ]

Scheme requires tail-recursion elimination (none / 0) (#63)
by Cal Jayson on Thu Mar 14, 2002 at 01:05:30 AM EST

The Scheme R*RS requries that implementation do correct tail-recursion elimination (not just tail-call elimination). Almost all implementation do this (there is a Java system that I still has problems being completely compliant, though). Any good Lisp system will do this, too, but it is not required of them.
--
kx.com: 2.5 billion trades
select max price from trade takes 1 second
[ Parent ]
Tail recursion (5.00 / 1) (#40)
by marx on Wed Mar 13, 2002 at 10:46:05 PM EST

I know what tail recursion is. I've never looked at functional language compiling or interpreting though.

I'm surprised by what you say. In my experience with functional languages, I think the only time I've had to use anything else than tail recursion is for operating on trees, or for computing the fibonacci series.

This would mean that practical functional language programming could be compiled to near-optimal assembly.

Since you posted the way you did, you obviously are an expert on functional language compilers. It would be nice if you then could explain why functional languages cannot compete in speed with imperative languages, instead of biting my head off.

Join me in the War on Torture: help eradicate torture from the world by holding torturers accountable.
[ Parent ]

Re: Tail recursion (none / 0) (#129)
by Qarl on Thu Mar 14, 2002 at 03:07:52 PM EST

Okay, so I overreacted (and as pointed out below, screwed up my analysis as well). But I stand by my point, both in your case and in general. As you say, you have not looked at functional language compiling or interpreting. So please don't say things like "I don't think there's any optimization that will remove the need for a recursive stack here" -- for all you know, there is, and it's used everywhere. The specific case you gave isn't quite obviously tail-recursive, but it can be optimized away -- I'll discuss how in my response to Jacob.

And on my end, I'll refrain from biting people's heads off in the future.


--Carl
[ Parent ]

You want to know something funny? (5.00 / 1) (#46)
by jacob on Wed Mar 13, 2002 at 11:23:57 PM EST

... if you don't know what tail recursion is, or if your knowledge STOPS there, do NOT post to discussions on functional languages. You simply do not know what you are talking about.

Are you sure you mean that? Because if you are, you'd better not post to this thread anymore ... :)

The function the original poster made did recur on itself, but it wasn't tail recursion -- see the "makeList" call waiting to compute until after the recursion is finished? Yep. That's what's in tail position, not the recursive call. It's true that you could convert it to a fully tail-recursive function, but such a conversion wouldn't get rid of the inefficiency the original poster ascribed to it (partly because that inefficiency doesn't in fact exist for reasons completely divorced from tail calls, as I'll explain in a followup to his message shortly).



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
Re: You want to know something funny? (none / 0) (#131)
by Qarl on Thu Mar 14, 2002 at 03:14:08 PM EST

Oops.

The function given there is basically just "map" (a general-purpose function that shows up in pretty much every functional language and iterates over a list applying a transformation function, for those unfamiliar with it). Since I've seen map in so many places, I'm used to assuming it has a linear-time, no-stack implementation. I forgot that this is not because of simple tail-recursion elimination but because the makeList function referred to above is generally a language built-in function and so even if it is part of the call-stack, this can be optimized out into a function that simply iteratively creates a new list.

Oh well, put my foot in my mouth. At least I didn't get all the way to my ankle this time.


--Carl
[ Parent ]

Not a problem (5.00 / 1) (#50)
by jacob on Wed Mar 13, 2002 at 11:51:38 PM EST

Actually, the problem of temporaries doesn't exist here at all. A typical implementation of MakeList will define a "List" as either a pair (roughly, a two-pointer-wide structure) whose first pointer points to the given item and whose second pointer points to another List, or a special value representing the empty list. MakeList, then, just allocates memory for a new one of these structures, copies its first argument into the first pointer and its second argument into the second pointer. So, in the example you give, every MakeList call is necessary because it allocates one cell of the result list -- though it seems like you're making lots of temporary lists that you're just throwing away, in fact you're not.

So your problem statement isn't actually a problem, but there is, as others have pointed out, a way to convert recursive code to iterative code anyway. It won't help alleviate the problem you're talking about, though: if you did in fact use a Copy-The-Whole-List-And-Prepend-This-Element function instead of the implementation I've described, it'd still be slow even in assembly language.

Interestingly, there are a couple of other reasons why this code you've posted will very likely always be slower than the equivalent C program. They are both arcane, and both stem from the fact that C uses arrays wherever possible, whereas functional languages tend to use linked lists. First, linked lists cost more to traverse in order than arrays (an extra dereference, plus possible cache issues). Second, the list based version allocates memory "as it goes": if there are N elements in the list, there will be N separate calls to the memory allocator. C code using arrays would likely just allocate all the memory up front by precomputing how long the result would be, or (more likely for C people) just modify the array in-place and destroy the original value in favor of the new one. Either of these options would be faster.



--
"it's not rocket science" right right insofar as rocket science is boring

--Iced_Up

[ Parent ]
What's your array test? (5.00 / 1) (#30)
by Lord of the Wasteland on Wed Mar 13, 2002 at 10:08:33 PM EST

I'd be interested in seeing the O'Caml and C programs you used for your array access comparisons. Also, I haven't used O'Caml all that much--if you can get the intermediate C that it generates that would be interesting for comparison purposes as well. I would expect that O'Caml, even with unsafe array accesses, is still doing a lot of other things not represented in your C version.

Also, what do you mean when you talk about "going through multiple layers of C code" to interface to external programs? Are you saying you think direct translation to assembly would allow a more effecient interface?

PS

I gave the article a -1 because I predicted that most of the posts that actually responded to your article (as opposed to all the ones that just defended functional languages against attacks you weren't making) would focus on the advantages of reusing existing optomizing C compilers. How much you lose by compiling to C first is highly dependant on the quality of the C that you generate and the strengths/weaknesses of your optomizing C compiler. I didn't think enough people would be able to make substantive contributions to that kind of discussion. Since this article is on its way up now, please prove me wrong.

Array tests (none / 0) (#41)
by ff on Wed Mar 13, 2002 at 10:59:21 PM EST

For the array tests I was using the ary3 (gcc ocaml) and matrix multiplication (gcc ocaml) tests from the language shootout. Note that in my own testing I've found the shootout benchmarks to not always be accurate and I believe the reason is because he doesn't run the test long enough to be able to compare anything. In my own benchmarking I've found that you need at least 10-20 seconds of run time to make any useful comparisons. The OpenGL tests I did were using my own benchmark, I haven't posted them anywhere.

I also would like to look at the O'Caml generated C code. I haven't looked very hard for how or if you can do that. I assume there must be a way.

When I said "multiple layers of C code" I was referring to the C code that gets generated from your orginal source which has to call more C code that was generated from the O'Caml to C interface stuff, which then calls the OpenGL API (if you're lucky, there may be more layers in there). This is a problem in just about every language (other than C/C++ or assembly). Most of the time you don't worry about it because whatever language you are using is so much slower than C that it doesn't matter, but in this case (O'Caml) it's annoying because the language is otherwise very fast. I don't know if this issue would benefit much from the direct assembly code generation.

[ Parent ]

Interface stuff (none / 0) (#43)
by Lord of the Wasteland on Wed Mar 13, 2002 at 11:08:44 PM EST

When I said "multiple layers of C code" I was referring to the C code that gets generated from your orginal source which has to call more C code that was generated from the O'Caml to C interface stuff, which then calls the OpenGL API (if you're lucky, there may be more layers in there). This is a problem in just about every language (other than C/C++ or assembly). Most of the time you don't worry about it because whatever language you are using is so much slower than C that it doesn't matter, but in this case (O'Caml) it's annoying because the language is otherwise very fast. I don't know if this issue would benefit much from the direct assembly code generation.

Since O'Caml already compiles to C I would be very surprised if there were much overhead in making an external API call. After all, since it's already running in C, a C runtime (heap, stack) is already set up. The side effects of the call shouldn't require any processing in the O'Caml environment.

[ Parent ]

overhead (none / 0) (#45)
by ff on Wed Mar 13, 2002 at 11:13:47 PM EST

Yeah, I was surprised too. The OpenGL calls were 2-3 times slower than C versions. I need to test some other API's, it could be because of the way the OpenGL hookup is written.

[ Parent ]
OCaml : probably the best you can get (5.00 / 2) (#55)
by arjan de lumens on Thu Mar 14, 2002 at 12:15:16 AM EST

Apparently, from looking at the source code of the OCaml compiler, it compiles straight to assembly code rather than going through a C compiler. (yes, I know I referred to an OCaml->C translation elsewhere - error on my part.) It seems to support a lot of platforms besides x86 as well, too, like IA64, alpha, arm, powerpc etc. And the entire compiler - which seems to be doing a lot of optimizations - is written in OCaml as well.

It appears from the discussion here that OCaml is a bit slow at calling external C libraries and doing arrays, compared to C. Haven't yet looked at the actual code that the compiler is producing..

I don't know (none / 0) (#61)
by ff on Thu Mar 14, 2002 at 12:55:35 AM EST

I had thought the same thing when I first started using O'Caml. Then I found the "-cc" and "-ccopt" options and figured it was using the C compiler like everything else I had seen. So now I'm confused. The -Ox options (and others) do have an impact on the resulting O'Caml executable. How does it optimize it using the C compiler if it is creating assembly directly? Is there a way to optimize object files? (I didn't know that was even possible but I'm willing to learn)

If it's not using C as an intermediate then that would definately be a huge plus in my book. Again, I'm not one who knows a lot about language design so I didn't bother looking at the O'Caml compiler source (although I guess I should).

Maybe I should post on the O'Caml list. hehe

[ Parent ]

OCaml (none / 0) (#69)
by i on Thu Mar 14, 2002 at 03:11:30 AM EST

can compile to C and straight to assembly. Isn't life wonderful? :)

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
How? (none / 0) (#83)
by ff on Thu Mar 14, 2002 at 08:37:30 AM EST

How to compile to one or the other?

[ Parent ]
Sorry. (none / 0) (#92)
by i on Thu Mar 14, 2002 at 09:31:59 AM EST

I've lost the OCaml distro when switching machines; it's been a while since I played with it. From their page it appears that it does not compile to C, only to various assemblies. Just to check it out, I've downloaded the distro and -- lo and behold! -- it is the case indeed. Just look in the compiler source, f.ex. in ocaml-3.04/asmcomp/i386 directory.

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
the OCaml compiler (none / 0) (#70)
by arjan de lumens on Thu Mar 14, 2002 at 03:18:59 AM EST

Apparently, it invokes the C compiler for any *.c files you might specify on its command-line - "-cc" selects this compiler/linker and "-ccopt" passes an option to this compiler/linker. Apparently this is useful for compiling and linking complete mixed C/OCaml programs with a single command-line. It still appears that *.ml files are compiled directly to assembly without any intermediate C stages, though.

I am a little bit surprised that -Ox affects the compiler, as no such switch seems to be documented anywhere - but the compiler seems to have a bunch of undocumented switches anyway.



[ Parent ]

no (none / 0) (#81)
by ff on Thu Mar 14, 2002 at 08:29:24 AM EST

If you compile your O'Caml program with "-ccopt -O2" the executable will be faster than if you compile without that option. If you use the option "-cc icc" (using Intel's compiler) it will also be faster.

It is using the C compiler for something.

[ Parent ]

linking (none / 0) (#95)
by hading on Thu Mar 14, 2002 at 10:02:03 AM EST

According to the ocamlopt documentation, a C compiler is needed for doing the final linking step to make an executable (or of course, as has been pointed out, if any *.c files are to be compiled and included). I don't know if it's doing any more than that. As a test, install OCaml on a machine that doesn't have a C compiler, write a simple program, and compile it with ocamlopt. When I tried this, it couldn't produce a final executable, but did produce an asm file which appeared to represent my program.

In general I think that implementations that are using C as an intermediate form will generally state so, so I don't think it's good to assume that they are if that's not the case.



[ Parent ]
alright (none / 0) (#100)
by ff on Thu Mar 14, 2002 at 10:27:34 AM EST

This is mostly due to misinformation on my part. All the shootout tests are passing (apparently pointless) options to the C compiler.

Combine that with the fact that it takes C compiler options at all and down the path of ignorance I go. :)

[ Parent ]

OCaml and gcc (none / 0) (#97)
by arjan de lumens on Thu Mar 14, 2002 at 10:11:07 AM EST

Did some testing - if the default C compiler (gcc on my system) is removed/renamed, the OCaml compiler can still be made to produce assembly code using the "-S" switch, but it then stops, complaining that gcc could not be found during the linking stage. And compiling with and without "-ccopt -O2" results in exactly the same executable file (as compared with diff) with the same performance on my system ..

[ Parent ]
true (none / 0) (#99)
by ff on Thu Mar 14, 2002 at 10:25:03 AM EST

That seems to be true, I think I was getting confused between using ICC and GCC.

I think that is pretty cool. So now I'm thinking the difference in speed is due to O'Caml not being able to optimize as good as a C compiler. That is something that can be fixed, so that is very good news.

BTW, after playing with Clean I've found it seems to produce assembly directly as well. Too bad their target is mostly Windows. Well, that and the compiler isn't free.

[ Parent ]

Forth (4.00 / 1) (#56)
by ucblockhead on Thu Mar 14, 2002 at 12:16:27 AM EST

You might want to look into Forth. While not, strictly speaking, a functional language, it is very similar in style to one. Variables are rare, most everything is done on the stack, and recursion is king. It is both a very low level language (lower level than C even) and also a faster one (significantly faster than C). If I were going to compile a functional language into an intermediate stage, it would be Forth, not C.

Forth is simple enough that a subset can even be burned on silicon. HP calculaters are programmed with a subset of Forth.
-----------------------
This is k5. We're all tools - duxup

Forth (none / 0) (#62)
by ff on Thu Mar 14, 2002 at 01:00:58 AM EST

I have looked at Forth in the past. I have not benchmarked anything in Forth that was even remotely close to C speed. However, I have not used a commercial Forth compiler, only bigforth and gforth.

The language shootout has some big/gforth benchmarks if you are interested. Take these with a grain of salt because they are not always accurate. Do the tests yourself and use inputs that make it run for more than 10 seconds or so.

[ Parent ]

The ok> prompt in Sun machines... (none / 0) (#80)
by Tezcatlipoca on Thu Mar 14, 2002 at 08:01:29 AM EST

is ready to accept Forth.

Sun machines are running a fully functional Forth interpreter while in the ok> prompt.
---
"At eighteen our convictions are hills from which we look;
at forty-five they are caves in which we hide." F. Scott Fitzgerald.
[ Parent ]
Forth faster... (none / 0) (#155)
by joto on Thu Mar 21, 2002 at 08:18:22 AM EST

I've heard Forth-types claim this, but it is completely beyond any kind of scientific reasoning, and certainly not supported by any benchmarks. I've yet to see a good example of well-written Forth-code outperforming well-written C-code.

Forth is typically implemented by a threaded interpreter. That would at most give you approximately 50% of C's speed, but you will get very compact binary code, meaning it can all live in cache. So far, so good.

The alternative is to compile to various degrees of real machine code to speed things up. The more you approach a real machine-code compiler, the more inlining, etc, happens, and you will have approximately the same code-size as what a C-compiler would output, and about the same speed.

One problem with compiling Forth is that you artificially limit yourself to a stack-based model, instead of a register based one, (which all modern computers use) (but of course, a real compiler can allocate registers for you inside a colon-definition, and apply smart algorithms to minimize/avoid stack-manipulation, etc).

And then, there's the differences in calling conventions. Most C calling conventions involve putting some of the variables in registers, meaning less messing around with the stack. The semantics of Forth, basically dictates that everything should be on the stack between calls. But shure, you can have smarter calling-conventions in Forth as well (such as letting the top 10 elements of the stack be allocated to registers).

In the end, there is no reason why Forth shouldn't be as efficient as C (both are languages artificially constrained to a machine-model that was reasonable in the 60s or 70s, but hardly typical of todays computers). But there is certainly no reason for it to be more efficient, and in the end: the competitive market of C compilers make it much easier to find a good optimizing C-compiler than an equally good Forth-compiler. And when you look at actual results from real benchmarks, this is typically the result as well.

[ Parent ]

FP languages optimize the programmer (5.00 / 6) (#66)
by tmoertel on Thu Mar 14, 2002 at 01:23:43 AM EST

Most functional programming languages are designed not for execution speed but for computer-language research and, ultimately, programmer efficiency. While in the past this design emphasis made functional programming languages too slow and resource intensive for all but academic work, today it makes them great for most real-world problems.

Think about it. Ten years ago the scarce resources were processor time and memory. It wasn't uncommon to write programs in assembly language in order to use these resources effectively. Today it's a different story. It's hard to buy a CPU slower than 800 MHz, and memory is dirt cheap. Today, very few programs are written in assembly languages because overall it's more efficient to use a higher level language. Why waste programmer months coding in assembly when C++, Perl, and Java are more than fast enough for most jobs?

With recent advances in CPUs and memory, the following question has likewise become relevant: Why waste programmer months coding in C++, Perl, and Java when functional languages are more than fast enough for most jobs?

Now, functional languages aren't fast enough for all jobs, but that's okay. Functional languages let programmers solve problems quickly so that there is lots of time left over to performance tune the few bits where speed actually matters. Additionally, most functional languages have foreign function interfaces that let them incorporate low-level code. In the same way that C++ programmers can rewrite hot functions in assembly, functional-language programmers can rewrite hot functions in C++.

So, in conclusion, let's not be so hasty to criticize FP languages for speed. After all, when you step back and look at the bigger picture, FP languages optimize the scarcest resource of all -- the programmer.

--
My blog | LectroTest

[ Disagree? Reply. ]


Yes (none / 0) (#87)
by ff on Thu Mar 14, 2002 at 09:00:26 AM EST

We are in agreement but your angle is different.

My thought is that functional languages could in fact be as fast as C but they are not due to current compilers not generating the assembly source directly.

I agree that you don't need 100% machine speed all the time, especially if you're using a nice, high-level language. There are times when you do need that speed though, my focus is on the game, graphics/CAD, and other applications with high user interactivity and requirements to move lots of data. While true that you can rewrite some areas in C/C++ for speed, you then have the problem of interfacing your high-level language to the low-level C/C++ code. This always has a high cost as you have to convert parameters and pass through multiple layers of code. For example, in my own testing I've found OpenGL calls from O'Caml to be 2-3 times slower than the same calls from C. The tests I did were clean with little language overhead (simple loops).

By the way, I think the current set of functional compilers are quite good and produce some of the best code out there, but again, my point is that I feel they should be at least as fast as the equivalent C code for the same functionality.

[ Parent ]

Hmm (3.66 / 3) (#102)
by Ken Pompadour on Thu Mar 14, 2002 at 10:42:53 AM EST

Your argument presupposes that functional languages are superior to imperative languages for large-scale software development.

Please post your empirical evidence that this is true before I can consider the rest of your argument. Thanks.



...The target is countrymen, friends and family... they have to die too. - candid trhurler
[ Parent ]
Ask, and ye shall receive (5.00 / 4) (#125)
by tmoertel on Thu Mar 14, 2002 at 01:42:25 PM EST

Ken Pompadour wrote:
Your argument presupposes that functional languages are superior to imperative languages for large-scale software development.
No it doesn't. Its claims stand regardless of project size -- small or large.
Please post your empirical evidence that this is true before I can consider the rest of your argument. Thanks.
Here's a short list for starters:
  • Prototyping Real-Time Vision Systems: An Experiment in DSL Design (1998) Abstract: We describe the transformation of XVision, a large library of C++ code for real-time vision processing, into FVision (pronounced "fission"), a fully-featured domain-specific language embedded in Haskell. The resulting prototype system substantiates the claims of increased modularity, effective code reuse, and rapid prototyping that characterize the DSL approach to system design....
  • Four-fold Increase in Productivity and Quality: Industrial-Strength Functional Programming in Telecom-Class Products (PDF) Abstract: The AXD 301 ATM Switch is the flagship in Ericsson's line of Datacom products. A fault tolerant and highly scalable backbone ATM switch, AXD 301 has enabled Ericsson to take the lead in the migration of public telephony networks to becoming true multiservice networks, offering both quality voice and broadband data services on the same backbone.... This paper demonstrates how the development of such systems is supported by the Erlang/OTP technology. The Erlang [functional] programming language was developed by Ericsson specifically for the purpose of building fault tolerant, distributed and complex systems.... The paper demonstrates how Erlang supports the characteristics mentioned, while offering unusually high productivity.
  • Haskell vs. Ada vs. C++ vs. Awk vs. ... : An Experiment in Software Prototyping Productivity: Abstract: We describe the results of an experiment in which several conventional programming languages, together with the functional language Haskell, were used to prototype a Naval Surface Warfare Center (NSWC) requirement for a Geometric Region Server. The resulting programs and development metrics were reviewed by a committee chosen by the Navy. The results indicate that the Haskell prototype took significantly less time to develop and was considerably more concise and easier to understand than the corresponding prototypes written in several different imperative languages, including Ada and C++.
  • Functional languages in microcode compilers (ACM Digital Library). Abstract: This paper discusses the advantages of using high-level languages in the development of microcode. It also describes reasons functional programming languages should be considered as the source language for microcode compilers. The emergence of parallel execution in microarchitectures dictates that parallelism must be extracted from the microcode programs. This paper shows how functional languages meet the needs of microprogrammers by allowing them to express their algorithms in natural ways while allowing the microcode compiler to extract the parallelism from the program.
You can find more such papers by tracing references and by reasonable application of Google and CiteSeer. For first-hand reports, just ask around on comp.lang.functional. I'm sure some of the fine folks at Ericsson would be more than happy to share their experiences with large-scale programming in Erlang.

--
My blog | LectroTest

[ Disagree? Reply. ]


[ Parent ]
We don't want native code compilation. (5.00 / 1) (#68)
by i on Thu Mar 14, 2002 at 03:07:53 AM EST

What we want instead is a portable assembler that is suitable as an FPL backend better than C is. One such backend is C--. Note that there are a lot of projects that use this name; I believe the relevant one is at cminusminus.org.

and we have a contradicton according to our assumptions and the factor theorem

Online Book (5.00 / 1) (#72)
by Booj on Thu Mar 14, 2002 at 03:56:55 AM EST

This might be MLP, but since it's relevant to the poster's question, I thought I'd point at that the online book An Introduction to Scheme and its Implementation is a fairly easy read and has a nice section on writing a simple Scheme compiler. Most of the information is relevant to compiling other functional languages however, so it might be worth a look to better understand how FP compilers operate in the first place.



Thanks (none / 0) (#86)
by ff on Thu Mar 14, 2002 at 08:51:28 AM EST

Very interesting, I like it, thanks!

[ Parent ]
-1 because of this bit ... (3.40 / 5) (#74)
by vrai on Thu Mar 14, 2002 at 04:42:18 AM EST

Games, Graphics/CAD applications, etc. require maximum performance. And yes, I would argue that C/C++ is not the proper language to write those types of applications in. Nobody likes corrupted memory and other stupid things you have to deal with in C/C++.

Yeah because no-one write games in C/C++ ... wait a sec ... just about every bloody game on the market is written in C or C++!

The idea that C/C++ results in memory problems is a myth. Bad programmers result in memory problems. Its like blaming Java or C# because the code you wrote is dog slow; its not the language its you! A good programmer can get the best out of whatever language they are using. A good programmer can get to grips with C/C++ memory and write fast, maintainable, robust code. Yes there is bad C code out there, but thats because the f**kwits that wrote it should have stuck to VB rather than pretending to be programmers.



C/C++ (5.00 / 4) (#84)
by ff on Thu Mar 14, 2002 at 08:48:17 AM EST

While I agree the mistakes are the programmer's fault, we are still human. Everyone makes mistakes.

I've been programming in C for more than 15 years (C++ for the last 9 years) and I still make little mistakes. That's why I'm tired of it. No one can be perfect. Once you start using a safe language, you won't want to go back to agonizing over every little detail and fixing bugs that take forever to track down. Other people have written really good articles on this same subject, I'll see if I can find the links again.

As far as performance goes, sometimes you can't make it any faster than it is but it's still slower than the equivalent C program. I've run into this many times and I've spent much time trying to optimize out all the slow areas. What I've learned is that it's impossible to make something in a language other than C or assembly that is as fast.

[ Parent ]

I see your point ... (5.00 / 2) (#91)
by vrai on Thu Mar 14, 2002 at 09:29:33 AM EST

... but don't entirely agree :)

While I program in C/C++ at work, for my own stuff I also use Perl, Python, Java and PHP. For string mangling and general scripting I use Perl, for front-end / lightweight work I use Python/Java, and for web stuff I use PHP/Python. However for serious development (servers, games, encryption tools) I almost always use C/C++. Why? Because while high-level languages are nice I love the absolute control over everything that I get with C/C++. Because you have control over memory allocation you can do lots of clever optimisations that are impossible in languages like Java and Python. C++ is a nice compromise as it lets you develop using nice and maintainable OO patterns, gives you the STL for general work, but lets you get down to the metal when you have to. I can honestly say that the vast (99%+) majority of the C++ code I produce has no memory leaks when purified at the testing phase, and this is code that uses loads of C-style memory managling and pointer arithmetic. Whether this is because of my fantastic coding methods, or because I some kind of C++-empathic freak I don't know :)

If you can get the programmers, then C/C++ is still the best way to go for big, computation/memory intensive applications. The trick is find ones like me who delight in the complexity rather than one of the legions of Javabots that universities seem to be producing.

[ Parent ]

Not true (4.00 / 2) (#89)
by Betcour on Thu Mar 14, 2002 at 09:23:01 AM EST

The idea that C/C++ results in memory problems is a myth

It is not ! A language that requires the programmer to use intensively pointers is just begging for memory leaks (beside the pain of using pointers for almost everything).

[ Parent ]
True (5.00 / 2) (#93)
by vrai on Thu Mar 14, 2002 at 09:38:23 AM EST

Good programmers don't produce leaky code, regardless of the language they are using.

Some programmers (like myself and many of my techie friends) actually enjoy the control that manual memory manangement provides. As I said in another reply, I also use Java, Perl, Python, and PHP: all good languages, and all useful in their own way. But for most projects (which for me tend to be servers and encryption tools) C/C++ still reigns supreme. If you don't like C/C++s low-level approach then cool (less competition for jobs for me :) but don't blame the language for problems that caused by lack of skill amongst many developers.

[ Parent ]

Real world (4.00 / 2) (#94)
by Betcour on Thu Mar 14, 2002 at 09:59:17 AM EST

Good programmers don't produce leaky code, regardless of the language they are using.

Good programmers are human beings. No matter how good they are, only God can make 100% leak-proof code with C. For humans, there'll always be memory leak as long as the programmer is in charge of allocating and freeing memory. And the fact that C does use extensively pointers doesn't help.

I've coded extensively in ASM (a full OS), Object Pascal, PHP as well as a few projects in C/C++, ADA and Smalltalks, and frankly for anything that requires low level coding I prefer Object Pascal anyday over C/C++. It is more readable, higher level than C yet lets you put assembler or use pointers if you want to. I've coded anything from disk partitioning tools to multithreaded web server. (that was my 3 lines of proselytism for today ;)

[ Parent ]
Uhm, no (5.00 / 2) (#104)
by Anonymous 242 on Thu Mar 14, 2002 at 11:15:58 AM EST

only God can make 100% leak-proof code with C
Making leak proof code (for large projects) merely takes superhuman effort, not omnipotence. And mere mortals can make leakproof C code (and prove it is leakproof) for certain classes of problems.

Although, in general I agree. Memory managment can and is a pain in most circumstances and it is all too easy to gaff.

[ Parent ]

Leaky code. (none / 0) (#110)
by katie on Thu Mar 14, 2002 at 12:09:31 PM EST


C++ memory management is easy.

What you do is this: write a proper reference counting system, factory system, garbage collection/recycling system[1].

Stick in a library and make everyone in the company use it by having pointers in code defended in peer-review at gun-point.


Real-world code falls over because people don't ANY of those things at all.




[1] By the way, THIS is where it wins out over Java without any extras - your garbage collection can actually be "re-use things if they're expensive to build" or other esoteric happenings...

This is hard to do if you can't change the inner behaviour of references, which you can in C++ because they're implemented in the C++. In Java they're implemented by the VM...



[ Parent ]
you are conflating 2 issues. (5.00 / 1) (#122)
by toddg on Thu Mar 14, 2002 at 01:29:06 PM EST

What you do is this: write a proper reference counting system, factory system, garbage collection/recycling system[1]. Stick in a library and make everyone in the company use it by having pointers in code defended in peer-review at gun-point.
Excellent. You have now created a high-level language. Add first class functions, and some optimizations, and you have a functional language on your hands.

This is hard to do if you can't change the inner behaviour of references, which you can in C++ because they're implemented in the C++. In Java they're implemented by the VM...
You're just the victim of a shitty garbage collector, such as Java has by design. Real GCs allow you to manually control prefetch & allocation, or manipulate object lifetimes. But it turns out that you don't need that very often if your allocator & GC are any good. I'm afraid your reference counter just isn't going to cut it.

C'mon. Do you really think no one thought of this in the 4 decades since garbage collection was invented? Please.

[ Parent ]

Not not thought of... (none / 0) (#152)
by katie on Wed Mar 20, 2002 at 04:08:37 AM EST


It's more... well. Unfortunately, I live in a real world (unlike, apparently, a large percentage of Kiro5in readers. Saving grace is that that percentage is lower than the /.ers.)

In this strange real world, C++ is the only tool we get given. Yeah, I know. It's dumb. So I spend months writing things that would 30 lines of Perl. But I'm not allowed to deploy Perl. I've recently been forbidden from deploying software involving TCL, although I can use it as a test tool.

I never said it made any sense.

It's funny, none of these companies insist their drivers put custard into their truck's fuel tanks, or that we stop having the dealers get trades measured in pounds and switch to chocolate buttons as a currency measure, but software engineers get told the equivalents...


Hence, the only tool I have is a hammer, and so I'm very good at persuading a hammer to be a good screwdriver...


[ Parent ]
GC isn't easy in a library (5.00 / 1) (#124)
by SIGFPE on Thu Mar 14, 2002 at 01:31:51 PM EST

garbage collection/recycling system
That's not easy in C++. You can use libgc but that's pretty scary and not easily portable - AFAIK it delves directly into the stack and does all kinds of mucky stuff (I've written code to do this myself but I don't approve of it :-). If you don't do stuff like that I think you have to use classes with quite a bit of overhead.

Referencing counting can be made very elegant in C++ however.
SIGFPE
[ Parent ]

recipe for failure (5.00 / 1) (#130)
by avdi on Thu Mar 14, 2002 at 03:10:12 PM EST

That's a good way to piss off your good programmers, keep your mediocre programmers from improving, bog down your code, and cause memory leaks from failure to understand what goes on underneath.

If you really want to get away from leaky code, stop allowing your coders to write C, VB, or Java code in C++. Train them to use C++ right instead, and memory leaks will largely vanish. For more detail, see my previous comment; or buy a copy of "The C++ Programming Language".

--
Now leave us, and take your fish with you. - Faramir
[ Parent ]

Actually... (none / 0) (#153)
by katie on Wed Mar 20, 2002 at 04:16:09 AM EST

On the code front, that's what I meant.

But what actually happens is none of that. No code reviews. Well not proper ones anyway. Ones where people winge about indentation for an hour are quite common. "Look, line 516 isn't indented properly /EITHER/!"

Training people to use C++ properly is fucking hard. Especially since they don't actually seem to want to learn. Almost all the places I've worked at tend to not use inheritance very much because it scares them. Templates are a complete anathama - "Oh they don't work properly", "Oh the error messages are incomprehensible", "Oh, no-one needs things like that anyway". I've worked at companies where people have forbidden me to use multiple inheritance because the MDs cousin told him it wasn't a good technique.

In the real world, people use a stack of floppy disks as a revision control system and think VI is a good editor for engineers to use to write code. Nothing less than the very, very latest verson of Word for the managers, but nothing better than vi for the engineers.

Most of the C++ people learnt the language on a 3 day course or from a "learn in 21 days" book. They haven't got better since, and never seem to. They need all the help they can get.



[ Parent ]
Not a C++ user, are you? (5.00 / 3) (#112)
by avdi on Thu Mar 14, 2002 at 12:14:32 PM EST

Anyone who thinks that that C++ requires the programmer to "use intensively" pointers isn't a serious C++ user. Firstly, simply using pointers in not "begging for memory leaks" - dereferencing a pointer cannot, in any way, introduce a memory leak. Memory leaks are by definition caused by code that allocates memory and never deletes it. Good C++ programmers hardly ever write allocation/deallocation code, because in good C++ code all allocation is cleanly encapsulated. Memory is allocated as needed, and deallocated as soon as it is not needed. Kind of like garbage collection, only completely controlled and deterministic.

IME, most people who think that C++ leads to leaky code are also afraid/unaware of the STL, have no idea what a "smart pointer" is, and can't remember just what it was that "resource acquisitionis is initialization" means.

--
Now leave us, and take your fish with you. - Faramir
[ Parent ]

Good functional language compilers (4.00 / 1) (#78)
by StrontiumDog on Thu Mar 14, 2002 at 06:37:24 AM EST



Clean (none / 0) (#88)
by ff on Thu Mar 14, 2002 at 09:12:01 AM EST

Yes, I've seen those. ghc was slower than O'Caml in my tests, although I really like the Haskell syntax.

Clean seems very interesting, but since it's not very free and it didn't seem to support very many platforms (the new version only runs on Windows), I haven't tried it yet. I will though, maybe it's faster than O'Caml.

[ Parent ]

Computer Language Shootout (none / 0) (#90)
by bob6 on Thu Mar 14, 2002 at 09:23:48 AM EST

link
The methodology may not be strictly scientific but it gives an idea. gcc gets the first place but several compilers for functional languages rank high, especially some ml and lisp dialects.

Cheers.
oops (none / 0) (#96)
by bob6 on Thu Mar 14, 2002 at 10:08:21 AM EST

Ignore the message, there's nothing new... sorry.

Cheers.
[ Parent ]
Another couple of links you might find interesting (4.50 / 2) (#101)
by hading on Thu Mar 14, 2002 at 10:38:14 AM EST

Just for functional language advocacy in general, the following is interesting: Function Programming in the Real World. In particular, for the type of application you seem to be interested in, some of the numerical projects might be interesting. The FFTW adopts another possible way to use functional languages to get fast code: instead of solving the problem directly in the functional language (and getting native code either directly or via C), write the functional program to write highly optimized C code.



Lisp compilers (none / 0) (#111)
by dmcnaught on Thu Mar 14, 2002 at 12:10:45 PM EST

I haven't seen anyone mention Lisp in a non-tangential way, so I'll jump in.

The Common Lisp spec requires a compiler as part of the environment. CLISP compiles to byte code, but CMUCL and the commercial implementations (Allegro, Lispworks, MCL) have excellent optimizing native-code compilers. CMUCL in particular can do a lot of type-inferencing to find possible bugs and speed up computation.

type-inferencing? (none / 0) (#128)
by klamath on Thu Mar 14, 2002 at 02:52:47 PM EST

CMUCL in particular can do a lot of type-inferencing to find possible bugs and speed up computation.
While I was aware that CMUCL produces optimized code (for instance, this interesting story), but I didn't know that it can do significant type inference. Since Lisp is (very) dynamically typed, how is this possible? Sure, if you add type declarations then you get some compile-time checks, but I thought that was the case with all ANSI compliant CL implementations.

P.S. Hey doug -- it's Neil Conway from pgsql-hackers ;-)

[ Parent ]

some cmucl info and link (none / 0) (#134)
by hading on Thu Mar 14, 2002 at 03:22:38 PM EST

Lisp implementations have a lot of freedom to ignore declarations (c.f. section 3.3.1 of the Hyperspec). In particular, I don't think they ever have to look at type declarations. There's plenty of good information about what CMUCL does with type declarations in its documentation. It does do a very good job of optimizing things that it can and pointing out where you might be able to make more declarations to allow even better optimizations, and I believe it can also catch type errors.



[ Parent ]
type inference (none / 0) (#154)
by joto on Thu Mar 21, 2002 at 07:29:00 AM EST

Well, without type-inference it wouldn't be able to produce that highly optimized code.

Type-inference is most commonly known from statically typed functional languages such as ML, Miranda, Haskell and others, but there is no reason why it should be limited to those.

If for example the compiler is able to prove that a certain variable is a fixnum or a floating point variable, then the compiler can generate efficient code for that type, and avoid any tests to see which type the variable has at the point of access.

While testing for and executing code for the common case (i.e. fixnum) can be made surprisingly fast, it can't be made completely transparent without hardware support. There is no way to avoid that test unless you either tell the compiler explicitly (static typing), or the compiler is able to infer it itself. Thus type-inference is needed.

Even statically typed functional languages often use tagged variables, however. This is in order to make the garbage collector simpler to implement.

While compiling C is today a pretty well understood topic, compiling functional languages efficiently is still a research topic in many cases, depending on what constructs are seen as problematic, important to get fast, and especially, where the ideal balance of just making the common case blazingly fast, trying to make everything fast (but thus slowing down the "common case", whatever that is), or trying to get it readable, maintainable and correct, lies.

[ Parent ]

compilation is allowed to be rather minimal (none / 0) (#135)
by hading on Thu Mar 14, 2002 at 04:02:25 PM EST

While of course implementations typically have good compilers, note that the spec really doesn't require them to do very much:

The minimum semantic requirements of compilation are that it must remove all macro calls and arrange for all load time values to be resolved prior to run time.

As far as I can tell, that's about all the compiler has to do.



[ Parent ]
Intermediate language (4.00 / 1) (#113)
by hardburn on Thu Mar 14, 2002 at 12:18:18 PM EST

Nearly every compiler out there (including GCC and MSVC) compiles to some sort of intermediate language. Now, wheather C is the right language to use as the intermediate language for functional lang compilers is up for debate, but the fact is that almost all compilers do this, for any language.

If you didn't use an intermediate lang, and you wanted to port the entire compiler collection, you would have to rewrite the C compiler to generate the new ASM for the new platform, rewrite the C++ compiler to generate the new ASM, rewrite the Pascal compiler to generate the new ASM, rewrite the FORTRAN compiler to generate the new ASM, and so on. Useing an intermeidate language means you only have to change the compiler that translates the intermediate lang to the new ASM, and everything else just works with only minor issues. That makes things much easier to port and encourages code reuse, resulting in fewer bugs.

I belive there are more advantages to an intermediate lang, but I can't think of them ATM.


----
while($story = K5::Story->new()) { $story->vote(-1) if($story->section() == $POLITICS); }


Intermediate language (none / 0) (#151)
by awgsilyari on Mon Mar 18, 2002 at 04:27:03 PM EST

The intermediate language for GCC is called RTL and is basically a suped-up "quads" representation. If you haven't studied compilers at a formal level, a "quad" is basically an abstract machine instruction like "add x, y, z" -- add vars x and y, store result in z.

The purpose of the intermediate language is threefold:

1) optimization occurs at this level -- it does not occur in the tree structure (many compilers don't even build a tree).
2) it is target-independent and so serves as a "middle end" which is what makes GCC so wonderfully retargettable.
3) this is what you mentioned -- different language front-ends all compile into intermediate language so that the optimization engine can optimize all the languages and not just one of them. Of course, some languages are more easily optimized than others, even once translated into RTL.

--------
Please direct SPAM to john@neuralnw.com
[ Parent ]

Did you read... (none / 0) (#114)
by SIGFPE on Thu Mar 14, 2002 at 12:20:37 PM EST

...my diary entry for yesterday on a related topic. I'd love to use a functional language instead of C/C++ if I could get the performance I believe they are capable of.
SIGFPE
Um... (none / 0) (#115)
by trhurler on Thu Mar 14, 2002 at 12:31:39 PM EST

I hate to say this, but assuming for the C backend an appropriately written runtime and a decent code generator, I seriously doubt that you or anyone you know or have heard of can significantly improve upon C-targeted languages. You might think that optimizing use of the stack and registers for your language would make a huge difference, but odds are the C compiler already does it better than you can.

Not only that, but it doesn't make sense. The effort involved is huge(easily into the thousands or tens of thousands of lines of code per platform supported, and much more if you want good optimization,) and at best you might see a fractional performance improvement, if you hire the god of compiler writers.

There are tricks you can use to get extra speed when targetting C. The real question is, are they using them, and if not, why not?

--
'God dammit, your posts make me hard.' --LilDebbie

C targetting sucks (none / 0) (#117)
by SIGFPE on Thu Mar 14, 2002 at 12:51:13 PM EST

C is a crap language for optimization. C++ is better because of the 'restricted' keyword but even then Fortran will always leave C/C++ in the dust. ANSI C is unable to make any guarantees about which pointers affect which variables - so it has to play conservatively and leave a great many variables in memory rather than in registers. No such problem in Fortran. Check any benchmarks on the web.
SIGFPE
[ Parent ]
Um... (none / 0) (#118)
by trhurler on Thu Mar 14, 2002 at 01:02:46 PM EST

I'm pretty sure a version of restrict made it into the latest standard, didn't it? That was the only thing that made having a "new" standard worthwhile, IMO. The rest was all crap.

Also, the truth is that if pointer aliasing is that big a deal to you in "current C", you don't do it, and then you get yourself a compiler option that says "no pointer aliasing." They exist.

Finally, I'll point out that you have to do a lot more than limiting your use of pointers in the way of optimization in order to beat C. Fortran compilers have been around for a very long time, and are very, very good at what they do - but they're also not very portable, good ones are expensive, and so they're a lousy target for another language.

--
'God dammit, your posts make me hard.' --LilDebbie

[ Parent ]
Aliasing (none / 0) (#121)
by SIGFPE on Thu Mar 14, 2002 at 01:26:19 PM EST

and then you get yourself a compiler option that says "no pointer aliasing."
It's kinda scary to use because it's 'global'. In Visual C/C++, say, you can turn it on and off for individual functions but you can't control which variables are affected. Just turning it on can therefore have unforeseen consequences - especially as real code that needs optimization still does use pointers to refer to some variables.

I think you're right about the latest C standard though. I hope that stuff makes it into compilers soon. In fact I think gcc already has _restricted.
SIGFPE
[ Parent ]

I don't know about significantly (none / 0) (#126)
by a humble lich on Thu Mar 14, 2002 at 01:45:25 PM EST

but often programs written in Fortran can be faster than C. With modern this may be less true, but because of the reduced number of operations allowed in Fortran allow the compiler to make more optimizations. In the simple case I just ran (I can post the code if you really want) the Fortran version seemed to run about 20% faster using gcc and g77, which is surprising since g77 compiles to C in an intermediate stage).

[ Parent ]
New information (5.00 / 3) (#127)
by ff on Thu Mar 14, 2002 at 01:51:50 PM EST

Even though my premise is wrong, I am glad I posted this article or I never would have gotten the information I now have. Thanks!

There are functional compilers that generate assembly code directly. In fact, there are at least a few of them that used to create C code but now do almost everything in assembly. Although I had spent a lot of time looking at the different compilers, most of the information I had was outdated or I was using outdated versions of the compilers (or I was just plain wrong in the case of O'Caml).

As it turns out, several of the compiler writers are headed in the direction I am talking about (O'Caml and Mlton in particular). It seems to me that the reason performance lags behind C is because their assembly code is not as optimized as what a modern C compiler can do. For me this is good news because I believe eventually they will have optimized assembly that approaches C's speed everywhere. Very nice.

The next problem I'll be thinking about is whether there will ever a functional language that approaches the level of standardization and portability of C. As it is now there is a lot of fragmentation. There are a whole lot of differnt functional languages and I don't know if any are source compatible (I have not seen any that are). For the most part they all support different functionality.

Teaching programmers (5.00 / 2) (#132)
by Paul Johnson on Thu Mar 14, 2002 at 03:21:36 PM EST

Unfortunately most programmers in the world can't handle the abstraction needed to do good functional programming. I recall the move from structured languages to OO, and that was pretty difficult for many people. For a long time people were writing bad C programs in C++ and throwing in classes because they felt they ought to.

Now think about teaching these people about functional programming, especially in a lazy evaluation language like Haskell. Just about everything they know is wrong. There are no loops: use recursion instead. Endless recursion is a good thing, some of the time. Functions are first-class objects which are slung around with complete abandon. In fact functions are all that exists.

Programmers coming to functional programming for the first time have to forget everything they ever learned about how to structure software. Its a big leap.

Now imagine you are a project manager considering using a functional language for a project. There is very little tool support beyond the compiler. None of your expensive diagramming tools make any sense in this environment anyway. None of your coders can think in a functional way. In a few months some of them will learn, and a few will be amazingly productive. The rest will have to be fired.

Not an attractive picture, is it.

Paul.
You are lost in a twisty maze of little standards, all different.

This still happens (none / 0) (#138)
by epepke on Thu Mar 14, 2002 at 04:41:01 PM EST

For a long time people were writing bad C programs in C++ and throwing in classes because they felt they ought to.

People still do this; I've seen it many times. To be fair, though, part of the reason is that C++ is really C with "Object" written on in Magic Marker. It isn't built like an O-O language and, what's worse, the object elements are only taught after the normal imperative stuff.


The truth may be out there, but lies are inside your head.--Terry Pratchett


[ Parent ]
Curly brackets (none / 0) (#149)
by Paul Johnson on Mon Mar 18, 2002 at 05:13:46 AM EST

Its true that C++ makes it easier to write awful non-OO classes, but thereby hangs a tale.

There are many OO languages, almost all of which are better than C++, but it was C++ which became dominant because it looked like an easy upgrade. All your existing C code would still compile (though not any more) and programmers only had to learn the OO bits of C++ instead of a whole new language.

Since then Java and C# have come along. Both languages have more in common with Smalltalk than C, but Java succeeded and C# may well do so. This was partly because they are backed by powerful companies, but also because they both have a C-like syntax, including curly brackets.

I have become convinced that no language can succeed unless it has curly brackets. Curly brackets make C programmers and their managers feel comfortable that they can easily understand it because it looks familiar.

So now apply this to functional languages. Haskell will never make it because it lacks curly brackets. What is needed is obviously a functional C which has curly brackets and the ability to drop into unsafe side effects at the drop of a declaration.

Guess how it will get used.

Paul.
You are lost in a twisty maze of little standards, all different.
[ Parent ]

Haskell has curly braces. (none / 0) (#150)
by i on Mon Mar 18, 2002 at 06:50:46 AM EST

They are almost always optional though. Unsafe side effects are supplied by unsafePerformIO and the like (not a part of the standard language but a common extension).

If you want a functional C with curly brackets and the ability to drop into unsafe side effects, there's more than one language that fits the bill. One alternative is called C++ (seriously).

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]

Not too long ago (5.00 / 1) (#136)
by epepke on Thu Mar 14, 2002 at 04:20:36 PM EST

It used to be that most compilers produced linkable machine code (NOT assembly--I don't know where people got the idea that they are the same thing). The most general such machine-language compilers used an intermediate generalized machine model which was translated into the target machine language, which is one of the reasons that RISC is easier to compile to than CISC. People generally compile to C these days because reasonbly good C compilers exist, and it's a lot easier to consider C as a glorified assembler (which is pretty close to what it was designed to be).

The problem with C, of course, is that it is so deeply based on imperative programming. It's hellishly difficult to optimize C for vector machines and SIMD machines, while it is relatively easy in FORTRAN or APL. I do think that for functional languages, quite a lot better work could be done. There would be two benefits:

  1. Functional programs are exactly what you want for multiprocessor architectures. Because a properly constructed function in a functional language has no side-effects, every is a candidate for parallel processing. This requires lexical binding, and of course some functions (like I/O) will have side-effects, but it is utterly trivial to analyze such a piece of code in a LISP-variant and damned near impossible in C, unless the C compiler does structure processing like LISP. But that's kind of silly. Even when "C-compiled" code looks somewhat like human code, it's a ridiculously small subset in idea space, but not one that is easy to detect directly.
  2. One word: Itanium. It's a brilliant design, but it will never be used to its full extent until compilers have architecture-specific information and algorithms. A compiler from a functional language, I feel, would work well, especially if the language had parallel primitives built in.

There's nothing here that couldn't be solved with infinite hubris. I kind of miss the days when this was common amongst developers.

Disclaimer: I'm in the middle of writing a Scheme interpreter (not compiler yet, though that may come later). While it is rather easy to get an almost-Scheme mapping onto C (one can hammer together a functional LISP in a couple of days), to do the Full Monty with continuations and zero-space tail recursions is quite another matter. You can't just tack it on as an afterthought, either. My experience is that the C code for just the interpreter winds up looking like Assembler anyway. That is, you might as well forget about using the automatic C stack and nice, juicy C control structures. I rather suspect the object code of a C-compiler-compiler would look about like this and would be as hard to optimize and understand as the output of yacc/bison.


The truth may be out there, but lies are inside your head.--Terry Pratchett


A functional language compiler that creates really fast code? | 156 comments (143 topical, 13 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!