Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
A Peer-to-Peer Programming Language

By nile in Internet
Wed Jul 11, 2001 at 08:08:06 PM EST
Tags: Software (all tags)
Software

Most programming languages are statically defined when they are compiled. C++, C, Java, and other languages cannot become richer over time after their compilers or interpreters are compiled. This model of building programming languages is pre-Internet, mirroring how books, magazines, and journals were published before the appearance of Web pages, dynamic content, and hyperlinking. Instead of this model, however, imagine a programming language that was defined on the Internet and more importantly, became richer over time as more programmers added to it. This is the idea behind BlueBox, a browser that runs a scalable peer-to-peer programming language that we are releasing today.


dLoo was formed over a year ago to create a browser, BlueBox, that could download language structures from the Internet and dynamically assemble them into languages. We needed the language structures to be modular so that the languages could scale in richness across non-communicating parties. We needed developers to be able to extend languages simply by linking to existing structures. The goal was to create a language that lived on the Internet and that grew in richness as developers created new pieces.

Although the concept of a scalable peer-to-peer programming language is conceptually simple, implementing it took over a year. We tried over a hundred variations before finding a programming unit that would fit our criteria for a language structure and be simple to use. The structure that we choose and that is used in BlueBox today is called a "word." Words are scalable, linkable, and allow for language inheritance and polymorphism. They model Web pages in their ability to scale and link to one another to solve new problems.

With words, one programmer can post HTML words, another Calculus words, and a third could create an HTML/Calculus language by changing one of the links in an HTML word to point to Calculus. A fourth programmer could then come along and add a graphing language by adding a link from a graph word to a Calculus word. This simple ability to link one language to another allows programmers to create very dense syntaxes to cleanly solve complicated problems. Individuals can build off of the words that have already been posted on the Internet to create richer and richer languages without ever talking to one another. The language is defined on the Internet and can be extended by anyone with a minimal amount of work.

In addition to linking, words allow programmers to inherit languages and override their syntax and semantics on a word-by-word basis. Language inheritance adds phenomenal power to the programmer's toolkit. One of the modules included in BlueBox is an abstract programming language written in words that programmers can inherit from to create specific programming languages like Perl and Python. Because this language already supports inheritance, standard conditional structures, and other basic features of programming languages, programmers do not have to implement them when creating a new language. They simply inherit from the abstract language and override its syntax where appropriate.

Language polymorphism means that if programmers specify that a document is written in an abstract language, they can use all of the languages that inherit from it. For example, the abstract programming language can compile programs written using syntax from any of the languages that inherit from it. As syntaxes like Perl, Python, C++, and Java are created by inheriting from the abstract language, it is possible to write programs that mix all of these languages together (see bluebox/src/tests/code). More importantly, language polymorphism means that new developers can add new words to a language simply by inheriting from existing ones. If HTML was written in words, for example, anyone could add new widgets and domains to the language.

All of the features mentioned in this paper are working today. After a year of development, the product has been released to the open source community under the GPL. A community site set up around BlueBox with access to the source can be found at http://www.dloo.org. The main features in this release are:
  • Implementation of a natural language database so that words can be dynamically downloaded, compiled, and assembled into a language.
  • Full support for language inheritance and polymorphism
  • BlueBox itself written in words
  • Full support for words that are written in other words
  • Ability to download and compile words from the Internet
  • The architecture of a software translator for transforming one technology or language to another.
  • Only has Python as a dependency. The previous releases, which were solely for educational purposes, had several difficult dependencies.
A more detailed description of BlueBox's architecture is available here. We're looking for early adopters to post some of the first words on the Internet. We also want to invite anyone who's interested to jump in and get their hands dirty on a peer-to-peer natural programming language.

The Web broke the conventions that information should be formally organized and indexed. One of the main criticisms of the early Web was that it was unorganized: anyone could post and link to other articles. BlueBox brings the same roller-coaster ride to language design. In place of formal language committees and hundred-page specifications, we have an language that can be added to and linked to by anyone. In the next few weeks, we are going to be implementing the code for database sharing in BlueBox and launching the network for sharing words. We do not know what this Internet of words is going to become but we think it is going to grow fast and with a wild creativity beyond anything that we can imagine today.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o http://www .dloo.org
o here
o Also by nile


Display: Sort:
A Peer-to-Peer Programming Language | 90 comments (72 topical, 18 editorial, 0 hidden)
Interesting concept. (4.00 / 3) (#1)
by deefer on Wed Jul 11, 2001 at 12:35:41 PM EST

But pretty scary to implement, I'd think. How do you deal with versioning/naming conflict?<P>
I see no mention of a debugger in the story. If I was mucking about with a compiler, I'd like to at least be able to debug it if it went wrong... Sometimes there's only so far a printf ("%s %i", localStr, localInt) can get you...


Kill the baddies.
Get the girl.
And save the entire planet.

We resolve ambiguities through lookahead (4.66 / 3) (#2)
by nile on Wed Jul 11, 2001 at 12:45:16 PM EST

On a very physical level, since words can link to each other like Web pages, they know what words they are calling. Thus, if one caches the words in the natural language database under unique ids ambiguities can be eliminated. In the future, though, we want to eliminate this location dependency and replace it with infinite rule/relationship look-ahead. Each word specifies what words it manages in rule/relationships. Even if two words match then, one can simply keep matching new words until one of them fails. In the rare case of a true ambiguity (i.e., both words fit perfectly in a document), the user would have to choose what word to use as we do in normal conversation.

cheers,

Nile

[ Parent ]
Buzzword.... Overflow! (none / 0) (#6)
by kmon on Wed Jul 11, 2001 at 02:05:12 PM EST

Wow, you must have a pretty nice product, but I don't appreciate the verbal obfuscation. My big question is this, though: How much is this a real programming language, vs a scripting language? It seems to me like if a programming language can do everything that a turing machine can do, you're done. Beyond that, there is no need to 'extend' the language. On the other hand, if you're working with a scripting language, this is a pretty brilliant idea, because you'll be able to grab modules from the web.

Aside from the fact that this is dynamically linked over a network, how is this different from something like the CPAN libraries?
ad hoc, ad hominem, ad infinitum!
Difference from the CPAN libraries (none / 0) (#10)
by nile on Wed Jul 11, 2001 at 02:49:31 PM EST

The CPAN libraries from PERL are a wonderful feature that I have always appreciated, but this is actually something very different. Imagine if Perl itself was defined on the Internet and anyone could add to it by posting new words. Or, if they wanted, they could create C++ or a math language, or any other language they wanted. Further imagine that at any time, someone could mix these languages together to solve specific problems like how HTML and Javascript were mixed together to give us DHTML.

As far as whether it's a scripting language or a compiled language, it is more difficult to classify it because of the backend. A short answer is that the words are being developed to compile down to the technology of your choice (Perl, Python, C++, C, Java, etc.) but that when they are run in the browser they run as a scripting language.

I apologize for the verbal obfuscation. We had to use some programmer terms like "inheritance," and "polymorphism," to covey what BlueBox does. If there were any particular obscure sections, though, please let me know.

cheers,

Nile

[ Parent ]
Ok, let me get this straight... (4.60 / 5) (#7)
by BigZaphod on Wed Jul 11, 2001 at 02:10:06 PM EST

Ok, maybe I'm not totaly clear on this, but I seem to get the impression that dLoo allows you to essentially write programs using natural language. The browser (compiler/interpreter/engine?) searches a database of registered words online to figure out what the heck the user means and then pulls together the code to dynamically build the application based somewhat on context and the words available to use. So the programmer only needs to concern themselves with what the actual problem is and not all the little details. (A good expample of what I mean might be that as far as I can tell, a programmer in dLoo would not need to concern themselves with actually, say, plotting the lines, shapes, etc that make up a graph but instead only needs to concern themselves with what the graph will be graphing and the "words" that have been created to deal with graphing take care of the busy work.)

The whole thing can grow and expand because programmers can then use already defined words and expand their meanings by linking them with new functionality. So the whole system can grow and change like a real natural language does. It is also possible for islands of words to develop that deal in very specific fields in the same manner that there are totaly unique english words used in specific areas of thought (such as partical physics).

If that is the case, I'm VERY impressed. :-) I've been working on a somewhat similar idea for the past year or so as well, but appaently you actually figured out how to implement it. Congrats! Of course if I have it all wrong, let me know.

"We're all patients, there are no doctors, our meds ran out a long time ago and nobody loves us." - skyknight
You got it! (5.00 / 1) (#8)
by nile on Wed Jul 11, 2001 at 02:20:16 PM EST

That's exactly it. It's really encouraging to hear that you got it. To learn how we implemented it, there are some graphical architecture pictures on www.dloo.org in the architecture section and "The Word Model" in the documentation section is a thorough introduction to the programming model.

Feel free to join the project by the way. The fact that you are already thinking in this direction is really encouraging,

Nile

[ Parent ]
If that's what "the word model" has been (none / 0) (#69)
by afeldspar on Thu Jul 12, 2001 at 02:05:39 PM EST

... then I am more convinced than ever that it is lunacy. It might be that I still haven't managed to understand any of your descriptions of this fantastic "word model" but if so, at this point I don't believe that it's my capacity to understand that's lacking.

You're trying to use natural language as the glue code to tie together existing programming languages and you're expecting the result to have all the flexibility of natural language and all the power and precision of every programming language.

You know what you'll get instead? The unimpressive precision of natural languages garbled up with the maddening inflexibility of programming languages.

I sympathize with you; it sounds like you have a compelling vision of how computers Should Be, and you want to see it happen. But it also sounds like you've never heard of the old saying -- never proved truer than in debugging -- that the devil is in the details.


-- For those concerned about the "virality" of the GPL, a suggestion: Write Your Own Damn Code.
[ Parent ]

Not quite it (none / 0) (#75)
by nile on Thu Jul 12, 2001 at 04:36:53 PM EST

You're trying to use natural language as the glue code to tie together existing programming languages and you're expecting the result to have all the flexibility of natural language and all the power and precision of every programming language.

That's not quite it. Natural language is the programming language. Words are building blocks by which you can create languages. You can create computer programming languages, of course, but it is equally possible to create any other language as well like mathematics, chemistry, etc.

I recommend reading some practical examples of words in action. This link goes through some complete examples of word-oriented programming and should put it more in perspective.

I sympathize with you; it sounds like you have a compelling vision of how computers Should Be, and you want to see it happen. But it also sounds like you've never heard of the old saying -- never proved truer than in debugging -- that the devil is in the details.

That's why I haven't posted in over two months. Several people wanted to see an implementation of what I was talking about. I invite you to look at bluebox itself at this point which implements all of the features previously promised.

cheers,

Nile

[ Parent ]
Tower of Babel (4.00 / 2) (#9)
by ipinkus on Wed Jul 11, 2001 at 02:40:59 PM EST

Imagine a DNS corruption. Now sort of translate that to dloo. Mix with "viral content" and reflect on Snow Crash. The language/thought viruses described in that book seem to translate directly to this word based programming you speak of.

I love the idea behind dloo though and hope that this thing evolves without babel and without any serious harm to its users :)

(BTW, if you haven't read Snow Crash, read Diamond Age first.)

Hmm... (none / 0) (#16)
by spacejack on Wed Jul 11, 2001 at 03:28:23 PM EST

Not wrapping my head around this too well.. how is this much better than Perl's eval?

Not an eval, a language (none / 0) (#18)
by nile on Wed Jul 11, 2001 at 03:33:51 PM EST

Actually, this is very different from Perl eval which allows you to directly execute a Perl expression represented as a string inPerl code.

Words are discrete language chunks that can be distributed across the Internet. BlueBox runs documents by downloading the words the documents uses and dynamically assembling them into a programming langauge to run the document. Thus, for an HTML page, there would be "HTML," "Table," "Img," and other words posted all over the Internet and BlueBox would run the page by dynamically downloading and assembling them into an HTML language that can read and display Web pages.

Does this make sense?

Nile

[ Parent ]
/me rolls eyes. (none / 0) (#62)
by ajaxx on Thu Jul 12, 2001 at 11:28:48 AM EST

posit the existance of some magic chunk of perl code. this chunk of code scans a document, determines it's type, fetches other bits of perl off the web dependent on document type, gloms these bits together and evals them, with the document as input. these bits are written in such a way as to render the document to your screen or a printer or into PDF or whatever else.

now s/perl/word/g above, and please tell me how this is different from bluebox.

i understand the concept that you're pushing (that the syntax of the language need not be a static thing). basically this is to traditional languages as lisp is to C; lisp has weak data typing, and BlueBox has weak syntax typing. however, this fails to impress me. the object concept already lets you define your own words. operator overloading lets you mangle syntax, where present. put in those terms, all you've really done is modify ld.so to resolve symbols across the network. (actually, now that i mention it, that's not a bad idea, i may steal it...)

it's been said elsewhere, but, why would i want to buy this?

just as a final jab, it's a fundamental statement of information theory that all Turing-complete languages are isomorphic; anything that can be expressed in one can be equivalently expressed in another. therefore, all you're (potentially) doing in compiling a language on the fly is creating a potentially Turing-incomplete langauge for a possible gain in programmer convenience. even if we state that all the target languages will be TC, i still seriously doubt the gain in programmer convenience.



[ Parent ]

Missing inheritance and polymorphism (none / 0) (#70)
by nile on Thu Jul 12, 2001 at 02:58:41 PM EST

First of all, if you set up all of Perl that way, you are going in the right direction. However, you need to add a couple of things to make it complete.

In addition to matching text, words also specify what words can come after them in a structure called rule/relationships in words, but that I am renaming to syntax/semantics. These structures can be inherited and are polymorphic. To see more on this download BlueBox and look in the tests directory or read this longer explanation here.

Have you read any of Larry Wall's writings on language design, by the way? I only mention this because I have been greatly influenced by his philosophies regarding the design of programming languages.

In terms of Turing completeness, the languages that words define can scale and anyone can add to them. So, if a programmer is using a Turing incomplete language defined in words, they can simply link it to a Turing complete language or add additional words.

cheers,

Nile

[ Parent ]
Why would I want to buy this? (3.00 / 1) (#20)
by jabber on Wed Jul 11, 2001 at 04:07:07 PM EST

As another poster already said, this seems like a subtle sales pitch veiled in an article submission. Granted, all informational materials by someone with a vested interest is a sales pitch of sorts, but it still feels a bit unpleasant.

Second, why would a freely extendable language be a good thing? It seems that something like this would bloat exponentially, swell with duplication and get so top-heavy in features that keeping up with the capabilities of the language would suck up more time than actually using it.

I could see it getting to a point where rather than developing a solution, one would spend weeks digging through the existing API to find a ready-made one. Without the administrative nightmare of keeping a clean, reference to a 'blessed' standard language, this would grow into a hydra that would eat itself in little time.

Interesting as an academic exercise, but it just doesn't seem practical. Then again, the same has been said about the telephone, the computer, probably even the wheel.

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"

It's the language, not the libraries that grow (none / 0) (#22)
by nile on Wed Jul 11, 2001 at 04:16:37 PM EST

Thanks for posting. I think there's a little misunderstanding here.

First of all, this isn't a sales pitch. BlueBox is released under the GPL. We're open source developers who have been working on a project for a year that does things radically different and we're telling other members in the community about it.

Second, word-oriented languages are very different from object one's so you wouldn't spend weeks digging through an API. Instead, you would simply write in the language that was appropriate for that domain. For example, if you were doing mathematics, you would use the natural mathematical symbols as defined by mathematic words to write mathematics. Similarily, if you were doing banking, you would use the words that bankers use (not C, C++, etc.) Words are a form of natural language programming that can scale on the Internet.

Does this make sense? It's the language not the libraries that grow,

Nile

[ Parent ]
Language management (none / 0) (#23)
by jabber on Wed Jul 11, 2001 at 04:27:07 PM EST

I'll be visiting the site as soon as I have enough time to give it a fair read. Seems like an interesting idea, except that rich languages don't have a very good track record. Still, this is a different concept entirely, and I take back my criticism. At second glance, it seems that there is huge potential for ambiguity, unless the language os properly managed and compartmentalized - again, the overhead on that must be formidable.

Have you looked at the way Ada handles expandability via annexes? Any similarity? I'm currently at work, but once done, I look forward to learning more.

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"
[ Parent ]

Ambiguity, and words (none / 0) (#43)
by nile on Wed Jul 11, 2001 at 09:11:38 PM EST

Actually, words are great at ambiguity because they encapsulate all of the syntax rules and semantic relationships a word has with other words in the word itself.

In short, words make language creation scalable (by coupling syntax and semantic relationships) and offer a new type of inheritance and polymorphism as a result of this coupling. In a way, Lisp and Pliat are to words what C (which has data and methods) is to C++ (which has inheritance and polymorphism by coupling them).

There is more on words in the Word model on the dloo.org site here.

Feel free to ask more questions,

cheers,

Nile

[ Parent ]
Ambiguity and Ada (none / 0) (#45)
by nile on Wed Jul 11, 2001 at 09:22:15 PM EST

We're in the process of implementing infinite rule/relationship lookahead with the language. This should eliminate most of the potential ambiguity problems. Where there are ambiguities (i.e., true ambiguities like we encounter in English), = developers - not users -will have to choose the word that they want.

I have to take off for a few hours. Thanks for checking out the site. There is a lot more information there.

cheers,

Nile

[ Parent ]
I'm just Not Feeling It(tm) (4.50 / 2) (#25)
by Gat1024 on Wed Jul 11, 2001 at 06:11:35 PM EST

Like others here, I need a living, breathing, concrete instance of how you've applied the technology. The article itself reads almost like sales pitch disguised as white paper. You really don't explain why the word oriented way is an improvement over existing methods.

After both this article and your site, I think word programming won't be as easy as you make it out to be. It'll be a nightmare. There is an enormous amount of context in languages and domains. A C program isn't just a collection of words that appear in a certain order. And it's semantics can't be neatly wrapped into individual words because there is a lot of cross cutting concerns. That's why a C compiler (or any for that matter) needs to keep track of its symbols, scope, type, etc. And then you want to be able to merge C syntax with Pascal or Perl?

It's the same for other domains. In many cases, the context of a language is a complex chain of relationships and data. How do you intialize this context? Who's responsible for its initialization? How do you pass it around? If I'm parsing HTML and Calculus at the same time, how do I merge contexts so that I don't get garbage on my screen? After all, the words need to know where to render themselves. And what if I include these words in a system that has not concept of graphics output at all?

From your post, it seems you need to have a "change over" word to let the system know what language domain to use. So my Calculus language gets fired up when the HTML language comes across a word that has been redefined for Calc. Doesn't this introduce a depedency that may break the Calculus language if it isn't mixed in with a HTML language domain?

Lastly, how do other words discover just what data can be passed to a receiving word? This question arises from your site where you have a calculator example. What if the two words don't understand each other's data types? As in the "number" word stores it data in Joe's Binary Coded Decimal format (JBCD) and the "plus" word only understands IEEE floating point. Who does the translation? And if the author of one of the words does it, then will that introduce yet another dependency?

Anyway, don't think I'm down on your work dude. It does look interesting. It's just that I'm from New York. Here, if you say "xyz is the best thing since peanut butter and jelly, " we tend to say "prove it, you sonuvabitch." Okay sometimes we leave out "sonuvabitch."

Why the word model is better (none / 0) (#27)
by nile on Wed Jul 11, 2001 at 06:41:52 PM EST

Like others here, I need a living, breathing, concrete instance of how you've applied the technology. The article itself reads almost like sales pitch disguised as white paper.

There's a paper on the dloo.org site called "The Word Model" that explains why it is better from a computer science perspective. This was also published on Kuro5hin a few months ago.

It's the same for other domains. In many cases, the context of a language is a complex chain of relationships and data. How do you intialize this context? Who's responsible for its initialization? How do you pass it around? If I'm parsing HTML and Calculus at the same time, how do I merge contexts so that I don't get garbage on my screen? After all, the words need to know where to render themselves. And what if I include these words in a system that has not concept of graphics output at all?

Words have rule/relationships in their structures that relate them to other words. In HTML, for example, the top level HTML word has a rule saying that the BODY word comes after it. The BODY word in turn says that Tables, Imgs, and other HTML words can come after it. Calculus and HTML wouldn't get jumbled up because there would be an exact rule/relationship saying where Calculus could be embedded. I really recommend downloading the source here and looking at it since BlueBox itself is written in words. No changeover word is needed.

Lastly, how do other words discover just what data can be passed to a receiving word? This question arises from your site where you have a calculator example. What if the two words don't understand each other's data types? As in the "number" word stores it data in Joe's Binary Coded Decimal format (JBCD) and the "plus" word only understands IEEE floating point. Who does the translation? And if the author of one of the words does it, then will that introduce yet another dependency?

Words don't pass data to each other in the way you're thinking of here. They are openly posted on the Web like Web pages and they link to each other like Web pages. BlueBox then downloads the words needed and dynamically assembles them into a programming language to run a document. The document is then translated into a generic higher level language. This language is then compiled into the technology of your choice: C, C++, Python, Java, etc.

cheers,

Nile

[ Parent ]
security? (none / 0) (#38)
by delmoi on Wed Jul 11, 2001 at 08:42:17 PM EST

Um, how in gods name do you handle security in a system like that?
--
"'argumentation' is not a word, idiot." -- thelizman
[ Parent ]
Two ways (none / 0) (#44)
by nile on Wed Jul 11, 2001 at 09:18:24 PM EST

There are two ways to handle security in such a system. The first is to have a sandbox like the Java VM provides which protects the users.

The second and - in my opinion - much better way is to have each domain specific set of words verify the words that it reads. That is, a set of math words could run any math document, but they will only run math documents that are true on a user's machine and only true math documents will be allowed in the database. We call this semantic security to distinguish it from the purely syntactical security methods of signing components, restricting sets of APIs, etc.

Good question by the way.

cheers,

Nile

[ Parent ]
So far all you have is declarations (none / 0) (#49)
by Gat1024 on Wed Jul 11, 2001 at 10:47:56 PM EST

You declare what the syntax is. You declare what the semantic relationships are. You even specify what code the word should generate when it is instantiated. It's like a template system on steroids.

But declarations are easy. The implementation (the code that will ultimately be generated and exec'd) is where the rubber meets the road. And that's what I'm trying to figure out.

Take your HTML example. Your BODY tag creates a window, initializes a font and blasts some text to the screen. It hits an IMG tag and hands off the responsibility for that to the IMG word. Where does the IMG tag paint it's image? How does it know it should paint it's image where it does? What about the TABLE word? In fact, the BODY tag doesn't even know how big to make it's virtual canvas until it has obtained the widths and heights of images and tables which are the domains of the IMG and TABLE words respectively.

These are cross cutting concerns that can't be neatly packaged in individual words. How do the HTML words communicate the current state of the graphics context to each other? If you have to model this context so that all the words within a domain can work together, what happens when you introduce a new set of words with it's own domain and own concept of rendering? My Calculus language might not understand the rendering model for HTML.

Also I tried to get the download but I got a checksum error. I'll prolly try again. But couldn't you just put up a webby front end to the CVS respository so that we can browse without installing Yet Another Open Source Project? Or just put the unpacked project on FTP so we can browse that way.

Thanks.

[ Parent ]

You also have context and links to words (none / 0) (#77)
by nile on Thu Jul 12, 2001 at 05:46:36 PM EST

You declare what the syntax is. You declare what the semantic relationships are. You even specify what code the word should generate when it is instantiated. It's like a template system on steroids.

But declarations are easy. The implementation (the code that will ultimately be generated and exec'd) is where the rubber meets the road. And that's what I'm trying to figure out.


I'll set up the source code on the Web as you requested in the next couple of hours.

Take your HTML example. Your BODY tag creates a window, initializes a font and blasts some text to the screen. It hits an IMG tag and hands off the responsibility for that to the IMG word. Where does the IMG tag paint it's image? How does it know it should paint it's image where it does? What about the TABLE word? In fact, the BODY tag doesn't even know how big to make it's virtual canvas until it has obtained the widths and heights of images and tables which are the domains of the IMG and TABLE words respectively.

These are cross cutting concerns that can't be neatly packaged in individual words. How do the HTML words communicate the current state of the graphics context to each other?

The HTML word has references to all of the words that are matched under them. So, first it would tell them to calculate what their sizes were, then it would draw a window that could contain them, and finally it would tell all of its children to draw themselves. In earlier versions of BlueBox, an HTML language was actually implemented that did all of the size work in your example.

If you have to model this context so that all the words within a domain can work together, what happens when you introduce a new set of words with it's own domain and own concept of rendering? My Calculus language might not understand the rendering model for HTML.

That's true. I actually meant for the Calculus words to be understood as an additional langauge that Javascript like could be embedded in the pages to perform work. One would then call traditional Javascript calls to display their results.

Also I tried to get the download but I got a checksum error. I'll prolly try again. But couldn't you just put up a webby front end to the CVS respository so that we can browse without installing Yet Another Open Source Project? Or just put the unpacked project on FTP so we can browse that way.

Sure. I'll work on it right now,

cheers,

Nile

[ Parent ]
Links to additional information (4.66 / 3) (#32)
by nile on Wed Jul 11, 2001 at 07:26:59 PM EST

There are several links to additional information in the editorial comments and it has just been pointed out to me that these will disappear when the article is posted. So, for the curious, you can find more information in:

The Word Model on K5
How BlueBox works (very simple one page explanation for nonprogrammers)
Architecture Pages at dloo.org that consists of six pages of graphical introductions to BlueBox's architecture.
FAQ (answers dozens of questions) at dloo.org
BlueBox source with examples BlueBox itself is written in words and has examples of everything discussed above. In addition there are README.design documents in all of the directories explaining how the source works.

cheers,

Nile

text/plain? (none / 0) (#54)
by Ubiq on Thu Jul 12, 2001 at 02:18:54 AM EST

Your web server seems confused:

% curl -I http://www.dloo.org/download/bluebox-0.5.tgz
HTTP/1.1 200 OK
Date: Thu, 12 Jul 2001 05:20:31 GMT
Server: Apache/1.3.19 (Unix)
Last-Modified: Wed, 11 Jul 2001 14:05:48 GMT
ETag: "6b95c-155e17-3b4c5d3c"
Accept-Ranges: bytes
Content-Length: 1400343
Content-Type: text/plain



[ Parent ]
Fixed. (none / 0) (#74)
by nile on Thu Jul 12, 2001 at 04:16:16 PM EST

Apache wasn't mapping the mimetype correctly.

Thanks for pointing that out,

Nile

[ Parent ]
grammer rules = specialized methods (4.50 / 2) (#33)
by sayke on Wed Jul 11, 2001 at 07:38:41 PM EST

so why do you draw a line between "grammer rules" and "methods"? methods can do lots of things, including recursive descent parsing, which encapsulate "grammer rules" quite nicely already. if i were to make a python class with some parsing methods, some other methods, and some data, i'd have encapsulated grammer rules, normal methods, and data in one space.

i quite probably don't understand, but, well, nobody else seems to either - and not for lack of your trying. it looks to me like you're either on crack, or reinventing the wheel in one of the most obtuse ways imaginable. not to mention that, if you were successful, you would have just created a DLL hell for language syntax! f00kin great... that's just what we need.

i like carefully designed minimalist frameworks logically taken to all-encompassing conclusions. as far as i can figure, your project is everything i stand against ;)


sayke, v2.3.1 /* i am the middle finger of the invisible hand */

Grammar rules are not methods (none / 0) (#34)
by nile on Wed Jul 11, 2001 at 07:55:00 PM EST

so why do you draw a line between "grammer rules" and "methods"? methods can do lots of things, including recursive descent parsing, which encapsulate "grammer rules" quite nicely already. if i were to make a python class with some parsing methods, some other methods, and some data, i'd have encapsulated grammer rules, normal methods, and data in one space.

I don't really understand the statement that grammar rules are equivalent to methods. I think we are using the same words in different ways if you believe this. By grammar rules, I mean the code that logically defines different pieces of langauge can be put together and what those groupings mean. One can use methods to define grammar rules and one can set some very simplistic grammar rules by what variables are passed to a method, but there is no equivalence relation by this definition. They are very different things.

i quite probably don't understand, but, well, nobody else seems to either - and not for lack of your trying. it looks to me like you're either on crack, or reinventing the wheel in one of the most obtuse ways imaginable. not to mention that, if you were successful, you would have just created a DLL hell for language syntax! f00kin great... that's just what we need.

This stuff is difficult, but there are a number of people who understand it. People learn in different ways and I am trying a number of different methods for explaining word-oriented programming to them. If you are a programmer, you might want to look at the BlueBox source, for example. Please feel free to ask more questions too. Sometimes it takes a dialogue back and forth before someone understands.

As to dll hell, documents are written against the groups of words that a user chooses. These words are then stored in the database with the document so that when a document is requested, all of the relevant words defining the document can be retrieved by the client. In this way, we avoid the conflicts that you are worried about.

Nile

[ Parent ]
Actual Information? (4.00 / 1) (#35)
by evin on Wed Jul 11, 2001 at 08:20:46 PM EST

I read the article and many pages on the dloo.org, but all I find are arguments saying "our hype is better than their hype," not discussing what the language actually is. It seems to be some sort of xml-based meta-language, but it's not clear why I would want this. One of the FAQs asks for a simple example, and is left unanswered (generalities about great/different things you could do are not an answer; a link to well-explained code is).

What interesting things have you done with Bluebox other than write itself? Where can I get some real documentation that explains how to use the language (or the languages it creates if I'm not supposed to use it directly)? As is, I have a better chance of getting work done in INTERCAL.

If I'm missing something obvious here, point me in the right direction.

Different programming model, not XML (4.00 / 1) (#37)
by nile on Wed Jul 11, 2001 at 08:31:54 PM EST

Part of the difficulty of understanding BlueBox is that it is not a language, but a browser that can dynamically assemble the the words that make up the language of your choice on the Internet. If you're looking for the BlueBox language on the site, you are going to be disappointed.

With that in mind, all that matters are "Words" which are the fundamental units of language in BlueBox. Words have methods, data, symbol recognizers (i.e., they can match themselves), and rule/relationships (i.e., they specify what words come after them).

The more words there are, the more languages you can use to write documents. The more they link to one another, the richer those languages will be.

Right now, we've written the following words/ languages:

+ the word "word" language -- i.e., the basic language for writing words in. See the word specification on the site in the documentation section at dloo.org. This is what we have spent most of our time on and is the focus of this paper.

In addition, we've also written:

+ a subset of Perl
+ a subset of a generic programming languge
+ XML

You can use all of these languages - and by linking and/or polymorphism -combinations of these languages in any programs you write using BlueBox

Does this make sense?

Nile

[ Parent ]
quick notes... (4.50 / 2) (#36)
by jason on Wed Jul 11, 2001 at 08:31:01 PM EST

The idea of constructing a program out of words is the core of forth. Forth also has similar ideals, in a way. Your idea of words and Chuck Moore's idea are quite different, however, and the terminology overlap is confusing.

Also, have you read Graham's On Lisp? If not, do so. You'll also need to dig up some explanation of reader macros; I don't know of a good one. Explaining the differences between your "word model" and Lisp's methods may help you differentiate what you've done.

You'll also want to compare / contrast with constraint languages. Again, you share some terminology with this community. I'm having a hard time picking apart the differences. See Mozart and GNU Prolog for some starting points. Some of the constraint database systems for agents also share your terminology / philosophy.

Jason, +0

Re: Quick notes (none / 0) (#39)
by nile on Wed Jul 11, 2001 at 08:49:05 PM EST

Warning: this is technical, but I think you might get it so I'm going to try:

Here's a quick definition of words:

Words have methods, data (like objects), symbol-recognizers (i.e., they can match themselves), and syntax rules/semantic relationships that specify how they connect to other words.

An HTML word then would have methods and data related to creating a Window, a symbol matcher that recognized "<HTML>" and a syntax rule/semantic relationship that specified that the Body word came after it and what methods to call/data to set on the Body word when it matched. A word that inherited from the HTML word would inherit all of these properties and could override the syntax rule/semantic relationship with a more specific word or add new relationships. Furthermore, if the HTML language was run against a document, it would match all of the standard HTML words and all of the ones that inherited from them through word polymorphism.

Lisp, Pliant, and other languages do allow you to override their compilers and add additional syntax. Words make this scalable (by coupling syntax and semantic relationships) and offer a new type of inheritance and polymorphism as a result of this coupling. In a way, Lisp and Pliat are to words what C (which has data and methods) is to C++ (which has inheritance and polymorphism by coupling them).

I need to stop using rule/relationships. I meant to stop using it a while ago because I realized it made it sound like logic programming but it slips in now and again. Rule/Relationships when I use it means syntax rules/semantic relationships -- i.e., what are the legal ways that words can be grouped together and what do those groupings mean.

Thanks for the suggestions. I'll eliminate the use of rule/relationships as quickly as possible.

Nile

[ Parent ]
Or should I make it more specific? (none / 0) (#41)
by nile on Wed Jul 11, 2001 at 08:52:20 PM EST

Rather than eliminating rule/relationships, maybe I should always say syntax rules/semantic relationships instead. Is that clearer?

Nile

[ Parent ]
"syntax/semantics" (none / 0) (#48)
by mbrubeck on Wed Jul 11, 2001 at 10:38:25 PM EST

Rather than eliminating rule/relationships, maybe I should always say syntax rules/semantic relationships instead. Is that clearer?
How about just "syntax/semantics" when you need to abbreviate?

[ Parent ]
Good suggestion! (none / 0) (#60)
by nile on Thu Jul 12, 2001 at 04:01:00 AM EST

My only worry is that people won't know that it is part of a word.

For example, if one says words have data, methods, symbol recognizers, and syntax/semantics is it clear that the last is a specific programming structure like the others are?

It's clearly better than rule/relationship, though, give the former's use in logic programming. Perhaps, syntax/semantic structures is better. It's not clear what those are, but it is clear that they are something like methods and data.

I think this is headed in the right direction,

Nile

[ Parent ]
Still needs relationship to current work... (5.00 / 1) (#64)
by jason on Thu Jul 12, 2001 at 11:55:22 AM EST

Lisp allows you to do much more than overriding the 'compiler'. It allow you to interfere with almost every stage from reading input to running machine code. You can even change away from the Lisp syntax. (I've yet to see the point behind Pliant, outside of someone's fun hack.)

And as far as I can see, your system is essentially a semantically directed grammar. These are not uncommon in reverse engineering / support codes. They tend to use Earley parsers and modify the grammar on the fly. This is also a common technique in natural language processing. What you've described above seems an almost trivial application of existing systems. If you check various XSL archives, you might see something similar... And IIRC, there's already an Earley parser in Python. Have you investigates using it?

Some programming projects go even further and merge the scanning pass. This also contains elements of intentional programming, but that's from a UI perspective.

Keep in mind that I'm not trying to trivialize your work. Very few people are working on these systems, so I (and probably others who aren't) need context to understand what's different. Solid related work sections and bibliographies lead to wider acceptance in leading-edge groups, as well. The marketroid speak and poor domain name (dLoo => "the loo") made me wonder if this is a hoax.

The next stage beyond these may be to use logic / constraint techniques to analyze the code. There are some advances in this direction for statically typed languages; see papers on usage polymorphism and set constraints. AFAIK, no one's merging the work between these areas.

Jason

P.S. Your message was not technical.

[ Parent ]

Relating it to current work (none / 0) (#72)
by nile on Thu Jul 12, 2001 at 03:45:09 PM EST

First, thanks for taking the time to examine this.

Out of curiosity, have you read "The Word Model" on dloo.org? It has a toned-down C.S. description of the general theory behind BlueBox. You can find it here. There is also a FAQ on the site in the same section that answers about 20 different questions relating words to different technologies.

The closest relationship to current work is monadic combinators from the land of functional programming. Words, however, couple state and methods and allow the syntax/semantic structures in them to be inherited and polymorphic. There is more on this in this reply. There are also several examples of this in the source code as well.

Intentional programming and words both fall under the general framework of natural language programming, but they have very different philosophies and completely different frameworks. The creator of intentional programming - Charles Simonyi - believes that syntax needs to be factored out, or better abstracted out to a syntaxless intention. In IP, as a result, one operates on the abstract syntax tree after a program has been parsed and can add additional nodes to the tree. These nodes are then translated into specific implementations.

The philosophy behind words is that the syntax of language is one of its most overlooked features. The goal isn't to abstract out different statements to an invariant representation but to manage the syntactical and semantic relationships between concepts in a domain. This management of relationships, not invariance of intentions, is seen as the key to natural language programming.

Now, syntactical relationships express the legal ways that concepts in a domain can be grouped together (e.g., the '+' token can follow the '2' token). Semantical relationships express what those groupings mean (e.g., if '+' follows '2', two is being added to something). In OO, there is a spagetti like relationship between syntactical and semantic relationships. Words couple these two relationships together in the same way that objects couple data and methods. In fact, words are a new programming unit that couple data, methods, symbol recognizers and syntax/semantic relationships together.

This coupling allows for a new type of inheritance and polymorphism that can be seen in the example I pointed to above. In this way, words are very different from the xmethods of intentions. This difference carries over to the development environment. In words, one writes programs using language just as one does today. Operating on an abstract syntax tree graphically never enters the equation.

Relating words to earley's algorithm is a little more difficult because it is a parsing algorithm not a programming model. The only relationship I can think of is that the way Earley matches right-handed non-terminals is similar to how the polymorphism of words works. However, this is not uncommon in parsing, so the similarity is only superficial. There really isn't a relationship here from what I can tell.

good questions, by the way,

Nile

P.S. I realize that my message was not technical for you and I hope you weren't offended. Many of readers of Kuro5hin are intelligent people who are just starting programming. As a result, the most frequent request is to tone down the technical terminology.

[ Parent ]
Uh, how about explaining what it actualy does (4.50 / 2) (#40)
by delmoi on Wed Jul 11, 2001 at 08:50:09 PM EST

"peer to peer"?

There really is hardly any real information in your story. How are "words" diffrent from procedures, or objects, or whatever else? You never really define what you mean when you say 'word'. Why did you choose such an ambigious word as 'word' rather then something closer to what you actualy implemented (like 'semantics pak' or something)

As it is, this reads like a bunch zero-content marketing hype. Why don't you put up a concrete example, and not BlueBox itself, that's over 4,000 files. How would you write a "hello world" program in bluebox?
--
"'argumentation' is not a word, idiot." -- thelizman
This might help (5.00 / 1) (#42)
by nile on Wed Jul 11, 2001 at 09:03:01 PM EST

I actually have a paper I posted earlier on K5 that explained the word model. Here's a link to it on the site at dloo.org. It contains a CS description of how words differ from objects with several pictures to clarify the model.

I choose the word "word" for the same reason that OO enthusiasts choose the word "object." I need to convey the idea of a modular linguistic entity and "word" is what we normally use to express that. I would love some other ideas, though. Phrase won't work because that conveys the idea of several modular linguistic entities.

I do need to write a tutorial, but in the meantime please look at the word model and the information at dloo.org and feel free to ask more questions.

cheers,

Nile

[ Parent ]
Props to nile (4.60 / 5) (#46)
by ScrO on Wed Jul 11, 2001 at 09:46:35 PM EST

I just want to say that nile rocks for sticking around and answering all the questions and clarifying things about this interesting topic.

Offtopic, yes, but I feel it's deserved. (=

ScrO!

This is NOT a Programming Language (3.77 / 9) (#47)
by exa on Wed Jul 11, 2001 at 10:24:57 PM EST

I'm a CS grad. student who has read more than the necessary amount of pl/compiler/semantics textbooks and papers to know what is a PL and what is not.

I've read the descriptions and examples. All this "language" is just what Software Engineering people would call a "Component Language" which comes to mean "wrappers around other languages". In fact, this is not even a component language because a component language lets you build compositionally (either by encapsulating intermediate/object code or source code doesn't really matter)

Just to make the point clear: if I write a shell script that compiles source codes in different languages according to their filename extensions I will _not_ have implemented a programming language. Yet, if I have a formal language with semantics that has equivalent power of a Turing Machine, without invoking the power of any other abstract machine (or human), I will have.

In that respect any realization of lambda calculus is a PL. However, an XML document is NOT a programming language unless you give it formal semantics on its own. (This is left as an exercise to the reader. Additional study: Observe why a markup language does not lead to a good syntax for a real life PL.)

To nile: being a programmer is not enough, you are either naive or making the most elaborate hoax of the Internet history. Please get a PL textbook, study denotational and operational semantics, learn how PLs are specified and implemented formally, and then come back. Then I will have the utmost respect for you. When I was a little high school student I used to code assembly madly, and I was convinced that I could write anything. But I surely couldn't know about complexity classes, formal languages or semantics theories without reading the proper textbooks could I?

BTW, a component language that abstracts away and fully interfaces host PLs is _not_ a bad idea. However, the way you represent it and the implementation you refer to are awful. I suggest you to work from the bottom up, without making up non-sense phrases about programming languages and natural languages. Forget about the internet, and first try to make that work locally. Practice like a scientist, and you'll succeed.

Look at what other people did: Here is one that I found, a concrete thing people did to generate multiple PL bindings to a 3d code.

http://www.isogen.com/papers/software.html

If you search for software component languages and XML you'll probably find more stuff about it. For instance, w3c seems to have a software package description language. You might like that. I also checked out researchindex.com but I couldn't find relevant papers.

Your basic insight is right: you can make software packages written in different languages work together with a good lang-independent and easy to use, easy to share, component language. Unfortunately your conclusions are wrong. You have not invented or found anything in the field of programming languages. Though, as a fellow k5 reader I encourage you to learn and research seriously in this field. Let's then make a head-to-head match between yours and mine! (Yes, it's been in the design stage for the last 4 years)

Thanks,

PS: If this sounds too harsh don't mod it down. I have to let k5 readers know the truth about this issue. And I'm frankly trying to help nile with this project of his because he seems to spend a lot of time on it. Please respond gracefully if you have assessed and found any faults with my arguments.
__
exa a.k.a Eray Ozkural
There is no perfect circle.

Ok.. (4.00 / 3) (#50)
by BigZaphod on Wed Jul 11, 2001 at 11:25:42 PM EST

First off, please don't take this the wrong way, but the tone of your comment sort of annoyed me. It had a "I know more than you because I'm a grad student" feel to it. Perhaps that wasn't intended, but that's how it reads. Besides, how do you know nile isn't a grad student himself? (I sure don't...)

I basically have two nits to pick with what you've said. I should note, however, that I really don't know much about any of this and I'm not affiliated with this project in any way. I'm sure nile will respond when the time presents itself.

Anyway, the first problem I have is this quote:

"But I surely couldn't know about complexity classes, formal languages or semantics theories without reading the proper textbooks could I?"

That is totaly wrong. How do you suppose those books were written in the first place? Someone clearly had to work from the ground up and learn all the details. Eventually they got written down. Pretty simple. And there's no reason a person couldn't do the exact same thing again. Perhaps nile chose to start fresh so as not to be bogged down by current theories and ideas. Perhaps not. I don't know. But you can't assume that he hasn't read about this stuff and you also can't assume he can't learn about it himself without the aid of a book.


"However, the way you represent it and the implementation you refer to are awful."

In what ways? Perhaps more information would allow nile to refine his methods or figure out where the trouble areas are. Telling him he has a problem is a nice gesture, but without more details it isn't really very useful.


You clearly seem to know what you're talking about, but less chest-pounding and more hand-offering might be better received. Nile is trying to do something new here, so why not try to figure out exactly how his ideas work before claiming that it is all wrong because a couple books didn't do it this way.

"We're all patients, there are no doctors, our meds ran out a long time ago and nobody loves us." - skyknight
[ Parent ]
Good God (4.00 / 2) (#52)
by LukeyBoy on Thu Jul 12, 2001 at 12:47:35 AM EST

This is an open-source project, with real human beings working on their spare time to make something that they personally believe in, and you come in and have the nerve to tell them that they're wasting their time??? And that we should take your opinion as gospel because you're a computer science graduate? When someone tells me their actual job or position it tells me a hell of a lot more about their capabilities than just what program they took in school. Thanks for letting the "k5 reader know the truth about this issue", it's really appreciated. Please show people the consideration and respect they deserve.

PS: If this sounds too harsh, then you got the point.



[ Parent ]
Putting the post in context (4.00 / 1) (#53)
by nile on Thu Jul 12, 2001 at 01:57:22 AM EST

I just returned from a night out and am writing a full response. First of all, the conversation should probably be put in context since exa and I have talked before when I first introduced the word model.

First open the word model in a new window here

Exa's first post is five messages down, his second is four messages down, and his last is at the top of the page.

Our conversation ended here.

BlueBox is here

I'll do a full response shortly,

cheers,

Nile

[ Parent ]
A detailed response (none / 0) (#57)
by nile on Thu Jul 12, 2001 at 02:58:06 AM EST

This is a detailed response to exa's post.

Exa claims:

1. An XML document is NOT a programming language
2. Words are a component language that abstract away and fully interface host PL languages as can be found at http://www.isogen.com/papers/software.html. The core point is that different languages can be made to work together through a good language-independent and easy to use, easy to share, compononent language.

Response to 1: XML is not a programming language. This is true. However, words are not XML. They are a programming model. They have methods, data, rule/relationships and symbol matchers and are Turing complete. Nowhere do I claim that XML is a programming language.

Response to 2: Component languages have been around for a long time. Again, this is true. Words are not a component language, though.

So, neither of the claims you present are the ones that I make. Let's go over words now. All of BlueBox by the way is written in words, complete with a word compiler, and several examples.

Here is a simple example of a Sum word:

Match "Sum"

method addNumber(n):
     self.sum += n

method finishedParsing():
    for number in NumberWords:
         addNumber(allNumber.getValue())

rule/relationship: Number.word

And here is a simple example of the Number word:

Match "\s[0-9]"

method getValue:
    return atoi(self.mathch)

The Sum word is instantiated in a document when the string "Sum" is parsed. Then its rule/relationship is called to match the Number word several times matching what the Number words says to match (in this case any number between 0 and 9). When the Sum word is finished with its rule/relationships, it calls its finishParsing method. The finishedParsing method iterates over all of the words that were matching summing the numbers.

In this way, when "Sum 0 4 5 6 7 8" is entered, the program will run the document, matching all the words and print out their sum.

This is clearly a Turing-complete programming language.

Nile

[ Parent ]
Um (none / 0) (#71)
by delmoi on Thu Jul 12, 2001 at 03:38:15 PM EST

The above statement is something like "This bicycle is yellow, therefore it is clearly an automoble"

The example you posted does not indicate turning completness in any way I can see.
--
"'argumentation' is not a word, idiot." -- thelizman
[ Parent ]
Words are a superset of objects (none / 0) (#73)
by nile on Thu Jul 12, 2001 at 03:55:48 PM EST

Sorry, I thought this was clear in the example. You can write words without syntax/semantic structures and in BlueBox they can be directly referenced by ID rather than being instantiated by name. So, if you write words without matchers and syntax/semantic structures they reduce down to objects. In this way, word-oriented programming languages can be reduced down to OOP programming languages by decreasing their expressiveness. Since OOP programming languages are Turing-complete, words can also be Turing complete.

cheers,

Nile

[ Parent ]
Hmm, let me see (none / 0) (#86)
by kubalaa on Fri Jul 13, 2001 at 08:32:03 AM EST

I want to see if I understand you fully. Words are not a "language," they are a "paradigm." So even though you're talking about writing a language with syntactic sugar for words, it sounds like I could implement words in python by doing something like this:
  1. Have all my domain objects inherit from a base class which defines a property, "match", and an abstract method, "onMatch", which recieves the tokenizer.
  2. Write a little tokenizer which attempts to match all objects in my system one at a time against a string.
  3. Now, instead of writing foo.bar(1,2,3), I can write a script like "foo 1 2 3".

What have I gained by this? Does this have any applicability beyond writing really primitive scripting languages? What is inadequate about python's expressive syntax that the primitive syntax of token streams fixes?

[ Parent ]

5 benefits of word-oriented programming (none / 0) (#88)
by nile on Fri Jul 13, 2001 at 03:37:47 PM EST

First of all, thanks for taking the time to understand the basics. I'm currently writing a tutorial to make all of this more clear.

There are several benefits to using words. To avoid being vague about these benefits, I'll make formal claims so that you can hold me to what I say. Claims 1, 2, and 4, by the way are duplicates to what I responded in your first post.

CLAIM 1: Words allow fewer relationships than objects. In particular, by weeding out bad object-to-object relationships, they protect the programmer from bad design decisions.

This is probably the most difficult part of words to understand. Words appear on the surface to be adding to the number of relationships that a programmer can make. This appearance of additional freedom is just an illusion, though. What words do is couple the syntactical and semantic relationships that elements have with each other together in the same way that objects couple data and methods.

In object-oriented programming, if you are solving a problem there is no guarantee that this coupling will exist. The programmer, for example, might make it legal for there to be two '+' symbols next to each other, unintentionally causing the addition method to be called twice when it shoud only be called once. This bug would be hard to find because the lack of coupling between syntactical and semantic relationships means that the addition method could be anywhere in the code.

CLAIM 2: Words make it easier to integrate different libraries together.

Methods and objects give the illusion of simplicity, but this illusion is shattered when one tries to integrate two libraries that cover different domains. In the object world, there are far too many relationships that objects can have with each other. It's possible for there to be a third object, for example, that manages the syntactical relationships between two objects and a fourth that defines what those relationships mean. A programmer looking to add more objects to the system can cause unintended semantic side effects as a result by changing the legal relationships between objects and missing what those relationships mean.

The programmer, for example, might make it legal for there to be two '+' symbols next to each other, unintentionally causing the addition method to be called twice when it shoud only be called once. This bug would be hard to find because the lack of coupling between syntactical and semantic relationships means that the addition method could be anywhere in the code.

To sum up, words, by coupling syntactical and semantic relationships together force programmers to write good code and, as a result, make it much easier to scale projects.

CLAIM 3: Words allow for language inheritance.

Words allow languages to inherit from each other on a word by word basis. If you look in the word source under bluebox/src/translator/input/code you will see a directory of words defining a generic computer language. Look carefully and you'll notice that there are roughly five root words in the directory: program, code, assertion, declare, and conditon. This forms the root language of most programming languages.

Creating a specific programming language, then, is simply a matter of inheriting from this root language and creating language-specific conditions, declarations, etc. Look in bluebox/src/translator/input/perl for an example of this.

CLAIM 4: Words make relationships between elements in a domain explicit

If a person - who knows no language - had to learn some, it would be easier for that individual to just learn C and not the languages we use in everyday life. After all, C only has a few reserved words in it.

But using C to solve math problems is much harder than using standard mathematical notation. C does not concisely express the relationships between elements in a domain. As a result, a user of the language has to do more bookkeeping in their header. Compare this:

function * f = constructFunction(X, 3)
f = applyFormula(f)
f = Integrate(f)

with:

f(X) = Integral(sin(X + 3) - cos(X+3))

In the former, the user is forced to do the bookkeeping of relationships in their head. They have to remember that the original function was X + 3. Then, they have to remember that passing it to applyFormula changes the function to sin(X + 3) - cos (X + 3). The relationships between elements are hidden behind methods.

CLAIM 5: Words make it possible to build a Web of software

Before the Web, books were released in discrete bundles consisting of hundreds of pages. These pages could not be linked to each other. If a person borrowed a book from a library that referencd another, they would have to return to the library to get the other book. The Web ripped books open and freed their pages. Then it jumbled them together so that one page could instantly transport a reader to the next through a link. Today, anyone can extend the existing content simply by linking to existing Web pages on the Internet.

Like the publishing world before the Internet, software today is released in discrete bundles consisting of hundreds of objects. The body of open source software is analogous to a worldwide library of books that writers can checkout and use to create new books. Words can rip open the existing libraries into discrete words that can link to one another on the Internet. In this way, the entire body of open source software can move online and the hundreds of domain-specific libraries that do not interact with each other, the dozens of languages that are incompatible with each other, can merge into a natural language that everyone can use and extend.

Thanks for the questions. They were right to the point,

Nile

[ Parent ]
The only thing I understand is... (none / 0) (#84)
by EraseMe on Thu Jul 12, 2001 at 11:55:30 PM EST

...exa is a complete dick.

[ Parent ]
What is the browsers actual function? (none / 0) (#55)
by mold on Thu Jul 12, 2001 at 02:41:38 AM EST

I apoligize if any of this is on your site, however I haven't had a chance to view it yet, and I have a few questions. Okay, so maybe more than a few :-)

What is the actual job of the browser? Is it a compiler? Do we just run our code through the browser and get a binary output, once it has had a chance to download all of the words that it needs? Or will the programs actually run in the browser?

Are the languages we create merely scripting languages? Or compiled? Will we be able to embed a language that we've written into a program we've written? i.e. While the user is running my program, if it's run in the browser, would it run the user's script? Or if it's compiled, is there a way to call the browser? Would there be a way to limit the power the user would have, while still being flexible with control other users?

Sorry about having so many questions, I'll check your website the first chance that I have. Although I hope that you'll answer here, as well.

---
Beware of peanuts! There's a 0.00001% peanut fatality rate in the USA alone! You could be next!
The browser's function is to ... (none / 0) (#58)
by nile on Thu Jul 12, 2001 at 03:17:40 AM EST

... dynamically assemble language structures called words into complete languages that can run documents.

That was a lot of buzzwords, so let me break it up a little bit better. Before doing so, though, it should be noted that when I say "word," I mean a programming unit that has data, methods, and other programming properties, not the colloquial meaning.

Let's start with an analogy. Consider an English teacher and a dictionary. When a student hands the teacher a document, the teacher goes to the dictionary to get all of the words that the student uses in the document, and then uses the grammar rules specified next to the words to "glue" those words together to make a complete language. The teacher then "runs" this language on the students document as he checks it for errors.

Now, the browser functions the same way. It can be thought of as a language assembler. When a browser opens a document, it does not understand, it downloads the words it needs from the Internet. So, programmers put modular pieces of language on the Internet and the browser collects them, glues them together, and makes them into language. It then uses this language to run the document in the same way the teacher uses a dictionary's definitions and grammar rules to grade the student's paper.

Do we just run our code through the browser and get a binary output, once it has had a chance to download all of the words that it needs? Or will the programs actually run in the browser? Are the languages we create merely scripting languages? Or compiled?

Right now, all of the download words are stored in a database and dynamically run by an interpreter (in this case Python) from that database. The backend of BlueBox is also being designed to translate documents to different languages like C/C++/Java/etc. that can be compiled

Will we be able to embed a language that we've written into a program we've written? i.e. While the user is running my program, if it's run in the browser, would it run the user's script? Or if it's compiled, is there a way to call the browser?

If you choose to compile the program down rather than run it in the interpreter, you will be able to embed it in a program that you have written and you will not need to call the browser to do so since it can be a library or a component. Currently, words only compile down to Python, but we have begun work to make them compile down to Java and other languages.

Would there be a way to limit the power the user would have, while still being flexible with control other users?

I'm not sure I understand this last question, so could you expand on it a little more. If you are asking about the security model of BlueBox, it has two different security models: the traditional restricted execution environment and using words to restrict new words from being uploaded to the database.

Good questions! Please feel free to ask more,

cheers,

Nile

[ Parent ]
Quick question. (none / 0) (#56)
by mold on Thu Jul 12, 2001 at 02:49:17 AM EST

When you say that you compile the language into the technology of choice, and then list a few languages, what do you mean exactly? If I want it to print out a C++ source file, does it do that? Or does it actually create the binary files? I'm a little confused on what the actual output is.

---
Beware of peanuts! There's a 0.00001% peanut fatality rate in the USA alone! You could be next!
It compiles down to source (none / 0) (#59)
by nile on Thu Jul 12, 2001 at 03:23:45 AM EST

The translator in BlueBox compiles languages down to the source that you are interested in, not a binary. The binary can then be compiled by conventional means.

We're actually pretty excited about this part of our technology since it uses word inheritance to create a base language reader that all of the new compilers inherit from to read the generic language that they compile. If you're interested in seeing some neat stuff, checkout bluebox/src/translator and then run the tests in bluebox/tests/input_technologies and bluebox/tests/output_technologies. Read the README.designs to see what's going on.

Good question, by the way. There are significant differences between source to source and source to binary translators. We intentionally choose the former because it offers a number of benefits to the developer.

Nile

[ Parent ]
Problems (none / 0) (#61)
by rusty on Thu Jul 12, 2001 at 09:55:14 AM EST

I'd like to try out Blubox, but I can't get it to compile.

First, you need to mention somewhere that it requires wxPython. Ok, figured that out, but now I get:

make[2]: Entering directory `/home/rusty/bluebox/src/units'
/home/rusty/bluebox/src/bluebox/bluebox.py --dictionary=/home/rusty/bluebox/src/bluebox --load=word.word
Dictionary Path: /home/rusty/bluebox/src/bluebox
Load file: word.word
Traceback (innermost last):
  File "/home/rusty/bluebox/src/bluebox/bluebox.py", line 435, in ?
  app = BlueBox(0)
  File "/usr/lib/python1.5/site-packages/wxPython/wx.py", line 1607, in __init__
    _wxStart(self.OnInit)
  File "/home/rusty/bluebox/src/bluebox/bluebox.py", line 343, in OnInit
    self.Cache(self.LoadFile)
  File "/home/rusty/bluebox/src/bluebox/bluebox.py", line 225, in Cache
    loadString = self.getInclude("load")
  File "/home/rusty/bluebox/src/bluebox/bluebox.py", line 234, in getInclude
    includePath = self.getIncludeWordPath(wordName)
  File "/home/rusty/bluebox/src/bluebox/bluebox.py", line 259, in getIncludeWordPath
    fileList = os.listdir(currentPath)
OSError: [Errno 2] No such file or directory
make[2]: *** [load-all] Error 1
make[2]: Leaving directory `/home/rusty/bluebox/src/units'

Any suggestions?

____
Not the real rusty

Getting BlueBox to compile (none / 0) (#65)
by nile on Thu Jul 12, 2001 at 12:26:35 PM EST

Hi Rusty,

If you check out from CVS, you need to also download the tarball from http://www.dloo.org/download/bluebox-0.5.tgz, untar it, and copy bluebox/src/bluebox/memory from the tarball to bluebox/src/bluebox.

The natural language dictionary is intentionally not included in CVS. There are also directions on the CVS page but they are at the bottom of the page. I should probably move them to the top.

Thanks for checking it out and if you have any more troubles let me know. Once you get it working look in the tests directory to see all the stuff discussed above working.

cheers,

Nile

[ Parent ]
wxPython no longer required (none / 0) (#66)
by nile on Thu Jul 12, 2001 at 12:32:08 PM EST

wxPython is no longer required as a dependency.

It should run unaided following the directions below on a standard Linux/BSD install that has Python >= 1.52. Previous releases did have dependencies on Apache's Xerces, wxPython, wxGTK and C++ templates, but those dependencies have been intentionally removed so that BlueBox compiles out of the box. I'll update the download page today and make it more clear that the previous dependencies are no longer needed.

Nile

[ Parent ]
One more time, give us an example. (4.00 / 2) (#63)
by dzelenka on Thu Jul 12, 2001 at 11:31:42 AM EST

I hate to repeat, but this single important request seems to be ignored. Please give us a sample of code that does "x" better that any other language. Nothing brings clarity quicker than a good example.
"Are you talkin' to me?"
Working on reply (none / 0) (#67)
by nile on Thu Jul 12, 2001 at 12:50:59 PM EST

I'm working on a detailed example to your question. It should be ready shortly.

cheers,

Nile

[ Parent ]
Formal claims with examples (4.50 / 2) (#68)
by nile on Thu Jul 12, 2001 at 01:14:12 PM EST

Good question. I'll be formal too, so that you can hold me to what I say.

CLAIM: Words allow you to create scalable, dynamic languages better than any other language. Words allow non-communicating parties to build and extend these languages on the Internet.

EXAMPLE: Here is a simple example of building a language with words:

Match "Sum"

method addNumber(n):
    self.result += n

method getValue():
    return self.result

method finishedParsing():
    for number in NumberWords:
        addNumber(number.getValue())

syntax/semantics: Number.word

And here is a simple example of the Number word:

Match "\s[0-9]"

method getValue:
    return atoi(self.mathch)

The Sum word is instantiated in a document when the string "Sum" is parsed. Then its rule/relationship is called to match the Number word several times matching what the Number words says to match (in this case any number between 0 and 9). When the Sum word is finished with its rule/relationships, it calls its finishParsing method. The finishedParsing method iterates over all of the words that were matching summing the numbers.

In this way, when "Sum 0 4 5 6 7 8" is entered, the program will run the document, matching all the words and print out their sum. Solving problems in object-oriented programming means creating modular objects that programmers can instantiate and reuse. Solving problems in word-orietned programming means creatign modular pieces of language (i.e., words) that programmers can use as a language.

EXAMPLES OF WORDS LINKING, INHERITING, and WORD POLYMORPHISM:

Now, words are better at language creation for three reasons: the ability to add another syntax/semantics structure that links to a new word, that ability to inherit the syntax/semantics structure, and the polymorphism of syntax/semantics structure. Let's go over all of these one by one:

If the Sum and Number word were posted on the Internet, another programmer could come along and write the following Draw word:

Match "Draw"

method drawPoint(n):
    GUI.drawPoint(n)

method finishedParsing():
    result = self.getFirstWord()
     self.drawPoint(result.getValue())

syntax/semantics: Sum.word

So, now we can write "Draw Sum 3 5 6" and get the point drawn. Notice how the Draw and the Sum person did not have to talk to one another.

Now, let's try inheriting from a word. Let's create the Multiply word which inherits from the Sum word.

Match "Multiply" inherit from "Sum"

method multiplyNumber(n):
    self.result = n * self.result

method finishedParsing():
    for number in NumberWords:
        multiplyNumber(number.getValue())

Now, we can write "Multiply 6 8 9 4" and it will correctly multiply those numbers. Notice how we did not have to specify the syntax/semantics structure. "Multiply" automatically inherited in from "Sum." In complex languages like HTML, for example, words can inherit dozens of syntax/semantics structures from each others.

Finally, let's look at the polymorphism of words. Remember the Draw author who linked to the Sum word. Since the Multiply word inherits from the Sum word, through polymorphic matching it is now possible to write "Draw Multiply 3 5 6." What happens here is that when Sum is matched through Draw's syntax/semantic link, Multiply is too since it inherits from it. In this way, links become progressively richer over time. Someone else, for instance, could inherit from number and create BigNum that overloaded the plus operator. Then, the following would work "Draw Multiply 234342 1323132 32423424."

CONCLUSION The linking, inheritance, and polymorphism of words makes them better than other languages at creating scalable languages. The fact that one can post words on the Internet and non-communicating parties can extend the language means words allow for scalable, Internet defined languages as discussed in the paper.

Thanks for the question. Does the above make sense? If not, what parts are unclear?

Nile

[ Parent ]
good examples. (none / 0) (#76)
by taruntius on Thu Jul 12, 2001 at 05:32:53 PM EST

And like all good examples, they raise a number of further questions. The biggest one I'm having trouble with is "fundamentally, how is this different than any other OO language?"

Let's say I create a C++ class X that's able to operate on objects of class Y. Let's say someone creates a class Z that's inherited from class Y. My class X should still be able to operate on items of class Z. This seems to be what you're describing with your polymorphism and inheritence mechanism. I'm having trouble seeing how the word model is an actual improvement.

By virtue of the fact that you keep referring to people posting their words on the net and magically making them available for others, I get the impression that you think that's the big new thing. Ok, fine, but how is that any different than what happens if I post my class X on my website and someone else posts their class Z on their website? Sure, your model seems to obviate the need for other programmers to actually track down the locations of those classes, download them, and compile them. But that's gruntwork. While I will certainly give you points for automating a large chunk of gruntwork which nobody likes to do anyway, I'm still not seeing how it's fundamentally any different than good old C++. And anyway, java makes a good argument for having already done away with a lot of that gruntwork.

Let's see some examples of things you can do in the word model that either can't be done in C++, or can't be done without a whole lot of pain.




--Believing I had supernatural powers I slammed into a brick wall.
[ Parent ]
Words define languages (none / 0) (#83)
by nile on Thu Jul 12, 2001 at 07:46:11 PM EST

The key thing here is that words define languages, not objects.

Let's say I create a C++ class X that's able to operate on objects of class Y. Let's say someone creates a class Z that's inherited from class Y. My class X should still be able to operate on items of class Z. This seems to be what you're describing with your polymorphism and inheritence mechanism. I'm having trouble seeing how the word model is an actual improvement.

This is one of the more difficult questions that is asked because the fact that both languages are Turing-complete means that anything I present in words can also be done in C++. The GNOME people, for example, do object-oriented programming in C by coupling data and methods together. The question that we should examine, then, is not whether C++ can do word-oriented programming, but whether it forces the programmer to do so.

Imagine a more complicated example where there were numbers, operators, parenthesis, and drawers. In this example, a word document might have "Draw 3 + (5 -4) * 6. There are several different ways that an OOP programmer could solve this problem just as there are several different ways that a structural program could handle data and methods. One popular way would be to write a parser that parsed the tokens and then iterate over those tokens and run commands as different tokens were matched. The problem with this approach is that it makes it very easy for a spagetthi like relationship between syntax and semantics to evolve. Changing what syntax the parser reads could have unintended semantic side effects. For example, letting two pluses be next to each other might end up making the addition method be called twice -- a disaster for an accounting program.

So, what words do is eliminate bad relationships like that by coupling syntactical and semantic realtionships in addition to data and methods. It's not that you can't do it in C++, it's that you have to do it in words.

Now, the reason that the inheritance and polymorphism are different is because this is langauge based inheritance and polymorphism. I.e., the result is a language composed of words which can be inherited from and can be polymorphic. Coupling syntactical and semantic relationships, in fact, forces you to create languages rather than just collections of objects. It's a direct consequence of the model.

Nile

[ Parent ]
all programming is words (none / 0) (#85)
by kubalaa on Fri Jul 13, 2001 at 08:18:29 AM EST

I do think words are a good way to approach certain problems of creating new languages. They clearly allow for more domain-specific syntax in some ways.

What I still don't see is how this is conceptually different from any programming except that it's less structured. In a traditional language, when you write a function you're defining a word. The function signature is the equivalent of "syntax/semantics". You could say it's even more so in Lisp; the only difference I see in your approach is that functions are named with regular expressions.

I don't see that words get any context beyond the following tokens, so you're completely limited to prefix notation. I'm also not sure that the problem they fix well --- inflexibility of syntax --- is really a problem. The C-style limitations of functions, variables, etc. has proven itself very useful; before I bought into the flexible syntax of words I'd have to see that it's being used to good effect. Is something written in words easier to understand? I find it hard to imagine how it could be, because the behaviour of every token is completely dependent on context, and the more flexibility you add to allow reasonable syntax structures, the harder it becomes for a human reading the code to predict how the words will interact.

Lastly, coordination between different syntaxes is not the main problem of "peer-to-peer" programming; data handling is. Any mini-language you write in words is useless to integrate unless it shares some words with those in another language.

[ Parent ]

Words are easier to use, not to understand (none / 0) (#87)
by nile on Fri Jul 13, 2001 at 02:55:53 PM EST

What I still don't see is how this is conceptually different from any programming except that it's less structured. In a traditional language, when you write a function you're defining a word. The function signature is the equivalent of "syntax/semantics". You could say it's even more so in Lisp; the only difference I see in your approach is that functions are named with regular expressions.

Structural programming is much simpler than object-oriented programming. But, after getting over the hurdle of learning object-oriented programming, most programmers find objects much easier to use in practice. Objects restrict the relationships that programmers can make between data and methods and, as a result of that restriction, prevent them from making serious mistakes.

Words work the same way as I detail below. They restrict the relationships that objects can have with one another. In particular, they couple syntactical relationships of objects (i.e., it is legal for '+' to come after '0-9') with semantical relationships (i.e., what it means for plus to come after a number). Object-oriented programming does not enforce this coupling and, as a result, makes it easy for programmers to incorrectly construe these relationships.

I don't see that words get any context beyond the following tokens, so you're completely limited to prefix notation.

The following token might also match tokens, so a word could get to it, by using self.getFirstWord().getFirstWord(). You have access to all of the words that you manage and - currently - that they manage.

I'm also not sure that the problem they fix well --- inflexibility of syntax --- is really a problem. The C-style limitations of functions, variables, etc. has proven itself very useful; before I bought into the flexible syntax of words I'd have to see that it's being used to good effect. Is something written in words easier to understand? I find it hard to imagine how it could be, because the behaviour of every token is completely dependent on context, and the more flexibility you add to allow reasonable syntax structures, the harder it becomes for a human reading the code to predict how the words will interact.

Methods and objects give the illusion of simplicity, but this illusion is shattered when one tries to integrate two libraries that cover different domains. In the object world, there are far too many relationships that objects can have with each other. It's possible for there to be a third object, for example, that manages the syntactical relationships between two objects and a fourth that defines what those relationships mean. A programmer looking to add more objects to the system can cause unintended semantic side effects as a result by changing the legal relationships between objects and missing what those relationships mean.

The programmer, for example, might make it legal for there to be two '+' symbols next to each other, unintentionally causing the addition method to be called twice when it shoud only be called once. This bug would be hard to find because the lack of coupling between syntactical and semantic relationships means that the addition method could be anywhere in the code.

To sum up, words, by coupling syntactical and semantic relationships together force programmers to write good code and, as a result, make it much easier to scale projects.

The C-style limitations of functions, variables, etc. has proven itself very useful; before I bought into the flexible syntax of words I'd have to see that it's being used to good effect. Is something written in words easier to understand? I find it hard to imagine how it could be, because the behaviour of every token is completely dependent on context, and the more flexibility you add to allow reasonable syntax structures, the harder it becomes for a human reading the code to predict how the words will interact.

I think we have to distinguish here between difficulty of using and difficulty of learning. If a person - who knows no language - had to learn some, it would be easier for that individual to just learn C and not the languages we use in everyday life. After all, C only has a few reserved words in it.

But using C to solve math problems is much harder than using standard mathematical notation. C does not concisely express the relationships between elements in a domain. As a result, a user of the language has to do more bookkeeping in their header. Compare this:

function * f = constructFunction(X, 3)
f = applyFormula(f)
f = Integrate(f)

with:

f(X) = Integral(sin(X + 3) - cos(X+3))

In the former, the user is forced to do the bookkeeping of relationships in their head. They have to remember that the original function was X + 3. Then, they have to remember that passing it to applyFormula changes the function to sin(X + 3) - cos (X + 3). The relationships between elements are hidden behind methods.

Words allow the relationships between elements to be explicit.

cheers,

Nile

[ Parent ]
more questions (none / 0) (#89)
by kubalaa on Fri Jul 13, 2001 at 03:57:55 PM EST

Thanks for the detailed reply.
  1. In particular, they couple syntactical relationships of objects (i.e., it is legal for '+' to come after '0-9') with semantical relationships (i.e., what it means for plus to come after a number).
    As I mentioned elsewhere, this is meaningful only for people writing parsers. Most objects in applications don't even have a syntactic relationship. What relationships they do have are far more complex than one-dimensional positions in a token string.
  2. The following token might also match tokens
    So I take it that's a yes, you are limited to prefix notation. There's no way for me to say: "'x' must come between two numbers" without changing the number definition to allow 'x' to follow it.
  3. I don't see that your code example has anything to do with "words." It could be written in a variety of ways with a variety of languages, and some are just as easy to read as your second implementation. What you didn't show was the rest of the implementation it would require to define the functions you use: with words, you'd have to write your own parser to allow you to use nested parentheses like you have, while most languages provide a useful syntax ready-made. I'd like to see evidence that defining your own sytntax every time your program is worth the work; what's wrong with the standard notation?


[ Parent ]
Re: more questions (none / 0) (#90)
by nile on Fri Jul 13, 2001 at 05:46:49 PM EST

1.In particular, they couple syntactical relationships of objects (i.e., it is legal for '+' to come after '0-9') with semantical relationships (i.e., what it means for plus to come after a number). As I mentioned elsewhere, this is meaningful only for people writing parsers. Most objects in applications don't even have a syntactic relationship. What relationships they do have are far more complex than one-dimensional positions in a token string.

All problems have elements that have syntactical and semantic relationships. Consider an ATM. It has relationships between its money feeder, its button interface, the customer, and the bank. Consider a calculator. It has relationships between numbers, operators, and parenthesis. Consider a web browser. It has relationship between navigation, history, user interface, etc.

I think part of the confusion here is that we mean different things by syntactical. When I say syntactical relationships, I mean legal relationships. It is legal, for example, for a customer to ask money from the bank and vice-versa. It is not legal for the button interface to ask for a loan from the money feeded. All programs restrict the ways in which elements in their domains can interact. The fact that object-oriented programming does not provide an explicit mechanism for handling these relationships does not mean that they do not exist.

What words do is restrict what relationships these elements can have with each other to prevent bad programming behavior. They accomplish this by coupling the syntactical relationships between elements (i.e., how they can be legally related) with the semantic relationships (i.e., what those relationships mean).

I agree, by the way, that there are some relationships that should be represented to the user other than as a text string (matrixes, for example). This is a representation problem, though. Words can parse multi-dimensional matrixes fine.

2.The following token might also match tokens So I take it that's a yes, you are limited to prefix notation. There's no way for me to say: "'x' must come between two numbers" without changing the number definition to allow 'x' to follow it.

Again, I didn't give enough detail. So, if you look in BlueBox, the matchers actually match a beginning and end expression and the syntax/semantic structures are matched inbetween them:

The matcher in words actually looks like the following:

<word:symbol>
   <word:begin expression=""/>
   <word:end expression=""/>
</word:symbol>

So, it is very easy to match pairs of parenthese, x between two numbers, etc. I am actually increasing the power of the matching system too with something I call abstract regular expressions. Abstract regular expressions are regular expressions that can match words as well. Most of the work is done on this -- I'm just finishing the backtracking algorithm.

3.I don't see that your code example has anything to do with "words." It could be written in a variety of ways with a variety of languages, and some are just as easy to read as your second implementation. What you didn't show was the rest of the implementation it would require to define the functions you use: with words, you'd have to write your own parser to allow you to use nested parentheses like you have, while most languages provide a useful syntax ready-made. I'd like to see evidence that defining your own sytntax every time your program is worth the work; what's wrong with the standard notation?

I provide this evidence farther down the page in "5 benefits of word-oriented programming." However, I can see your concern here so let me see if I can answer it.

The worry appears to be that programmers are going to be burdened with the additional burden now of defining a syntax. I think this will rarely be the case, though. Most syntaxes inherit from other syntaxes.

For example, in English, most words inherit from nouns, verbs, and modifiers. Creating a new domain in English, then, does not mean respecifying all of the grammar rules that define how nouns, verbs, and modifiers can be conjoined together. Instead, the programmer simply has to inherit from those three core words that already define the English language. No parser has to be written to create a domain specific dialogue in English since the core language is already defined.

Something similar happens in object land, by the way, we just don't see it. When we inherit from the root object of a language, what we are really saying is I am creating a new "word" that can be used wherever "object" can be used in this language. Of course, since objects don't have the properties of words, we don't look at it this way even though that is what is happening.

Understanding langauge inheritance is key to understanding words. Without inheritance, the system would require a great deal of work for ordinary programmers.

Nile

[ Parent ]
Can't parse examples (none / 0) (#78)
by xanaguy on Thu Jul 12, 2001 at 05:49:53 PM EST

I can't tell what are keywords and what are user-defined names in your examples. And how things link up.

Specifically, why do the "addNumber" and "multiplyNumber" methods get called? Is "result" (not "self.result") a member variable to the word or a local variable to the function? Why does the add word accept multiple numbers after it instead of just one? That is, how are the syntax/semantics lines interpretted?

[ Parent ]
Understanding the examples (none / 0) (#80)
by nile on Thu Jul 12, 2001 at 06:06:39 PM EST

Keywords:

Match: Structure in word that matches a symbol like "Sum"
method: a method
syntax/semantics: a structure that matches a word multiple times. (This should have been explained).

Keywords in Action:

So the following word matches sum and any number words that comes after it.

Match "Sum"

method addNumber(n):
    self.result += n

method getValue():
    return self.result

method finishedParsing():
    number=self.getFirstWord()
    while (number != None):
        addNumber(number.getValue())
        number = number.getNextWord()

syntax/semantics: Number.word


The use of the variable name 'result' in Draw was confusing. I should have called it 'firstWord.' Also in the Sum word, I should have iterated through all of the words to as I do above.

Does this make sense?

Nile

[ Parent ]
Intentional Programming? (4.00 / 2) (#79)
by xanaguy on Thu Jul 12, 2001 at 05:56:16 PM EST

Ever since I first heard about this, I've been dying to know how it relates to Intentional Programming (IP). IP is a project that Microsoft Research has been working on for a number of years, and everything I've heard about bluebox makes me think that it's attacking the same goals in the same way -- a dynamically growing programming language allowing a higher degree of code reuse by separating syntax from the problem.

Were you guys inspired by IP? Have you never heard of it? Were both inspired by some common source in the CS literature?

There's lots of good information at http://research.microsoft.com/ip/

Same area, but very different approaches (none / 0) (#81)
by nile on Thu Jul 12, 2001 at 06:38:47 PM EST

Intentional programming and words both fall under the general framework of natural language programming, but they have very different philosophies and completely different frameworks. The creator of intentional programming - Charles Simonyi - believes that syntax needs to be factored out, or better abstracted out to a syntaxless intention. In IP, as a result, one operates on a graphical representation of an abstract syntax tree (or IP tree) after a program has been parsed and can add additional nodes to the tree. These nodes are then translated into specific implementations.

The philosophy behind words is that the key to natural language programming is the management of syntactical and semantical relationships. The goal isn't to abstract out different statements to an invariant representation but to manage the syntactical and semantic relationships between concepts in a domain. Syntax isn't seperated out from semantics - as occurs in IP and XML documents. Instead, syntactical relationships and semantical relationships are coupled together in the same way that objects couple methods and data together.

Syntactical relationships express the legal ways that concepts in a domain can be grouped together (e.g., the '+' token can follow the '2' token). Semantic relationships express what those groupings mean (e.g., if '+' follows '2', two is being added to something). In OO, there is a spaghetti-like relationship between syntactical and semantic relationships. Words couple these two relationships together in the same way that objects couple data and methods. In fact, words are a new programming unit that couple data, methods, symbol recognizers and syntax/semantic relationships together.

This coupling allows for a new type of inheritance and polymorphism that can be seen in the examples (the keywords are now defined). In this way, words are very different from the xmethods of intentions. This difference carries over to the development environment. In words, one writes programs using language just as one does today. Operating on an abstract syntax tree graphically never enters the equation.

Intentional Programming, words, extensible languages like XML, and monadic combinators are all swirling around a similar set of problems. How all of these technologies tackle the problems is very different, though. If you are interested in related technologies, the closest is actually monadic combinators from the land of functional programming. Monadic combinators are like words without state, seperate methods, and inherited and polymorphic rule relationships.

good question, by the way,

Nile

[ Parent ]
Same area, but very different approaches (part II) (none / 0) (#82)
by nile on Thu Jul 12, 2001 at 07:30:41 PM EST

Thinking about it a little bit more, words and IP are actually the antithesis of each other. Right now, there are two very different approaches in natural language programming.

Approach I: Seperate syntax from semantics -- I.e., make them as far apart from each other as possible.

XML and IP programming both follow this approach.

Approach II: Couple syntactical and semantic relationships -- I.e., make them as close as possible.

Monadic combinators and words are both an example of this approach.

In this way, we are able to divide the field into two different and opposing camps.

cheers,

Nile

[ Parent ]
A Peer-to-Peer Programming Language | 90 comments (72 topical, 18 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!