Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
The Word Model: a new technology to integrate diverse problem domains

By nile in Technology
Fri Mar 30, 2001 at 12:20:21 AM EST
Tags: Software (all tags)
Software

The word model couples grammar rules, data, and methods together in the same way that the object model couples data and methods. This coupling eliminates the side effects of integrating different syntaxes with each other. Because words encapsulate grammar rules, data, and methods, it is possible to richly integrate both the syntax and the semantics of different domains. Integrating a word-oriented version of Javascript and HTML, for example, requires only one change to one grammar rule in one word.


The word model is important because it increases the number of problems that computer scientists can solve. As detailed in the first paper in this series, "Natural Programming and Integration," most problem solving involves integrating different domains of knowledge. To build the foundations of calculus, for example, it is necessary to integrate the domains of both logic and set theory. If these two domains were expressed in traditional object-oriented libraries, integrating the elements of the domains would be difficult and would require a substantial rewrite of those libraries.

The problem with object-oriented programming is that, although it encapsulates data and methods, it does not encapsulate the semantic relationships between elements in a domain that are expressed in syntax. In an object-oriented logic library, for example, there is no set way to create relationships between elements of logic. More importantly, because relationships are not encapsulated with the data and methods of the elements they relate, every semantic relationship between the elements of the domain can have undesirable side effects on other relationships.

Consider an object-oriented logic library and an object-oriented set library. It would be entirely reasonable in the object model to represent the elements of truth claims as children of "ClaimElement" and verify compositions of those elements with a "DeductiveDatabase" object. The logic library would work by accepting a set of axioms, and then only allowing new composites of "ClaimElements" to be added if they followed from what was already in the database. Similarly, it would also be reasonable in the object model to write a set library that had objects like "Set," "Member," and "NullElement."

Imagine that we wanted to integrate the set library with the logic library. The algorithms expressing the semantics of sets and memberships in the Set library would have to be integrated with the algorithms of logic in the DeductiveDatabase object. Statements regarding sets would only be allowed in the database if they followed from existing set statements there. As detailed in the first paper, which analyzed integration from a purely syntactical perspective, this integration would be hard because the semantic relationships of the elements are not encapsulated. As more and more domains are integrated, dealing with side effects becomes intractable.

With the word model, the semantic relationships of elements are coupled with their data and methods so that they can be richly integrated with the semantic relationships of other elements without side effects. In a syntax of logic words, for example, all of the semantic relationships of "implies" would be in the "->" word. Similarly, all of the semantic relationships of membership would be in the "{}" word. By simply modifying the local grammar rules of these words, the semantics of implies and membership could be integrated with each other with no side effects on other words. In this way, integrating logic and sets would not require rewriting libraries, but just connecting words with each other in new ways.

The word model is not just theoretical. It is already being used in BlueBox, a word-oriented natural language browser. BlueBox is on SourceForge with screenshots of a set of words that have already been written for it. This technology and BlueBox itself are being open sourced for the community to use. Source code will be available through CVS by the end of the week. More information on the word model can be found in the paper "The Word Model." The remaining papers in the series will be posted over the next two weeks.

This story was posted by dLoo, who welcomes feedback and contributions in writing these papers.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o "Natural Programming and Integration,"
o BlueBox
o screenshot s
o "The Word Model."
o Also by nile


Display: Sort:
The Word Model: a new technology to integrate diverse problem domains | 57 comments (43 topical, 14 editorial, 0 hidden)
Great article, well written... (4.50 / 6) (#1)
by alisdair on Thu Mar 22, 2001 at 05:40:03 PM EST

...but what does it mean? What's a natural language browser? Why would this be useful, ever?

I need some examples before I'll understand this properly: your explanation is a little too abstract for my poor brain.

Don't blame yourself. (5.00 / 8) (#7)
by CdotZinger on Thu Mar 22, 2001 at 07:21:41 PM EST

(My apologies to nile for this, but I'm a writer, so I get really pissy about these things.)

Don't blame yourself. The article doesn't make sense.

By "make sense" I mean something like "signify" or "denote"--you know, say something.

What we have here, I think, is a lot of imperfectly grasped abstract jargon assembled into an inexpressive jargon-pile, like when someone who doesn't know much about computers starts attaching "algorithm"-thisses and "object-oriented"-thats to his fundamentally unsound and incoherent computer-talk in hope of seeming expert. Or, like when a philosophy student first learns terms like "aufhebung" and "monad," and he wants to stick them into his McDonalds order. Or [many more examples].

It's almost irresistible, this will-to-puffery, because it fools most people. It's still widely believed that Al Gore is more intelligent than George Bush, because Gore says "paradigm" when he means "idea" (and Bush says "idear"). And it fooled you. You think you're dumb because this article didn't mean anything to you. It doesn't mean anything substantial to the person who wrote it, either. If it did, we could make a kind of rickety, phantom-outline sense of it, even though its subject is outside most of our areas of expertise.

I don't have anything against puffed-up, seemingly unduly complex language. I'm a real snob. I read Deleuze and Guattari et al for fun. But, balloons are inflatable; air isn't. Without a solid, unabstracted set of facts and/or propostitions to decorate, the jargon just hangs there like tinsel with no Xmas tree.

(Or, I'm being a dick, and the article is shrewdly satirical (though the bad grammar argues against that). Or, it's really bad Language Poetry (though the lack of Barthes epigraphs argues against that). Or, nile knows his biz but he's a very bad writer, and he should hire a TW for the project (I think it's a software project he's talking about--right?).)


Q: You could interest yourself in these interesting machines. They're hard to understand. They're time-consuming.
A: I don't like you.
[ Parent ]
A Concrete Example (4.00 / 2) (#5)
by nile on Thu Mar 22, 2001 at 06:28:03 PM EST

Several people want a more concrete example, so here goes.

First, a caveat. You can do word-oriented programming in object-oriented languages in the same way that you can do object-oriented programming in procedural languages. Words like objects are a way of coupling data.

Now, why should you care:

Most problems do not just exist in one domain of knowledge, but integrate multiple domains. If you want to solve a large number of problems that have been traditionally outside the domain of computer science, we have to find a way to integrate different domains without side effects.

Here's a quick concrete example from the paper of how words differ from objects.

Click on this link and you will see the source code for three different words in the XMLGUI syntax that comes as a sample with BlueBox. We have colored the grammar rules, data, and context in different colors so that it is easy to see what a word is composed of.

Now, this mini "Table" language is very different from a standard way of creating a Table parser. The most obvious difference is that rather than creating a parser that instantiates rows and columns as it comes across 'td's' and 'tr's', all of the grammar rules for recognizing td's and tr's are in the Tr word and Td word respectively. This is the key difference.

What it means:

The consequences can be analyzed from two different perspectives.

The (simpler) syntax perspective - integrating different syntaxes is now possible because there are no longer unintended side effects from different grammar rules. The first paper really helps here.

The - significantly more important - semantic perspective: The grammar rules in the words are the way in which semantic relationships are expressed. Since they are encapsulated, this means that the side effects of trying to integrate the semantics of different libraries has been eliminated. As a result, we can create as rich of languages as we want.

I hope this helps. As we noted, although we've spent a long time writing this, we're still trying to find the best examples and terminology to explain why all of this is significant.

cheers,

Nile

Minor correction (none / 0) (#6)
by nile on Thu Mar 22, 2001 at 06:29:25 PM EST

It should read "coupling programming elements", not "coupling data." nile

[ Parent ]
Benefits? (4.00 / 2) (#8)
by zephiros on Thu Mar 22, 2001 at 07:57:04 PM EST

What can I do with this that I can't do with traditional inheritance and polymorphism? I mean this seems neat and everything, but it looks like, for the most part, you're duplicating/relabeling functionality already present in current OO thinking.

Also, are you suggesting some sort of universal grammar for expressing functionality across problem domains? I'm sure you appreciate what a painful task it would be to develop a completely detailed functional model of even a simple, closed system (much less anything in the real world). Surely you don't expect developers to perform this level of analysis every time they code a new class?
 
Kuro5hin is full of mostly freaks and hostile lunatics - KTB
[ Parent ]

Answer: Why are words better than objects? (4.00 / 2) (#10)
by nile on Thu Mar 22, 2001 at 08:37:35 PM EST

Great questions.

First, all modern languages can solve any problem that can be solved on a Turing machine. So when we say 'can't,' we mean impossibly hard if you don't couple these entities in the same way GTK would be hard to use if it spread data and methods all over the place rather than coupling them in an OO manner.

are you suggesting some sort of universal grammar for expressing functionality across problem domains?

You just hit the problem actually. A universal grammar would be a grammar with global grammar rules. Those have side effects on other grammar rules because they all exist in the same space. They are very hard to write as a result. Words are the exact opposite of this because they encapsulte the grammar rules that connect one word to another in the words themselves. As an analogy, the question is similar to asking if objects put all of the data of a program in one file to make it accessible to all the methods in a program. The answer is no: objects couple data and methods.

but it looks like, for the most part, you're duplicating/relabeling functionality already present in current OO thinking ... what can I do with this that I can't do with traditional inheritance and polymorphism?

You can integrate libraries easily whose authors have never talked to each other or seen each other. Consider the logic library and set library again. If you know from the start, you want to integrate the two before you start coding, you can correctly map the semantic relationships from the start with objects. Obviously, you will not implement it the way that it was described in the paper.

What if, instead though, you just come across a logic library and a set library that you want to richly integrate with each other as discussed above? And the authors have never talked to each other so the libraries are designed as above. Then, you will have to seriously rewrite both of those libraries to make it so that you can prove theorems about sets.

With words, though, all of these semantic relationships are already encapsulated with the data and methods. So even if the authors had never talked to each other, integrating the two domains only involves connecting words to each other in new ways by adding grammar rules to different words. In this way, word-oriented libraries don't have to be rewritten to be richly integrated with each other.

It is this after-the-fact integration ability that distinguishes words from object and - for that matter - objects from procedural languages.

Nile

[ Parent ]
This Comment Is False (4.00 / 1) (#36)
by SEWilco on Fri Mar 23, 2001 at 10:11:44 PM EST

Can you code "This Sentence Is False"?

[ Parent ]
Context ambiguity (4.50 / 2) (#9)
by jabber on Thu Mar 22, 2001 at 08:37:01 PM EST

A very well written, and very interesting article. But, the idea of Word Oriented programming seems very counter-intuituve to me.

I understand that what this attempts to do is put the solution logic within the context of the problem domain, and herein might lie my problem. All programming courses I've ever taken, all CS courses except AI in fact, have flatly stated that ambiguity is a bad thing, and that context sensitivity is worse than logic that is clear regardless of how it is applied. Word Oriented programming goes directly in the face of these teachings.

But I have to step back and look at it as follows: Programming is for ever raising the level of abstraction at which we set up logic. We started with bits, then went on to op-codes, addresses and loops, then on to procedures and functions and externs and objects. At some point, we are bound to hit concepts that are meaningful in more than one way, given different perspectives, and may make different sort of sense, or non-sense in either.

Also, special purpose programming is becoming ever more commonplace. Where once men in lab coats toggled switches, little old ladies of today compose special purpose Macros in some high level scripting language (like VBA).

Given all that, maybe context sensitive programming isn't the demon that 3GL and 4GL programming classes made it out to be. Not all wrenches are adjustible after all. Very interesting. I'll have to dig deeper into what you mention, and see if it still makes sense afterwards. Thanks for the effort. +1.

[TINK5C] |"Is K5 my kapusta intellectual teddy bear?"| "Yes"

Good question (ambiguity is not part of the model) (3.00 / 1) (#11)
by nile on Thu Mar 22, 2001 at 09:09:43 PM EST

Ambiguity is not a necessary part of the model actuallly, as I would define it. It's possible that you mean something different by it. Regardless, it is a good question whether ambiguity is a good thing in a programming language. Like yourself, I won't commit myself to either side of the fence. The purpose of the model is to eliminate the side effects of global grammar rules by correctly coupling programming elements. There is a thread lower down where I discuss what words can do that objects can't. Here is a quick repost and then I'll discuss context ambiguity.

but it looks like, for the most part, you're duplicating/relabeling functionality already present in current OO thinking ... what can I do with this that I can't do with traditional inheritance and polymorphism?

You can integrate libraries easily whose authors have never talked to each other or seen each other. Consider the logic library and set library again. If you know from the start, you want to integrate the two before you start coding, you can correctly map the semantic relationships from the start with objects. Obviously, you will not implement it the way that it was described in the paper.

What if, instead though, you just come across a logic library and a set library that you want to richly integrate with each other as discussed above? And the authors have never talked to each other so the libraries are designed as above. Then, you will have to seriously rewrite both of those libraries to make it so that you can prove theorems about sets.

With words, though, all of these semantic relationships are already encapsulated with the data and methods. So even if the authors had never talked to each other, integrating the two domains only involves connecting words to each other in new ways by adding grammar rules to different words. In this way, word-oriented libraries don't have to be rewritten to be richly integrated with each other.

It is this after-the-fact integration ability that distinguishes words from object and - for that matter - objects from procedural languages.

Returning to context ambiguity, because of how they're coupled, it is possible to have a context structure in each word by which it can gain access to other words around it. This is not something that is necessary in the model, but it seemed to evolve out of it and was useful, so we use it.

Now, the fact that words have their context means they could do different things depending on their context. Like you, I'm not sure if this is a good thing and don't want to commit. What we mainly use the context for is to access the attributes of words around the current word so that it can know how to build itself. In certain cases, particularly TD words in HTML, this context can be used to make a word ambiguous. For example, a TD should be larger if a Table has greater width than all the TD words together. Perhaps this is bad, but its HTML's fault. Again, context is not a necessary part of the model and ambiguity is certainly not. HTML browsers, which are primarily written in C++, have this same ambiguity.

Hope this helps,

Nile

[ Parent ]
Perl, etc... (none / 0) (#44)
by pos on Mon Mar 26, 2001 at 07:50:35 PM EST

nile: you're doing a great job of following up on comments and questions. It shows that you care a lot about this subject :)

Now.... I think it is interesting to note some similarities between Word Programming and other languages. Perl is often touted as a great "glue" language because it is so versitile. It accepts lots of different code sytax conventions and plays very nicely with other languages. It also shares the similarity of having a very linguistic base (being a natural language and all) and often feels like a "high level scripting language" as jabber put it. I would point out to Jabber that Perl often seems ambiguous but is actually just very context sensitive and therefore more expressive than most languages. It's harder to learn the language but once you do, you can express more complex ideas quicker. The nice thing about Perl is there is no penalty to useing "baby talk". The problem is that sometimes other people are talking above your vocabulary/skill level.

The Word paradigm also seems to be free to adapt into new domains fairly easily as I see you expressing your ideas in python, XML and C++ all with similar ease. That bodes very well for your project and suggests that it is probably best expressed as a language/framework rather than a design pattern.

My main question is: are you proposing this as an extention of all OO lanuguages the way C++ is an OO extention of C? It seems you are implying this can be added to python, Perl, C++, Ruby, Lisp or Squeak (for example). I would think Ruby and Squeak have a lot of potential seeing as they have a strong OSS base and allow you to make changes to the compiler fairly easily.

So... any ideas about making up new compilers here? I suppose you have one for python already?

-pos

The truth is more important than the facts.
-Frank Lloyd Wright
[ Parent ]
What words can do that objects's can't (none / 0) (#12)
by nile on Thu Mar 22, 2001 at 09:24:49 PM EST

There was a good question farther down in the posts. I'm reposting it here to solicit feedback.

but it looks like, for the most part, you're duplicating/relabeling functionality already present in current OO thinking ... what can I do with this that I can't do with traditional inheritance and polymorphism?

You can integrate libraries easily whose authors have never talked to each other or seen each other. Consider the logic library and set library again. If you know from the start, you want to integrate the two before you start coding, you can correctly map the semantic relationships from the start with objects. Obviously, you will not implement it the way that it was described above.

What if, instead though, you just come across a logic library and a set library that you want to richly integrate with each other as discussed? And the authors have never talked to each other so the libraries are designed as above. Then, you will have to seriously rewrite both of those libraries to make it so that you can prove theorems about sets.

With words, however, all of these semantic relationships are already encapsulated with the data and methods. So even if the authors had never talked to each other, integrating the two domains only involves connecting words to each other in new ways by adding grammar rules to different words. In this way, word-oriented libraries don't have to be rewritten to be richly integrated with each other.

It is this after-the-fact integration ability that distinguishes words from object and - for that matter - objects from procedural languages.

Nile

why is this a good idea? (none / 0) (#25)
by streetlawyer on Fri Mar 23, 2001 at 09:14:29 AM EST

Then, you will have to seriously rewrite both of those libraries to make it so that you can prove theorems about sets.

With words, however, all of these semantic relationships are already encapsulated with the data and methods.

So you're substituting a model under which someone might have to do a big coding job *if* they want to integrate libraries, for a certain big coding job carried out *in case* someone might possibly want to integrate libraries in the future? Help me out here - how is this a benefit?

And in any case, "all" the semantic relationships seems like either an unreasonable goal, or one which completely ossifies the word set. Under this model, every time you make a change to a word (or worse, a new word), you have a vast job specifying its relatioship to everything else.

And finally, it would be nice to see some proof that the addition of new grammar rules is actually an easier job than rewriting the words.

This is not a flame -- I'm just asking for clarification on what you have to admit is a pretty abstruse topic.

--
Just because things have been nonergodic so far, doesn't mean that they'll be nonergodic forever
[ Parent ]

Looks hairy (4.00 / 2) (#13)
by wiredog on Thu Mar 22, 2001 at 09:27:49 PM EST

Worth a discussion, though, as others have noted, it seems to be more handwaving than anything else. Maybe I just can't wrap my mind around it.

The idea of a global village is wrong, it's more like a gazillion pub bars.
Phage

Is this a joke? (4.50 / 2) (#16)
by scriptkiddie on Thu Mar 22, 2001 at 10:08:00 PM EST

Maybe I'm biased after the Carmina Burana story, but this sounds like a hoax...follow that link and click "screenshots," you get screeenshots of a bunch of unrelated GTK+ applications compiled for X and Windows. There's no actual screenshots of these XML GUIs as far as I can tell. And if you wanted to implement a word-oriented grammar for the Web (which sounds like sort of what they're trying to do), you'd still have to actually implement every word...it wouldn''t particularly save any work.

Anyone care to explain?

We used GTK apps to test the XMLGUI syntax. (4.00 / 1) (#17)
by nile on Thu Mar 22, 2001 at 10:23:01 PM EST

We used GTK apps as templates to test the XMLGUI syntax. That way, we had all of the icons and sample layouts to use. If you look at the screenshots on the right, you'll notice that the same layouts are also working on Windows. You can see source to words in BlueBox on the longer version of the Word model that is linked to above. Look for a link on the right hand side of the paper farther down the page.

By natural language browser, we mean natural language programming browser, not natural language processing. We're not trying to implement a word-oriented grammar for the Web (natural language processing). The word model is simply a different programming model.

Hope this helps,

Nile

[ Parent ]
One clarification (none / 0) (#26)
by farmgeek on Fri Mar 23, 2001 at 09:17:33 AM EST

I read the white papers. I understood a fair bit of it and think that the programming model is interesting, but what exactly is Blue Box?

You said it is a natural language programming browser, which doesn't exactly explain it's purpose to me. I may have an educational gap showing there, but if you would, please explain what Blue Box (as opposed to the Word Model)is, using small words I would really appreciate it.

[ Parent ]
Talking to myself (none / 0) (#28)
by farmgeek on Fri Mar 23, 2001 at 09:28:29 AM EST

But nevermind. I found what I was looking for..

"BlueBox works by separating out the interface of open source applications from the rest of the application and serving that interface over the Internet or an intranet. Companies that use software served from a central server to all of the operating systems in their organizations will experience significant savings. Rather than going through a continual process of buying, installing, upgrading, and maintaining software, they will be able to buy an all-in-one-software box for their organization without changing the operating system of their existing desktops."


So, basically it's an Xterm, or am I missing something significant?

[ Parent ]
Okay.... (4.00 / 1) (#33)
by scriptkiddie on Fri Mar 23, 2001 at 06:22:57 PM EST

That was actually what I thought at first. but then I saw your Gimp screenshot. You didn't implement an entire Gimp canvas structure in XML, did you? Or is that a dummy Gimp that doesn't actually do anything?

Sorry if I sound overly accusatory, this is a pretty alien concept so it might take a while to get used to....

[ Parent ]

You're right. (none / 0) (#40)
by nile on Sun Mar 25, 2001 at 07:29:15 PM EST

We didn't implement an entire screen. We just connected an action in XMLGUI to an image to test out that action. Sorry for the confusion,

Nile

[ Parent ]
I don't get it. (3.00 / 1) (#19)
by Mr. Piccolo on Thu Mar 22, 2001 at 11:25:06 PM EST

I'm not sure I even understand the problem you're trying to solve, much less the solution you're proposing.

As far as I can tell, the problem is you want to combine parsers for two separate grammars into one parser that can parse some combination of the two grammars?

And your solution is to create a new programming language in which you can not only specify attributes and operations of some unit, but also how it interacts syntactically with other units?

It seems to me we already have tools to deal with creating parsers from descriptions of a grammar automatically; they're called lex and yacc.Integrating two (or any number) of grammars should be as simple as grabbing the descriptions of grammars, changing the merged description to make it consistent, and running lex and yacc on the new grammar description.

Maybe I'm missing something important, though; as I've said I'm having difficulty understanding the precise problem from your writeup.

The BBC would like to apologise for the following comment.


Interesting Idea... (none / 0) (#30)
by nymia_g on Fri Mar 23, 2001 at 10:36:36 AM EST

The idea of combining grammar rules from a set of domains is like performing a union on them. However, what I'm not sure is how intersections are handled. Would the intersection be grouped as a top-down list where it can be assigned a priority number? I think that would be best way of handling intersections though.

As a consequence of combining these grammar rules, it will eventually form an entirely new set of expression which was not possible before and that would probably surface at the syntax level. Put another way, this would result to the creation of an expression distinct from the other elements.

Another interesting idea is the treatment of grammar actions. It sounded like the actions are part of the data and methods. In OO, methods are fairly static and can only operate on a given data. By introducing grammar actions into the equation, it actually brings a new kind behavior wherein a grammar action can operate on any method and/or data. If this were allowed, would it cause confusion among the OO enthusiasts? Probably it will.

What about templates + concepts/constraints? (none / 0) (#31)
by kostya on Fri Mar 23, 2001 at 12:25:17 PM EST

I'm not going to say I understood everything in the story--you use a lot of jargon with very little "concrete" examples. But it sounds like you are trying to restrict how objects might interact.

If that is the case, Templates plus Concepts might do exactly what you are trying to accomplish. Sure, it might not be at a "symbol" level (i.e. "->" means X or Y), but it comes close. Concepts came about as a way to "restrict" templates. From what I gathered in your article, you could do something like this with templates and concepts.

An example (which might be totally missing the point):



template<class Element> class Set {
... // stuff
public:
void add(Element e);
}

You could then use concepts to constrain how the template will function, not allowing certain types of elements to be added, etc.

What's more, I'm sure some really clever and bright template programmer could write a library of template objects, such that you could have decalarations like:



#include <words>
LogicalElement<Element, RuleMemberOf> le;
SomeLogicalSet<Element> set;

le + set;

You could have the "+" operator be some kind of associating operator, allowing the logical element to bind itseld to the set, or something.

You have to remember that people hate new languages. Sure, someone always likes them. But not all the guys who have been busy building in "X". They want a way to do it without having to scrap their skills. Despite the usual C zealot flaming C++, C++ is popular because it allowed C programmers to transition easily, using what they wanted. I'd like to see your concepts done in an existing language. Perhaps C++ templates would be just fine?



----
Veritas otium parit. --Terence
Templates and concepts break the relationship (none / 0) (#32)
by nile on Fri Mar 23, 2001 at 03:12:01 PM EST

Thanks for the comment. I had to sit down and study your code carefully to understand what you are doing. The paper is being rewritten using C++ as an example language, but templates don't factor into it.

I'm going to answer your question below and I hope the answer will make sense in what's already written. If not, wait for the rewritten version of this paper and see if it makes sense then.

The following is my understanding of what you are doing. In your last set of code, you have:

#include <words>
LogicalElement<Element, RuleMemberOf> le;
SomeLogicalSet<Element> set;


le + set;

Now, a quick way to analyze this to see if it is identical is to look at where the grammar rules exist. In this example, the syntax rules are expressed in the above global file and semantics are encapsulated in the templates.

The problem that a programmer writing this code will face is that global syntax rules have unintended side effects on each other: that's the point of the paper. The template example works with just a few rules, but let's imagine another chemistry template, now:

template<class T, class Y> class ComplexMolecule{
../stuff
public:
}

The syntax of chemistry and material science could then be integrated with: #include <...>
ComplexMolecule<Element, AnotherElement>
Material<Element>


Now, let's say that we wanted to prove logical claims about sets of molecules. This would require integrating the two sets of relationships:

#include <words>
LogicalElement<Element, RuleMemberOf> le;
SomeLogicalSet<Element> set;
ComplexMolecule<Element, AnotherElement>
Material<Element>


Notice that Element has two different meanings here. There is an unintended side effect in the grammar rules.

hope this helps,

Nile

[ Parent ]
elements? (none / 0) (#38)
by delmoi on Sat Mar 24, 2001 at 12:20:14 AM EST

Ok, maybe this just all went over my head, but, why did you need to call everything an 'element' Are you talking about logical elements, or is element a core peice of 'word programming theory' or is element something like Hydrogen and Bromide?

Now, again I'm not exactly sure what you're trying to do with these 'element' things, but it might be clearer if you called them things like DataElement, ChemElement, or LogicElement. And how would that not solve the problem?
--
"'argumentation' is not a word, idiot." -- thelizman
[ Parent ]
I used "Element" to show the problem (none / 0) (#39)
by nile on Sat Mar 24, 2001 at 12:37:19 AM EST

You actually understand. I wasn't trying to show the solution with templates, but how they didn't solve the problem. So, I imagined two different template writers that both used Element in a way that made sense when they were coding. However, when a third programmer tries to integrate the templates in the method described by the original poster, she would encounter the problems that you just described. There would be conflicts and it wouldn't be clear what Element meant.

Nile

[ Parent ]
Perhaps I worded it poorly ... (none / 0) (#43)
by kostya on Mon Mar 26, 2001 at 11:59:07 AM EST

Element in my examples was just a string--you could have just as easily used T or Foo. The template takes a parameter, in my examples it was a Class type. The template then uses the class type to generate a specific "version" of the generic pattern that will work with that type. So I'm still not positive that "word relationships" could not be effectively created using templates.

Think of it this way: many "research" languages implement properties or events as language level artifacts. But Java and other OO languages handle this concepts very well with some simple programmer-side discipline. Properties are represented by naming schemes (get/set/is etc) and events are represented by patterns (ListenerInterface with a method or methods that must be implemented, addListener, removeListener methods on the event generator).

What I am saying is that I think you have a cool concept that can easily be added to languages like C++ using a library as opposed to a language-level artifact. My example with a "+" was just icing, and probably confused the issue.

Back to the examples:

#include <rules>
#include <ruled_containers>

//general example
LogicalElement<Some_Element_Class, Rule_MemeberOf> le;
LogicalSet<Some_Element_Class> set;

// specific example
LogicalElement<Periodic_Table_Element, Rule_MemberOf> material;
LogicalSet<Periodic_Table_Element> complex_molecule;

Note that the material and complexmolecule are oversimplified, but you could be more specific by subclassing from the generic containers and creating containers more specific and better suited to the problem domain--which is, in its essence, the major task of OO programming.

The thing is that what the templates can do is inconsequential (i.e. how do you assoiciate elements with one another, how are rules used and who encapsulates what, etc). What I am saying is that templates allow a level of abstraction (i.e. no longer what types, but what are you doing with an X type, where X can be anything) that should support the concept you are going for. And then you have the benefit of harnessing all those really skilled C++ programmers out there.



----
Veritas otium parit. --Terence
[ Parent ]
Interesting, but there are questions... (5.00 / 1) (#35)
by Alhazred on Fri Mar 23, 2001 at 09:03:23 PM EST

This is very much like the way FORTH works. That is if you are a serious FORTH programmer.

Now the FORTH language itself has a very simple syntax rule, every syntactical element (token) is bounded by whitespace. HOWEVER, all internals of the language, including the parser, are implemented as FORTH "words", thus the parser itself and, more importantly, the input stream, are accessible. Thus a FORTH word can effectively encapsulate syntactic and grammatical rules.

As an example the FORTH word IF enforces a syntactical rule that it must be followed by an ENDIF. This is a bit of a trivial example, but in fact because you can access the parser itself within the scope of such a construct, it is fairly trivial to do things like embed C in FORTH.

There are of course no real formal ways to specify the enheritance and overriding of syntactical and grammatical rules however. So far I'm having trouble imagining a practical implementation of a parser, short of the FORTH expedient of just writing nested parsers that share an input stream for each syntactic "realm".

It will be interesting to take a look at the bluebox implementation...
That is not dead which may eternal lie And with strange aeons death itself may die.
BlueBox is to Forth what Python is to C (none / 0) (#37)
by nile on Sat Mar 24, 2001 at 12:00:41 AM EST

Very good analysis! This is not a slam on Forth, by the way, just an analogy.

There is a definite relationship between word-oriented programming and languages like Lisp and Forth that allow you to extend their syntaxes. Coupling the grammar rules in these languages with the methods and data that define the syntax of what they parse would make them word-oriented languages.

Nile

[ Parent ]
Certainly! I agree. (none / 0) (#57)
by Alhazred on Sat Apr 14, 2001 at 12:56:55 PM EST

Having worked extensively with extending FORTH (I think I wrote some of the early OO extensions for FORTH, though they duplicated more or less by many others around the same time).

Notice that interestingly FORTH and LISP never made it into the mainstream of language development. The syntactical restrictions imposed in these languages in order to achieve extensibility limited their appeal severly.

I would still like however to gain a better understanding of how your concept would be translated into a concrete system. One which is both syntactically elegant and has the requisite performance upon which real-world applications can be built.

I guess I will HAVE to study Bluebox! ;o).
That is not dead which may eternal lie And with strange aeons death itself may die.
[ Parent ]
Nice idea, but insufficiently crunchy (5.00 / 1) (#41)
by Simon Kinahan on Mon Mar 26, 2001 at 07:19:28 AM EST

OK, so as far as I understand (and you could have written this much more simply IMNSHO), you want a language were new constructs can be created that encapsulate the grammatical rules for their use. I sort of understand why this might be useful, by the analogy someone made with Forth, or Lisp, bu in spite of the difficulty of the articles, there's no technical detail given on how the parser and language semantics need to be modified to accomodate this.

Simon

If you disagree, post, don't moderate
I agree, I'm the author, mod it down ;] (none / 0) (#42)
by nile on Mon Mar 26, 2001 at 10:51:00 AM EST

I agree. In retrospect, this tries to explain way too much for such a short piece. I've written a more detailed explantion as a new article.

Thanks,

Nile

[ Parent ]
Grammars are not ubiquitous (5.00 / 1) (#45)
by mvw on Wed Mar 28, 2001 at 11:34:25 AM EST

The achievement of object oriented programming was a way of better organizing programms.

Every programm needs to organize its data into more or less elaborate data structures and needs to organize its operations on that data into more or less elaborate algorithms/methods. So it was a good idea of localizing the apropriate data with the apropriate methods.

On the other hand, only few programms need grammars.

Grammars are used in the context of functions that work rather on strings of symbols than numbers. In fact all possible strings of a language should be generated by a grammar of this language. The programs that match strings against a grammar are called parsers.

Of course not every program needs to do heavy duty work on strings, and of course, even if it needs string processing, there might be other means to achieve that processing without the need of grammars and their parsing machinery.

I strongly doubt the relevance of these ideas, that mostly look like a way to localize parsing, for general computing.

I would not be surprised if this turns out to be either an

  • April 1st prank,
  • a voodoo scheme to charme venture capitalists or
  • just some cranky computer scientists trying hacking

Uhm, were is the code, actually?


Regards, Marc

Here's a link to the code (none / 0) (#49)
by nile on Fri Mar 30, 2001 at 12:43:59 PM EST

You can find the code at bluebox.sourceforge.net. This is a link to the current C++ version which is currently being obsolteted in favor of a Python version. That version will be out at the end of April.

Look at the rule/relationships in the XMLGUI words. Notice how it is possible to inherit them and that polymorphism is also possible. This is a new type of inheritance and polymorphism.

I recommend reading the longer version the word model posted earlier to understand what is going on here.

cheers,

Nile

[ Parent ]
It HAS to be an April 1st prank (none / 0) (#56)
by exa on Sun Apr 01, 2001 at 01:22:36 PM EST

I would rule out the possibility that nile is a computer scientist. Just to make sure, I showed that sourceforge page to a couple of other CS grad. students like me, and one of them actually bothered to take a look at it but he laughed his ass off. I mean, well, there are these nice people like nile who really want to do something but perhaps not quite getting it right.

nile: If you're writing a new PL, that's appreciated. If not walk out of the door. And don't make overly exaggerated statements which come to mean "I've invented a new programming paradigm". YOU HAVE NOT. In the writing of compilers and NLU projects people DO this, and in a way that you probably will never be able to make if you keep working your way. You have to read tons of textbooks and lots of papers to get to the level where you can really put together some new semantics. Go read some parsing/automata methods, and you'll see that there're MOUNTAINS of MATHEMATICS that you are ignoring. Writing even an LR(1) parser is a very difficult thing, and I DON'T THINK YOU HAVE EVER WRITTEN AN LR PARSER IN YOUR LIFE. Yeah, perhaps you have used some parser generator, or written some silly recursive descent parser but those are not the same thing. First write a parser! Then RANT about it!

Ah, and as clearly indicated these ideas are not that relevant for a general purpose language because you can't use a stupid parser for solving most of the problems. My suggestion: choose yourself more concrete and narrow goals for your personal projects.

Just because there are people here who are non-programmers and may think your good-english-bad-idea articles might be worthy, and vote for it, doesn't mean these articles of yours are worthy!!
__
exa a.k.a Eray Ozkural
There is no perfect circle.

[ Parent ]
Theoretically undoable (2.50 / 2) (#46)
by wytcld on Thu Mar 29, 2001 at 07:37:25 PM EST

It's widely claimed in linguistics that you can't get semantics from syntax. Since the specifications for any programming language are syntactic, the question is whether you can actually create an artificial language where the semantics can be fully embodied in the syntax. If you can do this you'll have a powerful counter-claim not just against Chomsky, but also against most of the competing schools of linguistics. Good luck!

Something different is going on here (none / 0) (#51)
by nile on Fri Mar 30, 2001 at 12:51:18 PM EST

I agree that you cannot get semantics strictly from syntax. There is a large amount of research to back that up.

This article is pointing out something much more mundane and eminently more practical. In every program, the elements in the domain it is working in have both syntactical and semantic relationships. In a math program, the syntactical relationships would be the legal ways in which it was possible to put together '0-9,' '+', and '().' The semantic relationships would be what those relationships mean. The point of this article, which is explained more in the detailed version, is that we should couple syntactical and semantic relationships for the same reasons we couple data and methods in objects. Both couplings eliminate side effects and make possible a new type of inheritance and polymorphism.

Let me know if this doesn't make sense and I will try to explain it from a different angle. I also recommend reading the longer version for a more detailed explanation.

cheers,

Nile

[ Parent ]
Problems computer scientists can solve... (5.00 / 2) (#47)
by ucblockhead on Fri Mar 30, 2001 at 10:36:57 AM EST

First, I hope this isn't taken as a slam, because it isn't meant as such. I've been very interested in these articles and hope they keep coming. Having been around ~1985, much of the confusion echoes the reactions of people to OOP when it hit the mainstream. Whether this is a case of that, or a misfire, I'm not smart enough to say.

But anyway...this quote bugged me on two levels:

The word model is important because it increases the number of problems that computer scientists can solve.
First, it bugs me because too many new languages, or types of languages, are aimed at "computer scientists" not "programmers". That is, too many fall down when they hit real world concerns.

But that's not a a big deal. What really bugs me about this quote is that it is wrong. It needs a word like "easily" near the end there. Because as we all know, a Turing Complete language is a Turing Complete language. The "Word Model" might be the bee's knees and the greatest thing since sliced bread, but it doesn't extend what we "can" do one jot. The question is (and what I'm sure you mean) does it make a large set of problems easier to solve?

The reason this bugs me so much is that I've had the misfortune of being on projects where it was felt that certain languages or models were "needed" to solve the task, at which point all reason was thrown out the window and all the effort was thrown at shoehorning the new model in when it would have been faster just to do what everybody already knew.

It may seem to be nitpicking to harp on one little word like "can" so much, but I think that it is critical to understand that things like the "Word Model" don't increase the number of things we can do, they merely make easier the things we want to do. With that understanding, you are much less likely to get blinded by the technology and forget to solve the problem at hand.
-----------------------
This is k5. We're all tools - duxup

A scientist is not her science (none / 0) (#48)
by Sunir on Fri Mar 30, 2001 at 12:16:37 PM EST

Actually, while you are theoretically correct and I understand your point, I think there is a practical dimension you are missing. Any Turing Complete language can implement any problem computer science can solve, but it doesn't follow that any Turing Complete language can implement any problem computer scientists can solve.

That is, computer scientists are people. People have a limited ability to understand the problem domain, despite the infinitude of possible solutions. Consequently, the limiting factor in practical computer programming is not the fundamental capabilities of the language, but the incapabilities of the programmers' minds.

As you know, this is why we keep building new languages; and I think it's why people are so religiously tied to their favourite language. Of course I don't like Java because it makes me feel stupid (can't get anything done). I don't like feeling stupid, even though programming is so stupifying!

Anyway, this is a stock response. I just wanted to refute your dismissal of "needing" a language to do a job. I really do need Perl to do CGI scripting because it's easier to get a handle on than doing something in C++, where the regular expressions are likely to be wrong and difficult to change.

I guess it comes down to money. While money isn't a force on Turing Completeness, it is quite a significant force out here in the Real World. It costs more money to develop CGI scripts in C++ vs. Perl. So, it's legitimate to need Perl and not want C++.

"Look! You're free! Go, and be free!" and everyone hated it for that. --r
[ Parent ]

No (none / 0) (#50)
by ucblockhead on Fri Mar 30, 2001 at 12:49:02 PM EST

No, that's precisely my point. If anyone working for me said that they "needed" Perl to do a CGI script, I'd fire 'em. You don't "need" Perl. You want Perl. And it is perfectly legitimate to want Perl. I'd want Perl if I were writing a CGI script. But if C++ were the only thing available, I'd write it in C++, because it is perfectly possible. And the truth is, once you stop telling yourself that you "need", Perl, and just start coding, you'll realize that it isn't actually all that hard to do it in C++.

Harder than Perl, yes, but not that hard.

You don't "need" Perl. You "want" Perl. And that is an absolutely critical difference when thinking about these things because when you lock yourself into "need" mode, you end up paralysing yourself by looking for the perfect tool instead of coding. I've been on projects where exactly that happened.


-----------------------
This is k5. We're all tools - duxup
[ Parent ]

Efficiency not efficacy (none / 0) (#53)
by Sunir on Fri Mar 30, 2001 at 02:01:43 PM EST

Well, I hope you wouldn't fire someone for that! That's really bad management. It's also bad management to ignore time and fiscal pressures. Developing a CGI in C++ takes longer than Perl, and it's harder to maintain, which means you can slip on deadlines and lose that important customer. So, yes, it's still legitimate to need Perl.

In a theoretical Turing world, you don't, but the point is we don't live in academic theses. I'm trying to introduce the other factors that exist in the Real World that need (!) to be considered.

If your suggestion that C++ is just as easy as Perl is your sticking point, that's a completely separate point. "[C++ is] Harder than Perl, yes, but not that hard," is something you find, but those people that are more proficient at Perl regular expressions might disagree.

Consider that I wrote a C++ header parser in Perl in two days, whereas that would be intractable if I coded it directly in C++. I had a hard, hard, hard deadline of three days to write a document generator. For me, Perl was faster than C++. Maybe you are just much better at writing recursive descent parsers in C++ than I. But I'm sitting in this chair and you aren't, so blphssst. ;)

"Look! You're free! Go, and be free!" and everyone hated it for that. --r
[ Parent ]

You are missing the point. (none / 0) (#54)
by ucblockhead on Fri Mar 30, 2001 at 03:46:25 PM EST

No, again, you never need a certain technology. You do take into account the costs and benefits of each available technology when applied to the problem domain.

I did not say that C++ is "just as easy as Perl". That's exactly where you are missing the point. What I am saying is that until you abandon the mistaken notion that you "need" a certain technology, you are never going to correctly weigh the costs and benefits of various technologies when applied to the project.

The attitude of "I need X" leads into disasterous paths where you spend more time getting X working on a new platform than it would have taken to just to do the project using the already available, but suboptimal, technology Y.

Yes, if I were in your shoes, I'd use Perl too. That's not the point. The point is that if you say "I need Perl", then you end up paralysed when you end up on a platform were Perl is not available.

That's exactly what I am talking about here. I saw a very large project sucked down into a black hole because the guys in charge decided that they "needed" full OOP with dynamic classes. They then proceded to burn many man-years trying to shoehorn that into C++, when if they'd just sat down and written the damn thing in straightforward C/C++, they'd have been done.

Again, we are not talking about "Perl is better that C++" here. We are talking about the attitude of "what tool in my toolbox will best solve the problem". And you are damn right that I'd have very strong words with anyone who came to me and said "I can't do the project, because I need that shiny new tool". No, you don't "need" it. Compare the cost of aquiring the tool with the cost of not using the tool.

An example:

It'd take you two days to write a C++ header parser in Perl. Suppoes it'd take you two weeks to write a C++ header parser in C++. Now suppose that you are working on a platform without Perl, and that it will take you three weeks to port Perl. (We are talking hypotheticals here.)

In such a case, is it smarter to use C++ or Perl for the project? (Assuming it's a one-off.)

That's the point. You don't "need" Perl. Perl is better, in a quantifiable way, for a certain set of problem domains. Those are two very, very different statements. To mistake one for the other is a massive mistake.
-----------------------
This is k5. We're all tools - duxup
[ Parent ]

Violent agreement (none / 0) (#55)
by Sunir on Fri Mar 30, 2001 at 04:15:01 PM EST

I agree with your point in your hypothetical situation, but there are very few circumstances when you are limited in technology. Typically you have many options. Nonetheless, the impoverished situations exist. For instance, I have to use CodeWarrior to develop on a Palm (especially given my efficiency constraints). So, while I would prefer using an environment I am more comfortable with, I have to adapt.

One of the worst places to work is at a place that makes development tools. You are forced to use their development tools even if they are a bad fit to the problem (or just bad in general). In that case, you need to use the house's product--for political reasons as well as dogfooding. Having worked for a very large corporation on their dev tools, I can tell you how frustrating this was.

Anyway, I think you need to do whatever it takes to do your job to the best of your ability. If you can use Perl, do so. If you can't, I agree that you should stop kvetching about it and get on with your life. Same goes for non-programming situations as well.

I think we're violently agreeing.

"Look! You're free! Go, and be free!" and everyone hated it for that. --r
[ Parent ]

To everyone who made suggestions (4.00 / 1) (#52)
by nile on Fri Mar 30, 2001 at 01:12:56 PM EST

Several people made suggestions on how to make "The Word Model: A Detailed Explanation" better. As they've probably noticed this article does not incorporate any of them.

This was not intentional. This is actually a version of the article that was posted over a week ago and that several people indicated they wanted a longer version of. To that purpose, when this had received 450 votes or so (with only a +27) and seemed about to be moderated out, I wrote a new longer version: "The Word Model: A Detailed Explanation."

Somehow this article has risen from the dead - probably because it is a good summary of the longer version. I just want to let all of Kuro5hin readers who made great suggestions know that I am not ignoring them. Your help is greatly appreciated and your suggestions will be incorporated into the documentation on the site in the future.

cheers,

Nile

The Word Model: a new technology to integrate diverse problem domains | 57 comments (43 topical, 14 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!