Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

Why not parallel (agent frameworks)?

By maketo in Technology
Wed Sep 20, 2000 at 03:33:55 PM EST
Tags: Software (all tags)

Several months ago I was given a task of writing a very fast genetic sequence alignment software. For people that dont know, this software takes two gene sequences of arbitrary lenght and produces all of their possible alignments (see chapter four of this tutorial)...

The real problem arises in lab use where a scientist has to compare a new gene sequence to an existing database of hundreds of thousands of sequences. There are already a lot of web sites to do this but this particular piece of software was supposed to be "ours" (sigh). At first I didnt know much about the whole subject so I decided to conduct an experiment using a dynamic programming algorithm. To be fast, C was the programming language choice. The sequential version of the algorithm worked smoothly, however wasnt nearly as fast for what we wanted to do. So I decided to try and parallelize the algorithm. Writing threaded applications that work under different platforms can be a pain, no matter how much standardization people have thrown in it. Soon, my parallelized version worked on a two-cpu Alpha machine and I was pleased (it was faster too). However, when compiled under Linux or Solaris or FreeBSD it trashed the memory and then the painful task of debugging a multi-threaded application started. Due to lack of time and proper tools(my finals were coming) I abandoned the code.

After a while I revisited the topic. This time I had an idea - why not have a bunch of little processes (agents) sitting around on a LAN and some of them can crunch on the sequence database and some of them can produce alignments and yet some of them can monitor LAN resource usage and redirect agents to more "unused machines". I would parallelize the solution twice: first, on each machine I can have a number of these agents working together, and second, these agents can be parallel over the network. Soon I realized that generalization was in order - why not make these components general enough, CORBA compliant (so that they can be written in the language of choice like python ;), why not have them publish their programming interfaces in XML and register them with a DNS-like yellow-pages service, make the components mobile so that resources can be used evenly on the Net, throw in an agent tracking service (so that we know where our components are right now), perhaps even in the future enable these agents to reason on each other's interfaces.

What would be the benefits of such a solution? We can focus on the problem we are solving. Instead of worrying if my C pointer is trashing my memory because it past the end of a string, I can worry about the algorithm. Instead of thinking about posix compliant threads and obtaining locks I can have the parallelization as given, by default. Instead of beeing contained to a single machine and its' python (or java or...) implementation, I can have my python sitting anywhere, even pieces of it sitting on different machines. I can upgrade components on the fly. My friend can write a module in java, register it with the yellow-pages service, use a standardized XML to describe its interface and I can use that new component on the fly. I can even write a small OS-like module to wake up on my python running Palm and get all the components it wants off the Net.

Of course, one will immediatly see space in this proposed solution for a graphical tool to do many a chore: assign methods to components, monitor network usage, manipulate agents by sending them around, build control components that monitor CPU/RAM usage on the network and resend the components where this usage is lower, draw graphs, observe behavior that arrises. Use different agents to suit different underlying operating systems, cpu's and memory sizes....

There are already mobile agent frameworks around. Aglets have been around for years, so have other solutions. However, none of these is of commercial value. These solutions are either too academic to be practical or too low on features to be of any use. Besides, they lack control tools. Also, beowulf has been around for a long time. So has been plan9. However, i need a tool and solution that will work independant of language or OS. One that is network enabled, by default, without my intervention.

Not every problem lands itself to this solution. However, it would be another way to solve things. For now, I am stuck to solving matters on a single machine and telling it to do exactly what it has to, no matter what the complexity of the problem attacked is. Many people observe that Open Source usually produces tools and methods that are (better) copies of commercial originals. Perhaps an Open Source tool such as the above proposed would be a good thing to do. Give people the power, for free.


Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure


Related Links
o this tutorial
o Also by maketo

Display: Sort:
Why not parallel (agent frameworks)? | 12 comments (6 topical, 6 editorial, 0 hidden)
COSM (3.50 / 2) (#5)
by dieman on Wed Sep 20, 2000 at 02:06:48 PM EST


Check this out, allows you to develop programs that can be easily compiled on any platform cosm has a port to.
COSM does not mention CORBA (written in C/asm) (4.00 / 1) (#7)
by maketo on Wed Sep 20, 2000 at 02:37:01 PM EST

Basically I am tied to C/asm and their API. I also depend on their good will and porting to my platform. If the whole system is written in python/Java or another networked language of choice, if components are CORBA then we are free to play. Also no mention of XML interfaces or any even remote facility to reason on new components that "join" the services offered. Not bad for starters, I definately could use their experience in my project.
agents, bugs, nanites....see the connection?
[ Parent ]
[Slightly OT] Mindless Language Propagation (3.50 / 2) (#8)
by Just Me on Wed Sep 20, 2000 at 02:42:29 PM EST

Have you looked at Ada95? It has support for tasking (multi-threading) and building distributed systems built-in and standardised, so there should be less problems with it than with C. Information on Ada95 can be found here, a Free compiler (GNAT) can be found at ftp://ftp.cs.nyu.edu/pub/gnat/

My bad... (3.00 / 1) (#10)
by AgentGray on Wed Sep 20, 2000 at 03:36:35 PM EST

I voted the piece up before the changes were in place. Now the entire thing is on the front page.

I apologize.

Re: My bad... (none / 0) (#11)
by AgentGray on Wed Sep 20, 2000 at 04:15:32 PM EST

Ah, much better. Thanks to the author or editor who fixed my mistake.

[ Parent ]
piper (none / 0) (#12)
by snowdeal on Thu Sep 21, 2000 at 12:00:44 AM EST

you may want to check out piper:
"Piper is a system for managing multi-protocol connections between Internet-distributed objects. Networks, programs, files, widgets, and so on, are all treated as objects and represented in a graphical user interface (GUI) as the nodes of a flow chart (with the Pied/Piper user interface). The user can join nodes via lines that depict links for data flow, procedural steps, relationships, and so forth.

The Internet-distributed nature of Piper lets the user work in a unique way: Only the graphical representation of an object resides on a local workstation. Compute-intensive programs and large data sets can reside remotely on high-performance, high-capacity computers.

Joining nodes across the Internet can also be used to form world-wide collaboratives (such as The Loci Project) and provide an almost limitless collection of objects for the user. "
there is a also a recent article entitled Distributing computing the GNU way that you might find interesting. hope this helps. - e
-- http://snowdeal.org [mutated daily]
Why not parallel (agent frameworks)? | 12 comments (6 topical, 6 editorial, 0 hidden)
Display: Sort:


All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!