Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Do programmers have it all backwards?

By jd in Technology
Mon Apr 30, 2001 at 04:48:04 PM EST
Tags: Software (all tags)
Software

There is a fascinating Operating System called "Ants", in which tasks migrate across a network to the data, rather than have the data transmitted to the tasks. Why? Because it scales much better than pushing data around, and as tasks are generally smaller, there's much less overhead.


Could something like "Ants" spell the end of Peer-to-Peer networks, as they are currently implemented? Perhaps. By having migrating code, you only have to run a (sealed) environment for arbritary "ants" to run in.

Gnutella, FreeNet, etc, which run by having a fixed piece of code on each machine, and then some local pool of data, may turn out to be simply too inflexible to work, in the long-run. Each does one thing, and one thing only. Want to do something else? Then you have to bring the data over, and do the work on your own machine.

This, however, largely defeats one of the great benefits of peer-to-peer -- the sheer volume of computing power at everyone's disposal. It's largely wasted.

This approach of migrating the code is not new. It was one of the concepts toyed with, when "agents" were the in-thing. You didn't go to the server, the agent did. It then reported back with the results, leaving you free to get on with something useful in the meantime.

It was also something that Java brought into the realms of possibility. It's much easier to migrate code, on a heterogenious network, if you can run on any machine WITHOUT recompiling.

In this day-and-age of viruses (virii!), malicious computer users, etc, migrating code of this kind seems like an insane delusion. However, it might not be. It's easy to seal off a section of the computer, such that nothing can cross the boundary between the two parts. SE Linux achieves this, quite satisfactorally, and it's not even a tenth of the way complete!

Once you can seal off a section of the computer, then viruses are contained. They can't spread into the rest of the system. If each process runs in it's own self-contained section, like this, then hostile code can't even infect another migrating process. It can inect itself, but that's about it.

There's another side to this, too. Content control. Freenet, Gnutella, etc, rely on sheer volume to overwhelm any attempt by organizations to subvert them. Fear and intimidation work just as well on large groups as small groups. A system where ants migrate is a system where users cannot know where the data is. ALL that's visible to the user is the result, and what site(s) they sent their ants to at the start. The result of this is that external content control becomes much, MUCH harder. At this point, nobody knows where the data is. Nobody at all.

(Arguably, the user who makes the data available does, but even that's not necessarily true. Ants may well have made copies, elsewhere, transferring the data to locations that are optimal for those wanting to access it. At that point, even the person who originally posted the data can't be sure where it is.)

There is one last additional advantage to this system. It turns the network of nodes into a shared supercomputer. The main reason Beowulf-type clusters over WANs are slow is that they ferry around data. But data is bulky, and there's generally a lot of it. There are many fewer processes, and they're usually very small. WAN-based clustering, then, would be much more efficient with an ants-style process.

Thought? Opinions?

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Poll
Migrating code is...
o Extinct. Buffalo Bill wiped it out by mistake. 4%
o A bad idea. Too many hazards. 10%
o So-so. Want to see it implemented as Open Source, first. 19%
o An intriguing idea, with a niche market. 20%
o A great idea! I'll code it tomorrow. 6%
o Ghoti (fish) 7%
o 42 18%
o The idea is seriously flawed (see below for details) 12%

Votes: 107
Results | Other Polls

Related Links
o Also by jd


Display: Sort:
Do programmers have it all backwards? | 45 comments (35 topical, 10 editorial, 0 hidden)
No - programmers do not have it backwards (3.87 / 8) (#5)
by Dacta on Thu Apr 26, 2001 at 09:19:52 AM EST

The success of Napster, and partial success of Gnutella has nothing to do with the technology they use, the language they are written in or (to some extent at least) how scalable they are.

Napster succeded becuase it was an easy way to get free music. That's it - simple!

There are a number of current toolkits that allow you to write mobile code fairly easily. Unfortunatly, no one has really come up with a good use for it yet.

Popular Power thought they could make money by selling the distributed computing power, but the money wasn't there. That's the best attempt anyone has come up with.

Remember the key to the uptake of any new technology: What can it do for me?. How does P2P mobile code help me? Possibly I could make a few cents a day selling my excess computing power, but are there any better ideas? That is the key!

Joel Spolsky has a nice article on this very thing.



I voted "niche market"... (4.20 / 5) (#7)
by DesiredUsername on Thu Apr 26, 2001 at 10:05:40 AM EST

...although "specialized" would be a better word.

The problem with mobile agents is that it presupposes that the data is already distributed. This is potentially useful for something like a search bot, but pretty lame for something like Napster. The whole POINT of Napster is to get the data from your PC to my PC. How would mobile agents solve that problem? There is no conceivable "agent" you could send me that would cause my computer to play "Metallica: Nothing Else Matters" unless the agent was effectively a copy of that song.

So I say "specialized": you have to have a situation where the data is "naturally" distributed (like price data at multiple online stores or something) where it is more efficient to transfer the agent N times plus one more transfer to get the results back to me than it would be to transfer all the data back first and then process. Internet searching and maybe some private, specialized LAN uses are all that leap to mind.

Play 囲碁
Example & Analysis (none / 0) (#31)
by craser on Tue May 01, 2001 at 04:53:22 PM EST

This is highly simplified. Also, there's a lot I don't know, so anyone who can refute/confirm my thoghts below is invited to do so.

In any computation on remote data, there are basically two ways to do it:

  1. Move the data to the local machine, then do the computation. (AKA "Traditional Method")
  2. Move the code to the remote machine, do the computation, move the results to the local machine. (The method proposed in this article.)

If the cost of moving the code to the remote machine together with the cost of moving the results back is less than the cost of moving the data, then you're going to win. If this is always the case, or is the expected norm, then you're going to win big. In other words, if the data you're actually looking for is much smaller than the data you have to sift through, sending out an Ant is a good idea.

This is really useful (as mentioned above) for something like an search engine. Google (or whoever) sends out an Ant to a server that they want to index. Once loaded into the remote server, the Ant indexes the site, and returns the results to Google. These results are vastly smaller than the data that would have to be moved under the traditional model, which means that you get your results back much faster.

However, it seems that this is something of a special case. Usually, what you have is mountains of data and no computing power, ala SETI. Or you have tremendous computing power, but nothing to do with that power, as is the case with the millions of computers all over the world that get turned off every night.

The real flaw in the 'Ants' scheme (in this simplified form), is that we're assuming that the information we're looking for resides on a machine with unlimited computing capacity. Not likely.

The central itch here is the need to bring code, data, and cycles together, with minimum drag on the network. What we need is a system that can make intelligent choices about how to accomplish this on the fly. Choosing just one method is just not going to work.

-chris

[ Parent ]

Concept not useful for freenet, napster, gnutella (4.00 / 5) (#8)
by fvw on Thu Apr 26, 2001 at 10:13:50 AM EST

The three peer-to-peer utilities you mentioned are all intended for sharing information. You can send an mp3, instructions on bomb making or a porn video to someone else. If we try to keep the data stationary, I'd send xmms, xpdf and smpeg to someone else.

As long as you don't find a way to enjoy someone else listening to the music etc for you, I don't think this'll work for applications that are meant to move data.

distributed supercomputing (3.66 / 3) (#9)
by _Quinn on Thu Apr 26, 2001 at 10:27:39 AM EST

   DC has always depended on a high ratio of compute time to communication; you simply don't benefit without one. If you're looking for high performance, migrant code doesn't help you any; it's communications-cheaper to send a few bytes telling the node what to do next, from its pre-loaded set, than to send it the code of what to do next. And only in rare cases will the data already be distributed when the DC starts, so the start-up cost of having N full copies of your program is usually pretty neglible, compared to distributing the data.

-_Quinn
Reality Maintenance Group, Silver City Construction Co., Ltd.
whats with the title? (3.33 / 3) (#10)
by rebelcool on Thu Apr 26, 2001 at 10:35:05 AM EST

it's a tool, and theres plenty of good reasons to do it either way.

COG. Build your own community. Free, easy, powerful. Demo site

Migrating code? (4.14 / 7) (#12)
by delmoi on Thu Apr 26, 2001 at 01:14:31 PM EST

Things like Gnutella, Freenet, and napster, are generaly used for media files. That is, the datafiles can only really be interpreted by humans. Basicaly, music and movies. In order to enjoy those things, you still need to get the data. Sending my media player over the pipe to watch a video on another persons computer would be intresting, but it wouldn't do me a lot of good.

Ants is an intresting idea, but it dosn't solve any of the problems that p2p programs are trying to solve.
--
"'argumentation' is not a word, idiot." -- thelizman
1000 media players on my machine? (3.50 / 4) (#18)
by guppie on Fri Apr 27, 2001 at 08:16:36 AM EST

I was thinking "This is stupid, if 1000 processes migrated to my machine to play a mp3 file, it would _really_ hog my machine."

But then the _uncompressed_ data would have to be transported to the "clients" to hear, and than would be unfeasable, too.

-1 to this one for mixing up p2p and distributed computing.

What? The land of the free? Whoever told you that is your enemy.
-Zack de la Rocha
[ Parent ]
Some links would be nice (4.40 / 5) (#14)
by tnt on Thu Apr 26, 2001 at 01:28:51 PM EST

Could you provide me some links. This is very similar to something I've been working on -- some distributed computation/caching/data/etc stuff -- and seeing their perspective on this might be helpful to me.



--
     Charles Iliya Krempeaux, B.Sc.
__________________________________________________
  Kuro5hin user #279

MIT ANTS research paper (3.00 / 1) (#25)
by dnorman on Mon Apr 30, 2001 at 08:58:26 PM EST

Charles, the MIT research paper for ANTS is available at http://tns-www.lcs.mit.edu/publications/openarch98.html

[ Parent ]
Computational power (3.33 / 3) (#15)
by weirdling on Thu Apr 26, 2001 at 03:21:33 PM EST

In a homogenous network of equal servers, this is perhaps true; but where I work I can't migrate the server code off the Sun because I will risk serious reductions in speed if I do.
Databases often need horrendous amounts of bus bandwidth and storage to function efficiently, so it wouldn't work easily for databases, although for many other ideas, it's not so bad...

I'm not doing this again; last time no one believed it.
Compression (3.00 / 2) (#17)
by DoubleEdd on Thu Apr 26, 2001 at 06:45:22 PM EST

It seems to me that a lot of the desktop user's processing time is spent expanding compressed files of one sort or another. It might be JPEGs, MP3s or otherwise, but generally speaking the data received over the network takes up less space than the data piped to the soundcard or displayed on the VDU. Sending out an ant or agent to expand the data before it even hits your computer is daft.
Whilst it might have been the case in the pre-multimedia days that code was smaller than the data, it is far from true now.

The other half (3.66 / 3) (#19)
by hardburn on Fri Apr 27, 2001 at 12:34:30 PM EST

Gnutella, FreeNet, etc, which run by having a fixed piece of code on each machine, and then some local pool of data, may turn out to be simply too inflexible to work, in the long-run. Each does one thing, and one thing only.

You forgot the other half: . . . and can be combined with other things to create useful things. That last half, IMHO, is very important.

Fear and intimidation work just as well on large groups as small groups.

Maybe. Maybe not. I know for sure that, if I'm intimidated, I'd rather be part of a large group that can help me then be all alone.

Freenet, Gnutella, etc, rely on sheer volume to overwhelm any attempt by organizations to subvert them . . . A system where ants migrate is a system where users cannot know where the data is. ALL that's visible to the user is the result, and what site(s) they sent their ants to at the start.

While thats true for Gnutella (a relitively simple, but baddly thought out network), Freenet works essentialy in the same way. While sheer volume is important to Freenet, Freenet is designed to scale to large networks (whereas Gnutella is definately not).


----
while($story = K5::Story->new()) { $story->vote(-1) if($story->section() == $POLITICS); }


Gnutella (3.00 / 1) (#30)
by ryancooley on Tue May 01, 2001 at 04:37:37 AM EST

Gnutella was made as an ACTIVE network protocol, while it should have been passive so huge bandwidth isn't required. It's something that could be fixed by any decent programmer in 15 minutes, but then try to convince EVERYONE to use your new client. The onlything FreeNet POTENTIALLY has which Gnutella does not is file caching. I say potentially since FreeNet is no more than a Java based FTP server right now.

[ Parent ]
Freenet (none / 0) (#33)
by hardburn on Wed May 02, 2001 at 10:11:39 AM EST

You obviously aren't familer with how Freenet works. Though the 0.3 series is buggy, it is still much more then just "Java-based FTP server". An FTP server has zero real anoynimity (psedo-anoynimity, maybe). Even the broken 0.3 series has fairly good anoynimity for both publishers and requesters.

Freenet works by making an educated guess as to where the data is on the network (this isn't the right place for an extensive discussion on how it does this; why don't you go read up on it). Gnutella works by sending the data to everyone your node knows about. Gnutella also has zero real anoynimity (psedo-anoynimity, maybe).

Things also look bright for Freenet 0.4. It will (we hope) fix a number of problems that have plagued the 0.3 series. It will finaly do away with the inform.php system of finding new nodes (and replaced with a far more scaleable system). Simulations show the Freenet network having problems getting above a few thousand hosts (IIRC) with inform.php, but goes well over 200,000 with the new system (at which point the computer running the simulation ran out of memory).


----
while($story = K5::Story->new()) { $story->vote(-1) if($story->section() == $POLITICS); }


[ Parent ]
I know what I'm talking about (none / 0) (#35)
by ryancooley on Thu May 03, 2001 at 11:24:23 PM EST

Look, if MPAA wants the logs of every 'Maxtrix' that passed through, they can pinpoint who's freenet node sent it and who's node received it. While with FreeNet you can claim the the software was caching that file and you didn't intentionally download it, but with Gnutella you can claim the same thing... "I searched for maxtrix looking for mathematical information and downloaded al the results"

You're right that Gnutella sends the list of available files to everyone, but that doesn't mean much. Without the clustering that Gnutella provides, you must know the IP address of the node running FreeNet, just as you must know the IP address of a public FTP server.

File caching is the only thing freenet provides right now that a public FTP server doesn't.

[ Parent ]
File caching (none / 0) (#37)
by hardburn on Fri May 04, 2001 at 11:23:39 AM EST

OK, I think I was just confused over your use of "file caching".

Yes, that "file caching" is important to Freenet's design. However, with an FTP server, you must know the IP of the FTP server that has the data you want. With Freenet, you need an IP of at least one node (though it's better that you have more) and Freenet will find the exact location of the data for you.

(Note that current beta version of Freenet don't do this very well. The developers have basicly given up trying to improve 0.3 and are instead focusing on 0.4.)

This is an important diffrence. It is why the developers describe Freenet as a "routing protocol", like TCP/IP, not a "storage protocol", like FTP or Gnutella.

if MPAA wants the logs of every 'Maxtrix' that passed through, they can pinpoint who's freenet node sent it and who's node received it.

For receivers, this would only be possible if they physicly comprimise the node. If it was possible to figure out who received the file from a remote attack, it would be considered a bug in the system. (Note that "physicly comprimise" could mean breaking into the system with SSH or Telnet).


----
while($story = K5::Story->new()) { $story->vote(-1) if($story->section() == $POLITICS); }


[ Parent ]
You don't know what the Feds can do (none / 0) (#39)
by ryancooley on Mon May 07, 2001 at 08:58:55 AM EST

with an FTP server, you must know the IP of the FTP server that has the data you want. With Freenet, you need an IP of at least one node (though it's better that you have more) and Freenet will find the exact location of the data for you.

FTP Search engines do just that. Find what you are looking for. Besides, Gnutella does the same thing.

This is an important diffrence. It is why the developers describe Freenet as a "routing protocol", like TCP/IP, not a "storage protocol", like FTP or Gnutella.

I'm a Cisco certified networking tech. I don't need explanations. Just because they describe FreeNet as the be-all-end-all doesn't mean anything. I've used it, I know how it works, and I can say that it does not protect a persons privacy any more than Gnutella does. FreeNet is certainly not going to route files through several nodes. That would use a huge ammount of bandwidth, and it certainly does not do it now. It is not a routing protocol at all.

For receivers, this would only be possible if they physicly comprimise the node. If it was possible to figure out who received the file from a remote attack, it would be considered a bug in the system. (Note that "physicly comprimise" could mean breaking into the system with SSH or Telnet).

You obviously don't understand how the government works. They don't attack either end, they go the the ISPs and subpeona the logs, and or the routers themselves. Either way, they have the source and desination of every bit of traffic.

[ Parent ]

OT: (none / 0) (#40)
by hardburn on Mon May 07, 2001 at 09:49:47 AM EST

FTP Search engines do just that. Find what you are looking for.

Uhh, but in the end you are still connecting directly to an arbitrary FTP server. If that FTP server goes down, too bad for you. Freenet has spread the data twards nodes that are likely to want that data, so taking any one down is laregely ineffective.

Besides, Gnutella does the same thing.

Gnutella does not spread the data to where it's most likely wanted. Nor does it use much in the way of encryption. Also, Gnutella (in it's first-generation clients/server anyway) uses a lot of broadcasting, which wastes quite a bit of bandwidth. Freenet developers never use bradcasting for that reason.

FreeNet is certainly not going to route files through several nodes. That would use a huge ammount of bandwidth, and it certainly does not do it now. It is not a routing protocol at all.

It routes the file through every node in the request path, each node caching the data at each point. This is the whole point behind Freenet. How can this be described as "not routing the files through several nodes"?

The bandwidth is a minor concern, since we're not broadcasting. It's just one node to one other node, which passes it to one other node. Bandwidth will build up on a linear scale (not geometric like Gnutella).

I'm a Cisco certified networking tech. I don't need explanations.

Freenet is quite a bit diffrent then other network protocols. Even if I concede that any certification is worth anything, I would still doubt your (or anyone else's) ability to suddenly understand it after a few hours of working with it.

I've used it, I know how it works, and I can say that it does not protect a persons privacy any more than Gnutella does.

What version of Freenet?

they go the the ISPs and subpeona the logs, and or the routers themselves. Either way, they have the source and desination of every bit of traffic.

The developers of Freenet have allways known that traffic analysis is a problem and have several new features lined up to make it more difficult in 0.4 (like Silent Bob encryption, a lot of other stuff I don't understand :).


----
while($story = K5::Story->new()) { $story->vote(-1) if($story->section() == $POLITICS); }


[ Parent ]
Not really (none / 0) (#41)
by ryancooley on Fri May 11, 2001 at 06:24:07 PM EST

<i>you are still connecting directly to an arbitrary FTP server. If that FTP
server goes down, too bad for you. Freenet has spread the data twards nodes that are
likely to want that data, so taking any one down is laregely ineffective. </i><p>If you download a file on Gnutella, you are now sharing it. That's how files move to where they are needed. If one Gnutella node goes down, others have the same files to share. It works the same with FreeNet. <p><i>It routes the file through every node in the request path, each node caching the data at
each point.</i><p>With FreeNet, you are still connecting directly to the machine with the data. If you connect in a serial chain like you claim, downloading a 1meg file passing through 5 computers would end up using 5 megs of bandwidth. Mow that may not seem significant, but with an 800 Meg DVD and more servers, you'd bring nodes to their knees, and you'd end up with hundreds of DVDs being downloaded through some poor saps computer with an ISDN connection. And then he shuts off his PC :-). This is exponential. At least with gnutella, your broadcasts are only a couple packets through the network, better than an entire DVD being broadcast. <i><p>a lot
of other stuff I don't understandt </i><p>Yes, I knew that. I've read all the papers, I've used the software. I know certification doesn't mean much. These guys aren't doing anything, they're just better at P.R. Their claims are just plain old FUDD.

[ Parent ]
OT: shall I respond (none / 0) (#42)
by hardburn on Wed May 16, 2001 at 04:06:43 PM EST

I've been debating on if I should continue this debate. Well, here goes:

Perhaps I should have done this earlier, but I will list my own "certifications". I have been on the Freenet development mailing list for over a year. I chat in the Freenet IRC channel many times a week. I have a few pieces of code in the Freenet CVS tree (not much, mind you, but it's something). I think I know a little more about Freenet than you.

If one Gnutella node goes down, others have the same files to share. It works the same with FreeNet.

Yes.

With FreeNet, you are still connecting directly to the machine with the data. If you connect in a serial chain like you claim, downloading a 1meg file passing through 5 computers would end up using 5 megs of bandwidth. Mow that may not seem significant, but with an 800 Meg DVD and more servers, you'd bring nodes to their knees, and you'd end up with hundreds of DVDs being downloaded through some poor saps computer with an ISDN connection. And then he shuts off his PC :-).

You contridict yourself here. You said above that in Freenet (and Gnutella) if one computer goes down, others have the file to pick up the slack. Now you say that if one guy shuts off his PC, the file is lost.

It IS true that Freenet will always route a request for a given file to the same place (unless your node has found a node since the last request that is closer to the file). However, if the next node in the chain isn't connected, it will just route it to the next closest node.

Also, in 0.4, files can be split up upon inserting. To encourage people to do this (but primarly to make traffic anaylisis harder), all files will be padded to a size which is a power of 2.

This is exponential

This is linear. A request goes from one node to the next node to the next node, etc.. This is either a linear build up of bandwidth or you have a funky definition of "exponential".

In Gnutella (first generation, anyway), a request will go to all the nodes you know about, to all the nodes they know about, to all the nodes they know about, etc. This is a gemetric build up of bandwidth (sligtly diffrent from exponential, but not by much).

a lot of other stuff I don't understandt
Yes, I knew that. I've read all the papers, I've used the software.

See my qualifications above. You missunderstood what I ment by "stuff I don't understand". Freenet is a complex beast, (I've argued that it's even chaotic (in the scientific sense) in nature, although not everyone agrees) and has with ideas within ideas within ideas. However, like many forms of chaos, it is at least understandable on the surface after you've looked at it long enough.

I pretty much understand the 0.3 stuff, but 0.4 is a big mystry to me.


----
while($story = K5::Story->new()) { $story->vote(-1) if($story->section() == $POLITICS); }


[ Parent ]
I'm AT LEAST as bored of this as you... (none / 0) (#43)
by ryancooley on Thu May 17, 2001 at 05:45:24 AM EST

But I'll hit on one or two things....

You said above that in Freenet (and Gnutella) if one computer goes down, others have the file to pick up the slack. Now you say that if one guy shuts off his PC, the file is lost.

You missed the point. Lets say that all nodes are running Gnutella and FreeNet... A has a lot of movies available for download. B is a node with an ISDN line who's freenet node connects to both A and C. C wants to download a few of the movies from A. Now if FreeNet works as you say it does, then all the billions of bytes sent from A to C are bottlenecked at B (Who is extremely unhappy that he can't surf the web because all his bandwidth is being used). I was assuming that B with the ISDN line was the only route from A to C which is a situation that will happen very often in the real world.

This is either a linear build up of bandwidth or you have a funky definition of "exponential".

Let's assume there are 5 computer's like B who act as a path between bigtime bandwidth users. To get a 1 meg file, 6megs of badwidth are used. To get a 3 meg file, 18 megs of bandwidth are used. If you don't consider that exponential, I don't know what else to tell you. Meanwhile, with Gnutella, if A is downloading a 5 meg file, it connects directly to C and only 5 megs of bandwidth are used (direct connection from node to node).



[ Parent ]

Late to the party (none / 0) (#44)
by rusty on Thu May 24, 2001 at 03:21:19 AM EST

I'm late, and probably no one will ever see this, but I just wanted to pick this nit:

To get a 1 meg file, 6megs of badwidth are used. To get a 3 meg file, 18 megs of bandwidth are used. If you don't consider that exponential, I don't know what else to tell you.

Actually 1 * 6 = 6 and 3 * 6 = 18, so that's the very definition of linear. If X is the size of the file, and Y is the number of machines it passes through, than bandwidth used is X*Y. If this were exponential it would be something like X^Y.

It's rare that I get to jump in with a math quibble, so I'm carpeing my diem here.

____
Not the real rusty
[ Parent ]

Almost, not quite (none / 0) (#45)
by ryancooley on Tue Jun 05, 2001 at 10:56:52 AM EST

I can see your point, but the problem is, 6 people aren't trying to download a 1 meg file, and using 6 megs in the process. 1 computer is trying to get a 1 meg file and using 6 megs in the process.

Continuing on the example;

1 meg = 6 megs
2 megs = 12 megs
3 megs = 18 megs


Despite the exponential/linear arguement, the whole point being the figures are absolutely incredible, A system such as FreeNet claims to be, would not, and could not exist in the real world. If you think Gnutella felt growing pains at first, you've got no idea how painful it would be to use FreeNet in a decetly sized community.

[ Parent ]
You can't transmit the user. (4.66 / 3) (#21)
by AndyL on Sat Apr 28, 2001 at 01:46:33 PM EST

In the examples you mentioned the ants would be useless. The reason you're looking for the data is because a user wants it! The number of times I download an MP3 to do something other then listen to it is trivial, if not zero.

Generaly, when you're not downloading songs or por^Wimages, The single piece of data you're downloading is useless by itself. For example, if I want to compare the large file on my machine to the large file on your machine, No matter where the comparison takes place there's a lot of data to be pushed back or forth.

The only usefull end-user scenario I can think of, off the top of my head, would be if you were view a large file, but you didn't want to view all of it. Then it would be usefull to be able to search and whatnot on the server...

Perhaps I'm just stuck in old paradigm thinking. Could you give us an example of a real-world use for this that I might be missing?

-Andy



Still interesting concept (1.25 / 4) (#22)
by Highlander on Sat Apr 28, 2001 at 02:01:55 PM EST

Voted +1, because soon surely someone will patent this stuff - prior art on kuroshin .. .

Moderation in moderation is a good thing.
[ Parent ]
This is a very useful thing (4.50 / 2) (#26)
by MugginsM on Mon Apr 30, 2001 at 10:03:14 PM EST

I'm not particularly familiar with the ANTS model,
but the basic idea of being able to send code "close" to the data it operates on is a very useful one.

People have been mentioning that this is useless
for things like Napster, Gnutella and the like since the thing people *want* is the data.

I'd like to counter this by saying that people don't just want "the data" - they want "some, specific, data".

Searching doesn't only consist of searching for filenames or song titles.

For example:
A Gnutella type system that lets you send mobile
code to your peers.

You can now:
* Search for a song that *sounds like* a small sound sample you sent with the code.
* Search for a song title and discard matches that are the same size, but contain different data from the majority. - a way to filter out "faked" songs.
* Search for media files that you've watermarked.

and so on.

In the end it's just a "search for file" "download it" operation, but the power of the search just skyrocketted.


Imagine the bandwidth savings if the web search engines sent a "summariser" program to all the web sites which returned a condensed version of the site similar to the format the engine stores it.

There are many uses for this.

- Muggins the Mad

[ Parent ]
One problem... (5.00 / 2) (#29)
by guinsu on Tue May 01, 2001 at 12:27:54 AM EST

Imagine the bandwidth savings if the web search engines sent a "summariser" program to all the web sites which returned a condensed version of the site similar to the format the engine stores it.

The only problem is that every porn spammer on the net would clog this search mechanism up to get more hits to their site. They would just return fake matches for every search. I thought of an idea like this a while ago, but I realized it would need some sort of moderation/policing to keep it from getting abused. And the abuse would be on a large scale, look at the tricks that have been pulled on search engines (repeated text, small fonts, fonts the color of th ebackground, etc) to get good rankings.

[ Parent ]
Missing the true point, I believe... (4.60 / 10) (#24)
by afeldspar on Sun Apr 29, 2001 at 02:39:06 PM EST

If this is the same "ANTS" system I remember reading about in an MIT paper, the poster is doing it a disservice by misdescribing it. Several people have pointed out that the examples he cites aren't particularly useful, but the examples he gives kind of miss the real point of the system (assuming it's the same system I'm remembering, which I'll assume for the remainder of this comment.)

The true power of the "ANTS" idea is not simply moving a computational task from the client to the server (besides, given how easily most servers get Slashdotted, it doesn't seem to make sense to undistribute data processing...) The idea is instead to distribute a network-related task across the network, making each intermediate point in the chain between client and server an intelligent agent rather than just a passive relay.

An example where this would be useful? Multi-casting. Server A is broadcasting a live event; clients C, E, F, H and I want to tune in. In the conventional model, A makes five separate connections to each of those clients, and transmits the same data five times, each time along the quickest path it finds to that client.

In the ANTS model, however, server A realizes when it starts getting back information on what the quickest paths are, that some portion of them is shared: that node D is on the path to both clients E and F, and that node G is on the path to both H and I. So server A sends ANTS code to nodes D and G that arranges to receive a single stream of packets from A and multiplex it to the clients that node is closest to.

And of course, this is not the only possible application for this technology; there are probably others out there who can assess better than I can what impact such an infrastructure could have on Gnutella, FreeNet, et cetera... I'd like to hear from those people.


-- For those concerned about the "virality" of the GPL, a suggestion: Write Your Own Damn Code.
Sounds like... (none / 0) (#28)
by _Quinn on Mon Apr 30, 2001 at 10:24:00 PM EST

   ... active networks. Maybe that's what ANTS stands for? (Active Network Transport System?)

   The active network idea may be equivalent to the idea described in the article, but if so, as you said, the author gave some poor examples.

   Impact on peer-to-peer? Not much, right now, because most peer-to-peer traffic isn't carried (routed) by the peers. If it were, this could be useful. See also: virtual private networks?

-_Quinn
Reality Maintenance Group, Silver City Construction Co., Ltd.
[ Parent ]
Mosix (3.00 / 1) (#27)
by Ubiq on Mon Apr 30, 2001 at 10:08:18 PM EST

Isn't this what Mosix has been doing for quite a few years now?

From what I've heard it's quite capable of running on ordinary workstations, not just in special clustering environments.



How did this get in? (none / 0) (#32)
by tchaika on Tue May 01, 2001 at 07:06:44 PM EST

This article is earnest and well meant, but it is just a combination of inferences drawn from poorly understood buzzwords. It really doesn't make any sense. (Sorry).



What about caching remote code locally? (none / 0) (#34)
by unDees on Wed May 02, 2001 at 07:00:20 PM EST

A related issue, though probably not the original intent of the post, is moving remote chunks of code to the local machine for speed purposes. In this age of component architectures, smiling salesmen tell us that our software can invoke methods and properties of an object "without knowing caring whether or not the object 'lives' on another computer." Let's set aside for a moment the question of how software "knows" things. :) I could see value in grabbing a 300KB library from another machine, just so I don't have to marshal parameters back and forth across a possibly slow and flaky connection. Wouldn't it be nice if my component architecture could do that for me?

Hmmm..... do any of them do that?

Your account balance is $0.02; to continue receiving our quality opinions, please remit payment as soon as possible.

Yes, RMI does it. (none / 0) (#38)
by tchaika on Sun May 06, 2001 at 12:50:51 PM EST

RMI (on Java) will download necessary classes remotely if configured to do so. However, the consequences of this type of architecture, if widely applied, result in nightmarish visions of DLL Hell multiplied by Outlook Security Bug Hell. Getting the right balance between security, complexity and convience is daunting. There's an awful lot to be said for monolithic executables, from a support and reliability point of view.

[ Parent ]
migrating code to bring the data to you (none / 0) (#36)
by jdhollis on Fri May 04, 2001 at 01:25:47 AM EST

I'm seeing a lot of comments on people wanting the data to come to them. I agree, otherwise, what good is the data if you can't use it. But that doesn't mean all the code has to stay with you. Why not send out code onto the network that brings you the data? To some extent, something along these lines has already been implemented (although my description is rather poor, as I do not understand very well how it works): The Scheme Underground Network Package. How fitting that something so flexible would be coded in a dialect of Lisp. I haven't looked at this thing in several years. Reading this article caused me to remember it. Hope someone finds this interesting. cheers, j.d.

Do programmers have it all backwards? | 45 comments (35 topical, 10 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!