Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
The Distributed Folding Project enters the CASP 5 trials

By Jeff Gilchrist in Internet
Tue Jun 04, 2002 at 03:00:06 PM EST
Tags: Science (all tags)
Science

Since early this year, The Distributed Folding Project has been hard at work predicting protein folds conformations with help of thousands of participants. This distributed computing project should not be confused with the similarly named Folding@home distributed computing project, which aims to simulate the act of protein folding rather than simulate the outcome of the phenomenon.

Knowledge of the shape of these folded proteins is vital for determining their behavior and function, therefore obtaining this knowledge is the goal of the project. This is an important area of medical research that could have implications for the treatment of diseases such as Alzheimer's and AIDS. In order to do this, the project utilizes a distributed computing system where individual computers run a software algorithm which produces predicted shapes for the folded protein structures, which are then sorted by another piece of software and uploaded to a central server where the data in analyzed. Additional information about the science behind the project can be found on this page


The reason for this work is that the sequence of amino acids that for a particular protein can now be predicted in many cases due to research in genetic sequences, but proteins do not stay in two-dimensional structures. The Hydrophobic Effect causes protein's amino acid chains to fold into complex three-dimensional structures that can't be easily predicted by mere knowledge of the protein's amino acid sequence.

The CASP 5 Trials, are when different groups of scientists use different techniques to predict the structures of selected folded proteins during a designated period time. This "competition", with different groups and varying methods, is an effort to determine the abilities and limitations of current protein structure prediction techniques. The Distributed Folding Project will be competing against other major scientific groups to deliver the most accurate predictions. Given the limited time period of the trials (occurring between May and August), it is very important that the project obtain as much processing power as possible. This will produce large numbers of predicted conformations in order to ensure the results are closest to the actual shapes of the proteins studied. This is an opportunity to play an important role in potentially revolutionary medical research aimed at solving one of biology's most difficult problems!

The software client supports a variety of operating systems including Linux, Windows, Solaris, OSX, FreeBSD, Tru64, HP-UX 11 and more. In order to join the project, you first go here and register, while making sure you record the handle that you are assigned by the server. You can then download the client software here, and give the program your previously assigned handle name the first time you run it in order to start participating.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Distribute d Folding Project
o this page
o CASP 5 Trials
o go here
o here
o Also by Jeff Gilchrist


Display: Sort:
The Distributed Folding Project enters the CASP 5 trials | 35 comments (26 topical, 9 editorial, 0 hidden)
Can someone tell me (5.00 / 5) (#2)
by CaptainSuperBoy on Tue Jun 04, 2002 at 10:19:33 AM EST

Can someone tell me, what's the difference between all the different folding groups?  Does this group have the same goal as Folding@Home, and United Devices?

--
jimmysquid.com - I take pictures.
Semi-explanation (none / 0) (#25)
by thebrix on Tue Jun 04, 2002 at 04:25:07 PM EST

The following are horribly simplified explanations, but give the gist of what is going on :)

The United Devices project is different from the other two; it takes the structure of molecules known to be implicated in cancer and tries large numbers of artificial, but plausible, molecules to see which ones fit with the cancer molecules so have a possibility of neutralising their effects (rather like a guard being put over a knife blade); those that do are candidates for being developed into real drugs and tested further.

The other two projects tackle a well-known difficult problem in genetics; the two-dimensional structure of many proteins is known but precisely how they appear in three dimensions is not. Both projects try to find out plausible three-dimensional structures, but by entirely different methods, rather like deriving a representation of a mountain range by crumpling up the sheet of paper a map is printed on.

[ Parent ]

Difference between F@H and DF (none / 0) (#29)
by pointwood on Wed Jun 05, 2002 at 04:36:32 AM EST

The F@H (Folding@Home) and the DF (Distributed Folding) projects are complementary projects, a short explanation taken form the official DF board:
"Distributed Folding looks to be addressing the structure prediction problem, whereas Folding@Home is addressing how proteins fold."

If you are interested in the science behind the DF project, you should check out the Educational Forum right here:
http://www.free-dc.org/forum/forumdisplay.php3?s=&forumid=32

I'm not very familiar with UD, but AFAIK, they are mostly working on cancer but have also worked on Anthrax. Read more about it here:
http://members.ud.com/home.htm

They only have a Windows client which I have never installed and probably never will. I don't like their terms of agreement. It has been talked about more than once in our forum (the Ars Technica DC Forum) and the problem I have with it is that unless I specifically configures the client after installing it, the client will run whatever project the UD people wants to run. When they started, they advertised a lot about cancer research and that is why they got a lot of perticipants. I bet a lot of participants expects the client are doing cancer research but in fact it could easily be crunching on a different project without asking the participants.

What I like a lot about the DF project is that the project management team takes security and privacy issues very seriously. I see that as a welcome change.

--
Pointwood - Contribute your spare computer power to science!
http://tsf.dbestern.net/


[ Parent ]
UD (none / 0) (#32)
by CaptainSuperBoy on Wed Jun 05, 2002 at 09:47:50 AM EST

I ran UD for a while.. your concerns are valid, UD is designed to eventually farm out their network to paying customers.  You can opt to only work on the cancer project though, but by default you will work on whatever project they decide on.

I'll look into DF.. sounds interesting.

--
jimmysquid.com - I take pictures.
[ Parent ]

Distributed Folding (none / 0) (#34)
by pointwood on Thu Jun 06, 2002 at 07:22:24 AM EST

It is IMHO pretty cool.

The team behind is very responsive and have been fixing bugs very quickly.

--
Pointwood - Contribute your spare computer power to science!
http://tsf.dbestern.net/


[ Parent ]
Still waiting (5.00 / 7) (#3)
by dark on Tue Jun 04, 2002 at 10:33:22 AM EST

Such distributed calculation projects have come and gone over the years, but I'm still waiting for one which meets these criteria:

  • It comes with source code, so that I know what's going on on my computer
  • The results will be in the public domain

    I've seen many projects that meet the second criterion (sometimes by default, if the target is simply a decryption key), but I haven't yet seen one that meets the first. I would love to get a pointer to one, since my computer sometimes gets bored.



  • DF Project with Source (5.00 / 3) (#4)
    by Jeff Gilchrist on Tue Jun 04, 2002 at 10:40:52 AM EST

    Hi Dark, There is one such project currently available that the results will be in the public domain and source is available, that is the ECC (Elliptic Curve Cryptography) cracking project here: http://www.nd.edu/~cmonico/eccp109/main.html

    [ Parent ]
    Thanks! And it proves my point :) (5.00 / 2) (#14)
    by dark on Tue Jun 04, 2002 at 12:04:03 PM EST

    I did a quick review of this program before installing it, and I found a security problem. I wouldn't have had an opportunity to find it in a binary-only program, other than by digging out my decompiling tools and spending a week or two.

    I mailed cmonico about it; do you know anyone else I should contact? The web site doesn't mention any addresses.



    [ Parent ]
    You're Welcome (5.00 / 2) (#15)
    by Jeff Gilchrist on Tue Jun 04, 2002 at 12:09:02 PM EST

    Interesting... Cmonico is the main person to contact for the project, he usually responds pretty quickly. I helped co-write the Win32 Service version but if Chris knows about the problem you found it will probably get to me eventually.

    [ Parent ]
    I'd rather see (5.00 / 3) (#6)
    by FlightTest on Tue Jun 04, 2002 at 10:48:38 AM EST

    It comes with source code, so that I know what's going on on my computer
    I'd rather see a project where the source was given to a respect third party. For the sake of discussion, say Alan Cox. He verifies that the code contains nothing malicious, and agrees not to help people spoof the client. An NDA on the workings and client-server communications, as it were.

    The obvious problem is, widely giving out the source code almost guarantees that someone will mess with the code and either (a) goon it up because they didn't understand what they were doing, or (b) goon it up intentionally to spoof the results.

    Having closed source on this type of project almost completely takes care of (a) which is the far more likely case and discourages casual instances of (b). How do you distribute the code and not completely negate the benifits of distributed computing?

    Why did I flip? I got tired of coming up with last minute desparate solutions to impossible problems created by other fucking people.
    [ Parent ]

    security through obscurity? (3.00 / 1) (#12)
    by tps12 on Tue Jun 04, 2002 at 11:44:23 AM EST

    Spoofing results is already possible, it just requires some effort to reverse-engineer the communication protocol.

    Closing the source does not free the authors from the responsibility to build in verification measures to counteract spoofing. But it makes it difficult for the community to help improve upon whatever measures they do take.

    If their security model depends on the source being closed, then they already have problems.

    [ Parent ]

    The practice of spoofing (5.00 / 2) (#16)
    by dark on Tue Jun 04, 2002 at 12:23:20 PM EST

    I have thought a lot about this, while preparing an article which I never wrote. I don't think it is possible to have a security model for this that works. If a client sends in a negative result, then the only way to verify that is to duplicate the work it did, which defeats the point of distributing the work. Any attempt to create a trusted computing environment on the client machine is going to be vulnerable to reverse engineering.

    [ Parent ]
    It's actually, quite simple. (none / 0) (#28)
    by Trepalium on Wed Jun 05, 2002 at 12:42:50 AM EST

    You occasionally give the computer a task for which you already know the answer. By giving you the correct answer, to such a control data, you can verify that the software is operating properly. The other option is to only accept data that is collaborated by two or more computers. The down side to both of these approaches, is you effectively reduce maximum computational power in the name of accuracy.

    [ Parent ]
    Re: The practice of spoofing (none / 0) (#35)
    by jeffenstein on Fri Jun 07, 2002 at 12:54:04 PM EST

    One way to prevent this is to give the same work unit to multiple clients (maybe 2-3), and verify that they all come back with the same results. Yes, this will slow down the project, but this greatly lessens the chances of spoofing.

    However, I agree that it is impossible to completely eliminate the possibility of spoofing.

    [ Parent ]

    No (5.00 / 2) (#18)
    by FlightTest on Tue Jun 04, 2002 at 12:32:42 PM EST

    Not security through obscurity. The biggest problem will be well-intentioned people tweaking it to make it faster, gooning it up and causing unintended problems. Also, you eliminate casual spoofing of results. People who do it because it's easy.

    And opening the source makes it much easier to spoof the verification measures, because you already know how it works.

    The security model doesn't DEPEND on closed source any more than my house security depends on my neighbors watching out for each other. It's one facet of a much larger security model.

    Why did I flip? I got tired of coming up with last minute desparate solutions to impossible problems created by other fucking people.
    [ Parent ]

    D.net (5.00 / 2) (#11)
    by autonomous on Tue Jun 04, 2002 at 11:27:21 AM EST

    I've been playing with d.net for a long time, they give you 99.5% of the source code for the clients and proxies, the only code they leave out is the encryption keys used for block management and the buffer format. I think this is quite fair, as full source code release would may open the project up to large scale cheating (I know that many eyes make shallow bugs, but one cheater can destroy the work of thousands of people, so that section of the code stays private until the contests are done.) http://www.distributed.net/source/
    -- Always remember you are nothing more than a collection of complementary chemicals worth not more than $5.00
    [ Parent ]
    Most of the DF client is open source (none / 0) (#30)
    by pointwood on Wed Jun 05, 2002 at 05:31:09 AM EST

    However there are some essential part of it that isn't.

    If you have any questions in that regard, please ask in the official project forum:
    http://www.free-dc.org/forum/forumdisplay.php3?s=&forumid=27

    I'm sure Howard (the primary person working on this project) will answer your questions. I believe he already has answered this question before but I couldn't find it right now :(

    --
    Pointwood - Contribute your spare computer power to science!
    http://tsf.dbestern.net/


    [ Parent ]
    Issue (4.50 / 8) (#5)
    by MotorMachineMercenary on Tue Jun 04, 2002 at 10:44:57 AM EST

    Since this kind of research has an absolutely enormous profit potential if/when an application is found, I'm extremely interested whether these people are willing to donate the results to the public domain, just like I am doing with my processor cycles. I briefly went through the FAQs and found the following:

    32. Who will own the patent for this discovery and which commercial enterprise will profit?

    The results from distributed computing by the Distributed Folding software will be compiled and reviewed at the SLRI for additional study. As Distributed Folding results test the ability of the software to recreate known protein folds, the initial results are unlikely to make any new discovery about the structure of these proteins. It will, however, validate the utility of the software. The authors of the software and their employer, the Samuel Lunenfeld Research Institute (SLRI) own the Distributed Folding software under the terms of the Intellectual Property Policy of Mt. Sinai Hosptial and its agreements with the University of Toronto. In any event Mt. Sinai Hospital will own any new intellectual property associated with the resulting data, and will make any such discoveries available to the public, at no cost, sometime after their initial research is completed. [emphasis mine]

    I'm not exactly happy with making results "available" to the public (whatever that means), especially with such a vague timeframe. I've found this to be more or less the case with scientific distributed programs (please let me know if there is an exception out there).

    Unless someone can give me more tangible information on patent/intellectual property issues, I'm going to stick with SETI@home. 538 work units and still going!

    My bodyweight is muscle and cock MMM
    Tenured K5 uberdouchebag Herr mirleid
    Meatgazer Frau gr3y


    Wow (2.28 / 7) (#7)
    by MotorMachineMercenary on Tue Jun 04, 2002 at 10:53:34 AM EST

    So you're saying that just because someone might profit from your otherwise unused processor cycles you´re going to forgo potentially cancer/alzheimer/AIDS -curing research in favor of looking for small green men?

    This is exactly what´s wrong with humanity nowadays!

    My bodyweight is muscle and cock MMM
    Tenured K5 uberdouchebag Herr mirleid
    Meatgazer Frau gr3y


    [ Parent ]
    Nothing wrong here, move along.... (4.85 / 7) (#10)
    by Elkor on Tue Jun 04, 2002 at 11:25:03 AM EST

    you´re going to forgo potentially cancer/alzheimer/AIDS -curing research in favor of looking for small green men?

    Uhhh, yeah. Sounds about right.

    Reasons:
    1) The timeframe for release is nebulous. They could withhold the data for several years before releasing it, thus prohibiting any sort of feedback on your contribution. Or hold onto it for so long that it becomes outdated and useless.
    2) He is not receiving compensation for his participation in the reasearch, but the hospital most definitely will profit from the research (if in no other fashion than prestige for the accomplishment).
    3) The little green men will already have the cure to all of the world's ailments. Thus they should be our first priority.

    That the clock cycles are unused doesn't make them unimportant.

    That is like saying "Since you don't work on Sundays you should donate your time to Company X so they can make more money."

    That's well and good for you, but he (and you) get to make your own decisions regarding how you spend your time.

    Similarly, your incredulity at their choice seems to imply you would have all space research programs stopped and refocused to medical research. Unfortunately, this wouldn't garauntee success in medical research. Just failure in space exploration.

    He chooses to chase little green men. You are free to choose what you want.

    Regards,
    Elkor


    "I won't tell you how to love God if you don't tell me how to love myself."
    -Margo Eve
    [ Parent ]
    Patents (5.00 / 5) (#19)
    by dark on Tue Jun 04, 2002 at 12:43:14 PM EST

    Simply finding the cure is not enough, it has to be distributed and implemented worldwide. If a corporation gets patents that are essential to this cure, then they can delay the worldwide cure indefinitely (well, for 20 years) while they milk the disease for profit. They might even ignore some areas (e.g. parts of Africa) that are too poor to pony up the license fees. And because of the way patents work, this would block any independent effort to find the same cure.

    I will not help any corporation achieve this sort of power. I already find it galling when publicly funded research is sold out to corporations, I'm not going to be part of a public project that does the same thing.



    [ Parent ]
    Weird (5.00 / 5) (#21)
    by dark on Tue Jun 04, 2002 at 12:53:04 PM EST

    I just noticed that you replied aggressively to your own comment. Are you some kind of troll, or is this a subtle attempt at humour? (I say "attempt" because anything I don't get is obviously not funny.)

    [ Parent ]
    Multiple accounts (none / 0) (#27)
    by silsor on Wed Jun 05, 2002 at 12:00:09 AM EST

    Looks like he forgot to log out.


    ✠  Patron saint of unmoderated (none / 0) top-level comments.
    [ Parent ]
    You might prefer Folding@Home's terms (5.00 / 1) (#9)
    by Freaky on Tue Jun 04, 2002 at 11:09:59 AM EST

    From the Folding@Home FAQ

    Who "owns" the results? What will happen to them?

    …We will not sell the data or make any money off of it.

    Moreover, we will make the data available for others to use. In particular, the results from Folding@home will be made available on several levels. Most importantly, analysis of the simulations will be submitted to scientific journals for publication, and these journal articles will be posted on the web page after publication. Next, after publication of these scientific articles which analyze the data, the raw data of the folding runs will be available for everyone, including other researchers, here on this web site.



    [ Parent ]
    Info on Distributed Folding (none / 0) (#26)
    by mad-ness on Tue Jun 04, 2002 at 10:22:45 PM EST

    Greetings. Been a while since I posted here.
    The scientists at the Samuel Lunenfeld Research Institute (SLRI) are working on various projects, several of which are more 'complete' and which are either open source or part of the public domain (GNU GPL in the case of code, 'free' in the case of information).

    BIND - The Biomolecular Interaction Network DataBase

    BINDdb

    MoBiDiCK - A Tool for Distributed Computing on the Internet

    MoBiDicK pdf

    I do not know if MoBiDiCK is open source, though that link is to a .pdf with a LOT of details on the project. BIND is a free/open database for results of protein interactions. Here is a link to a page with a list of recent submissions BIND News.

    For more information about the project, please drop by the official forum, where we tend to have big (not to mention mind-bending) discussions of the science involved, which ranges from bio-informatics to statistics to biology to high level math to hard core computer science.

    If you have a specific question about the project, the timeline in which results will be published/shared/opened or anything else please feel free to ask it there, replies are usually very swift, the project management is very attentive and active.

    Distributed Folding Forum

    Insert witty signature here.
    [ Parent ]
    Contrary thoughts (none / 0) (#31)
    by thebrix on Wed Jun 05, 2002 at 05:49:34 AM EST

    These issues always come up whenever distributed projects are mentioned and, for me, are irrelevant. As someone who's suffered three times from cancer (and survived twice thanks to sheer luck and the third time thanks to research right at the cutting edge) I want a cure!

    I couldn't care less whether the software was jointly developed by Saddam and Osama, the project underwritten by Enron, the servers located in a nuclear bunker in North Korea, and the results bought by Bill Gates ... as long as there's (a little) more understanding of the problem at the end.

    [ Parent ]

    The Distributed Folding Project enters the CASP 5 trials | 35 comments (26 topical, 9 editorial, 0 hidden)
    Display: Sort:

    kuro5hin.org

    [XML]
    All trademarks and copyrights on this page are owned by their respective companies. The Rest © 2000 - Present Kuro5hin.org Inc.
    See our legalese page for copyright policies. Please also read our Privacy Policy.
    Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
    Need some help? Email help@kuro5hin.org.
    My heart's the long stairs.

    Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!