Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
A step toward solving comment spam?

By ubernostrum in Op-Ed
Wed Jan 19, 2005 at 04:35:20 PM EST
Tags: Internet (all tags)
Internet

Rumors have been flying on various weblogs all week, but today it's official: Google, along with Yahoo and MSN, is going to start supporting the attribute rel="nofollow" on links in an attempt to fight spam.

Except it won't work.


So... what's it do?

Basically, this attribute prevents search engines from considering the link in their rankings. Google's PageRank algorithm was the first and perhaps most famous example of the technique, but nearly all modern search engines use the number of links to a page as a measure of its relevance; in effect, more links to your page mean higher ranking in the search engines.

In itself this is not such a bad thing, but with the recent rise of weblogs and their often improbably-high PageRank values (since weblogs tend to link to lots of other weblogs, they all pull each other up) this has become a prime tool for spammers. The idea is simple: if you write a script that posts a lot of comments to weblogs and links to your site in each comment, you'll get a hefty PageRank boost.

The use of rel="nofollow", in effect, closes the hole. As the Google announcement puts it:

From now on, when Google sees the attribute (rel="nofollow") on hyperlinks, those links won't get any credit when we rank websites in our search results. This isn't a negative vote for the site where the comment was posted; it's just a way to make sure that spammers get no benefit from abusing public areas like blog comments, trackbacks, and referrer lists.

So now all a site needs to do is ensure that all links posted in comments have the attribute rel="nofollow" and the problem will go away; no PageRank bonus, no reason to spam, right?

Wrong.

Spam 101

I'm going to take a bit of a risk here and go on record as saying that this will not solve comment spam at all; in fact, it'll probably make the problem worse. The reason for this is that there are currently two kinds of spammers:

  • One kind, let's call them "Group A", spam weblogs and discussion forums for the PageRank bonuses.
  • The other kind, "Group B", spam because they want their address or brand name to be seen by as many people as possible. They don't care about PageRank, they just care about the (fairly constant) percentage of people who will hit their site after seeing the address or name. This is the same motive and method we're all familiar with from USENET and email spam.

Obviously this can stop Group A in its tracks if widely implemented (and the Google announcement sports an impressive list of weblogging tools which are already on board, with more likely to follow), but it does nothing about Group B. What's more, the Group A spammers are unlikely to say "Aw, shucks" and give up; they're probably just going to become Group B spammers, because the marginal gain of visibility is better than nothing.

So I have a feeling that what we're going to see after the widespread implementation will not be a decline in comment spam (and, on weblogs which support TrackBack, TrackBack spam), but rather a marked increase. Group A spammers could afford to be somewhat conservative in the amount of their posting, because a few links from a high-profile weblog could bring a huge PageRank increase. But with the end of the PageRank incentive they'll have to fall back on the spammer's old standby: volume. In the coming months I expect to see the firehoses turned on like never before.

The only upside is that now the two groups of spammers have been collapsed into one, and that one group shares the motives and methods of spammers in other fields. So the problem has been consolidated somewhat, which in theory makes a comprehensive solution more feasible. Of course, comprehensive solutions for spam are a dime a dozen and so far not one of them has comprehensively solved the problem.

The other effect

There's another consequence of search engines respecting rel="nofollow", and I'm surprised at how quickly it was picked up: this technique makes it much easier to criticize sites you don't like.

Robert Scoble, a relatively high-profile weblogger who works for Microsoft, is already fantasizing about the ability to criticize without giving PageRank, and what it will mean to his writing style:

What do I mean? Well, last year a carpet store in Redmond ripped off a lot of people. The store is now out of business, but back when it was happening I wanted to link to the store but couldn't.

Why not?

Because one link from my blog would have automatically put the store at the top of the search page on Google for "Redmond carpet store." Why is that? Because of my Page Rank. Several thousand sites link to me so Google's algorithm considers anything I link to as "highly relevant." I've seen this many many times.

Now, of course, Robert is free to link as he pleases, and a simple rel="nofollow" will prevent the targets of his sniping from gaining any PageRank benefits. The same applies to everyone else. Hat tip to Shelley Powers, who noticed this early on but didn't publicize it, apparently hoping it would stay undiscovered for a while.

So not only do we get more spam than before, people are probably going to be more openly nasty to each other as well. Lovely.

Or, perhaps, some prominent webloggers will use "nofollow" as a money-making technique: for a fee, they'll take the "nofollow" off your link and let you have their PageRank. This would be somewhat trickier to implement, but it goes hand-in-hand with the ability to freely link to people you don't like; PageRank can now be used tactically, to give or to take away.

Too sudden, too thoughtless

I think the eventual verdict on this will be that it was simply not thought out properly before everyone ran and implemented it. As far as I know, the first proposal for something like this came from Ian Hickson of Opera last August:

I'm thinking that HTML should have an element that basically says "content within this section may contain links from external sources; just because they are here does not mean we are endorsing them" which Google could then use to block Google rank whoring. I know a bunch of people being affected by Web log spam would jump at that chance to use this element if it was put into a spec.

At the time it made a bit of a splash, and it was discussed somewhat and even debunked somewhat, but generally everyone who heard about the idea fell in love with it (myself included). Now, less than six months later, an unprecedented joint effort of search engines and software companies has formed to implement it seemingly overnight.

Admittedly this has got to look like a godsend to people who are getting deluged with spam, and I freely admit it even had me fooled, but after a moment's reflection the problems start to emerge. Unfortunately, there doesn't seem to have been a moment's reflection before this was implemented just about everywhere.

Of course, I could be entirely wrong: it's possible that spammers will stop and consider that, since their efforts managed to unite every major search engine and weblogging tool on the planet into sudden and unprecedented cooperation, maybe their efforts are unwanted and they should stop. And it's possible that people will still snipe at each other somewhat reclusively by avoiding links to the targets of their ire.

But I wouldn't count on it.

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Poll
Is this a good idea?
o Of course it is, it'll get rid of comment spam! 23%
o Of course it is, it lets me whine about people without giving them PageRank! 17%
o It's got some good and bad points. 48%
o Of course it's not, spammers will just up their volume to compensate! 7%
o Of course it's not, mean people will be able to link with impunity! 3%

Votes: 52
Results | Other Polls

Related Links
o Google
o Yahoo
o Rumors
o been
o flying
o is going to start supporting the attribute rel="nofollow" on links
o PageRank
o TrackBack
o the ability to criticize without giving PageRank
o noticed this early on but didn't publicize it
o Ian Hickson of Opera
o discussed
o somewhat
o debunked somewhat
o myself included
o Also by ubernostrum


Display: Sort:
A step toward solving comment spam? | 128 comments (117 topical, 11 editorial, 0 hidden)
Not for spam (2.66 / 3) (#8)
by curien on Wed Jan 19, 2005 at 05:50:58 AM EST

The problem of weblogs "spamming" the search engines with bogus hits is also a huge issue, and this could help remedy that situation.

--
This sig is umop apisdn.
I meant, Not *just* for spam nt (none / 0) (#9)
by curien on Wed Jan 19, 2005 at 05:51:35 AM EST



--
This sig is umop apisdn.
[ Parent ]
Maybe yes, maybe no. (none / 0) (#10)
by ubernostrum on Wed Jan 19, 2005 at 05:56:57 AM EST

Most of the "problem" with weblogs and search engines is that they link to each other so prolifically. But they tend to do it deliberately rather than through links in comments, so this likely won't do anything for it.




--
You cooin' with my bird?
[ Parent ]
The way I see it (none / 0) (#16)
by curien on Wed Jan 19, 2005 at 08:36:28 AM EST

This should be the default link type for all links generated by CMS-type software, not just links in the comments. My understanding is that the problem with blogs clogging up search engines isn't so much the blogs explicitly linking to each other in the content but the massive amount of cross-linking caused by TrackBack or whatever.

--
This sig is umop apisdn.
[ Parent ]
Well... (none / 0) (#22)
by ubernostrum on Wed Jan 19, 2005 at 10:00:30 AM EST

It really depends on who you talk to. Andrew "blogs are of the devil" Orlowski over at the Register would have you believe that every search is now hopelessly polluted with trackbacked and cross-linked blog URLs.

Thing is, even when you get a bunch of blog stuff it's usually blog entries that either A) talk about what you were looking for or B) link to it.

So to some folks there's really not even a "problem".




--
You cooin' with my bird?
[ Parent ]
Better real links than spam links (none / 1) (#28)
by cburke on Wed Jan 19, 2005 at 11:51:26 AM EST

So bloggers give each other sloppy Page Rank blowjobs.  Big deal.  At least the links aren't being placed by spam scripts that scour every blog on the web.


[ Parent ]
It is already possible (1.75 / 4) (#11)
by Viliam Bur on Wed Jan 19, 2005 at 06:03:18 AM EST

If you do not want spammers to make links to their pages, at first place, do not allow them to write links. Or, use something like this:

<p>Please take a look at this page:
<a href="#" onclick="javascript:location.href='http://www.kuro5hin.org'">
http://www.kuro5hin.org</a>, it is very interesting!!!</p>

People with 0 web experience will have JavaScript turned on, so it will work like any other links. People with 0.1 web experience can copy-paste the URL to browser.

Problem: (3.00 / 4) (#12)
by ubernostrum on Wed Jan 19, 2005 at 06:23:31 AM EST

Inaccessible. There are people out there who aren't using JavaScript-capable browsers and also can't view the JavaScript to copy/paste. They're called "disabled users."




--
You cooin' with my bird?
[ Parent ]
Partially wrong. (none / 1) (#17)
by diskis on Wed Jan 19, 2005 at 09:08:03 AM EST

Lets remove the javascript, and the result will be as the user would have disabled javascript and we'll get:

http://www.kuro5hin.org

A non-working, or actually broken link, that you are free to copy-paste. Well, still sort of inaccessible. I'm against use of javascript, but some people just have to use it.

Disabling javascript only disables it, it doesn't remove whole tags from the source.

[ Parent ]
OK. (none / 0) (#21)
by ubernostrum on Wed Jan 19, 2005 at 09:57:52 AM EST

You show me a screen reader that gives the user the full text of a javascript URL and I'll buy it. It's not like they can hover over the link and read the status bar...




--
You cooin' with my bird?
[ Parent ]
sheesh... (none / 0) (#25)
by diskis on Wed Jan 19, 2005 at 11:01:22 AM EST

It's not necessary to dabble with javascript in this case. Notice that every link has two parts. One, the target, where the link is pointing. Two, the onscreen part, the text the browser shows.

Viliam Bur's snippet uses javascript for the first part. The part that is shown onscreen is not written with javascript, plain old text. Please try any screen reader, they will all work.

[ Parent ]

Oh, I see. (none / 0) (#26)
by ubernostrum on Wed Jan 19, 2005 at 11:30:27 AM EST

So we're still allowing naked URLs. That's spam waiting to happen, sorry...




--
You cooin' with my bird?
[ Parent ]
but... (none / 0) (#84)
by Viliam Bur on Thu Jan 20, 2005 at 09:26:52 AM EST

...that kind of spam will not increase PageRank - which is the problem that Google is trying to solve by adding a new tag.

[ Parent ]
No, really. (3.00 / 2) (#24)
by i on Wed Jan 19, 2005 at 10:51:55 AM EST

People with 0.1 web experience can copy-paste the URL to browser.

Maybe they can. But will they?

and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]

think harder (3.00 / 2) (#34)
by Goerzon on Wed Jan 19, 2005 at 03:16:51 PM EST

Problem: spammers make links to their pages. Solution: spammers still make links to their pages, but now "people with 0 web experience" won't see the URL in their status bar before they follow the link.

Yes, that's a huge improvement.

[ Parent ]

No problem (none / 1) (#83)
by Viliam Bur on Thu Jan 20, 2005 at 09:24:42 AM EST

...you can also add a tag "title" to display the URL before clicking, or write the URL to window status, when mouse is above the text.

The "clicking of the URL" is one part of link functionality, "giving higher PageRank" is another. Replacing direct <a href="..."> with something else removes the PageRank part. Now, there are many possible things to do to display the URL to user. For example to write it on the page with small letters. Or use a small image (which would mean "You are going to click an URL that will take you out of this site") with "alt" attribute containing the target URL. Etc...

[ Parent ]

-1 against blog spam (2.25 / 8) (#15)
by Dr Funkenstein on Wed Jan 19, 2005 at 07:42:14 AM EST

blog spam is a good thing. keeps the crap from the useful parts of the inturnet.

Sweet. (3.00 / 8) (#18)
by Dr Gonzo on Wed Jan 19, 2005 at 09:38:14 AM EST

Does that mean Rusty will be putting a "rel=nofollow" attribute in all of MichaelCrawford's posts?

"I felt the warmth spread across my lap as her bladder let loose." - MichaelCrawford

Damn you! (none / 0) (#42)
by skyknight on Wed Jan 19, 2005 at 06:47:26 PM EST

Too fast, thou art.

It's not much fun at the top. I envy the common people, their hearty meals and Bruce Springsteen and voting. --SIGNOR SPAGHETTI
[ Parent ]
Yes. (none / 1) (#44)
by Dr Gonzo on Wed Jan 19, 2005 at 07:16:53 PM EST

It is one of the side-effects of being a supergenius like myself.

"I felt the warmth spread across my lap as her bladder let loose." - MichaelCrawford
[ Parent ]

or a life lacking refresh whore ;-) /nt (none / 0) (#47)
by skyknight on Wed Jan 19, 2005 at 07:40:15 PM EST



It's not much fun at the top. I envy the common people, their hearty meals and Bruce Springsteen and voting. --SIGNOR SPAGHETTI
[ Parent ]
Nice (1.33 / 3) (#23)
by hackle577 on Wed Jan 19, 2005 at 10:40:54 AM EST

spam sucks, this is not spam. ergo, +1 FP. i cant follow my own logic...

--
Yeah, that's right. "Turd Ferguson." It's a funny name.

Sir, your glass is severely half-empty (3.00 / 8) (#27)
by nkyad on Wed Jan 19, 2005 at 11:34:53 AM EST

I know it is natural. Given any solution to a problem, people will always try to point to what it doesn't do: "Hey, look, this wheel thing is great, but how does it help with the lack of light at night?" or "Hey, I really like this fire of yours, but can't it do anything beyond heat, light, protection and food preparation? Can we use it to float in the great water there?"

Your article has the same problem. Search engines are not trying protect blogs from spam, they are trying to protect themselves. And then you go on and decry a clearly good side-effect because "people will be nastier to each other". Does PageRank ever got in the way when people wanted to be nasty?

Don't believe in anything you can't see, smell, touch or at the very least infer from a good particle accelerator run


Just so you know... (none / 1) (#30)
by ubernostrum on Wed Jan 19, 2005 at 12:05:06 PM EST

Google and Microsoft both own and operate weblogging services. The spam problem hits them in multiple ways.




--
You cooin' with my bird?
[ Parent ]
that attitude is very short sighted (none / 1) (#65)
by martingale on Thu Jan 20, 2005 at 12:44:37 AM EST

Doing something stupid for the sake of doing *something* is idiotic. ubernostrum is right to point out the flaws.

The idea is stupid precisely because Google is giving away the keys to their index to the masses of people out there, in the hope that those people will do the right thing. They've got PhDs who can tackle the problem in-house, and they have huge resources for testing ideas, so why risk it all on some half assed prediction about how the masses of people out there are going to use the new attribute, and *link the Google index* to those millions of people?

Whoever pushed that idea at Google should be ritually pelted with rotten eggs.

[ Parent ]

This is terrible (2.84 / 13) (#29)
by nebbish on Wed Jan 19, 2005 at 11:59:40 AM EST

It means my blog's pagerank will go down, and I really care about both my blog and its pagerank.

---------
Kicking someone in the head is like punching them in the foot - Bruce Lee

Write in: missing poll options (none / 1) (#31)
by fyngyrz on Wed Jan 19, 2005 at 01:51:18 PM EST

  • This will reduce search engine rank abuse, and reduce blog abuse as well

  • This will reduce search engine rank abuse, but it will not reduce blog abuse

  • This will not reduce search engine rank abuse, but it will reduce blog abuse

  • This will not reduce search engine rank abuse, and it will not reduce blog abuse, either


Blog, Photos.

Missing poll option (2.66 / 12) (#32)
by godix on Wed Jan 19, 2005 at 03:08:22 PM EST

It'll help get the totally worthless blog pages off my google searchs and who gives a shit what happens to the blogs?

"Yeah, we rocked the vote all right. Those little bastards betrayed us again."
- Hunter S. Thompson on the 2004 election.
Word UP!! (2.33 / 3) (#36)
by Polyester Jones on Wed Jan 19, 2005 at 04:35:11 PM EST

Blogs are the background noise of the Internet.

--
When ideas fail, words come in very handy.
-Anonymous
[ Parent ]
WIPO (1.16 / 6) (#33)
by Goerzon on Wed Jan 19, 2005 at 03:12:47 PM EST

Of course it is! It will improve PageRank and if it hurts blogs, I'd consider that a blessing, seeing as blogs are only for losers and child molestors anyway!

I doubt they have the spare capacity you think (2.75 / 8) (#35)
by R Mutt on Wed Jan 19, 2005 at 03:46:50 PM EST

You say "Group A spammers could afford to be somewhat conservative in the amount of their posting." I really don't think they're sitting there saying "Hey, I could send out 10,000 spams and make fifty bucks, but since I'm feeling conservative I'll send 1,000 and make five"

More spam = more money for them: they will already be spamming as much as they possibly can, and planning to up their capacity as soon as they can.

Nofollow will reduce their income. That means they have less money to invest in new hardware. It makes it harder for them to increase the volume.

Say they've got enough cash in hand to increase the volume somehow. There's now a lower rate of return on their investment: why does that make it more likely for them to invest in spam than they would anyway?

OK, they can switch to group B and make some money. But as you say, there's probably less money there or they'd be doing that already, at least some will give up. I'm not sure how transferable it is anyway: I suspect mail spam is much higher volume.

Analogy: imagine someone invented a totally impregnable car lock. Your reaction would seem to be: "this is terrible: they'll all start breaking into our homes instead."
----
Coward... Asshole... from the start you kept up the appearance of objectively posting interesting links.

spam hysteria. (1.57 / 7) (#37)
by the ghost of rmg on Wed Jan 19, 2005 at 04:54:43 PM EST

the perennial whines about spam never cease to amaze me. i actually like most of the spam i receive, particularly those wonderful 419 scammers. i think spam has enriched my life more than it has proved a nuisance. as far as comments on weblogs go, that sort of spam is easy to deal with via scripts and user moderation.

this initiative, if anything, will only erode the power blogs have over search engines by invalidating user feedback. in that connection, i can't quite get my brain wrapped around what this microsoft schmuck was bitching about. why is he worried about increasing the relevance of that redmond carpet shop? if they're doing something unusual, they're absolutely relevant! why else would someone submit that search if they were not looking for information on that shop?

i suspect that was just a lame justification for what, in the end, is only an attack by large content brokers like google and yahoo on the blogosphere. this sort of thing has become all too familiar this past week.

the only kind of comment spam that should be dealt with severely is the kind one sees on slashdot with the threshhold set to -1. with honest spammers, there is at least an attempt to inform, even if it is self-interested. the kind of spam on slashdot, and unfortunately this site as well, is purely obnoxious and actively seeks to destroy discussion, as opposed to innocuously promote various potentially useful goods and services.

in the end, it comes down to a class war between the blogging and advertising proletarian and the corporate forces behind yahoo, google, and the "trolls" they employ. in reality, as small independent entities, the spammer and the blogger are ultimately on the same side of the fight, constantly attempting to find room in the marketplace for their (admittedly questionable) content amidst a world dominated by megaconglomerates who want to put them out of business by any means necessary, whether it's by harassing the bloggers' users or by flooding the legitimate spammers' advertisements out of sight.




rmg: comments better than yours.
Well done /nt (none / 1) (#38)
by i on Wed Jan 19, 2005 at 05:12:34 PM EST



and we have a contradicton according to our assumptions and the factor theorem

[ Parent ]
Nice [nt] (none / 0) (#41)
by CodeWright on Wed Jan 19, 2005 at 06:27:48 PM EST



--
A: Because it destroys the flow of conversation.
Q: Why is top posting dumb? --clover_kicker

[ Parent ]
I'd be content with (none / 1) (#49)
by gdanjo on Wed Jan 19, 2005 at 08:32:56 PM EST

a "leave out all blogs from search results" checkbox. Most of the time I don't mind if a blog returns a result, but sometimes, for that hard-to-find thing you're looking for, the noise of blog hits makes the useful info almost completely invisible.

Dan ...
"Death - oh! fair and `guiling copesmate Death!
Be not a malais'd beggar; claim this bloody jester!"
-ToT
[ Parent ]

Type B Spammers (none / 0) (#39)
by Nick W on Wed Jan 19, 2005 at 06:22:03 PM EST

Hi everyone

I've never met, nor seen a type B spammer - im not certain they exist at all. We have discussed comment spam alot at my site, and many of the participants are blog spammers.

The overriding opinion on the nofollow foolishness is that it will only *increase* spam - spammers will simply double their efforts in order to reach the millions of blogs that will never be updated and will not install plugins. It's a brainless, knee-jerk, lip-service reaction from the search engines with the only purpose being to be seen to be doing something.

From a search marketers point of view its really rather ludicrous. The only way to move on comment spam IMO is to stop people being able to write bots (they're very simple to make) and comment *easily* on blogs. There are some very good ways to do this such as:

  • Captchas
  • Member registration
  • Pre-moderation
  • Changing the way that the comments actually work to deny anchor text to the spammers

It really isnt that tough but of course Goole et al will still have to deal with folks gaming their algos by commenting on abandoned blogs. This should fade over the years as hosting accounts run out and servers upgrade rendering old software incompatable etc but it would certainly take time. The best solution would be for Google to recognize abandoned blogs and for MT and other vendors to put in some real proactive, as opposed to reactive (read MT Blacklist) measures to stop the automated commenting *before* rather than after the attempt is made.

Google and Yahoo and MSN are playing games and stalling for time, i dont beleive for one minute that they dont know just how utterly useless this is. One SE rep told me the most interestng thing about this would be the resulting conversation...
http://www.Threadwatch.org
heh. welcome to k5... (none / 0) (#63)
by kpaul on Thu Jan 20, 2005 at 12:17:59 AM EST

weird seeing you here. ;)

i bloglines your site a couple weeks ago. keep up the good work.

are you new to k5?


2014 Halloween Costumes
[ Parent ]

Improving the CommentAPI spec (none / 0) (#67)
by skim123 on Thu Jan 20, 2005 at 12:56:53 AM EST

Let me add to your fine comment that "they" really ought to update the CommentAPI spec to support Captchas.
I've never met, nor seen a type B spammer - im not certain they exist at all. We have discussed comment spam alot at my site, and many of the participants are blog spammers.
I noted this too, in this comment. However, once the major blog engines implement rel="nofollow" doesn't it logically follow that those Group A spammers must become Group B spammers? True, many, if any, exist now, but I think Google et al may have just created this group single handedly!

A couple more comments: Member registration won't work, IMO, if it's done on a blog-by-blog or blog engine-by-blog engine basis. No one wants to have a gazillion usernames/passwords to remember. I think they only way this would work would be to have some global user store, as I discussed in this comment. Member registration would only be viable for those who read blogs only hosted by a particular engine (like on LiveJournal, for example).

Pre-moderation is the best solution today, IMO. Munging the comment URLs is really no different than Google's proposal.

Money is in some respects like fire; it is a very excellent servant but a terrible master.
PT Barnum


[ Parent ]
TypeKey (none / 0) (#112)
by dn on Fri Jan 21, 2005 at 09:17:06 PM EST

I think they only way this would work would be to have some global user store,...
Such as TypeKey.

    I ♥
TOXIC
WASTE

[ Parent ]

OK... (none / 0) (#70)
by ubernostrum on Thu Jan 20, 2005 at 02:24:23 AM EST

I've never met, nor seen a type B spammer - im not certain they exist at all.

So you've never used email?




--
You cooin' with my bird?
[ Parent ]
Email (none / 0) (#74)
by Nick W on Thu Jan 20, 2005 at 05:29:51 AM EST

We're not talking about email though - most blog spammers look down their noses at the UCE boys.

Nice to see you too kpaul! Yeah, i read here a little but my first posts :)

Im still trying to work out how to read the threads - the discussion is so disjointed on this system (at least to me) im finding it very tough to follow the conversation heh..
http://www.Threadwatch.org
[ Parent ]

Thing is... (none / 0) (#75)
by ubernostrum on Thu Jan 20, 2005 at 05:53:47 AM EST

Take away the PageRank benefits, and the blog spammers aren't going to just go away. They've already invested in spamming as a business model, so they'll fall back to a method which still works: lots of volume, and who cares if you get PageRank?




--
You cooin' with my bird?
[ Parent ]
No they won't... (none / 1) (#80)
by njyoderx on Thu Jan 20, 2005 at 07:41:27 AM EST

They'll just resort to more e-mail spamming if anything.  If blogs frequently implement nofollow, then spammers won't bother because 99.999999% of blog readers won't do business with those people.  E-mail, otoh, has plenty of willing dimwits.  Your entire article is based on speculation of a non-existent trend.

[ Parent ]
Yes they will. (none / 0) (#82)
by ubernostrum on Thu Jan 20, 2005 at 08:37:30 AM EST

If you think bloggers are any more intelligent on average than users of email, I have a bridge in Brooklyn I'd like to sell you.




--
You cooin' with my bird?
[ Parent ]
nt (none / 0) (#86)
by njyoder on Thu Jan 20, 2005 at 09:59:05 AM EST

Most of the people responding to spam are really lonely morons, druggies and old people falling for scams.  Bloggers don't exactly fit those profiles.  Yes, a lot are lonely, but it's not the same kind of  pathetic lonely.

[ Parent ]
Strange spam (none / 0) (#40)
by Homburg on Wed Jan 19, 2005 at 06:23:13 PM EST

What this proposal doesn't deal with, is the amount of comment spam which doesn't appear to be promoting a URL at all. I got hit with a lot of spam which consisted entirely of random character strings, one of which was linked to an invalid URL, made up of random characters. More recently, I've started getting spam which is of the same form as spam which promotes a link in the 'homepage' field, except that it doesn't include a link anywhere.

I've no idea what the spammers are getting at here. Do they just have so many zombies they can send some out spamming for the hell of it?

from what i've heard... (3.00 / 3) (#52)
by kpaul on Wed Jan 19, 2005 at 08:46:28 PM EST

it's a test. if they tag your site with "iarhoaiuhtruiehtihae" or something similar they can come back in a week and see if it's still there. if it is, the script puts you on a 'good' list so the real spam can begin.

at least that's what i've been told...


2014 Halloween Costumes
[ Parent ]

this article is too negative (2.66 / 3) (#43)
by jbuck on Wed Jan 19, 2005 at 06:56:23 PM EST

I think that this approach will, in fact, succeed in getting rid of one category of comment spam, and the proposed feature is useful in that it lets people distinguish between a link that recommends a site, and a link that is not intended to recommend a site. It'll be interesting to see what happens to some political bloggers' karma when this goes into effect, as many of the links from the other side of the political divide are of the form "look what this moron said today".

It can be useful for community-moderated sites (e.g. K5, Slashdot, DailyKos) as well: sites that moderate their comments might attach "nofollow" by default to links when a comment is posted, then take it off for comments that are moderated up, so that links in highly rated comments get more juice.

Yes, spammers can still post comments that attempt to convince a human being to follow the link, but they can do that today.

This is really insightful (2.11 / 9) (#45)
by Anonymous Howards End on Wed Jan 19, 2005 at 07:28:08 PM EST

Tell you what, send it to Google, and once they've admitted that they are complete idiots with no understanding of the web or how people use it, and that you are right, and they give you a job that pays a million dollars a week, and a hot Swedish porn star wife, and a huge ranch, and a pony, then get back to us and we'll come round to your ranch for a huge party and take turns on her.  The pony, I mean.
--
CodeWright, you are one cowardly hypocritical motherfucker.
harumph (none / 1) (#53)
by Frequanaut on Wed Jan 19, 2005 at 08:55:10 PM EST

Someone is a bitter, bitter man.


[ Parent ]
I got link spammed the other day... (3.00 / 3) (#46)
by jsnow on Wed Jan 19, 2005 at 07:33:41 PM EST

My apologies if this is a boring story, but I have a wiki I use mostly for notes to myself. Some other people use it, and I disallow editing by anyone not logged in (though I allow anyone to set up an account if they wish).

The other day, a full screenfull of links showed up. Here's an example, the first and last few lines (not as links):

WIKI-Sponsored links: aeron aerosole airsoft guns armani attorney bakery baseballs bikini birkenstocks blockbuster video bowling campground canon car dealer circuit city costco days inn dentist digital camcorder dr martens electric scooter embassy suites fairfield inn flat panel monitor fleet gas scooter gps hair salon hampton inn home depot insurance broker ipod kenneth cole lingerie little black dress .... buy cheap Vioxx buy cheap Wellbutrin buy cheap Xanax buy cheap Xenical buy cheap Yasmin buy cheap Zanaflex buy cheap Zithromax buy cheap Zoloft buy cheap Zovirax buy cheap Zyban buy cheap Zyrtec

The links themselves were mostly to .cc (cocos islands), .to (tonga), .nu (niue), and .com sites.

There were some interesting entries in the http logs:

212.164.71.254 - - [08/Jan/2005:13:16:20 -0800] "POST /wiki/index.php?title=Main_Page&action=submit HTTP/1.1" 302 26 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Fuck you (+http://www.go-to-the-nahuy.com); .NET CLR 1.1.4322)"

Not very polite. The IP address is from Russia. Googling the IP, I find that quite a few wikis have been defaced by this one guy.

What to do? I blocked his IP. Not very satisfying, but legal action is out of the question (the spammer is in another country, and it's not even clear if he broke any law - he certainly didn't bypass any security mechanism, he just created an account (named GoogleBot) and edited a page). I'd like to add his IP to a global list of known wiki spammers, but I don't know of any good ones. Any suggestions?

the question you have to ask yourself (none / 1) (#48)
by the ghost of rmg on Wed Jan 19, 2005 at 08:05:25 PM EST

is why the hell you use something so public for notes to yourself. are you daft?

forget about ip blacklists. the first thing you should do is buy a pad of paper. if that seems too "low tech," try a PDA.

it is a source of constant astonishment that people use the web for their own private purposes, while not only leaving their content open to the inspection of others, but even allowing others to comment or even edit that content. what is the matter with them?


rmg: comments better than yours.
[ Parent ]

shared knowledge is good (none / 0) (#50)
by jsnow on Wed Jan 19, 2005 at 08:42:53 PM EST

My notes are mostly for myself, but also for others I work with (both locally and remotely) to know what I'm doing. I could use a tighter security policy, but so far that's more work than deleting link spam (which has only happened once). I'm not worried about content deletion, since mediawiki (the wiki software I'm using) keeps a complete history of every document.

I also like to leave some of these pages relatively open so that others can correct my mistakes and/or learn from what I know (in the event that it's useful to someone else).

[ Parent ]

me to (none / 0) (#54)
by hswerdfe on Wed Jan 19, 2005 at 08:58:23 PM EST

same thing happened to me with my wiki same links same username. but whatever ban his ip and move on
--- meh ---
[ Parent ]
MediaWiki spam blacklist extension (none / 1) (#56)
by brion on Wed Jan 19, 2005 at 10:14:43 PM EST

If you're using MediaWiki, there's an experimental centralized spam blacklist extension. It's not really documented at present, but if you want to give a heads-up on a spammer you've noticed drop a note on the blacklist discussion page.

The IP addresses are generally open proxies or such, so we've concentrated on blocking the spam URLs themselves to cut down on repeat spam attempts.

The extension itself can be pulled from our CVS.



Chu vi parolas Vikipedion?
[ Parent ]
that's interesting (none / 0) (#60)
by jsnow on Thu Jan 20, 2005 at 12:03:10 AM EST

If I get spammed again I might give it a try. What does it do if a blacklisted URL is in an edit, does it remove the URL from the text or does it refuse the to apply the whole edit? Or does it block the source IP from then on?

[ Parent ]
Blacklist function (none / 0) (#69)
by brion on Thu Jan 20, 2005 at 02:05:48 AM EST

The whole edit is rejected. (It's still annoying to get spam edits even if the URLs are stripped -- you have to clean up the pages anyway to remove the junk text.)

Occasionally this can be an annoyance if a bogus entry makes it into the list. (Blocking all links to China is maybe not the best idea. ;)

The blacklist functionality is itself built-in; you can set $wgSpamBlacklist to a regular expression, which if the page matches will reject edits. The extension is to fetch the blacklist from a centralized wiki page.



Chu vi parolas Vikipedion?
[ Parent ]
I get a lot of those (none / 1) (#68)
by mike3k on Thu Jan 20, 2005 at 01:24:37 AM EST

I get lots of spam consisting of a large string of links. Lately a lot of it is entirely in Chinese and all of them come from Chinanet. I finally added a deny for their entire IP address block of 219.128.0.0-219.140.255.255 to my .htaccess. Last night I also added a patch to my Drupal-based site that automatically adds ref="nofollow" to all user-posed links.

[ Parent ]
+1FP (none / 1) (#51)
by kpaul on Wed Jan 19, 2005 at 08:44:18 PM EST

nice write-up. k5 is alive, sir. ;)


2014 Halloween Costumes

That's not what the SEO specialist told me. (none / 0) (#55)
by Lisa Dawn on Wed Jan 19, 2005 at 09:24:50 PM EST

Nice writeup, by the way, even if I find it a bit too biased.

There are two points I'd like to make. First: not everyone uses nofollow. That means that using it gets you a bit less spam in your blog or bbs, while these "group A spammers" will stick to the suckers who don't.

Second: don't forget that group A spammers are actually SEO kids with uncooperative ethics. Who says they'll necessarily collapse into group B spammers, instead of just giving up this aspect of SEO play? Besides... if they're group A spammers willing to become group B, they're probablly in group B already.

personally... (none / 0) (#62)
by kpaul on Thu Jan 20, 2005 at 12:13:59 AM EST

i don't see it changing the comment spam problem too much. it will change the SERPs a little, although i've heard Goog was already noticing this stuff and taking it into account...

i imagine there's a lot of SEO script kiddies out there who have a bot running on their gransparent's computer and being lazy, they'll just keep it running. a lot of this crap is automated. this isn't gonna make it go away.

the word on the 'net is that this will actually cause more spammers to register blogs at the free services and use them to spam, rather than just form comments of other blogs...

what a tangled web we weave, eh? ;)


2014 Halloween Costumes
[ Parent ]

I like it. (none / 0) (#73)
by Lisa Dawn on Thu Jan 20, 2005 at 04:35:06 AM EST

The big-ness of it all.

It's like you have to look at it in detail, but then you have to step back and realize that it's also connected to everything else.

[ Parent ]

How to use this to make money: (3.00 / 6) (#57)
by sllort on Wed Jan 19, 2005 at 11:06:19 PM EST

Next thing that needs to be done is to implement an Apache module with the following logic:

if (requesting ip is member of google's netblock)
 {
  insert nofollow in all my links;
 }
  else
 {
  all my links are normal;
 }

Then just tell Google not to cache your pages. The result? You can sell links to your competitors, all the while secretly disabling them for the purpose of search engine rankings, thereby having your pagerank and selling it too. When people visit your site, they see normal links, and when Google visits your site, it sees nofollow links. And properly implemented this should be completely untraceable. So with regards to "being nasty to people", you can now do this secretly.

Google certainly has handed a shiny new sword to webmasters who lack morals. Rusty, please get right on this.
 
--
Warning: On Lawn is a documented liar.

Or perhaps (none / 0) (#87)
by tekue on Thu Jan 20, 2005 at 10:27:16 AM EST

...Google took away the old, dull sword from the hands of advertisers who lack morals. One should not buy links from blogs, but create a product that yelds praise from bloggers (if links from blogs are what he wants).

Actually, I would love people to start cheating advertisers in this manner -- that would render this "blog product placement" uneffective.

Hell, using your idea I might start selling links from my blog too :)
--
Humanity has advanced, when it has advanced, not because it has been sober, responsible, and cautious, but because it has been playful, rebellious, and immature. --Tom Robbins
[ Parent ]

Finally (3.00 / 3) (#58)
by Carnage4Life on Wed Jan 19, 2005 at 11:08:47 PM EST

I'm not sure why you are so upset about rel='nofollow' being used to deny people PageRank. I think this fixes a bug in Google which has irritated me for a very long time. Scoble should be able to link to a website that ripped him off without Google suddenly thinking that it is an authoritative site on the subject when Scoble meant the exact opposite.

I'm just amused that the entire world has to change their HTML to fix design flaws in Google's search algorithms.

PS: The first mention of an idea like rel='nofollow' was item 75 in the Wired article How To Save The Internet

Wait, no (3.00 / 7) (#59)
by sllort on Wed Jan 19, 2005 at 11:40:15 PM EST

We should be able to give the target a score, see, like this "rel=link5" or "rel=link1" ... and maybe labels. Like "insightful".
--
Warning: On Lawn is a documented liar.
[ Parent ]
+5 (funny) (none / 0) (#61)
by kpaul on Thu Jan 20, 2005 at 12:09:43 AM EST

thanks for the smile. it was much needed. :)


2014 Halloween Costumes
[ Parent ]

it's a terrible principle (none / 1) (#64)
by martingale on Thu Jan 20, 2005 at 12:30:56 AM EST

As a piece of markup, rel='nofollow' is neither good nor bad, it's just another attribute. What's truly bad about this is that it gives people yet another way of tweaking the Google search engine. Google should be figuring out ways of deducing information that's stable and can't be easily modified by abusers, not give people more ways to abuse their index.

As a general rule: if you give people new things to play with, you can't control what they'll do with it. They are kidding themselves if they think that the handful of blogger software developers will always act in ways which ultimately are in Google's best interest.

[ Parent ]

I don't get it (none / 0) (#96)
by JahToasted on Thu Jan 20, 2005 at 01:36:08 PM EST

How can this be abused? Give me an example please? It seems to me that using this attribute is like not linking to anything at all as far as google is concerned. How can this be bad? Are you saying that if I take off all the links on my webpage, its somehow going to hurt google? You do know that you can set an option in your cookies.txt to hqave google completely ignore your website. As far as I know that has never been abused. Maybe thats because it can't be. This new attribute is just a way to tell google to "ignore this link". Its just a finer grained version of the option you already have in your cookies.txt.

Please give an example of how it can be abused.
______
"I wanna have my kicks before the whole shithouse goes up in flames" -- Jim Morrison
[ Parent ]

lowering rank (none / 1) (#100)
by martingale on Thu Jan 20, 2005 at 08:06:31 PM EST

All you need to abuse it is to make it so a site has a *lower* pagerank than its natural rank. Others have discussed the idea of bloggers linking to sites they don't like but using no-follow. What's going to happen when all those bloggers decide some site is gay-friendly or a terrorist site? Their actions can lower the natural rank of such a site without proof, rhyme or reason. This in turn biases Google's index for the worse. But now Google doesn't have as much control over this anymore, because it's the bloggers who decide who to victimize. In the past, Google has sometimes acted internally to prevent abuse. Now they're giving the ability to abuse to anybody.

Any system such as this one which takes away the control from Google's engineers and gives this control to a nebulous mass of people without accountability is not in the interest of Google's mission: to make all the world's knowledge accessible (presumably in a fair way).

[ Parent ]

But (none / 0) (#110)
by JahToasted on Fri Jan 21, 2005 at 03:25:47 PM EST

there is a difference between lowering the rank and not raising the rank. This will not lower a page rank. it will simply not raise the page rank.

Think of it like the K5 voting queue. You have the option of voting +1, 0 or -1, right? Well by linking a site its like you are voting +1. By linking with rel="nofollow" its like you are voting 0. You still have no option of voting -1 in Page Rank, you can only choose between +1 and 0. Voting zero is not going to lower a pagerank, it will be just as if you didn't vote (or in this case link to the stie).
______
"I wanna have my kicks before the whole shithouse goes up in flames" -- Jim Morrison
[ Parent ]

of course it will lower the pagerank (none / 0) (#113)
by martingale on Sat Jan 22, 2005 at 12:34:08 AM EST

there is a difference between lowering the rank and not raising the rank. This will not lower a page rank. it will simply not raise the page rank.
If you think it cannot lower the pagerank of a site, you don't understand PageRank. A site's PR depends on the links to it, but also on the links which link to the sites which link to it, and on the links which link to the sites which link to the sites which link to it, etc all the way up to the largest cluster of sites from which you can reach that site. The PageRank measure on that cluster is in an equilibrium. If you raise a site's rank somehow, you also lower everyone else's rank, and conversely. But you don't lower everyone else *equally*, instead you reach a new equilibrium.

[ Parent ]
Using that logic (none / 0) (#119)
by JahToasted on Mon Jan 24, 2005 at 12:38:11 PM EST

If I don't link to a site at all I'm lowering its page rank. But really that's just ridiculous. All this will mean is that sites that people don't like will be ranked the same as the sites that people don't have any interest in at all. I don't see why that's a bad thing really.
______
"I wanna have my kicks before the whole shithouse goes up in flames" -- Jim Morrison
[ Parent ]
don't make silly points (none / 0) (#126)
by martingale on Thu Jan 27, 2005 at 03:22:14 AM EST

If I don't link to a site at all I'm lowering its page rank. But really that's just ridiculous.
Whatever gave you that idea? If you don't link to a site you would have linked to, then you're clearly lowering PageRank. If you simply decide not to be a player, then you don't have any effect at all anyway. You must calculate the bias with respect to the same situation if the no-follow tag didn't exist.

All this will mean is that sites that people don't like will be ranked the same as the sites that people don't have any interest in at all. I don't see why that's a bad thing really.
No, it biases the data. I've already explained to you that not linking can lower somebody else's site compared to having that link. Rankings reflect relative orderings.

Google is interested in objective information representing the state of the web as a proxy for semantic importance. They are quite clear about this: if you obviously tamper with links to manipulate PageRank, they'll penalize you.

Now, not linking to sites which you would have otherwise linked to because such linking is natural in the first place constitutes a clear form of tampering. It biases the "natural" state of the web links, ie the flow of information. Doing so on a small scale can obviously be dismissed as noise, but doing so on a large scale does affect PageRank quite appreciably.

Whoever at Google thought up this idea of giving the masses of bloggers and others this power is either very high up, or deserves to be fired. Either way, it allows precisely the kind of large scale manipulations of PageRank which have been frowned upon by Google before, when either p0rn site operators or bloggers have performed them to change term rankings in their favour.

This will bias Google's rankings if it gets used, and if it doesn't get used, or is simply ignored later on, it still adds useless cruft to an already much too complex world wide web. The guy who pushed this idea through is an idiot.

[ Parent ]

It still bothers me. (none / 0) (#71)
by ubernostrum on Thu Jan 20, 2005 at 02:33:46 AM EST

Although it is already possible to link without giving PageRank (just run the link through Google), I guess I feel it'd be more appropriate to openly link to them and use some negative phrase as the link text. If you've got good PageRank you're more likely to associate the target with that phrase.

Also, with the threat of giving PageRank to the target of your rant, you have to stop and think for a minute about whether it's really worth sniping at them. Now we can all post our knee-jerk reactions without any reflection, and I'm sure that'll make the intarweb a better, calmer place ;)

I didn't konw about the Wired article, thanks for the link.




--
You cooin' with my bird?
[ Parent ]
nt (none / 0) (#79)
by njyoderx on Thu Jan 20, 2005 at 07:37:05 AM EST

1. No one knows about that option.
2. It requires a redirect instead of a simple attribute tag.
3. It will just result in increased ranking to the "redirection link" instead.
4. You're advocating google bombing.
5. Your "existing solution" is at best, just as bad as the new one.

[ Parent ]
Couple of things. (none / 0) (#81)
by ubernostrum on Thu Jan 20, 2005 at 08:36:38 AM EST

1. No one knows about that option.

That's funny, because two of the largest weblogging companies in the world (Google and Six Apart) build it into their tools (Blogger and Movable Type).

3. It will just result in increased ranking to the "redirection link" instead.

In most cases the redirect is done through Google, which provides a service for this. I don't think Google's PageRank can get much higher than it already is...




--
You cooin' with my bird?
[ Parent ]
why does kuro5hin require a subject? (none / 0) (#85)
by njyoder on Thu Jan 20, 2005 at 09:55:22 AM EST

That's funny, because two of the largest weblogging companies in the world (Google and Six Apart) build it into their tools (Blogger and Movable Type).

Doesn't change the fact that most people don't know about it.  And the other big blogs DON'T have it yet.

In most cases the redirect is done through Google, which provides a service for this. I don't think Google's PageRank can get much higher than it already is...

The link wouldn't be to google's main site, it'd be to the redirect to the other site.  That getting highly ranked is bad.  And your method is just a hack anyway.

[ Parent ]

Because it helps other people (none / 0) (#88)
by curien on Thu Jan 20, 2005 at 11:03:55 AM EST

I use dynamic threaded, so I only actually read the comments that look interesting. Your comment is less likely to be read if it has a useless or inane subject than if you give it a decent one.

Is it that hard to come up with a one-line summary of your post?

--
This sig is umop apisdn.
[ Parent ]

Seems like I figured out one that works... (none / 0) (#90)
by njyoder on Thu Jan 20, 2005 at 11:51:51 AM EST

Just put " why does kuro5hin require a subject?" in each subject line and someone will respond ;-P

[ Parent ]
It's better than nt :-) (none / 1) (#93)
by curien on Thu Jan 20, 2005 at 12:08:29 PM EST

"nt" in the subject usually means that your comment body has *n*o *t*ext.

--
This sig is umop apisdn.
[ Parent ]
Ahh, there *was* text on that one (none / 0) (#103)
by joecool12321 on Fri Jan 21, 2005 at 03:38:05 AM EST

I actually skipped the "nt" (dynamic threaded here, as well. Who isn't?) post without even opening it.

Even something as simple as "Two key flaws" or "Re: <title of previous post>" is useful.

[ Parent ]

I should have put the 'nt' in quotes nt (none / 0) (#104)
by curien on Fri Jan 21, 2005 at 04:14:12 AM EST



--
This sig is umop apisdn.
[ Parent ]
Here's how it works: (none / 0) (#89)
by ubernostrum on Thu Jan 20, 2005 at 11:15:01 AM EST

The link wouldn't be to google's main site, it'd be to the redirect to the other site. That getting highly ranked is bad. And your method is just a hack anyway.

This is a link to k5, using the Google redirect. k5 doesn't get PageRank from it. Google's PageRank is 10 and always will be, so they're not getting any PageRank from it either. It's a link which does not affect PageRank anywhere, in any way.




--
You cooin' with my bird?
[ Parent ]
I'm talking about _page_ rankings not _site_ ranks (none / 0) (#92)
by njyoder on Thu Jan 20, 2005 at 11:56:41 AM EST

I'm not saying kuro5hin is getting a rank from it.  You're confusing a rank for the entire site for a rank for a specific link/page.  The point is that the google redirect url (which will be the same for everyone who uses google redirects) will become higher ranked, which is effectively the same as having the site itself ranked.  Even if google ignores redirect links, that doesn't say anything about how other search engines deal with them or how google deals with non-google redirects.  So if you search for "bad rug seller", it will just come up with www.google.com/url?sa=U&q=http://www.badrugsellers.com instead of going directly to www.badrugsellers.com.  What's the difference?

[ Parent ]
Er. (none / 0) (#94)
by ubernostrum on Thu Jan 20, 2005 at 12:50:41 PM EST

Google doesn't track over its own redirects. That's kind of the whole point.




--
You cooin' with my bird?
[ Parent ]
Google isn't the only search engine (none / 0) (#99)
by njyoder on Thu Jan 20, 2005 at 05:51:54 PM EST

You need to read my entire reply.  What about other search engines?  WHat about redirects through other urls?  If you're going to have some standard, doesn't it make sense for it to be part of the A html tag rather than some redirection hack?

[ Parent ]
A couple comments... (3.00 / 3) (#66)
by skim123 on Thu Jan 20, 2005 at 12:46:35 AM EST

Robert Scoble, a relatively high-profile weblogger who works for Microsoft, is already fantasizing about the ability to criticize without giving PageRank
Um, he already has this power. It's called linking using a referrer. That is, rather than linking http://www.suckyCarpets.com, you'd link to http://www.google.com/url?sa=D&q=http://www.suckyCarpets.com. Personally, I think it's a Good Thing that Google provides this option. Users should be able to (optionally) specify the "worth" of their outgoing links. Granted, one should not be able to say, "This link is more important than a link on site XYZ.com," but they should be able to rank the worth of their links, if nothing else at least on a binary system (which is what this rel="nofollow" allows).

The other kind, "Group B", spam because they want their address or brand name to be seen by as many people as possible.
Perhaps it's just that "Group A" comment spammers visit my blog, but I find that the vast, vast majority of the time comment spam is targetted to old blog entries. My thinking is that the spammer assumes that I won't be checking those older entries for spam, but in the same vein the likelihood of folks reading my older entries is much less than folks reading my newer ones. In fact, if you check out weblogs.asp.net they essentially combat this problem by turning off the ability to comment on entries past a certain age.

Again, it might be that no "Group B" guys come my way, but if the population of comment spammers that visit my blog is random, then it would lend evidence to the fact that the overwhelming majority of comment spammers are in "Group A." Of course, as you pointed out, this may shift violently now that Google et al have introduced this measure.

I don't think anyone honestly thinks that Google's new initiative will actually end comment spam. Anyone who makes that claim is ludicrous and doesn't get the volume of spam in their Inbox that I or countless others does. Spammers aren't going to just say, "Oh cripes, they have us here, let's not bother spamming anymore." It's preposterous to assume otherwise. The only way comment spam will ever truly be defeated without requiring loads of work from the blog author (such as moderation of posts) is by having a global user store (like Passport or the Liberty Alliance) and then having blogs require that users must authenticate to post. This global user store would require some hassle in setting up an account, and if a person posted comment spam on any blog in the network, this user could be banned from the global user store, requiring the spammer to take the time/effort to create yet another account. Amortorized over enough blogs, bloggers would find their comment spam trickling down to a tiny amount per year, as opposed to the masses that are often received now.

Money is in some respects like fire; it is a very excellent servant but a terrible master.
PT Barnum


isn't it the other way around? (none / 0) (#72)
by martingale on Thu Jan 20, 2005 at 02:52:16 AM EST

I think you've mixed up Group A and Group B. Group A are interested in increasing their PageRank. To do that, they have to put links pointing back towards them in as many different web pages as possible.

So they naturally want to also fill up your comment archives, since those old comments will probably appear on a different url than the main page, but still have high potency as they are on the same site as your main page.

The Group B spammers don't care about Google in particular. They are probably much more interested in trackbacks and RSS feeds, neither of which has anything to do with Google or PageRank. It's the Group B who prefer to spam only recent comment areas, because that's what the RSS feeds and casual readers will see, whereas your archives don't get pulled all over the net.

[ Parent ]

Ah yes, mixed 'em up, my bad. [n/t] (none / 0) (#91)
by skim123 on Thu Jan 20, 2005 at 11:54:32 AM EST


Money is in some respects like fire; it is a very excellent servant but a terrible master.
PT Barnum


[ Parent ]
If I'd seen this one in the queue (1.33 / 6) (#76)
by ksandstr on Thu Jan 20, 2005 at 05:59:25 AM EST

I'd have -1'd it.

Apparently, the solution to the web drowning in blog wank is more blog wank. Especially the touchy-feely limp-wristed "but can't you see that people will be MEAN to one another?!" kind, as evidently it is the kind of blog wank that heals instead of hurting and cures AIDS instead of raping your cat.

--
Gegen kommunismus und bolschewismus und terrorismus, jawohl!

If I rated comments (none / 1) (#78)
by ubernostrum on Thu Jan 20, 2005 at 07:01:49 AM EST

I'd zero that one.

Apparently the solution to blog wank is anti-blog wank. Why don't you go jerk off James A.C. Joyce somewhere else?




--
You cooin' with my bird?
[ Parent ]
Hello Kevor S. Andtrivellech (none / 1) (#111)
by Stylusepix on Fri Jan 21, 2005 at 03:37:09 PM EST

Thank you for informing us that pessimistic cat-raping wankers may hurt our blog. We at Kuro Five Hind welcome your contribution and will translate it into Scoop code at once.

if ($postormsgbody{category} eq "touchyfeelylimpwriting") {
    $heal++; $aids--; }
else {
    $heal--; rape($cat); }


OMG THAT MEANS YOU'RE A CAT RAPIST ! GET AWAY FROM HERE YOU SICK BASTARD !
Go; you're an it-getter, but No; it's all in good fun (and games). Laugh, in stock?
[ Parent ]
Google don't care about comment spam (2.33 / 3) (#77)
by frabcus on Thu Jan 20, 2005 at 06:14:12 AM EST

It's irrelevant to Google whether this stops comment spam.  What it will stop is comment spam interfering with Google's page rankings, so it will improve Google's page results.

Now, I'm sure some employees of Google have the motive to try and reduce comment spam. However, rel=nofollow will be a success for Google the corporation whether or not that succeeds.

Is this spam? (none / 0) (#95)
by VisualPolitics on Thu Jan 20, 2005 at 01:35:24 PM EST

It's interesting because I've been part of Group B of blog-spammers, but I never do it with a script, and I try to say something relevant to the discussion, and as an aside, I might post a link to my blog if I think I might get away with it. But I never use scripts, and I'm not sure my "blogwhoring" even can be construed as spam. Here's a test case - a couple links to my blog:
Rush Limbaugh humps CNN's Daryn Kagan
Bush and Condi's secret relationship

- - - - - - - - - -
Anne Coulter has a Giant Hyena Clitoris
and Other Lies of the Liberal Elite.


[ Parent ]
You can't stop the spammers. (none / 1) (#97)
by JahToasted on Thu Jan 20, 2005 at 01:52:24 PM EST

The spammers always find a way. I predict that once all the blogs start using this attribute, the spammers will start setting up their own blogs and set up bots to plagiarise content from popular blogs to their own. Then since its their own blog they will be able to create links without this tag.

Of course this will lead to mass confusion, since there will be a huge number of blogs with the same content with no way to know which was the original. You will be able to figure it out by finding the one that doesn't have the spam links, but who has time for that?

Ultimately, the only permanent solution will be for the search engines to ignore blogs completely.
______
"I wanna have my kicks before the whole shithouse goes up in flames" -- Jim Morrison

Hmmmm (none / 0) (#115)
by toulouse on Sat Jan 22, 2005 at 06:10:17 AM EST

"... since there will be a huge number of blogs with the same content ..."

Yes. That would be awful.


--
'My god...it's full of blogs.' - ktakki
--


[ Parent ]
It will work at what it is designed for (none / 0) (#98)
by cburke on Thu Jan 20, 2005 at 02:41:21 PM EST

Seems like a good idea to me.  No it won't stop spam, but it will make Google results more relevent and make certain kinds of spam less effective (by reducing the effect of spam on Page Rank).  Both are good; reducing the ROI of spam is good.

On the other hand I wonder how much of Google's Page Rank goodness depends on comments in blogs/forums saying "Oh, you have a question about X?  I found a good link on the subject here..."

Nothing to see here, move along... (2.25 / 4) (#101)
by Surial on Thu Jan 20, 2005 at 11:06:43 PM EST

rel=nofollow is merely a combination of hype generation and implementing something to simplify matters for those who can't program very well.

In practice rel=nofollow has always been there. Just DONT surround your link with the A HREF tag if the requestor is google. If it's anything but google, send the link as normal.

That's nofollow for you right there. Always has been there.

Hence, this really is nothing new under the sun, except maybe the psychological effects of on one hand showing solidarity towards spammers but on the other hand letting the secret out, as it were.

--
"is a signature" is a signature.

what an incredibly efficient method! -nt (3.00 / 2) (#102)
by forgotten on Fri Jan 21, 2005 at 01:13:05 AM EST


--

[ Parent ]

I hope IHBT - cause: you are so very wrong. (none / 0) (#109)
by Stylusepix on Fri Jan 21, 2005 at 03:02:11 PM EST

There are more search engines than google.

Having multiple render modes is much harder and ressource intensive than adding a tag to links and keeping a single render mode: consider caching, essential to most applications. It often deals with completely rendered pages, as to entirely avoid calling the web application. Your system breaks that, requiring multiple versions. As for Google's and other search engines' caches, they are broken too.

The code required to implement functionality such as that which you describe requires a far greater amount of work - given that there are thousands of web applications that must be modified, this more complex modification would never become nearly as widespread as the better solution that is 'nofollow'.

Your method is fatally flawed.

HTH. HAND. IBIYGD.
Go; you're an it-getter, but No; it's all in good fun (and games). Laugh, in stock?
[ Parent ]

Huh? (none / 0) (#123)
by Surial on Wed Jan 26, 2005 at 05:34:43 AM EST

I'm not 'wrong' - I was replying specifically to the various tricks described in the article, whereby you let on that everything's just hunky dory but secretly send nofollow ONLY to google. I was mentioning this 'nofollow' is nothing new in that regard, you've always been able to do that.

So I'm in favour of this tag. Okay, rereading my comment that wasn't very clear. Just mentioning that 99% of the (correct) complaints about the nofollow thing actually apply to web linking at large, and hence can't really be used as reasons not to implement nofollow.
--
"is a signature" is a signature.

[ Parent ]

Carpet store example (none / 0) (#105)
by jmj on Fri Jan 21, 2005 at 07:53:43 AM EST

I don't understand the problem with the anti-carpet store page getting a lot of PageRank.

Yes, the association between the carpet store's name and the phrase "Redmont carpet store" would get a boost. But the high-PageRank page containing the criticism of the store will be high in the search results for that phrase, so people will at least be warned that there may be something wrong with the store (unless they've been using the "I'm feeling lucky" button). Isn't that exactly what you would want ?

When I'm looking for product information, I want to see the bad reviews along with the good ones. If suddenly these bad reviews would be excluded, my impression would change completely.



Yes, but... (none / 0) (#106)
by elgardo on Fri Jan 21, 2005 at 09:40:56 AM EST

If I search for a specific company or product, I see the company or product page as the most relevant page. Reviews about the company or product should be secondary.

[ Parent ]
PageRank is fleeting. (none / 0) (#114)
by ubernostrum on Sat Jan 22, 2005 at 03:44:27 AM EST

If you notice, the boost from a link like that wears off with time.




--
You cooin' with my bird?
[ Parent ]
Blog Spam (none / 0) (#107)
by Rezvid on Fri Jan 21, 2005 at 10:13:08 AM EST

As someone who only rarely reads blogs, I was shocked by the spam on that Dave Barry blog example page Google provided. However, I should not really have been surprised b y the amount of spam. As for blogs having unreasonably pageranks, I personally do not have a problem with it, as long as search engine hit is relevant. The link could be to depths of Hades, as long as it shows me what I was looking for.

It's not meant to do anything with group B (none / 0) (#108)
by redrum on Fri Jan 21, 2005 at 01:44:49 PM EST

This isn't meant to end weblog spam, it's meant to put an end to spammers abusing Pagerank. That's all.

Censorship, Subjectivity and PagePop (none / 0) (#116)
by femto on Sat Jan 22, 2005 at 08:51:34 PM EST

Won't 'nofollow' allow censorship and bias to creep into search engines?

For example, if a significant portion of the Internet links to a controversial page, search engines will note this popularity and 'report' on it.

With 'nofollow' a signficant number of people may feel pressured to label links to controversial pages as 'nofollow'. The pressure may come from their own beliefs or a campaign by a third party.

The result is that PageRank stops being an objective measure and becomes a subjective measure.

Essentially 'nofollow' redefines pagerank. Without nofollow, PageRank in some way measures the importance of a page. With no follow PageRank becomes a measure of popularity.

I hereby prepose a name change from PageRank to PagePop.

Now you're getting it (none / 0) (#118)
by curien on Mon Jan 24, 2005 at 09:34:31 AM EST

With no follow PageRank becomes a measure of popularity.

That is exactly what PageRank is supposed to do.

--
This sig is umop apisdn.
[ Parent ]

It should be an option (none / 0) (#117)
by auraslip on Mon Jan 24, 2005 at 04:35:51 AM EST

duh
124
Move towards the Semantic Web (none / 1) (#120)
by DragonDave on Mon Jan 24, 2005 at 06:09:06 PM EST

What we need to do is move further towards a both machine- and human-readable website, where Google and other search tools can understand the meanings behind the text; nofollow is not the right way of dealing with this.

Nofollow means that bots are supposed to NOT FOLLOW the link, not just not index the relationship between them. What we need is a set of meaningful attachments to websites, potentially through XML markup.

Ideally, we want to be able to identify ourselves properly over the internet (whether through single identifiers or a multiplicity of those identifiers) and to attach trust behaviours to people. I trust some sites because I trust the author [see newscientist.com]  others I choose not to trust because the people who trust that website are a bunch of morons [see any pseudoscience page]. And then you can say "don't trust these spammers..."

What about "human" tests? (none / 0) (#121)
by phliar on Tue Jan 25, 2005 at 01:09:17 AM EST

I'm a little behind the times where blogs are concerned, so I don't know what I'm talking about. On some sites' form submissions, I often see an image of random letters and numbers with distortions to make it hard to OCR -- having the same characters in the form field proves that a human filled it out. Wouldn't this kill all automated form submitters, and hence blog spam?

Faster, faster, until the thrill of...

You're right (none / 1) (#122)
by toulouse on Tue Jan 25, 2005 at 04:27:01 PM EST

The "technology" is called CAPTCHA (see also here).

However, the current consensus amongst some web developers is that CAPTCHA is a double-edged sword in that, by asking a person to validate themselves every time they want to post something, you're as likely to dissuade and annoy as maintain a high signal to noise ratio. It makes more sense at somewhere like yahoo mail, where you only have to do it once, but somewhere like here, for example, it could well end up being an irritant.


--
'My god...it's full of blogs.' - ktakki
--


[ Parent ]
But wait... (none / 0) (#125)
by benjamini on Wed Jan 26, 2005 at 05:59:48 PM EST

Spammers are unlikely to spend the time registering account. With blog sites like blogger you can force people to log in, therefore forcing them through a turing test. However, should only have to do it once - during registration.

It wouldn't be perfect, but at least better.

[ Parent ]

That's true, (none / 1) (#127)
by toulouse on Thu Jan 27, 2005 at 01:38:43 PM EST

but what I failed to mention in the initial response is that there are at least two other 'issues' surrounding the use of CAPTCHA.

The first is that visual CAPTCHA (such as the images you get on yahoo) prejudices heavily against the sight-impaired. Relying, as it does, upon the human ability to comprehend the intended lexeme masked behind often-quite-severe mangling, it's no surprise that some people have a real problem with it. There have been numerous requests by various bodies not to use visual CAPTCHA for this reason. This is also compounded by recent EU laws governing accessibility. Replacing visual CAPTCHA with audio CAPTCHA won't work, as you've then shifted the prejudice towards the hearing-impaired (and getting audio to work cross-browser can be nightmarish anyway - far more so than displaying an image, at least).

Secondly, and this has more direct bearing on your point, spammers have discovered ways to circumvent CAPTCHA by using humans as their parsing-bots. The entire premise of CAPTCHA is based on the idea that the test is nigh-on impossible for the machine to parse - it requires a high-level of skill in agent / AI design to break the best CAPTCHA algorithms. This prevents spammers / abusers from using scripts, but they've figured out ways around this. One common example I've come across uses the following model:

You're a spammer / abuser / whatever. You need to circumvent a CAPTCHA routine. Instead of sweating day and night trying to solve a hard AI problem to beat the system, you simply set up a website with highly "desirable" content (free porn, serial numbers, mp3s, mpegs, whatever). In order to let people access your content, you make them pass a CAPTCHA-based test, except that the test they have to beat isn't yours; it's one you're being prompted for by your prospective target: This way, you get your porn-surfers to act as your validation mechanism - and they then have no idea that they're helping the spammer - they think it's just a pre-requisite for access to some silicon-inflated T&A.

In short: CAPTCHA's a nice idea, but it has thorny issues beyond the simpler 'use cases'.


--
'My god...it's full of blogs.' - ktakki
--


[ Parent ]
this kind off stuff wil put an end to the internet (none / 0) (#124)
by soundproofing on Wed Jan 26, 2005 at 12:02:26 PM EST

Searching should be about relevance not about links. Search engines should be developing better algorithms for relevance and quality of content.

Such as AI for good grammar, spelling, and logical concise layout of information.


soundproofing, noise control, vibration damping, and acoustics consultant and engineer. http://soundproof.mine.nu/
Yes I think so (1.00 / 3) (#128)
by jojo0901 on Thu Jun 30, 2005 at 12:41:05 PM EST

手机铃声下载 铃声下载 手机铃声 手机铃声下载 铃声下载 手机铃声 手机铃声下载 铃声下载 手机铃声 手机铃声下载 铃声下载 手机铃声 手机铃声下载铃声下载手机铃声 手机铃声下载铃声下载手机铃声 手机铃声下载 铃声下载 手机铃声 免费铃声 免费铃声下载 手机铃声免费下载 手机铃声 手机铃声下载 铃声下载 免费铃声 免费铃声下载 手机铃声免费下载 免费电影 免费电影下载 电影网 电影下载 免费下载电影 下载电影 小电影 小电影网 小电影下载 免费小电影 激情小电影 成人小电影 小泽圆电影 激情电影 成人电影 激情成人电影 激情免费电影 视频聊天 视频聊天室 点歌网 免费点歌 手机点歌 手机彩信 彩信下载 彩信图片 手机铃声 手机铃声下载 铃声下载 Mp3铃声 免费铃声下载 免费手机铃声 手机铃声免费下载 Mp3铃声下载 手机铃声 手机铃声下载 铃声下载 免费铃声下载 Mp3铃声 免费手机铃声 免费手机铃声下载 Mp3铃声下载 电影网 免费电影 免费电影下载 电影下载 电影免费下载 下载电影 小电影 小电影下载 小电影网 免费小电影 激情小电影 成人小电影 激情电影 成人电影 激情成人电影 激情免费电影 视频聊天 视频聊天室 点歌网 免费点歌 手机点歌 手机彩信 彩信下载 彩信图片 彩信动画 彩信相册 手机铃声下载 铃声下载 手机铃声 免费铃声 免费铃声下载 手机铃声免费下载 Mp3铃声下载 Mp3铃声 电影网 免费电影 免费电影下载 电影下载 电影免费下载 下载电影 小电影 小电影下载 小电影网 免费小电影 激情小电影 成人小电影 激情电影 成人电影 激情成人电影 激情免费电影 视频聊天 视频聊天室 点歌网 免费点歌 手机点歌 手机铃声下载 铃声下载 手机铃声 免费铃声 免费手机铃声 免费铃声下载 手机铃声免费下载 Mp3铃声下载 Mp3铃声 手机彩信 彩信下载 彩信图片 彩信动画 彩信相册 手机铃声下载 铃声下载 手机铃声 免费铃声 免费铃声下载 手机铃声 手机铃声下载 铃声下载 mp3铃声 mp3铃声下载 免费电影 免费电影下载 电影网 电影下载 免费下载电影 下载电影 小电影 小电影网 小电影下载 免费小电影 小电影 激情电影 成人电影 激情成人电影 激情免费电影 视频聊天 视频聊天室 点歌网 免费点歌 手机点歌 手机彩信 彩信下载 彩信图片 手机铃声 手机铃声下载 铃声下载 Mp3铃声 免费手机铃声 免费铃声下载 手机铃声免费下载 Mp3铃声下载 手机铃声 手机铃声下载 铃声下载 免费铃声下载 Mp3铃声 免费手机铃声 免费手机铃声下载 Mp3铃声下载 电影网 免费电影 免费电影下载 电影下载 电影免费下载 下载电影 小电影 小电影下载 小电影网 免费小电影 激情小电影 成人小电影 激情电影 成人电影 激情成人电影 激情免费电影 视频聊天 视频聊天室 点歌网 免费点歌 手机点歌 手机彩信 彩信下载 彩信图片 彩信动画 彩信相册 动漫下载 动画片 免费动画片 动画片下载 动漫网 看动画片 免费动画片 动漫下载 动画片下载 免费动画片下载 免费动漫 免费动画片 动画片下载 免费动画片下载 动漫下载 免费动漫 动画片 免费动画片 动画片下载 动漫网 免费动画片 免费动漫下载 动画片下载 动画片网 动画片 动画片下载 免费动画片 动漫下载 动画片 免费动画片 动画片下载 h动画 日本h动画 成人h动画 免费h动画 h动画下载 h动漫 日本h动漫 成人h动漫 免费h动漫 h动漫下载 头文字D 头文字D下载 头文字D动画片 蜡笔小新 蜡笔小新下载 蜡笔小新动画片 七龙珠 七龙珠Z 七龙珠动画片 新七龙珠 h动漫 日本h动漫 成人h动漫 免费h动漫 h动漫下载 h动画 日本h动画 成人h动画 免费h动画 h动画下载 免费动画片 动画片下载 动漫网 免费动画片 动画片下载 动漫下载 免费动漫 免费动漫下载 耽美动漫 耽美动画 耽美 免费动画片 动画片下载 动漫下载 动画片 动漫 火影忍者 火影忍者下载 火影忍者动画片 灌篮高手 灌篮高手下载 灌篮高手动画片 游戏下载 单机游戏下载 网络游戏下载 手机游戏下载 迷你游戏下载 游戏外挂下载 模拟游戏下载 成人游戏下载 传奇外挂下载 魔兽世界外挂下载 劲乐团外挂下载 梦幻西游外挂下载 泡泡堂外挂下载 游戏动画下载 即时战略游戏下载 角色扮演游戏下载 体育竞技游戏下载 经营策略游戏下载 模拟游戏下载 动作冒险游戏下载 休闲养成游戏下载 飞行射击游戏下载 QQ外挂下载 QQ游戏外挂下载 奇迹外挂下载 CS外挂下载 天堂外挂下载 街机模拟游戏下载 游戏试玩 动画音乐下载 游戏屏保下载 免费手机游戏 诺基亚手机游戏下载 三星手机游戏下载 索爱手机游戏下载 摩托罗拉手机游戏下载 Nokia手机游戏下载 单机游戏 网络游戏 手机游戏 迷你游戏 游戏外挂 模拟游戏 成人游戏 传奇外挂 魔兽世界外挂 劲乐团外挂 梦幻西游外挂 泡泡堂外挂 游戏动画 即时战略游戏 角色扮演游戏 体育竞技游戏 经营策略游戏 模拟游戏 动作冒险游戏 休闲养成游戏 飞行射击游戏 QQ外挂 QQ游戏外挂 奇迹外挂 CS外挂 天堂外挂 街机模拟游戏 游戏试玩下载 动画音乐 游戏屏保 免费手机游戏下载 诺基亚手机游戏 三星手机游戏 索爱手机游戏 摩托罗拉手机游戏 Nokia手机游戏 小游戏 迷你小游戏 在线小游戏Flash小游戏 小游戏网 小游戏下载

A step toward solving comment spam? | 128 comments (117 topical, 11 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!