Kuro5hin.org: technology and culture, from the trenches
create account | help/FAQ | contact | links | search | IRC | site news
[ Everything | Diaries | Technology | Science | Culture | Politics | Media | News | Internet | Op-Ed | Fiction | Meta | MLP ]
We need your support: buy an ad | premium membership

[P]
Referrer log spam

By enterfornone in Internet
Thu May 31, 2001 at 03:21:59 AM EST
Tags: Internet (all tags)
Internet

I generally go through my referrer logs each day to see where people visiting my site are coming from. For those who don't know, most web servers will log the page you clicked on to get to the page you are currently viewing, using the HTTP referrer header. So for example I can tell that most of the traffic to my web site comes from links in my sig at sites like Slashdot and Kuro5hin. I can also see what search terms people are using to find my site. Disturbing Search Requests is a site dedicated to discussing information found in referrer logs. In fact, you can see their referrer logs here.

Lately however, it seems that I am getting a lot of pages in my logs that don't link to my site at all. Given the content of many of these pages, I'm beginning to think that referrer logs have become the latest target for spammers.


Quite often lately I am seeing a lot of pages in my referrer logs that do not link to my page at all. For example today I saw http://www.lovinmood.com/headcandy.htm. "Head Candy is the brand name of the first and only devices that increases your enjoyment of giving and receiving oral sex."

Now call me a conspiracy theorist, but is it possible that people are writing scripts to "spam" random web pages with fake referrers in order to get people to visit their site?

Such a script would be quite easy to create. The referrer header is sent by the client, so in theory any web cient such as Mozilla could easily be hacked to change the referrer header to whatever you like. In fact the ad blocking proxy Junkbuster allows you to do exactly this.

With this funtionality available, how easy would it be to crawl the web and spam your URL into the logs of others.

This has implications beyond simply spamming logs. Devious marketers could make it appear that their portal is more effective than it really is. "Look at the traffic our web site sends you now - buy bigger ads off us and you can get even more."

If these practices are occuring at all they would seem at present to be not very widespread. But as internet marketers look for new ways to promote their site, it's something that they are sure to consider. What can we do to stop it?

Sponsors

Voxel dot net
o Managed Hosting
o VoxCAST Content Delivery
o Raw Infrastructure

Login

Related Links
o Slashdot
o Kuro5hin
o HTTP
o referrer header
o my web site
o Kuro5hin [2]
o Disturbing Search Requests
o here
o http://www .lovinmood.com/headcandy.htm
o Mozilla
o Junkbuster
o exactly this
o Also by enterfornone


Display: Sort:
Referrer log spam | 37 comments (37 topical, editorial, 0 hidden)
Possible explanation? (4.33 / 9) (#1)
by Jin Wicked on Wed May 30, 2001 at 02:49:39 AM EST

While I don't think your theory is really likely, I have noticed this curious occurence in my site's statistics as well. However, while I did notice a couple of pr0n referrers, they were generally only 1-2 hits each and I could find no reference to me when viewing the page.

However, the types of "one hit" referrers I get aren't limited to pr0n or marketing links. Many of them are personal homepages, video game forums, and basically a myriad other things that I could understand someone visiting my site to be interested in.

I asked one of my other friends about this once, who runs his own server, and knows more about that kind of thing than I did. He explained it to me as a bug in some browsers that sometimes records the last page you visited as a "referrer" even though it really wasn't. So if someone was looking at that pr0n page, then clicked their bookmark to visit you, then it might accidentally show up as a referrer stat. I don't know how true that is, but it makes enough sense to me to at least seem like a plausible explanation. He mentioned Internet Explorer as being the guilty browser, but I'm sure there are others with the same problem (if this is indeed happening at all.)

If there is a better explanation, I'd like to know about it, as this is one of the things that has baffled me for quite awhile.


This post was probably not written by the real Jin Wicked. Please see user "butter pie" for Jin's actual posts.


Consider this... (none / 0) (#6)
by ti dave on Wed May 30, 2001 at 05:04:14 AM EST

You've got a webcam prominently displayed on your index page.
I'm sure bots (and horned out guys) have submitted your site to the various pr0n webcam directories (although you seem respectable).
These directories are fairly dynamic too, so you may not have caught any references to your page (rankings change, etc.).

ti_dave
"If you dial," Iran said, eyes open and watching, "for greater venom, then I'll dial the same."

[ Parent ]
referrer bug (4.66 / 6) (#7)
by johnathan on Wed May 30, 2001 at 09:17:45 AM EST

He explained it to me as a bug in some browsers that sometimes records the last page you visited as a "referrer" even though it really wasn't. So if someone was looking at that pr0n page, then clicked their bookmark to visit you, then it might accidentally show up as a referrer stat. I don't know how true that is, but it makes enough sense to me to at least seem like a plausible explanation.
This was indeed a well-known bug in Netscape 4 for MacOS. It was troubling mostly as a security issue (e.g. passwords or private URLs being sent), rather than as an inconvenience to webmasters. Here's an article that I dug up on it.

The moral of the story in our current context is that referrer information, like any provided by the client, should be taken with a grain of salt.

--
Her profession's her religion; her sin: her lifelessness.
[ Parent ]

Confirmed (4.80 / 5) (#12)
by scorbett on Wed May 30, 2001 at 12:39:15 PM EST

I've seen this in my own logs, and in fact I can reproduce this at will (Netscape 4.7/WinNT). Load a page (any page, local or remote), let's say "www.somedomain.com", then go to the "location" text field and type in the address of your site. The next time you check your referrer logs you'll see www.somedomain.com as a referrer, even though there is no link from somedomain.com to your site. This is inaccurate and annoying, but not really a big deal, in my opinion.



[ Parent ]

They [spammers] will destroy themselves. (3.33 / 3) (#2)
by hany on Wed May 30, 2001 at 02:57:53 AM EST

I can confirm such "referrer spamming" - it happens to my site too.

What's "good" about such spamming is that in quite short period of time spammer will destroy this for now quite valuable item and they will either cease to exist or move elsewhere.

What's bad about spamming (and missuse and/or abuse in general) is, that nice things are destroyed for short term profit for bunch of ignorants.

And what to do with that? Ignore spam! Filter it out! Throw it into trash!
As long as they found it effective, they will be doing it.


hany


Possible Explanation (2.00 / 1) (#3)
by Tachys on Wed May 30, 2001 at 03:00:18 AM EST

Is it possible they happen to be at that site when they decided to go to your site?

No. (3.50 / 4) (#8)
by WWWWolf on Wed May 30, 2001 at 09:40:15 AM EST

Is it possible they happen to be at that site when they decided to go to your site?
No. If the browser sends the URL in that case as referrer, it's violating the standards.
The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard.

- HTTP/1.1 specification, RFC 2616


-- Weyfour WWWWolf, a lupine technomancer from the cold north...


[ Parent ]
Yes, it is. (4.50 / 4) (#11)
by J'raxis on Wed May 30, 2001 at 11:11:35 AM EST

A browser violating standards!? You aren't serious!!

-- The RFC2616-Compliant Raxis

[ J’raxis·Com | Liberty in your lifetime ]
[ Parent ]

Scenario: (3.50 / 2) (#4)
by Estanislao Martínez on Wed May 30, 2001 at 03:24:27 AM EST

What if you browse the web with 2 or more windows open, and drag links from one to the other in order to visit them? Does the browser report as referrer the document you dragged the link from, or just whatever document that window was displaying previously? If the second option is what happens, people who browse this way may generate senseless referral chains.

I don't know what browsers do under this scenario, since I don't worry about such things-- I run the Internet Junkbuster...

--em

I think you are reading to much into this. (3.66 / 9) (#5)
by Tachys on Wed May 30, 2001 at 04:17:43 AM EST

Don't think anyone will bother spamming referrer logs because very few people read referrer logs.

I think one explaination is people don't like being tracked. So they get junkbuster and put some obscene web site, like that headcandy link, in the referrer to tell you what they think of being tracked

being tracked. (none / 0) (#15)
by www.sorehands.com on Wed May 30, 2001 at 03:00:57 PM EST

I don't use the referrer logs to track people, but to see who is linking to me. I have found some news articles about my site in Mexico and Brazil only because of the referrer logs.



------------------------------------------------------------------------------
http://www.barbieslapp.com
Mattel, SLAPP terrorists intent on destroying free speech.
-----------------------------------------------------------
[ Parent ]

Should have put "some" people (4.00 / 1) (#18)
by Tachys on Wed May 30, 2001 at 11:42:25 PM EST

I think I should have wrote

I think one explaination is some people don't like being tracked.

I don't mind people using referrers, but some people are probably very paranoid about them.



[ Parent ]
Spambots? Spamspiders? (none / 0) (#37)
by sumppi on Tue Mar 26, 2002 at 04:55:32 AM EST

This isn't just few people, it's happening a lot. Most of the queries the spambots make end up in 404 though, few examples:

211.94.202.72 - - [25/Mar/2002:23:46:31 +0200] "GET http://www.searchfeed.com/rd/Clk.jsp?id=22417&r=3&p=1386 HTTP/1.1" 404 "http://all_4_free.home.chinaren.com/money.html"

pd951318a.dip.t-dialin.net - - [25/Mar/2002:22:51:42 +0200] "GET http://www.adexit.de/cgi-bin/javascript.pl?account=19880 HTTP/1.0" 404 "http://217.110.254.219/fickregie/index.htm"

212.169.172.69 - - [25/Mar/2002:21:55:35 +0200] "GET http://www.adexit.de/cgi-bin/javascript.pl?account=19602 HTTP/1.1" 404 "http://www.erospur.de/weiber/geil1.htm"

pd900249b.dip.t-dialin.net - - [25/Mar/2002:19:35:51 +0200] "GET http://www.gamechartz.de/logger.php?uid=10104 HTTP/1.0" 404 "http://www.gamechartz.de/logger.php?uid=10104"

etc... those are the few latest entries. Those started showing up about a year ago or so.

Sumppi.

[ Parent ]
Or maybe something else (1.00 / 5) (#9)
by Nitesurfer on Wed May 30, 2001 at 10:24:50 AM EST

Yes someone may have come in to see your site. As a product marketer I have looked into EMAIL blasting. But where do you get the addresses, you use a product like what is listed here: http://www.vome.com/lightning.htm

So if you have your email address listed on your site this type of program will harvest it for use. This might be how you got BOGUS referring pages... maybe not...

It sounded as if you use your own routine to keep track of Log information. I use a free service called HITLOGGER. Below you will see an example of my self visiting my own site.

Time Remote Host Referer URL Remote IP Monitored Page User Agent, Platform
===============================================================================
07:33:41 h00045a25eeb2.ne.mediaone.net bookmark/direct xxx.xxx.xxx.xxx http://home.att.net/~btechinc/menu.html Mozilla/5.0 (Windows; U; Win98; en-US; m18) Gecko/20010131 Netscape6/6.01 1024x768 (8 bits)
08:41:40 h00045a25eeb2.ne.mediaone.net http://www.chip.cz/texty/2001_2/0524/keyb.shtml xxx.xxx.xxx.xxx http://home.att.net/~btechinc/menu.html Mozilla/5.0 (Windows; U; Win98; en-US; m18) Gecko/20010131 Netscape6/6.01 1024x768 (8 bits)


This is pretty cool because it gives the time, the ISP domain name of the Visitor, and refrring page. I think these are very important showing you where people are coming from, and what brought them to you. For example, the second listing show that I was referred from an article in a Czech e-Magazine. Pretty Cool




David Byrd

CEO --- Twenty First Century Technologies, Inc.
Home of the Nite-Surfer Illuminated Keyboard

Broken browsers (3.83 / 6) (#10)
by J'raxis on Wed May 30, 2001 at 11:00:09 AM EST

This could just be a broken browser. According to the RFCs, and common logic, a browser should only send a referring URL if the user visits your page from that URL, but I've seen some browsers send the URLs to a page as referrer if I'm at that page, and type in a new URL.

For example, if I typed in my site URL right now, IE might send http://www.kuro5hin.org/?op=comments&tool=post&sid=2001/5/30/22341/3757#here along as a referrer. I've seen some obviously erroneous referrers pop up in my log that also obviously couldn't be attributed to spamming. Like "http://www.microsoft.com/" where I obviously don't have a link on ther / page.

-- The RFC2616-Compliant Raxis

[ J’raxis·Com | Liberty in your lifetime ]

Neat idea, but (3.00 / 2) (#13)
by error 404 on Wed May 30, 2001 at 12:54:15 PM EST

I don't think the basic spammer is that smart. And besides, that's a real small audience.

Still, I guess now I'll have to play with my browser a bit, heheheheh...
..................................
Electrical banana is bound to be the very next phase
- Donovan

Haven't seen any referrer spamming... (3.00 / 1) (#14)
by Captain_Tenille on Wed May 30, 2001 at 01:25:46 PM EST

But on two occasions, I've had people visit my site doing searches on "captain+and+tenille". Heh heh. Probably not what they had in mind, I would think.
----
/* You are not expected to understand this. */

Man Vs. Nature: The Road to Victory!

Funky stuff (3.50 / 2) (#22)
by fluffy grue on Thu May 31, 2001 at 03:12:37 PM EST

I've gotten a lot of people finding my site through searches on:
  • Obituatires and eulogies (linked to the memorial site for Captain Webster, a member of a mailinglist I used to be active on)
  • Errogenous zones (usually linked to my currently-unavailable gender rant, which I bet confuses or disturbs the hell out of the people who read it)
  • Robot erotica stories (usually linked to my non-robot non-erotica transformation stories I wrote many years ago)
  • Random computer hardware (since I have a page on the fun spam I've gotten from Micro Warehouse)
  • Japanese recipes (linked to my pseudo-japanese recipe page which I really should work more on)
  • Nanites (because of Hemos Industrial Nanites)
  • Information on porcupines
  • Searching for scat porn (ugh, ugh, ugh... at least it links to a poem I wrote)
Then some specific search terms which are kinda funny:
  • "magenta stories" (I have no idea what that's supposed to mean)
  • "she is genitals picture" (??)
  • "shaved stories" (???)
  • "lamar cx snowboard bindings" (I really don't want to know how that links to me...)
  • "Why would anyone drink antifreeze?"
  • "wife watching sex stories"
  • "pinion tree beat"
  • "dexatrim"
Ones which I feel sorry for the people not finding what they wanted:
  • "stain repellent" (going to one of my old plushophile fantasy stories)
  • "lead glass bathroom window"
  • "ergometric chair"

--
"Is not a quine" is not a quine.
I have a master's degree in science!

[ Hug Your Trikuare ]
[ Parent ]

Odd Search Requests (none / 0) (#35)
by sventhatcher on Thu Jun 07, 2001 at 06:52:22 AM EST

I once had a poll on a website I was running that had a response about midget sex or some such.

Once google finally indexed me, guess what was the most common search term that led people to me?


--Sven (Now with bonus vanity weblog! (MLP Sold Seperately))
[ Parent ]
What I'd like to know (3.00 / 2) (#16)
by spacejack on Wed May 30, 2001 at 04:49:22 PM EST

is who comes by using that Nutscrape 1.0/CPM browser, and do my pages look ok in it.

Meta-data problems (4.00 / 1) (#17)
by Delirium on Wed May 30, 2001 at 04:52:30 PM EST

This (if it's happening) is an example of one of the problems of meta-data and usage statistics: it's nearly impossible to collect and display such data without influencing the data itself. For example, someone I know had a script that would parse IRC logs and display the 15 most commonly used words in the logs. He ran this fairly often; usually multiple times per day. After a while it began to form a weak feedback loop - the commonly used words would appear yet again in the listings, thus becoming even more commonly used.

This is definitely happening... (4.00 / 1) (#19)
by costas on Thu May 31, 2001 at 04:03:48 AM EST

I don't have the URLs handy (have been cleaning them off my logs) but for months, I would be getting 2-3 sites as referrer URL in my logs.

When I checked the site(s), I always got redirected to a company advertising services for webmasters --you know the kind, we help place you in search engines, etc. That was at least 6 months ago, BTW, I don't consider it new.

Although I was entertained when I saw it was spam, I wondered whether these guys understand that webmasters are usually savvier than most Web users and wouldn't be a good target audience for spam...


memigo is a news weblog run by a robot. It ranks and recommends stories.
Bate (none / 0) (#24)
by camadas on Thu May 31, 2001 at 05:08:26 PM EST

I wondered whether these guys understand that webmasters are usually savvier than most Web users and wouldn't be a good target audience for spam...
Well, you checked the sites, i think the spam worked, don't you agree ?

[ Parent ]
I don't agree (none / 0) (#26)
by roystgnr on Thu May 31, 2001 at 09:23:39 PM EST

The purpose of most spam (I'm ignoring the "You Have Three Months to Live" guy; where did he go after Y2K, anyway?) is to make money, not to get people to hate you. Did he give the spammers any money?

[ Parent ]
I must disagree... (none / 0) (#34)
by Minister on Wed Jun 06, 2001 at 11:38:57 AM EST

with this:

...webmasters are usually savvier than most Web users and wouldn't be a good target audience for spam...

Not in my experience. I'm the sysadmin at a local ISP and some of the so-called webmasters that we get on the phone asking for help aren't qualified to successfully pick their noses, much less know what to do with their raw logs.

I realize that the people that I talk to are filtered through our tech support and through the fact that the people who I do talk to are the ones having problems, but I'd say about 10% of people claiming to be webmasters (and professional web designers) haven't the faintest inkling about what they're doing.

I guess shouldn't rant too much, these are the people that make us the most money, they're the ones I get to bill for fixing their mangled perl scripts, rebuilding their site after they accidentally delete it and other general stupidity recovery measures.



[ Parent ]
Porn Sites (none / 0) (#20)
by guinsu on Thu May 31, 2001 at 10:32:17 AM EST

I've found a bunch of porn sites in the referer field in my logs in the past, there is definitely spamming going on. At first I didn't believe it was the case, only b/c it seems like such an odd way to market your site, and the only people who would see it would be computer savvy users, not the usual suckers who fall for spam (then again, maybe system admins and porn go together, who knows)

Yeah, yeah. (3.00 / 2) (#21)
by Nurgled on Thu May 31, 2001 at 02:47:45 PM EST

This has been happening for a long time now, and I've always told people that 'referer' (sic) statistics are pointless and almost as meaningless as browser statistics.

Anyone basing anything on any information derived from freeform arbitrary client-supplied strings will be sorely disappointed. The sooner people realise this, the better.



quite clever marketing stunt (4.00 / 1) (#23)
by yet another kris on Thu May 31, 2001 at 03:38:06 PM EST

as the webmaster of the mentioned disturbing searchrequests site i have developed an obsession with my site stats.

on one hand i believe most of the obscure referrers are the result of the already mentioned browser bug. on the other hand i think referer log spamming is a side effect of the profusion of weblogs. some of the webloggers check their log very often to see how much impact they make. moreover, weblogs are a great way to spread links.

as long as it is new, referrer log spamming looks like a clever marketing stunt. just spam a couple of hundred webloggers. some of them may even post it. their readers may pick it up and later someone will post the link on sites like metafilter or here and finally the traditional media will report.

i suppose, this won't work to promote commercial sites, but for all these fake sites that try so hard to be the next aybabtu. but since this is a novelty trick targetted on an internet savvy audience, it will wear off quickly. it may have worked for "Head Candy" and it will do so for a couple of other sites, but sooner or later noone will fall for it.

dsr (none / 0) (#27)
by enterfornone on Fri Jun 01, 2001 at 05:22:03 AM EST

Today I saw little girls sex photos in my logs. Now I have 3 out of four of the words on my page (don't seem to have little there) but I can't find my page when doing the search. Even typing "little girls sex photos enterfornone" (sans quotes) shows nothing. No idea what's going on there? Did someone look for kiddie porn on altavista, find nothing of interest, give up and click a bookmark to my page instead?

--
efn 26/m/syd
Will sponsor new accounts for porn.
[ Parent ]
someone is making fun with you, i suppose. (none / 0) (#28)
by yet another kris on Fri Jun 01, 2001 at 06:14:09 AM EST

the other explaination is: the searchengine (dunno about av, but google does) works with a cluster of small decentralized computers. if some of them are temporarily not available, nobody will notice (unless you check your referrers), because they still come up with results.

on the other hand, i believe this thread will "inspire some creative" webmaster to think about referer spamming.

[ Parent ]

Yet (none / 0) (#32)
by enterfornone on Mon Jun 04, 2001 at 05:37:42 PM EST

I do appear in little girls photos. I guess when you add sex to the search the paid links get shuffled to the top. Tho I'm not sure why Jason's Majordomo Page is there. I wonder what sort of a listserv Jason is running...

--
efn 26/m/syd
Will sponsor new accounts for porn.
[ Parent ]
Another explanation: fun with Proxomitron (3.50 / 4) (#25)
by artemb on Thu May 31, 2001 at 05:18:27 PM EST

If I had to visit a page (lets's say /foo/bar.html) and they had referer and user-agent logging on, what they would see is something like this (no matter what page I'm coming from):

User-Agent: Godzilla/666.666 (doors; U; NT98.0; en-ru) Halabaloo/87687687
Referer: /foo/bar.html

NOTE: referer = requested page.

In my case the reason for weird user-agent and referer is Proxomitron which works as HTTP proxy and lets you filter/change headers and document contents. Quite useful for killing pop up windows and other annoying stuff, like unwanted cookies or ads.

Great (3.00 / 1) (#29)
by Nurgled on Fri Jun 01, 2001 at 08:37:18 AM EST

Great minds think alike! I also do that, which beats the majority of the clueless scripts which output pages saying that you suck if you type a URL directly into your browser, or have the cheek to follow a link from a document outside their domain.

However, you should really be using \u rather than \p (Proxomitron replacement symbols for 'full URL' and 'path portion of URL' respectively) as the Referer header should contain a full URL.

It also has the added bonus of confusing people who don't understand where Referrer logs come from, and try to trace a visitors 'click path' through their site! :)



[ Parent ]
Even better... (3.50 / 2) (#30)
by Kasreyn on Sat Jun 02, 2001 at 11:05:18 AM EST

I use the Proxomitron. In its text matching language, "/u" (without quotes) is replaced with the current page.

So make your user agent:

"I don't HAVE a web browser!"

and make your referrer:

/u

If you want to be cool like me. =P


-Kasreyn


"Extenuating circumstance to be mentioned on Judgement Day:
We never asked to be born in the first place."

R.I.P. Kurt. You will be missed.
[ Parent ]
Straightforward to do in Java (none / 0) (#31)
by jck2000 on Sun Jun 03, 2001 at 11:01:33 PM EST

Faking referer, user-agent and similar strings is pretty straightforward in Java and most scripting languages (e.g., Python). For instance, in Java, one could do the following:

import java.net.*;

import java.io.*;

public class HeaderTweak {

public static void main (String args[]) {

try {

URL u = new URL(args[0]);

URLConnection c = u.openConnection();

c.setRequestProperty("referer", args[1]);

c.setRequestProperty("user-agent">, args[2]);

c.connect();

c.getContent();

}

catch (Exception e) {

System.err.println(e);

System.err.println("Usage: java HeaderTweak <url> <referer> <user-agent>");

}

}

}

With this, one could do something like:

[xyz@xyz misc]$ java HeaderTweak http://targetsite http://referersite goofyuseragentname

Awhile back I was playing with something like this to lay down some nonesense in the logs of a friend's site -- I don't think he noticed until I specifically told him about it.

Curl (none / 0) (#33)
by error 404 on Tue Jun 05, 2001 at 05:49:06 PM EST

A little DOS program I downloaded last night but haven't had a chance to play with yet. Anyway, the point of the program is to be able to upload and download files off a server (ftp, http, shttp, ...) via command line. The options include setting the referrer and browser values to whatever you want.
..................................
Electrical banana is bound to be the very next phase
- Donovan

There are ways... (none / 0) (#36)
by prostoalex on Fri Jun 08, 2001 at 01:05:09 AM EST

I have a links panel enabled in my IE and whenever I click on the button that goes to my homepage, in the logs I see the site that I was on as a referrer, even though it doesn't contain a link.

Referrer log spam | 37 comments (37 topical, 0 editorial, 0 hidden)
Display: Sort:

kuro5hin.org

[XML]
All trademarks and copyrights on this page are owned by their respective companies. The Rest Š 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.

Powered by Scoop create account | help/FAQ | mission | links | search | IRC | YOU choose the stories!