Newsgroup: comp.risks


Delivered-To: dds@aueb.gr
Return-Path: <risks-owner+M227@csl.sri.com>
Received: from mailgate-internal1.sri.com ([::ffff:128.18.84.103])
by blue.servers.aueb.gr with esmtp; Thu, 27 May 2004 23:07:17 +0300
Received: (qmail 29284 invoked from network); 27 May 2004 20:07:11 -0000
Received: from localhost (HELO mailgate-internal1.SRI.COM) (127.0.0.1)
by mailgate-internal1.sri.com with SMTP; 27 May 2004 20:07:11 -0000
Received: from quarter.csl.sri.com ([130.107.1.30])
by mailgate-internal1.SRI.COM (SAVSMTP 3.1.2.35) with SMTP id M2004052713071024308
; Thu, 27 May 2004 13:07:10 -0700
Received: from quarter.csl.sri.com (localhost [127.0.0.1])
by quarter.csl.sri.com (8.12.9/8.12.10) with SMTP id i4RJnDMQ015462;
Thu, 27 May 2004 13:07:05 -0700
Received: from chiron.csl.sri.com (chiron.csl.sri.com [130.107.15.74])
by quarter.csl.sri.com (8.12.9/8.12.10) with ESMTP id i4RJmvG4015380
for <risks-resend@mx0.csl.sri.com>; Thu, 27 May 2004 12:48:57 -0700
Received: (from risko@localhost)
by chiron.csl.sri.com (8.11.6/8.11.6) id i4RJmvl05069
for risks-resend; Thu, 27 May 2004 12:48:57 -0700
From: RISKS List Owner <risko@csl.sri.com>
Date: Thu, 27 May 2004 12:48:57 PDT
precedence: bulk
Subject: Risks Digest 23.38
To: risks@csl.sri.com
Message-ID: <CMM.0.90.4.1085687337.risko@chiron.csl.sri.com>
Precedence: bulk
Sender: risks-owner@csl.sri.com
version=2.63-20040415.180434
RISKS-LIST: Risks-Forum Digest  Thursday 27 May 2004  Volume 23 : Issue 38

   FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS (comp.risks)
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

***** See last item for further information, disclaimers, caveats, etc. *****
This issue is archived at <http://www.risks.org> as
  <http://catless.ncl.ac.uk/Risks/23.38.html>
The current issue can be found at
  <http://www.csl.sri.com/users/risko/risks.txt>

  Contents:
Paris Airport collapse: Analogy collapses (Marshall D Abrams)
FBI fingerprint screwup: Brandon Mayfield no longer a suspect (PGN)
GAO looked at DoD and off-shored software (James Paul)
So what's new with Pittsburgh Verizon DSL (David Farber)
The lighter side of electronic voting (Jason T. Miller)
Florida law bans deceptive subject lines in e-mail (NewsScan)
Spam being rapidly outpaced by 'spim' (Nico Chart)
Another method of password theft (James Renken)
Window smashed, data lost (David Lazarus via Monty Solomon)
Spamming the referrer logs (Diomidis Spinellis)
And a Mac Sniffer in a Pear Tree ... (Paul Kedrosky via Dave Farber)
Speed cameras: fines refunded, licenses restored (Stuart Lamble)
Re: Radar Gun Follies (Chris Meadows)
Re: New UK driving licence puts identity at risk (Chris Malme)
Re: Challenge-response is a bad idea (Jonathan de Boyne Pollard)
REVIEW: "Beyond Fear", Bruce Schneier (Rob Slade)
Abridged info on RISKS (comp.risks)

----------------------------------------------------------------------
[...]

Date: Tue, 25 May 2004 11:49:54 +0300
From: Diomidis Spinellis <dds@aueb.gr>
Subject: Spamming the referrer logs

A new form of spamming pollutes web server referrer logs, tricking Web sites
to publish pages with links to unrelated commercial content.

Every day I receive an e-mail report summarizing the activity at my personal
Web site.  This allows me to see how the day's activities, such as the
release of a new software update, or a new blog entry, contribute to the
popularity of various areas.  It is also a security monitoring tool: an
unexpected surge in traffic could mean something was amiss in its content.

Over the last year the contents of this report were becoming less reliable
as a proliferation of different distributed crawling engines began taking up
a noticeable percentage of the site's traffic.  A bit of filtering corrected
that issue: a "user agent" trying to read the robots.txt file could safely
be excluded from the site's statistics.

Over the last days a more sinister form of noise has made its appearance.
Part of the report I receive contains a listing of the top-10 referrer
sites: the foreign URLs that were followed to land on my site.  This is a
useful feature, because it allows me to see which foreign links contribute
to the traffic.  Here is an example, from the day I announced a new release
of UMLGraph, an open-source declarative UML diagramming tool, on
freshmeat.org:

Top 10 Referrals:

         77
http://freshmeat.net/projects/umlgraph/?branch_id=48663&release_id=160174
         59 http://javanews.jp/
         63 http://www.cafeaulait.org/
         50 http://www.ibiblio.org/javafaq/
         33 http://freshmeat.net/daily/2004/05/09/
         23 http://freshmeat.net/projects/umlgraph/
         15 http://www.javanews.org/
         14 http://www.freebsd.org/ports/sysutils.html
         [...]

Yesterday, following one of the links in the day's referral list, landed me
on a typical popup window-infested porn Web site.  It was the first time I
had to enable Mozilla's popup window blocking feature to escape from the
deluge of popups.  The same happened with another site appearing in the
referral list. Scanning the content of both referring pages confirmed my
suspicion: none of the two did in fact contain a link to my Web site.  The
referrals were generated by Web server log entries like the following:

66.230.218.66 - - [15/May/2004:23:10:54 +0300] "GET / HTTP/1.1" 200 3132
"http://www.mixtaperadio.com/" "Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; .WONKZ)"

A Google search for a .WONKZ user-agent predictably showed more than a
hundred entries, typically containing Web site usage statistics.  With many
sites automatically generating lists of referring sites and posting them
on-line, the spamming of a site's referrer log is apparently an easy way to
increase the number of links pointing to a Web site, and thereby increase
the site's performance in search query results that base their results on
this number (e.g. pagerank).  An entry in an article on "spamdexing" at
http://www.tutorgig.com/encyclopedia/getdefn.jsp?keywords=Spamdexing refers
to this practice as "Referrer log spamming" and gives a similar rationale.

The risk: the ability to crawl the Web generating millions of spammed
referrer entries will further diminish the utility of two up to now useful
data sources: referrer logs, and incoming link counts as a measure of a
site's importance.

Diomidis Spinellis -     http://www.spinellis.gr

------------------------------
[...]

End of RISKS-FORUM Digest 23.38
************************



Newsgroup comp.risks contents
Newsgroup list
Diomidis Spinellis home page

Creative Commons License Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-Share Alike 3.0 Greece License.