Join 3,497 readers in helping fund MetaFilter (Hide)


A pox on your house, Spammer
November 11, 2003 7:00 PM   Subscribe

Spammers strike back? Well then call this return of the Webmaster Jedi. As a blogger and domain owner, I am sick of waking up to fifty new comments, all of which are spam for something of dubious legality. The fine folks at Kalsey are angry too. And they declared war. Lots of people stood up and took notice. What can you do to help stop this infestation? Blacklists and Bayesian filtering come to mind... (Via Smart Mobs)
posted by swerdloff (22 comments total)

 
Movable Type's blacklist feature has caught all the comment spam left on my sites. I'm happy with it. Luckily I haven't been deluged as some other sites have.
posted by neuroshred at 7:39 PM on November 11, 2003


We're using the MT-Blacklist tool as well with great success.
posted by MediaMan at 8:16 PM on November 11, 2003


Seeing that Movable Type is the most common victim platform for the problem, and that the spam is being generated by big dumb scripts, wouldn't it make sense to, oh, I dunno, rename the comment processing CGI script out of harm's way? And perhaps check the Referer: header on POST requests? I can't imagine that these spam scripts are smart enough to get around even these trivial modifications, otherwise pretty much every form on the web with a textarea would be seeing this.
posted by majick at 8:24 PM on November 11, 2003


I can attest that blosxom is also being targeted.
posted by i_am_joe's_spleen at 9:07 PM on November 11, 2003


Blacklists: they'll find a way around them.
Bayesian: wont stop them either, a few cleverly placed apostrophe's, and they're in.

So what shall we do?
Wholesale slaughter, without prejudice.
Who's with me?!
posted by pemulis at 12:16 AM on November 12, 2003


Count me in pemulis. But we also need to include the jackasses who actually respond to these things and spend their money on the products. That's who's really fucking it up for everyone else. Line 'em all up and shoo...
posted by Witty at 1:23 AM on November 12, 2003


Simple enough solution: do what we did back in the days of the BBS. Authorize new user accounts for anyone that wishes to post. If you don't want to register, you can read but can't comment.
posted by Civil_Disobedient at 1:41 AM on November 12, 2003


It seems that insulating oneself from bots via robots.txt also insulates from comment spam. I find it ironic -- this is clearly the work of machines rather than humans, and nothing is quite so evil as using someone else's space for ridiculous round-about self-promotion schemes, yet these bots of evil follow the rules while bots supposedly of good (certain search engines) do not.
posted by Dreama at 2:04 AM on November 12, 2003


How much did viruses cost us in clean-up and lost productivity?
Nimda: $635 million
Code Red: $2.62 billion
SirCam: $1.15 billion
Love Bug: $8.75 billion
(Source: Wired News Jan. 14, 2002)

How much does SPAM cost us EVERY YEAR?
$8.9 billion for U.S. corporations
$2.5 billion for European businesses
$500 million for U.S. and European service providers
(Source: CNN, Ferris Research)

Spam is the real cybercrime. It costs us more per year than viruses ever will. Why do we send script kiddies to federal prison but not spammers?

We need to wake up to the fact that these spammers are robbing us of our lives 3.8 seconds at a time, robbing companies of productivity thousands of hours at a time and robbing the world economy of billions each and every year.

So what shall we do?
Wholesale slaughter, without prejudice.
Who's with me?!


Wholesale slaughter? You bet!
Without prejudice? No, with extreme prejudice.

.';(+_+) - - - - - =__( -_- )__= - - - - - (+_+):'..

(spammer)...........(gunman)......(idiots who respond)

Kill them all.
posted by cup at 2:08 AM on November 12, 2003


In my blind fury towards spammers I wrote "SPAM" in upper case but that should have been "spam" in lower case letters.

Apologies to the fine people at Hormel Foods for abusing their trademark .
posted by cup at 2:23 AM on November 12, 2003


Blacklists: they'll find a way around them.
Bayesian: wont stop them either, a few cleverly placed apostrophe's, and they're in.


Because the majority of these spam comments are filed by 'bot scripts, filtering isn't necessary: The best way to deal with them is to remove the automated aspect of commentary. The best way to do that is to use a simple non-OCR-able graphic dynamically generated for each potential post, and have the user type in the number or word on the graphic.

Yahoo and Hotmail have been doing it for a while now with their new account sign up (to prohibit the large-scale throwaway account signup), and PHPNuke and its brethren have been doing it for logins and comments for a while as well.
posted by thanotopsis at 3:54 AM on November 12, 2003


Unfortunately doing that harms accessibility, and doesn't entirely remove the problem as some comments are posted manually by idiots falling for "Make Big Money Writing Comments" scams.

I'd like to see ISPs and PC manufacturers installing basic guides to let new users know to not click any spam links in email/comments, never buy anything from them, etc. A lot of people just don't understand how it all works.
More naming, shaming and boycotting of companies that provide net connections & hosting for spammers might help a little.
posted by malevolent at 4:15 AM on November 12, 2003


Dreama--you assume that spambots will respect the robots.txt file. Not a safe assumption.

Changing the name of the comments cgi does seem to be a good measure. I've done that, as well as install mt-blacklist.

It's worth pointing out here that comment spam is not intended to get people to click through--many spam comments appear on entries that have long scrolled off the front page of the target blog. It's a way of gaming Google's page ranking system (or so the consensus goes), an involuntary, distributed "link farm" a la Search King.
posted by adamrice at 7:07 AM on November 12, 2003


Even my lame little blog has been hit. What really starched my shorts was the presumptuousness of the disclaimer at the bottom, which went something along the line of "If you do not want this in your database, then feel free to remove it." How about asking me in the first place?

(I try to keep my site completely advertising free. Not much money in it, but I can then do whatever I want.)
posted by Samizdata at 7:15 AM on November 12, 2003


Any simple scheme should work, as long as it is only implemented on your blog, and not rolled into the master templates. Cause there's no incentive for the spammers to write a script that only targets one blog.
posted by smackfu at 9:30 AM on November 12, 2003


Here's a guy that tried to send a comment spammer a bil.
posted by nyxxxx at 9:39 AM on November 12, 2003


I'm in. F^ckers killed my favorite yahoo group. The free pron crap got so bad that everybody un-subscribed.

Eventually, some psycho is going to target spammers and start popping them off one-by-one.
posted by Yossarian at 10:03 AM on November 12, 2003


back when yahoogroups was eGroups a list I hosted there (medical discussions about an illness) was force-subscribed to a list about mastrubation (pics included, yuk!) - all my members, more than ten times over and over in one single day. No matter how much we called and complained about this breach of the TOS, egroups/Yahho did diddlysquat about it. Run my own lists now, and never looked back.

(am I the only one who uses the latest version of trackback btw?)
posted by dabitch at 10:11 AM on November 12, 2003


The best way to do that is to use a simple non-OCR-able graphic dynamically generated for each potential post, and have the user type in the number or word on the graphic.

CAPTCHAs are too annoying to be required for each post, and any scheme that allows them to be bypassed under some circumstances (for example, by a cookie that "proves" you have responded correctly in the past) is open to exploitation by spammers. Furthermore, a visual CAPTCHA is not very accessible to the blind (or to those with text-only terminals). No, in the end, the only way to prevent this kind of spamming is to require users to register for an account, or at least to validate their e-mail address (the latter can be done entirely using cryptographic tokens without requiring a registration database). Perhaps allow unvalidated visitors to continue to post, but subject their messages to much more strict filtering (i.e. no URLs allowed in messages).
posted by kindall at 12:52 PM on November 12, 2003


FWIW, I'm using PopFile Bayesian filtering on my email.
Messages Classified
Bucket          Count         False +    False-	
damnspam        112 (9.88%)         0        22	
friends          56 (4.94%)         5         4	
maillist        187 (16.50%)        0         2	
publishing      142 (12.53%)        3         2	
spam            446 (39.36%)       12         1	
work            190 (16.76%)        4         0
Or, in short, 1 133 messages at a 97.88% success rate (and most of the failures were due to training spam into the more-refined damnspam.)
posted by five fresh fish at 10:51 PM on November 12, 2003


five fresh fish,

Thank you for mentioning PopFile.

I had tried Bayesian filters before but none that supported Japanese characters!

[This is good!]

You made my day! :)
posted by cup at 2:48 AM on November 13, 2003


Pemulis wrote:
Blacklists: they'll find a way around them.

And veen dey doo, vee vill crush dem like little bugs! :-)

Seriously though, MT-Blacklist ain't like the run-of-the-mill blacklists you've seen. It's the first spam solution I've ever seen (or conceived of :-) that puts more workload onto the spammer than the recipient.

Add to that the P2P connectivity (coming soon...although right now, we're running manual central updates) and you've got one powerful weapon that will always be a step ahead of the vermin.
posted by fooljay at 5:24 AM on November 13, 2003


« Older Science Times: 25th Anniversary...  |  American rhetoric online... Newer »


This thread has been archived and is closed to new comments