Comment spam outpacing e-mail spam

This week my comment spam has outpaced my e-mail spam. The comment spammers use varied IP addresses, so an IP block does no good. MT-Blacklist is not working for me at the moment but hopefully I will have it installed when the final bits ship with Movable Type 3.1 next week. MT-Blacklist blocks comments based on keywords and link usage, not just IP block. How bad is the problem? Only 7.7% of the comments submitted to my site are legitimate comments. I could restrict to only TypeKey comments, but even a Six Apart employee does not use TypeKey when leaving a comment on my weblog. I do not want anyone to have to jump through too many hoops to participate in a conversation. My solution is to wait on the new MT-Blacklist. I approve all comments before they appear live on my weblog, so it is just a pain to remove the comments and delete the notification e-mails I receive for junk e-mail. I am sure Jay will make good money from licensing MT-Blacklist. The comment blocking feature is too important for Six Apart to leave to an outside developer. Ultimately Six Apart needs to acquire Jay or roll their own solution. Six Apart’s immediate response will be “use TypeKey,” which could work well for a restricted audience, but not a broader audience. The ability to comment on a published item is an essential element for a weblog platform. Many publishers are now shutting down their comments because they just cannot handle the signal to noise ratio. User experience is key, and comment spam takes as much away from the user experience as an e-mail box full of spam. [Update: I went through my log files to look for the path taken by the spammer. The worst spammer was a POST direct to mt-comments.cgi from agent “MSIE 6.0; Windows NT 4.0; PCUser,” or Internet Explorer 6 on a NT box. Each comment spam follows a pattern yet has a unique identifier. E-mail address is “bob@y” + identifier + “” and the identifier begins the comment text. The linked site contains no links and the spammer must only be interested in PageRank for a future launched site. The best course of action is to change the name of mt-comments.cgi to something different and update mt.cfg. This change will make upgrading a little more difficult, but it is worth the pain to keep the bulk of comment spam away (for now).]
  • Posted
  • Updated at
  • Comments [5]


Commentary on "Comment spam outpacing e-mail spam":

  1. dan on wrote:

    You may want to take a look at Elise Bauer’s comments on the subject. I found them quite enlightening. For instance, I hadn’t realized you could change the name of mt-comments.cgi, thus avoiding a lot of the automated spammers.

  2. Anil Dash on wrote:

    C’mon, I do usually use TypeKey to sign in, I had actually just scrolled past the link to see where you had it enabled on that post. :)

    More to the point, You’re running a beta version of MT, and there is a Blacklist for MT3.0d as well as a version coming for the final release of 3.1. I do agree that commenting is a core feature for weblogs (and the reason it’s become one, I think is because of the success of MT) and we’ll definitely be putting work into making it smoother and more accountable, as you know.

    It’s an arms race, and when you have millions of pages published, with hundreds of thousands of them generated by sites that are running old versions of an application, there are a lot of variables to control. I think, though the number of people spamming sites is clearly getting worse, the experience of people running current versions of MT, especially once 3.1 is released next week, will be getting better.

    The core problem in comment spamming actually seems to be similar to a key problem in email spam: zombie machines. There are so many Windows machines that have been taken over by malware that the problem is far more complex than it seems. I think our experience with TypePad and the various attacks people try on that service is going to help inform our path for MT going forward.

    Anyway, Jay Allen’s gonna be dropping by our office later today, I’ll make sure he knows you appreciate his work. :) Thanks for all the feedback!

  3. Niall Kennedy on wrote:

    I agree that TypePad will provide Six Apart with a lot of good hosting provider experience as the largest installation of the MT codebase. A lot of good information to make its way into the codebase as well as best practices for other hosting providers.

    Microsoft is seen as at least part of the problem why its machines are able to be used as zombie machines. Spammers are taking advantage of the largest install base, but Microsoft is seen as the problem. Some users move to OS X and Linux to be free of the holes of Windows. Microsoft’s takes a stance of “well if the world would just upgrade to XP SP2 things would be a lot better.”

    The point I am trying to make is users will see comment spam as a Six Apart design flaw in the same way Microsoft is under fire for its Windows code.

  4. kristine on wrote:

    I got hit a LOT by bob the spammer in the last week +. Its silly because he/she/it keeps leaving comments, but I’m using the option that TypeKey comments come though first and every other comment must be approved. So he’s spamming my mailbox, nobody on my site is actually seeing the literally thousands of comments.

    I’m looking forward to this coming week with the releases Anil mentioned because then I’ll go through and delete all this crap that bob dumped on my site! :)

  5. Dan on wrote:

    I’ve had good luck with ridding my web log of bob by adding the following regex to my mt-blacklist:


    I found it in Jay Allen’s blog and it has kept bob the comment-spam-bot at bay for 5 days and counting…