ArticlesForumDownload AboutContact

boakes.org

nice of you to drop by. tea?

Tags: Spam, WordPress

Akismet htaccess extension

June 8th, 2006, by Rich.

Akismet htaccess extension

My spam counter in Akismet has been steadily rising of late, and it’s been approaching 10,000 caught spams very quickly. Yesterday it went through 9,950 and with my average of over 100 spams per day it should have gone through the 10,000 barrier by now. But instead I’ve had about 3 spams today. Did I just find an off button for spam?

Worst Offenders

I wrote the other day about a small Akismet extension that I’d been playing with that helps remove the worst offenders from the list of caught spam - this in turn makes false positives easier to recognize. One of the things that the extension notices is which IP addresses are particularly prolific, it’s this information that helped me to a 95% (and rising) reduction in the spam that I send to Akismet to be checked.

htaccess

When writing about the extension on the Akismet mailing list, I suggested that hooking the worst offenders IP list apache htaccess file should provide a simple dynamic means of rejecting spams before the http request has been processed by the server (and before the akismet plugin has had to check it against it’s server).

So I tried it, and it works, apparently flawlessly so far. In my access log I can see that over 100 requests have been rejected so far today. That’s 100 requests that are not sat in my “spambox” waiting for me to check for false positives.

Todays Akismet spam count is in the single figures, instead of the triple figures.

Fantastic.

Ongoing Thoughts

So what does this mean for this site?

  1. If a spammer does manage to leave a message here, then it’s caught by Akismet and marked as such, it never gets through.
  2. When I remove those messages, I can optionally ban the IP address from whence they came.
  3. Bandwidth usage is reduced becuse the server does not accept connections from spammers, and fewer two way chats with the Aksimet server are necessary.
  4. The comment database does not get needlessly filled with temporary comments that are removed once their spammyness has been identified, so the DB and DB indexer has less work to do.
  5. So what does this mean for the Akismet project?

    1. There will be fewer hits from this site, so more time to concentrate on others.
    2. If the changes can be used by others, the net effect will be a more scalable and responsive service.
    3. If large scale uptake was achieved, the spam zeitgeist might start to look different because the number of spams being checked daily should significantly reduce.

    There are several obstacles to global spam nirvana, including:

    1. The installation proecess will require a tiny bit of “hand cranking” to ensure the htaccess file is in a suitable state for automatic updating.
    2. The system should probably exist as a separate plugin that hooks into the Akismet plugin at appropriate points, but those points haven’t been defined yet.
    3. The system should probably exist as a separate plugin that hooks into the Akismet plugin at appropriate points, but those points haven’t been defined yet.
    4. Not everyone uses Apache, so not everyone has an htaccess file.
    5. Not everyone uses Wordpress, so Akismet service users on other platforms will have to re-implement the idea.

    Download

    Note: this is an experimental extension to Akismet. It appears to work for me, and I will be delighted if others try it an can give useful feedback for the experiment.

    i.e. Please don’t ask linux/htaccess questions - I’d love to help, but I don’t currently have time to hand-hold on a non-production experiment.

    Still keen? Great.

    If you’re really brave and want to try it out, you need to carefully follow these steps.

    1. Precondition: Follow the installation instructions for Akismet and ensure it’s working correctly.
    2. Download the Extended Akismet Plugin.
    3. Replace the akismet plugin with the one you just downloaded - it should pick up all the configuration from the “official” version of the plugin.
    4. In the WP admin interface Open the “Manage | Files” tab, and check that your htaccess file is writable.
    5. The automatic IP banning is written between two markers that are “# BEGIN worst-offenders” and “# END worst-offenders”. If you already have a deny list in your htaccess file, just add the markers to the list. Mine looks like this:
      Order Allow,Deny
      # BEGIN worst-offenders
      Deny from 202.75.49.130
      Deny from 202.75.49.131
      Deny from 202.75.49.133
      Deny from 202.75.49.134
      # END worst-offenders
      Allow from all
    6. From now, when you look at your “Worst Offenders” list in Akismet, you should see the option to ban the spamming IP addresses when you delete the messages.
    7. Feedback and ask questions below

    FAQ

    1. Is this a polished and buffed plugin that’s ready for the prime time?No! Absolutely not! For many reasons. This is an experiment to see if the idea works and to generate some discussion around what’s needed for dynamic spam blocking systems to work. It’s public so that those with the right skills can try it, or examine it, and perhaps learn from or contribute to it. If you’re an armchair amateur blogger, this plugin is not for you; yet.
    2. Why are some items in the list of offenders not ticked?Items with fewer than 4 spam messages are not ticked - this is flexible within the plugin, but not yet configurable.
    3. Why do I only see 10 “Worst Offenders” at once?Items with fewer than 4 spam messages are not ticked - this is flexible within the plugin, but not yet configurable.
    4. I have an idea for making this better, what should can I do?Discuss it, implement it, share it.
    5. Does all this IP address checking add more load onto my poor server?The benefits far outweigh the costs. It’s a small increase at the front end, but a massive decrease overall. Comparing an IP address is mathematically simple task, so it is very fast. Storing the comment, sending it to the Akismet service and then removing it from the database is much more work.
    6. Could this be an end to spam?No. The number of zombie machines out there is too large to block them all, this method reduces spam from zombies that know about your website, so it directly saves your resources whilst reducing the number of calls your server makes to the Akismet service.
    7. How many machines can this method block?Currently IP addresses drop off the end of the list after 400 are added, so the least recent disappear. This is adjustable in the software but not user-configurable yet.
    8. If I’m blocking spam at source, will capability of Akismet be diminished, because it might not see new variants of spam as they emerge?I don’t know. I doubt it. Maybe Matt can add detail without giving too much away.
    9. Can it block legitimate comments from non-spammers?It’s possible, but improbable. In cases where the spammer comes through a proxy, the IP address of the proxy might get banned, so anyone attempting to connect to the site through that proxy would get a “403″. Similarly, in shared IP pools for dial-up users, it’s feasible (though highly improbable) that a spammer might dial up, spam, become banned and then hang up, relinquishing the IP address to the next user. If that user happens to visit your site then they’ll get a 403. The lower threshhold of ‘n’ spams from an IP address or a domain is there to decrease these possibiities, but it cannot negate the issue.
    10. How long does the ban last?Currently there is a rather arbitrary limit of 400 IP addresses in a FIFO queue. When an address gets to the end it’s dropped off the list and is thus allowed to connect again.
    11. I think I’m ok with .htacces files, but what if i’m not?Take a backup before you start: cp .htaccess .htaccess.bak in your wordpress root. Then if you want to revert rm .htaccess then cp .htaccess.bak .htaccess.
    12. Can I revert to the vanilla Akismet plugin?Yes. No changes are made that affect the standard akismet functions, so you can swap back and forth by replacing the akismet.php file as many times as you like.
    13. I want it, but I’m not an uber-geek, is there any hope?Yes. If you think it’s a useful idea, the most helpful thing you can do is blog about it. If people red your blog and like the idea then it will help generate interest. Interest generates ideas, which increase the likelyhood that this could turn into something really useful.

35 Responses to “Akismet htaccess extension”

Pages: «1234»

  1. 11
    Trackback from: 冰古Blog » Akismet Worst Offenders Extension
  2. 12
    Eric Meyer Says:

    Nice work, sir! One small glitch: the text reading something like “In total, 10 IP addresses are now banned (1 added).” contains a link, and that link points to boakes.org/wp-admin/templates.php?file=.htaccess — as opposed to the corresponding page on my server. It’s something I’m sure I can fix in my own copy of the plugin, but it’s also something worth fixing for future downloaders.

  3. 13
    Rich Says:

    Thanks to Eric Meyer and Chris Samuel for pointing out the deliberate snafu with the “banned” link. I’ll knock out a new version now. I should have done it when Chris pointed it out initially but it’s been sunny, and that doesn’t happen in Britain all that often :)

    Ok, that seems be fixed. I’ve switched the link to the unbroken version; watch out for a minor (but useful) improvement to follow…

  4. 14
    Trackback from: Reduce Comment Spam With Akismet Worst Offenders and htaccess Extension « Sabahan.com
  5. 15
    Trackback from: New Blog Anti-Spam Tools at The Musings of Chris Samuel
  6. 16
    Trackback from: Frühstücksfleisch wird immer agressiver — Software Guide
  7. 17
    An NTL user from Nottingham Says:

    wtf you dick, you leach of wordpress, and don’t even give them credit for their software.
    P.s. Akismet doesn’t work for me, it’s shit theres a good one thats in the plugin db though.

  8. 18
    Rich Says:

    Dear NTL user from Nottingham, perhaps you missed the list of categories just under the article title. This article is published in two categories, Spam and WordPress. This is the 35th article in the WordPress category.

    I’m delighted to note that at the time of writing, this little extension is on the front page of planet.wordpress.org thanks to Donncha’s favourable review.

    The article relates to one of several plugins that I have written for WordPress, that I have given freely to the WP community. When time permits (which is less often that I’d like) I try and visit #wordpress on IRC and answer the odd question. I also try to support the plugins that I write, and answer as many questions as I can.

    I’m sorry Akismet doesn’t work for you, so I’m pleased that the other plugin that you failed to adequately reference provides you with spam-free blogging.

    Since you appear keen to help the community, please consider joining the spam-stopper mailing list where discussion surrounding the Akismet services and clients is ongoing. I’m sure your silver tongue will help cut through any bureaucracy that might be stopping Akismet from being a better tool than it already is.

    So, to recap; your claims of “not crediting wordpress” have been firmly rebuffed, suggestions of “leaching” have been shown to be wholly without grounds and poorly researched and your irrelevant insults have been politely ignored.

    BTW, you sound remarkably like an angry spammer who’s worried that this experiment might affect your profits; after all, you say you’re a WordPress user, but you’ve not given your name or the URL of your blog. People are bound to wonder if there’s an ulterior motive behind your anonymous barbed words.

  9. 19
    Chris Samuel Says:

    Hi Rich,

    Sounds like they are the same loser who tried to spam my blog post about using this plugin with abusive comments (same source IP and the referrer to my blog post was from this article, so it seems they’re following the trackbacks). Looks like he wants to have a go at anyone who’s fighting spam..

    I think we’re at stage 2 of the Ghandi scale. :-)

    cheers!
    Chris

  10. 20
    Rich Says:

    After a further comparison of access logs, Chris has also been kind enough to forward me copies of the unpublished comments from the same user. Needless to say, “An NTL user from Nottingham” doesn’t seem to have anything positive or constructive to say on any subject, so any future ramblings will be reserved for publication in a future thread; or maybe I’ll just mark them as spam and let Akismet take over.

    Incidentally he/she/it found this site through a rather specific Google query for “.htaccess“, nothing to do with WordPress per se.

Pages: «1234»