Akismet htaccess extension

Akismet htaccess extension

My spam counter in Akismet has been steadily rising of late, and it’s been approaching 10,000 caught spams very quickly. Yesterday it went through 9,950 and with my average of over 100 spams per day it should have gone through the 10,000 barrier by now. But instead I’ve had about 3 spams today. Did I just find an off button for spam?

Worst Offenders

I wrote the other day about a small Akismet extension that I’d been playing with that helps remove the worst offenders from the list of caught spam – this in turn makes false positives easier to recognize. One of the things that the extension notices is which IP addresses are particularly prolific, it’s this information that helped me to a 95% (and rising) reduction in the spam that I send to Akismet to be checked.

htaccess

When writing about the extension on the Akismet mailing list, I suggested that hooking the worst offenders IP list apache htaccess file should provide a simple dynamic means of rejecting spams before the http request has been processed by the server (and before the akismet plugin has had to check it against it’s server).

So I tried it, and it works, apparently flawlessly so far. In my access log I can see that over 100 requests have been rejected so far today. That’s 100 requests that are not sat in my “spambox” waiting for me to check for false positives.

Todays Akismet spam count is in the single figures, instead of the triple figures.

Fantastic.

Ongoing Thoughts

So what does this mean for this site?

  1. If a spammer does manage to leave a message here, then it’s caught by Akismet and marked as such, it never gets through.
  2. When I remove those messages, I can optionally ban the IP address from whence they came.
  3. Bandwidth usage is reduced becuse the server does not accept connections from spammers, and fewer two way chats with the Aksimet server are necessary.
  4. The comment database does not get needlessly filled with temporary comments that are removed once their spammyness has been identified, so the DB and DB indexer has less work to do.
  5. So what does this mean for the Akismet project?

    1. There will be fewer hits from this site, so more time to concentrate on others.
    2. If the changes can be used by others, the net effect will be a more scalable and responsive service.
    3. If large scale uptake was achieved, the spam zeitgeist might start to look different because the number of spams being checked daily should significantly reduce.

    There are several obstacles to global spam nirvana, including:

    1. The installation proecess will require a tiny bit of “hand cranking” to ensure the htaccess file is in a suitable state for automatic updating.
    2. The system should probably exist as a separate plugin that hooks into the Akismet plugin at appropriate points, but those points haven’t been defined yet.
    3. The system should probably exist as a separate plugin that hooks into the Akismet plugin at appropriate points, but those points haven’t been defined yet.
    4. Not everyone uses Apache, so not everyone has an htaccess file.
    5. Not everyone uses WordPress, so Akismet service users on other platforms will have to re-implement the idea.

    Download

    Note: this is an experimental extension to Akismet. It appears to work for me, and I will be delighted if others try it an can give useful feedback for the experiment.

    i.e. Please don’t ask linux/htaccess questions – I’d love to help, but I don’t currently have time to hand-hold on a non-production experiment.

    Still keen? Great.

    If you’re really brave and want to try it out, you need to carefully follow these steps.

    1. Precondition: Follow the installation instructions for Akismet and ensure it’s working correctly.
    2. Download the Extended Akismet Plugin.
    3. Replace the akismet plugin with the one you just downloaded – it should pick up all the configuration from the “official” version of the plugin.
    4. In the WP admin interface Open the “Manage | Files” tab, and check that your htaccess file is writable.
    5. The automatic IP banning is written between two markers that are “# BEGIN worst-offenders” and “# END worst-offenders”. If you already have a deny list in your htaccess file, just add the markers to the list. Mine looks like this:
      Order Allow,Deny
      # BEGIN worst-offenders
      Deny from 202.75.49.130
      Deny from 202.75.49.131
      Deny from 202.75.49.133
      Deny from 202.75.49.134
      # END worst-offenders
      Allow from all
    6. From now, when you look at your “Worst Offenders” list in Akismet, you should see the option to ban the spamming IP addresses when you delete the messages.
    7. Feedback and ask questions below

    FAQ

    1. Is this a polished and buffed plugin that’s ready for the prime time?No! Absolutely not! For many reasons. This is an experiment to see if the idea works and to generate some discussion around what’s needed for dynamic spam blocking systems to work. It’s public so that those with the right skills can try it, or examine it, and perhaps learn from or contribute to it. If you’re an armchair amateur blogger, this plugin is not for you; yet.
    2. Why are some items in the list of offenders not ticked?Items with fewer than 4 spam messages are not ticked – this is flexible within the plugin, but not yet configurable.
    3. Why do I only see 10 “Worst Offenders” at once?Items with fewer than 4 spam messages are not ticked – this is flexible within the plugin, but not yet configurable.
    4. I have an idea for making this better, what should can I do?Discuss it, implement it, share it.
    5. Does all this IP address checking add more load onto my poor server?The benefits far outweigh the costs. It’s a small increase at the front end, but a massive decrease overall. Comparing an IP address is mathematically simple task, so it is very fast. Storing the comment, sending it to the Akismet service and then removing it from the database is much more work.
    6. Could this be an end to spam?No. The number of zombie machines out there is too large to block them all, this method reduces spam from zombies that know about your website, so it directly saves your resources whilst reducing the number of calls your server makes to the Akismet service.
    7. How many machines can this method block?Currently IP addresses drop off the end of the list after 400 are added, so the least recent disappear. This is adjustable in the software but not user-configurable yet.
    8. If I’m blocking spam at source, will capability of Akismet be diminished, because it might not see new variants of spam as they emerge?I don’t know. I doubt it. Maybe Matt can add detail without giving too much away.
    9. Can it block legitimate comments from non-spammers?It’s possible, but improbable. In cases where the spammer comes through a proxy, the IP address of the proxy might get banned, so anyone attempting to connect to the site through that proxy would get a “403″. Similarly, in shared IP pools for dial-up users, it’s feasible (though highly improbable) that a spammer might dial up, spam, become banned and then hang up, relinquishing the IP address to the next user. If that user happens to visit your site then they’ll get a 403. The lower threshhold of ‘n’ spams from an IP address or a domain is there to decrease these possibiities, but it cannot negate the issue.
    10. How long does the ban last?Currently there is a rather arbitrary limit of 400 IP addresses in a FIFO queue. When an address gets to the end it’s dropped off the list and is thus allowed to connect again.
    11. I think I’m ok with .htacces files, but what if i’m not?Take a backup before you start: cp .htaccess .htaccess.bak in your wordpress root. Then if you want to revert rm .htaccess then cp .htaccess.bak .htaccess.
    12. Can I revert to the vanilla Akismet plugin?Yes. No changes are made that affect the standard akismet functions, so you can swap back and forth by replacing the akismet.php file as many times as you like.
    13. I want it, but I’m not an uber-geek, is there any hope?Yes. If you think it’s a useful idea, the most helpful thing you can do is blog about it. If people red your blog and like the idea then it will help generate interest. Interest generates ideas, which increase the likelyhood that this could turn into something really useful.
This entry was posted in General and tagged , . Bookmark the permalink.

36 Responses to Akismet htaccess extension

  1. Chris Samuel says:

    Minor bug – when you ban an IP address it puts a message up about X IP addresses being banned, but does that as a hardcoded link to http://boakes.org/wp-admin/templates.php?file=.htaccess – which is probably not what you want.. ;-)

    But, a great tool!

    Folks have to remember to let spam build up though and not delete it on reflex so it gets a better chance of repeat spammers.

    cheers!
    Chris

    [Update: Fixed!]

  2. Chris Samuel says:

    Rich, the lower limit stuff doesn’t appear to be working here (either on my site or Donna’s blog), I seem to see it classify all spams, even those with only 1 appearance.

    Could this be a PHP5 issue ?

    cheers!
    Chris

  3. Rich says:

    I think it’s down to the fact that this version shows all spammers (even down to just 1 hit) but only pre-selects (i.e. ticks) the box of those that have spammed multiple times – let me know if what you’re seeing is different to that. I’m basically playing with the UI to see what works, what’s useful etc.

  4. Rich says:

    Update.
    Since the trial began 6 days ago, Akismet has handled about 300 spam comments, the IP’s of those machines have been banned by the plugin and as a result, over 1000 connections have been denied access with a 403 message.

    In past months the number of 403′s has averaged 100 per month, so the current level of 150 per day suggests that the mechanism works.

    It’s certainly a lot easier to separate ham from spam, because my spam box is almost empty.

    The do-or-die card for this approach is the number of IP addresses that htaccess can ban before it starts to negatively affect performance.

    The key here is that using akismet, the server has to do work every time a comment spam is received, but by banning IP addresses, the server has to do (far simpler comparison) work for every connection.

    It’s a trade off, currently the implicity of an empty spam box is wonderful, and worth it’s weight in gold, but I’d like to know just how far the idea can be stretched.

  5. Rich says:

    Update.
    Another 5 days on over 2000 spam connections have been denied and about 400 handled by Akismet. So, early indicators are that using dynamic htaccess re-writing can cut comment spam by two thirds. Which is good, because this is cutting the spam before having to use resources in asking a third party for help.

  6. Pingback: Holy Shmoly! :: The Akismet Worst Offenders

  7. Elliott Back says:

    What you need is to add an iptables DENY rule instead of an .htaccess rule. Apache still has to do a lot of work to process those rules on incoming requests. If you’re going to drop connections based on an IP, why not go down a few layers?

  8. Rich says:

    Update.
    The 403 count is now up to 3400 (i.e. 1400 denied connections since my last update). That means the average number of banned connection attempts is curently circa 240 per day (one every six minutes).

  9. Rich says:

    Hi Elliot,
    That’s certainly an extension of the idea which could increase the benefit of blocking – every layer counts.

    Currently we don’t use ipchain on our debian server (and there are several other sites which run on it so I can’t go too mad).

    I went for .htaccess because it’s low-hanging fruit. It proves that the concept can work, and it also highlights that such plugins could be made easier to develop with a few more hooks in the akismet/wordpress framework.

    Hopefully, we can do something to make active defence part of the standard wordpress experience, with a suite of plugins that can cater for different hosting and firewall scenarios.

  10. Pingback: harc a spam kommentek ellen - kobak pont org

  11. Pingback: 冰古Blog » Akismet Worst Offenders Extension

  12. Eric Meyer says:

    Nice work, sir! One small glitch: the text reading something like “In total, 10 IP addresses are now banned (1 added).” contains a link, and that link points to boakes.org/wp-admin/templates.php?file=.htaccess — as opposed to the corresponding page on my server. It’s something I’m sure I can fix in my own copy of the plugin, but it’s also something worth fixing for future downloaders.

  13. Rich says:

    Thanks to Eric Meyer and Chris Samuel for pointing out the deliberate snafu with the “banned” link. I’ll knock out a new version now. I should have done it when Chris pointed it out initially but it’s been sunny, and that doesn’t happen in Britain all that often :)

    Ok, that seems be fixed. I’ve switched the link to the unbroken version; watch out for a minor (but useful) improvement to follow…

  14. Pingback: Reduce Comment Spam With Akismet Worst Offenders and htaccess Extension « Sabahan.com

  15. Pingback: New Blog Anti-Spam Tools at The Musings of Chris Samuel

  16. Pingback: Frühstücksfleisch wird immer agressiver — Software Guide

  17. wtf you dick, you leach of wordpress, and don’t even give them credit for their software.
    P.s. Akismet doesn’t work for me, it’s shit theres a good one thats in the plugin db though.

  18. Rich says:

    Dear NTL user from Nottingham, perhaps you missed the list of categories just under the article title. This article is published in two categories, Spam and WordPress. This is the 35th article in the WordPress category.

    I’m delighted to note that at the time of writing, this little extension is on the front page of planet.wordpress.org thanks to Donncha’s favourable review.

    The article relates to one of several plugins that I have written for WordPress, that I have given freely to the WP community. When time permits (which is less often that I’d like) I try and visit #wordpress on IRC and answer the odd question. I also try to support the plugins that I write, and answer as many questions as I can.

    I’m sorry Akismet doesn’t work for you, so I’m pleased that the other plugin that you failed to adequately reference provides you with spam-free blogging.

    Since you appear keen to help the community, please consider joining the spam-stopper mailing list where discussion surrounding the Akismet services and clients is ongoing. I’m sure your silver tongue will help cut through any bureaucracy that might be stopping Akismet from being a better tool than it already is.

    So, to recap; your claims of “not crediting wordpress” have been firmly rebuffed, suggestions of “leaching” have been shown to be wholly without grounds and poorly researched and your irrelevant insults have been politely ignored.

    BTW, you sound remarkably like an angry spammer who’s worried that this experiment might affect your profits; after all, you say you’re a WordPress user, but you’ve not given your name or the URL of your blog. People are bound to wonder if there’s an ulterior motive behind your anonymous barbed words.

  19. Chris Samuel says:

    Hi Rich,

    Sounds like they are the same loser who tried to spam my blog post about using this plugin with abusive comments (same source IP and the referrer to my blog post was from this article, so it seems they’re following the trackbacks). Looks like he wants to have a go at anyone who’s fighting spam..

    I think we’re at stage 2 of the Ghandi scale. :-)

    cheers!
    Chris

  20. Rich says:

    After a further comparison of access logs, Chris has also been kind enough to forward me copies of the unpublished comments from the same user. Needless to say, “An NTL user from Nottingham” doesn’t seem to have anything positive or constructive to say on any subject, so any future ramblings will be reserved for publication in a future thread; or maybe I’ll just mark them as spam and let Akismet take over.

    Incidentally he/she/it found this site through a rather specific Google query for “.htaccess“, nothing to do with WordPress per se.

  21. Pingback: Basic Thinking Blog » Akismet getunt

  22. Pingback: my weblog » Akismet in Neuauflage

  23. oculos says:
    This comment has been moved from the Worst Offenders article so that it’s in context for others searching for similar infomation.

    Hi there!

    Is it possible that this extension increases the number of received spam?

    See, the first 2, 3 days after using it, it worked beautifully. Then, 3 days later, I started getting like 300 spam/hour, and the number of banned ip’s just got to like 200 so far! Hoe come they keep getting through, if they are banned? Maybe my .htaccess is misconfigured or something? It is indeed addind the ip’s there, but still lots of spam are getting through, which increases the traffic I get.

    Now I am kinda afraid to get back to the original akismet plugin because i’ll have to quit my job to filter spam… ;)

    I’m sure itÅ› something to do with my setup, so could you guys give me a hint of what to do?

    Yours,

    oculos

  24. Rich says:

    Hi Oculos,

    This comment has been moved from the Worst Offenders article so that it’s in context for others searching for similar infomation.

    I notice you’ve mentioned Banned IP’s so you’re not using this plugin, you’re using the htaccess extension which adds to the concepts described here… it’s also highly experimental, however it should not result in an increase in spam.

    When you say you’re “receiving” 300 spam per hour can you be more specific? Are these spam messages coming from “banned” IP addresses?

  25. oculos says:

    Dear Rich,

    Thanks for your reply!

    I believe they are comming from the banned IP addresses, as sometimes I try to ban those listed, but they don’t get added (likely because they are already on the list of the .htaccess).

    the spam are the usual, those poker stuff… could it be that either they spoof their address, or because the worst offenders try to ban the ip associated with the domain name, and they change it? just a guess…

  26. Rich says:

    If you’re “banning” the IP addresses, and if you see them listed in the .htaccess file, then perhaps that file setup is not right.

    Do you have the appropriate allow-deny stuff around the list of banned IP’s? i.e. the “Order Allow,Deny” and “Allow from all” as shown in the download instructions?

  27. Pingback: Useless Knowledge Blog » Blog Archive » Von Spam Karma auf Askimet

  28. Pingback: DeSiOrSite.com » Aggiornamenti sito

  29. Pingback: újra itthon - kobak pont org

  30. Pingback: …time is what you make of it… » Archivio del blog » Akismet htaccess extension

  31. Pingback: Akismet .htaccess Extension | Col’s Tech Stuff

  32. Pingback: 2006 » July » 31 « :: plasticdreams ::

  33. jca says:

    I’ve been trying out your akismet-htaccess-extension — appreciate you adding the ban IPs function.

    A few questions/issues:

    The plug-in “sees” domain names in addition to IPs — most of my “worst offenders” list are the domain names instead of the IPs. Since (I assume this isn’t possible) you can’t deny via a domain name (the plug-in won’t add domains to htaccess, only IPs), is there any way to set the extension to ignore the domain names? This way only bannable IPs will be listed, instead of falling off the top ten list because of domain names.

    The plug-in doesn’t add the “Order Allow,Deny” and “Allow from all” to htaccess if they aren’t there — should it also add these instructions automagically around it’s ip list?

    Appreciate any help/info.

  34. Pingback: Snail Blog (beta) » 最強の対スパムPlugin『Akismet』を試してみた

  35. Pingback: Have FUN! » A few temporary solutions to the spamming problem

  36. Pingback: ITエンジニアのお仕事 » Blog Archive » スパム対策プラグイン導入(Akismet)