Worst Offenders for WordPress 2.5 – Pre-Alpha

I’m in the process of rewriting the Worst Offenders plugin for the soon-to-be-released WordPress 2.5. Before I make a tested and polished version of the code globally available, I’d be interested to hear from anyone who’d like to alpha test it.

As before Worst Offenders works cooperatively with other anti-spam plugins: its primary purpose is identifying and deleting the comments that are 100% definitely spam (sent by the very worst offenders) so that any “false positives” (sent by real genuine humans) can be rescued from the spam bin!

I’ve got it working on this site already, where it’s proved faster than the previous versions – it also has a nicer user interface. There are a few minor operational features that need to be finalised, but it’s basically capable of doing what it’s supposed to.

This version has a pluggable interface, so different “litmus tests” can be applied to spam at the same time, and third parties can easily write tests without having to write a whole interface.

The Worst Offenders v3.0.0.0alpha User Interface

I’m keen to hear from people who:

  • Know their way around WordPress/PHP already.
  • Can take a look at the litmus test API and comment on ways to improve it.
  • Suffer from very high spam loads (hundreds or thousands per day) who’ll be able to give the existing litmus tests a bit of a workout to check if their SQL is as efficient as I hope.

Development SVN is being kindly hosted by Automattic and releases will be available here.

This entry was posted in General and tagged , , , , , , . Bookmark the permalink.

21 Responses to Worst Offenders for WordPress 2.5 – Pre-Alpha

  1. Shahab says:

    Great news .. In fact I was considering to pick this up for development .. But my being lazy (and busy) kept me from doing so .. I have few comments :

    Remove it from Manage screen .. Its a hassle to click three times to goto Worst Offender screen (where you pass Manage posts page in between) on a mobile client .. It would be good if a userhook kinda link is posted in the dashboard itself just beneath the “Akismet has blocked 3 million spams” link … [update: done!]

    The black list feature needs a lot of tweaking .. currently it adds up a lot of duplicates .. I’ll recommend using tables in the database for temporary sorting and filtering the ip addresses that are to be blocked … [update: done!]

    You cant see the comment which is currently marked as spam in WO page .. The link which is generated for a spam comment is dead .. So its no way you can find out which comment is which .. Showing the content in tool tip would be much better ..

    Thanks and good luck for one of the best plugins Ive ever used ..

  2. Rich says:

    great input there Shahab, thanks.

    As I see it there’s a bunch of stuff to be done before the first release candidate, the list is…

    1. Sort out the spam counting system. Currently each Litmus test returns a count of the number of messages it can detect. The downside of this is that if a comment is detected by more than one litmus test, it gets counted twice. The solutions are to either to avoid counting altogether and just return and merge the display results, or, as you’ve suggested use a temporary table to store the count.
    2. Simplify access. Dashboard inclusion with a quick delete button is definitely a good idea. I’ve been looking into that and also into getting a simplified clone of the “All” tab within Akismet itself (its tabs are extensible).
    3. Comment Viewing. The dead link is a surprise, but it’s good that it’s highlighted the issue. I’m torn between a popup or a static iframe that would contain the detail, what I definitely want to avoid is returning and using all the detail unless it’s necessary, so whether popup or iframe, it would need to be a dynamic comment lookup.
    4. Litmus test detection. Currently the litmus tests are manually included. I’d like this to be automated so new tests can be added without any change to the core. This will either be through sticking tests in a single folder, or making tests standard plugins that register with WO upon initialisation.
    5. Once discovered (exactly like the plugin system) each litmus test should initially be un-activated – it should be up to the user to decide what gets used.
    6. Clean up Index creation and deletion in each litmus test.
  3. Rich says:

    I’ve just checked in a basic dashboard link which cuts the number of clicks down to 2 (one click to select the worst offenders page and one click to press the delete button. I’ll have a look at adding a button to the dashboard later.

  4. Chris Samuel says:

    Bore da Rich, dw’in hoffi clep fawr! ;-)

    OK – just upgraded a test version of my blog to 2.5, so I gave your WO 3.0 a quick go, and find I get a divide by zero error:

    Warning: Division by zero in [..]/worst-offenders/classes/all_litmus.php on line 67

    Line 67 in the version I have is:

    $average = $counter / $denominator;

    so it’s not handling where $denominator is 0.

    Pob lwc!
    Chris

  5. Chris Samuel says:

    Interestingly, now I’ve migrated my main blog to 2.5-RC1 I’m not seeing the same problem, possibly because I’ve had a couple of spam comments arrive since the migration.

    Still, well worth checking before a possible divide that something bad isn’t about to happen..

    Diolch yn fawr!

  6. Chris Samuel says:

    Just noticed that it claims “Worst Offenders has has removed 10,264% of spam automatically” – whiter than white ? :-)

  7. Rich says:

    I’ve checked in an update that:

    1. Uses wp_cache wherever possible to reduce DB overhead.
    2. Modifies all getCount functions so they refer to the cached – reducing overall queries.
    3. Gives an accurate total count which allows for overlap of different litmus tests.
    4. Hopefully fixes the percentage calculation issue. (Thanks for the pointer Chris)
    1. Chris Samuel says:

      Do you need to bump the version number for it to notice the change ?

      Or do they only update daily ?

    2. Rich says:

      No idea… perhaps I should probably start tagging the check-ins.

    3. Chris Samuel says:

      Bore da Rich,

      The current version gives me this error on the WO options screen:

      Warning: call_user_func_array() [function.call-user-func-array]: First argument is expected to be a valid callback, ‘AllLitmus::rollCall’ was given in $BLOG/wp-includes/plugin.php on line 311

      Two other points:

      1. You need to tell people they must use MyISAM for their comments table in MySQL as you rely on FULLTEXT queries. I was using InnoDB and was getting errors about FULLTEXT not being supported on that table.

      2. You need to tell people to update their indexes when they install the plugin (or do it for them), otherwise you get errors like:

      [Sat Mar 22 23:53:49 2008] [warn] mod_fcgid: stderr: WordPress database error FUNCTION wp2.wordcount2 does not exist for query SELECT wordcount2(comment_content, ‘http://’) as num, group_concat(comment_id separator ‘,’) as comment_id_list FROM wp_comments where comment_approved=’spam’ group by num having num >= 10 order by num desc; made by runCachedMatchesQuery

      Actually, I’m still getting that, is that because the first error stops the FUNCTION being created ?

      cheers,
      Chris

    4. Chris Samuel says:

      Forgot to say – that’s the current development version (you need to look under “Other Versions” link from the download page.

    5. Chris Samuel says:

      More information – I’ve noticed that there are SQL errors being generated by the attempts to create indexes, they’re not being picked up by your code but are getting reported in the Apache error log.

      Here you go, these are all from just clicking on the “Update Options” button on the WorstOffenders Config page:

      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error FUNCTION wp2.wordcount2 does not exist for query SELECT wordcount2(comment_content, ‘http://’) as num, group_concat(comment_id separator ‘,’) as comment_id_list FROM wp_comments where comment_approved=’spam’ group by num having num >= 10 order by num desc; made by runCachedMatchesQuery
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ”wp_comments’ ADD INDEX ‘ip_spotter’(‘comment_author_IP’)’ at line 1 for query ALTER TABLE ‘wp_comments’ ADD INDEX ‘ip_spotter’(‘comment_author_IP’); made by addIndex
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ”wp_comments’ ENGINE = MyISAM ROW_FORMAT = DYNAMIC;
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tALTER TABLE ‘wp_comment’ at line 1 for query
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tALTER TABLE ‘wp_comments’ ENGINE = MyISAM ROW_FORMAT = DYNAMIC;
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tALTER TABLE ‘wp_comments’ ADD FULLTEXT INDEX ‘content_fulltext’(‘comment_content’);
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tCREATE FUNCTION wordcount2 ( a text, b VARCHAR(255) )
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tRETURNS INTEGER
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tCONTAINS SQL DETERMINISTIC
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tRETURN (CHAR_LENGTH(a)-CHAR_LENGTH(REPLACE(a, b, ”)))/CHAR_LENGTH(b);
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t made by addIndex
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘if exists url_spotter’ at line 1 for query drop index if exists url_spotter; made by addIndex
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ”wp_comments’ ADD INDEX ‘url_spotter’(‘comment_author_url’)’ at line 1 for query ALTER TABLE ‘wp_comments’ ADD INDEX ‘url_spotter’(‘comment_author_url’); made by addIndex
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ”wp_comments’ ADD INDEX email_spotter’(‘comment_author_email’)’ at line 1 for query ALTER TABLE ‘wp_comments’ ADD INDEX email_spotter’(‘comment_author_email’); made by addIndex
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ”wp_comments’ ENGINE = MyISAM ROW_FORMAT = DYNAMIC;
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tALTER TABLE ‘wp_comment’ at line 1 for query
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tALTER TABLE ‘wp_comments’ ENGINE = MyISAM ROW_FORMAT = DYNAMIC;
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tALTER TABLE ‘wp_comments’ ADD FULLTEXT INDEX ‘content_fulltext’(‘comment_content’);
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tCREATE FUNCTION wordcount2 ( a text, b VARCHAR(255) )
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tRETURNS INTEGER
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tCONTAINS SQL DETERMINISTIC
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t\tRETURN (CHAR_LENGTH(a)-CHAR_LENGTH(REPLACE(a, b, ”)))/CHAR_LENGTH(b);
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: \t\t\t made by addIndex
      [Sun Mar 23 00:14:23 2008] [warn] mod_fcgid: stderr: WordPress database error You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ”author_fulltext’ (comment_author)’ at line 1 for query ALTER TABLE wp_comments ADD FULLTEXT INDEX ‘author_fulltext’ (comment_author); made by addIndex

    6. Rich says:

      I’ve removed references to roll_call (I need to find a better way of setting configs, probably within the “content” method).

    7. Chris Samuel says:

      Hi Rich,

      I’ve emailed you a patch for the current devel version that fixes all those MySQL bugs and results in all the indexes getting created correctly.

      I’ve gone from it finding 2 messages to finding 49 that it can deal with as it’s now picking up 46 in the MultiLink category it wasn’t seeing before.

      All 49
      IP 0
      MultiLink 46
      Domain 0
      Email 0
      MD5 0
      Name Length 2
      ObviousName 1

    8. ovizii says:

      Plugin could not be activated because it triggered a fatal error.
      Parse error: syntax error, unexpected T_CLASS in /var/www/web6/web/wordpress/wp-content/plugins/worst-offenders/classes/litmus.php on line 3

      using wp 2.5

    9. Rich says:

      Thanks for the report ovizii. I’ve just checked in a modified version which may should help.

    10. AJ says:

      Hey Rich,

      I’ve been using Worst Offenders since forever and love it.
      I have recently switched to Defensio and was wondering if you could make this work with Defensio too (defensio.com)

      I would love to start using this plugin again. Thanks a ton for the plugin :)

    11. Rich says:

      Sounds interesting… It should be fairly easy to write a defensio litmus test for Worst Offenders. If nobody else has a crack at it, I’ll take a look.

    12. AJ says:

      Awesome.. I’ll be waiting for it :)

    13. AJ says:

      Hey Rich,

      Any updates on the defensio integration with worst offenders?

    14. Alan says:

      am i right in thinking that as you are using abstract class code, that this is now PHP5 only?

      ;_;

    Leave a Reply

    Your email address will not be published.

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">

    Comment moderation is enabled. Your comment may take some time to appear.