Archive for the 'Project Honey Pot' Category

Bad Behavior 2.2 Status and Roadmap

November 2nd, 2009 by Michael Hampton

Since the first release of Bad Behavior four years ago, tens of thousands of WordPress users have used it to protect their sites from the scourge of link spam. Bad Behavior’s second major release, just a year after the first, was a major redesign that has stood the test of time. Bad Behavior became even easier to port to other web site platforms as well as easier to add new features and block new spam.

Now the design needs a few tweaks. This work will eventually become Bad Behavior 2.2. Today I want to update you on some of the changes Bad Behavior needs and what I’m planning for the 2.2 version.

As I noted with today’s 2.0.32 release, development of the 2.0 branch has been limited to bug fixes and security issues so that I can concentrate development on this new version. The development will take place in versions numbered from 2.1. As a development branch, it won’t be appropriate for everyone, but many of you will be interested in following its progress.

Before I get into the details of the roadmap, there’s something I haven’t talked about in a while and should probably do again. Bad Behavior has been a personal project of mine for almost five years now. It was born out of an incident, a couple of months after I started blogging, where I got my first comment spam. Unfortunately, my first comment spam was followed by 700 more over the space of a few hours. As you can imagine, I was thoroughly pissed. I spent some time looking at anti-spam solutions, but at the time there wasn’t much, and what there was didn’t work all that well. I felt I had to roll my own. A couple of months later, Bad Behavior was born.

I still clearly remember cleaning up after that first incident, and killing link spam has become something of a personal crusade for me. But I’ve learned that I can’t possibly do it all alone. Fortunately this field has grown significantly and there are now a whole lot of smart people working on various aspects of the link spam problem. What Bad Behavior brings to the table is to take that 700 spam attack and allow fewer than one percent to reach your blog. Having to clean out 7 spam from the moderation queue is much easier than cleaning out 700. (This is one reason why I advise using more than one anti-spam solution.)

The main technique Bad Behavior uses to accomplish this is to block bots which scrape your site to get access to your comment forms, login forms and other such forms on your site. Once a bot has the form, it can pass it around a botnet and send dozens of spams to that page from all over the world. Preventing malicious bots from accessing the forms in the first place stops the majority of spam. The remainder is a variety of techniques used to identify poorly coded bots which imperfectly masquerade as legitimate web traffic.

As new spammers start up and new botnets come online, some find themselves already blocked, while others need to be analyzed and updates made to block them, so Bad Behavior will always require continuous development. Often this development is delayed because I have to pay bills. As you may be aware if you’ve been a very long time user, I lost my job in 2005 and since then I have lived on revenue from blogging and paid web consulting work. Therefore I can only work on Bad Behavior when my finances permit.

Today my finances do not permit me to do any further work on Bad Behavior, mainly due to the economic recession. If you want this work to continue, as I’ll outline in the roadmap below, skip your morning latte tomorrow and send me a financial contribution. The amount is blank, so fill in whatever you feel is appropriate.

And if you see any problems with the roadmap, or feel it could be improved, feel free to comment below.

Core Changes

The most important change won’t be visible right away. A design change to the core is needed to enable Bad Behavior to be tested using more rigorous test methods. The earliest 2.1 releases will contain this change and I will write tests for each of Bad Behavior’s existing checks. Before the 2.2 stable release, and going forward, a test will be written for each feature introduced into Bad Behavior, to help prevent obvious and silly bugs which require almost immediate updates to fix, as happened with 2.0.30 through 2.0.32. The test suite which emerges from this work will ship as a downloadable package, so that you can test Bad Behavior yourself. (Thanks to Tony Bibbs for suggesting this change.)

Bad Behavior’s various whitelists will be moved out of the core and into a separate file template, downloaded separately from Bad Behavior. This will allow you to update Bad Behavior without disturbing your personal whitelists. This is currently an issue for all platforms. On platforms which support an integrated administrative page for changing Bad Behavior’s settings, and can store settings in the host platform’s database, the whitelists will be manageable from within the administrative page.

Platform Connector Changes

On platforms which do not support an integrated administrative page for changing Bad Behavior’s settings, and require settings to be placed in the platform connector’s file, these settings will be placed in a separate file, downloadable separately from the platform connector. This will allow for the incorporation of settings for new features without updating the platform connector, or conversely, updating the platform connector without disturbing your settings. This is currently an issue for the Drupal module, MediaWiki extension, and possibly other platforms.

The integrated administrative page will be introduced for more platforms. I had originally intended to write this myself for MediaWiki, whose platform connector I maintain, but the lack of adequate developer documentation had made it virtually impossible. (The documentation seems to have improved greatly since then, so I’m going to make another attempt at it.) I expect that these are going to be highly specific to the platform and that little code can be shared between them. If you maintain a platform connector and need assistance with implementing this, please contact me.

The integrated administrative page will be enhanced to allow more complex searching through the database records. Currently it is not possible to search the records except by manually crafting a URL. In the future the entire database will be searchable and you will be able to mark records and forward them to me for analysis. Due to privacy concerns, records sent to me are kept on encrypted media at all times, used solely for analysis of how to permit or block similar traffic (as appropriate) and destroyed within 90 days. Personally identifying information, if present, is not used. I have done this since the beginning.

The current list of platform connectors needs to be updated; it’s come to my attention that some are out of date or their maintainers have stopped maintaining them. If you are, or want to be, a maintainer for a platform connector, please contact me.

The code which creates the database in a new Bad Behavior installation is currently in the core; however, it properly belongs in the platform connector, since it can vary by platform. For instance, the Drupal module already uses its own code for this, but the WordPress and MediaWiki connectors share the same code. This code will be moved out of the core and split into separate files to facilitate reuse where possible, give a slight performance gain, and enable other platforms to do their own initialization where needed.

I’ve identified several new situations in which it would be useful for Bad Behavior to call back to the platform connector to have the host platform perform some action or another. As a result, the platform connector API, such as it is, will expand. It will remain backward compatible, however, in case some platform does not or cannot implement the complete API.

The porting documentation needs to be greatly reworked and expanded. It doesn’t say much except to look at the existing code and base your work off of it, which is perhaps fine for some experienced programmers, but not for everyone.

Bad Behavior needs to be localized, that is, translated into languages other than English. This is still an open design issue, since each platform handles localization in a completely different manner and requires files containing localized translations to be installed in different places. The most likely solution at this point will involve “language packs” which you will be able to download separately from the core. In addition, people will be needed to help translate Bad Behavior. I will make a separate post when I’m ready to accept translations.

Spam Prevention

The core design change mentioned above, which will allow for improved testing, will also enable some new features which haven’t been implementable before, such as improved whitelisting of search engines. As you may know, Bad Behavior has been using the http:BL service from Project Honey Pot to detect spammers for some time now (if you enabled the feature). The http:BL service also identifies many different search engines and can be used to whitelist them, preventing such issues as the recent blocking of msnbot when it began using a suspicious user-agent string. This feature will be available for testing early in the 2.1 release cycle. The original methods of identifying major search engines will remain in place and be maintained for those who cannot use http:BL.

Speaking of Project Honey Pot, Bad Behavior will allow you to serve spammers honey pots or QuickLinks provided by the service, so that it can catch even more spammers.

A screener which uses JavaScript and cookies to identify legitimate users has been in Bad Behavior since the initial 2.0 release, but proved difficult to implement, as it required calls into the host platform which weren’t always available or didn’t work as expected. This feature has been disabled for years. I will finally revisit this technique, as I think there’s still some value in this approach.

And of course I will continue to kill spammers as they come across my radar screen.

Other

Bad Behavior’s documentation has always been less thorough than I would like. It will have to be revamped. In addition I will have to keep on top of it by writing documentation for new features as the new features are written, rather than afterward. Documentation will also need to be translated, and I will need your help for that. I will make a separate posting when I am ready to accept translations.

On many platforms, users currently have to download the Bad Behavior core, then the platform connector, and then upload them together on their web site. If not done perfectly, this can result in errors, or a completely broken site. Where possible, I plan to have a build system which, upon each release of the core, combines it with the platform connector for each platform, an optional language pack, as well as files such as the whitelist and settings templates mentioned above, creating a single download. This should make installing and updating the software more convenient and less error-prone for users of affected platforms.

Finally, I made a proposal long ago for Bad Behavior to automatically update itself. This is not appropriate for everyone, of course, but it may be useful for people on platforms which don’t provide update facilities for their plugins/extensions. This is still a post-2.2 change, though I want to do some preliminary work to see if it can be done reliably and what might be necessary to accomplish it.

I’ve also probably forgotten a few things. They’ll be announced when I remember them.

Status

Bad Behavior must continue to keep up with spammers as they attempt to adapt and find new ways to post their automated garbage. Historically, keeping up with the spammers has not been that difficult, as there is only so much the spammers can do while maintaining their high rates of spamming. Today, 100,000 or more spams in a single run is not unusual, and one spammer I’ve blocked can send 1,000,000 in a day. Bad Behavior attempts to drive up the cost of link spamming by blocking as many automated spammy requests as possible, forcing the spammers to resort to MUCH slower manual methods, or ideally, give up and find more honest work.

I believe the proposed changes outlined above will make Bad Behavior a much stronger tool for preventing link spam while at the same time making it more accessible to a wider variety of users and web site platforms.

Only one thing remains, and that is to do the work. As I noted before, Bad Behavior is a user-supported project. If you think this roadmap looks good, and want to accelerate Bad Behavior development, your financial contribution will help ensure that I can devote more time to its development and bring it to fruition much faster. Otherwise, I have to spend my time first on consulting and other work which brings in revenue, and that means it will be much longer before you see these features.

I would estimate that all of the above would take me about six months to complete if it isn’t funded. At the same time I think contributions totaling $500 or more would allow me time to complete the majority of the above within a month. I know that a lot of you are having financial trouble due to the economy; so am I. Even if you are unable to send a contribution, please leave your comments so that I know you support Bad Behavior and wish it to continue.

This is also the time to send in feature requests. If Bad Behavior doesn’t do something you would like it to do, please leave a comment. (And remember that feature requests accompanied by a contribution are more likely to be implemented sooner.) Due to a hard drive crash I’ve lost all email that was sent to me before August of this year, and possibly some more recent email as well. If you have emailed me with a feature request recently, and don’t see it included above, please also leave a comment.

Thank you again for your support, and here’s to a future without spam.

P.S. If anyone knows how to deliver electric shocks over the Internet, please contact me. This could be the ultimate spam-prevention feature. :)

Bad Behavior 2.0.21

August 5th, 2008 by Michael Hampton

Make a Donation.

Bad Behavior 2.0.21 has been released. It is a maintenance release and is recommended for all users.

MediaWiki and WordPress users should take note of special upgrade instructions below.

Who should upgrade?

Users who receive significant traffic from the Ukraine should upgrade to fix an issue which may cause users in the Ukraine to be blocked.

All users should upgrade to take advantage of protection from newly identified spambots and malicious bots as well as a new method of spambot detection.

What’s new?

New in this release (since 2.0.20):

  • Users who specified the Ukrainian language in their browser settings were mistakenly blocked. This issue has been fixed.
  • Bad Behavior now incorporates data on harvesters and comment spammers compiled by Project Honey Pot and published through its http:BL service. In order to enable this feature, you must obtain an http:BL access key and provide this key to Bad Behavior in its settings. While the http:BL settings can be fine-tuned to block or allow requests based on the threat level and age of a harvester or comment spammer record, the default settings have been extensively tested and found to block virtually all spammers known to http:BL while allowing all legitimate users, even those that http:BL may have classified as suspicious. This feature obsoletes any other http:BL plugins you may have, and they can be removed.
  • The Majestic-12 search engine crawler was mistakenly blocked. This block has been removed and a block placed for a malicious bot which pretends to be the Majestic-12 crawler.
  • The bot used by Attributor, a service which looks for copyright infringement and sends takedown notices, has been identified and blocked.
  • Several additional spambots have been identified and blocked by user agent.

Support

If Bad Behavior has helped you, please make a financial contribution toward further development. Your contribution ensures that I can prioritize Bad Behavior development. Otherwise I must spend most of my time on other projects which pay the bills. Which is a shame, because I really enjoy making spammers miserable and drying up their revenue streams until it’s more profitable for them to work at McDonald’s than to send spam.

Download

Download Bad Behavior now!

Special Upgrade Instructions

For MediaWiki: Before installing this version of Bad Behavior, manually remove (e.g. using FTP or ssh) any old versions you may have, including the lines added to LocalSettings.php. Then install the new version fresh, following the installation instructions for MediaWiki.

For WordPress: If updating to this version through the automatic updater fails, manually remove (e.g. using FTP or ssh) any old versions you may have installed. Then upload and install the new version fresh, following the installation instructions for WordPress. After doing so, future automatic updates should proceed normally.

For other platforms: No changes to your upgrade procedures should be necessary.

Project Honey Pot and http:BL

April 27th, 2007 by Michael Hampton

Project Honey Pot made several announcements this week, the largest of them Thursday when it announced it had filed a $1 billion lawsuit against spammers on behalf of the members of Project Honey Pot. I’m proud to say I’ve been such a member for some time now, and will lend whatever assistance I can to efforts to stop spam.

Project Honey Pot has been targeting email spam for years. But now it has also quietly launched an initiative to target blog comment spam. I’m proud to say I’m also participating in that effort.

On Wednesday, the project announced http:BL, a DNS-based blacklist of IP addresses which have been seen harvesting email addresses and sending email and comment spam. This is just about exactly what I had in mind when I announced the Bad Behavior Blackhole almost two years ago; Project Honey Pot has actually built something better.

I’ve spent the last day or so evaluating http:BL and found that its design is unfortunately not amenable to adding directly in to Bad Behavior, as it has significant technical differences from other DNS-based blacklists.

Therefore, I’m writing a separate http:BL plugin for WordPress. I’m currently testing it here and I hope to make the first release in the next few days.

Project Honey Pot relies on webmasters who want to actively participate in stopping spam. But the project has only a few bloggers running honey pots, so it’s not yet catching a lot of comment spam bots.

You can help by signing up for Project Honey Pot and installing a honey pot on your blog, forum or wiki.

Your honey pot, along with millions of others, will trap spambots of all types and feed its data into http:BL, which will improve the service for everyone.