Browser CAPTCHA

Posted about 1 month ago in About This Blog.

I'm sure everyone has noticed that my blog posting has dramatically fallen off from the rate I was getting articles out. Unfortunately, I've been spending my blog time fighting the endless war against spam. I've made some progress there and thought I would share some details that others might find useful.

As I've covered previously this blog now requires me to approve all comments. I'm super happy with this decision. I approve posts promptly, so there's pretty much no downside for users and this means you have not seen a single spam message on this site since I made the change. This was literally the perfect solution… on the viewer's side of the fence.

What it didn't fix was the hassle on my side. I don't mind approving messages at all, as long as I have a reasonable pile to go through. However, the spammers really ramped up their efforts against me lately and this blog received 11,134 comment posts in the month of November alone. Six of those were legitimate comments. That passes my definition of reasonable.

To fight back, I've added a new plugin to this blog I call Browser CAPTCHA.

If you've read this blog closely enough to know how much I hate CAPTCHA's, that name probably surprises you. It's true that I believe CAPTCHA's are pure evil. If you feel the need to control what makes it past the server and you think, "I'll screw up my interface to make a human prove they are a human," then I think you may have a problem with your brain being missing. I swear I always need three shots just to get past a Google CAPTCHA and that's the "Do No Evil" company. Whatever you do, don't get desperate and hit the hearing impaired CAPTCHA button, because that has to be the only thing worse than a normal CAPTCHA. I'm sure the suicide rates for people with vision impairments must be on the rise in this era of site security.

Browser CAPTCHA doesn't do any of that. Instead, I took a page out of Sun Tzu's The Art of War and got to know my enemy a bit better. Spam bots are not browsers and they do some things differently. If you can detect those differences, you know you are not dealing with a human. Thus my plugin screws up the interface for your browser. If it can pass the test, I trust the post.

What are some differences between browsers and spam bots? Here's a list shared with me from Allan Odgaard:

  • Spam bots don't have a Javascript engine. This is the big deal. It seems universally true so it's definitely a key to detecting them. Force them into needing Javascript to pass some test and you've got them.
  • Spam bots don't typically pay attention to cookies. This turns out to be a handy performance detail, since you can use mod_rewrite to redirect incoming requests to certain URL's if they are missing a magic cookie before they even reach your application.
  • Spam bots don't correctly handle redirects for POST requests. You can use this to add another layer of protection.

The current version of Browser CAPTCHA uses these combined factors to test browsers when they try to post a comment. There are other differences my friends have made me aware of, but I haven't employed them yet.

How's this working out? I've had seven spam posts since I made the change a little over two full days ago. They all came in together and I could tell it was a human investigating the changes I had made. If that's the worst thing I have to worry about now, it's a huge improvement. We will see how things go, but I definitely recommend similar strategies for now to others fighting in the war…

James Edward Gray II added about 12 hours later:

Allan Odgaard raised another great point to me today: it can be worth it to check against DNS blacklists as well. There is a Ruby script for doing that.

You can get some false positives this way, if an IP switch happens shortly after an address is used for spamming. It's a pretty uncommon occurrence though.

Antares Trader added 1 day later:

I'm actually writing this comment as much to see if I make it through the spam filter as for any legitimate reason. However, I do have a question.

I have been told that when writing a web site I should always have a solution available for clients who lack JavaScript capabilities. This makes sense on a lot of levels. If you what to get your site notices your most important visitor is the GoogleBot, and it doesn't have JavaScript. Screen Readers are another place where JavaScript is spotty at best. Some strange people still block JavaScript on security grounds.

But here we are talking about posting to a blog. An activity which doesn't exactly merit a huge amount of availability considerations. I would be interested to hear how you balanced particularly the JavaScript enabled requirement in your test against these factors.

I'm glad you are back posting again.

AT

James Edward Gray II added 1 day later:

Does the GoogleBot need to be submitting forms to reach the public portions of your site? I hope not, because it doesn't do that either, nevermind the lack of a Javascript engine.

I seriously hope the current solution isn't hindering screen reading devices. I can see how some anti-spam techniques would, but I'm betting the ones I am using are not. I will take complaints filed against this very seriously of course.

My site' Javascript is viewable to any user who cares to read it, so my opinion is that I'm not risking your security. If disagree with that analysis or just don't care to take the time to double-check the security, I completely understand your choice. And that choice is going to cost you the right to post comments here. I don't really feel that's a radical limitation in this The Age of Ajax.

I'm not trying to belittle your concerns. Obviously, I want everyone to be able to read this site. I'm pretty sure they can. I asked Google and he says he can. That's important to me.

I also like allowing comments and I don't want to make you login just to post them. I can do that for most folks with a little browser interrogation and save myself from checking over 11,000 junk messages a month. That's good enough for me.

If that bothers a lot of readers, I could shut off comments altogether. That treats everyone fairly and still allows me to enjoy posting content here. That's just not my first choice.

So yeah, I agree that you need to make your site accessible. But like everything else it's a balancing act. I also needed my blog to be more maintainable. I had to balance those two factors.

Joseph Pecoraro added 3 days later:

You could throw in a noscript tag to warn users who don't have Javascript enabled. I personally like the approach you've taken.

James Edward Gray II added 3 days later:

I do provide a warning for those who have Javascript disabled, yes.

Add Your Thoughts

You can use Markdown in the body of your comment to format text and make links.

Note that I reserve the right to edit any content you post here. I typically exercise this right to fix formatting issues. All posts must be approved so spam will never be seen on these pages.

Author:
URL or Email (optional):
Body: