You are here


Fight spam and read books

Well, I fixed my spam problems, it seems. I am now using CAPTCHAs on blog comments. A CAPTCHA is a way of checking whether the person accessing a web page is a "real" person by asking them to do something which computers find it hard to do. Traditionally, this has involved asking them to type out a word in a picture, because computers have always had trouble with image processing. However, software has improved at reading images, and this approach has started failing. Some other ways to determine whether the user is a real person have been suggested:

In order to prove your authenticity, please provide the answer to the following formula: formula
And then there's:

a new captcha approach

I am using neither of these methods, unfortunately. Brad pointed out ReCAPTCHA to me, which is now the recommended implementation of the CAPTCHA system. As described on their page, people perform word recognition all the time when they answer CAPTCHAs, and ReCAPTCHA uses this to assist in scanning the world's library archives into digital format. When some pages of some books are scanned in, the software can't always work out what the words are supposed to be, so these words get used in CAPTCHAs, and we let the people of the world work out what they are. If you're wondering how unknown words can be used in a CAPTCHA, go and read the link above.

Anyway, the point is, we're helping to digitize humanity's knowledge, and fighting spam at the same time. It's like hitting two birds with one stone. I notice that Facebook also uses ReCAPTCHA in its sign-up form. I think it's awesome.

Please let me know if there are any issues using the new CAPTCHAs when submitting comments?

Update: More captcha amusements and yet more.

Drupal anti-spam

Lazyweb, O, lazyweb, I call out to thee in my hour of need. I installed the spam and trackback modules for drupal, and to the outside observer, my blog is nicely spam-free. However, I get about fifty spam comments and spam trackbacks a day, which get trapped in the approval queue, and I have to manually wade through cialis and porn adverts/links to see if there are any real comments/trackbacks for any of my posts.

Depressingly, there generally aren't.

What's the best way to keep one's comments and trackbacks spam-free, without having to manually delete every single dodgy one, and without getting any false-positives?

A side note is that the trackback module isn't great - if I want to send a trackback, I have to manually find the trackback URL and put it in the little textbox - isn't there a nice drupal module that checks all outgoing URLs, and autodiscovers the trackbacks, and pings them? The trackback module that I have installed seems to think that this is what it does, but it has delusions of grandeur, in my opinion.

Subscribe to RSS - spam