Using Image Spam Techniques For Captchas.
Publicated on :
1179624185
Spammers are getting better each day with new methods to bypass the OCR. I read a lot about image spammers last month, and discovered how they do it. They have good knowledge on how OCR's work and how to defeat them. Now anti-spam vendors developed "fuzzy signature" technologies which could filter out near exact signatures of images that where classified as spam images. So, the spammers developed other techniques and that lead to the result that signature based detection is bypassed for now.
But, then I got an idea: can we use these techniques to defeat blog and forum spammers by creating Captchas with these techniques?
New image spam comes in various flavors to sum up a few:
1.Word Splitting: Insert random white lines in text
2.Geometric Variance: Line difference & RGB contrast
3.Speckling: confetti in and around text
4.Word salad: Mix up words in clever way.
Now, remember that most blog and forum spam also deploys the use of OCR. So can we use the spammers image techniques against blog and forum spammers, and generate CAPTCHA's out of them? What I found out -and what is key to unsuccessful image spam- is randomization. Every CAPTCHA should be different with every presentation to the user. This can be accomplished in an easy manner. We can insert enough technique randomization, like word salads, speckling, word splitting and generate them on the fly.
In other words:
If we build a Captcha that would mix all variants and pulls out a set of letters and numbers randomly we can use their technique against them, and key is randomization.
I'm interested in what you think about it.