Yahoo! CAPTCHA Cracked.

Publicated on : 1201586673
It has been suggested before that it would be a matter of time, but now it seems official: The Yahoo! CAPTCHA is no-more. A team of Russian hackers have found a way to read the CAPTCHA with 35% accuracy. Let there be no mistake: the CAPTCHA that Yahoo! deploys is believed one of the most difficult CAPTCHA's to crack. It utilizes bended alpha numeric characters and other features you might expect from a strong CAPTCHA, and still it's easy to solve by humans. I think this is a great leap in character recognition and the death punch to the Completely Automated Public Turing test to tell Computers and Humans Apart. I have weak faith in CAPTCHA's these days, since there will always be a way to compute something that requires human interaction. Whether it be image CAPTCHA's, audible ones or simply Javascript based CAPTCHA's.

The Russian hackers had this to say about the Yahoo! CAPTCHA:

"The CAPTCHA has a vulnerability we'll discuss later. It's not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100.000 tries per day, taking into the consideration the price of not automated recognition – one cent per one CAPTCHA." - which seems a plausible conclusion. The researchers can be contacted on this address: NetworkSecurityResearch[at]gmail[dot]com. The released software package shows us some inside techniques, the implementation of yahoo CAPTCHA recognition engine can be found here:

http://rapidshare.com/files/84243632/YahooCAPTCHARecognition.rar.html

If it's gone for some reason, hook me up and i'll send you my copy.

First project (server) needs MATLAB 2007a Compiler Runtime (MCR) installed. It waits for a connection and receives CAPTCHA, after that it sends recognized CAPTCHA text string back to client. Client reads jpg-files in test1 directory and sends them one by one to the server located on the same machine.

About developing a good CAPTCHA

I got a lot of questions about solving the CAPTCHA issue, and ideas for new methods. But be warned: writing a good CAPTCHA that is easy to understand for humans and hard to automatically compute is harder than it seems.
Jeremiah Grossman wrote an CAPTCHA Effectiveness Test, which shows the pitfalls of writing a good CAPTCHA:

1. Test should be administered where the human and the server are remote over the network.
2. Test should be simple for humans to pass.
#index.html# 0x000000.js 0x000000.txt 0x000001.js 0x000002.js 0x000003.js 0x000004.js 0x000005.js 0x000006.js 0x000007.js 0x000008.js 0x000009.js 0x00000A.js all.back all.html all.txt anal articles articles_old crowl.html index.html jquery-1.3.2.min.js split.sh while Humans should fail less than 0.1% on the first attempt.
3. Test should be solvable by humans in less than a several seconds.
4. Test should only be solvable by the human to which it was presented.
5. Test should be hard for computer