Locked in a cat-and-mouse game with spammers who use bots to defeat anti-fraud mechanisms and create fake accounts, Google today announced a deal to acquire reCAPTCHA, a company that provides those squiggly words at login screens.
The ReCAPTCHA deal isn’t exactly a security transaction. Strategically, it gives Google an excellent crowd-sourcing tool to beef up its already impressive machine-vision algorithms (think book-scanning and maps) but, in the long run, the ability to use CAPTCHAs that are near-impossible for bots to decipher allows Google to raise the bar significantly in the fight against bots and spam.
According to Adam O’Donnell, director of emerging technologies at anti-spam firm Cloudmark, believes this is a very smart purchase by Google.
“Google already has the best computer-vision techniques. The way ReCAPTCHA works, this means that Google will only be presenting CAPTCHA words that are very difficult for a bot to defeat,” O’Donnell explained.
The words presented by the ReCAPTCHA service come from scanned printed material (archival newspapers and old books). As Google explains here, computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text.
In this way, reCAPTCHA’s unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR). This technology also powers large scale text scanning projects like Google Books and Google News Archive Search. Having the text version of documents is important because plain text can be searched, easily rendered on mobile devices and displayed to visually impaired users. So we’ll be applying the technology within Google not only to increase fraud and spam protection for Google products but also to improve our books and newspaper scanning process.
Earlier this year, Researchers at Google recently released a paper detailing a new CAPTCHA system consisting of correct image rotation (Socially Adjusted CAPTCHAs) whose main purpose is to make it easier for humans, and much harder for bots to recognize them.