Google, Facebook CAPTCHAs Beat By Bot - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Cloud // Software as a Service
10:06 AM
Connect Directly

Google, Facebook CAPTCHAs Beat By Bot

Improvements in computer vision and machine learning are making it harder for companies to defend against automated attacks.

10 Top Tech Companies Poised For Massive Layoffs
10 Top Tech Companies Poised For Massive Layoffs
(Click image for larger view and slideshow.)

The term CAPTCHA, proposed by computer science researchers in 2000, stands for "Completely Automated Public Turing test to tell Computers and Humans Apart."

The Turing test, formulated by Alan Turing in 1950, attempts to evaluate whether a computer's conversational responses to questions can be distinguished from a person's answers.

A CAPTCHA represents a reverse Turing test because it asks a computer rather than a person to identify whether the respondent is human or machine. The roles are reversed.

Such tests are themselves being tested by recent advances in computer vision and machine learning. As Google recently demonstrated with AlphaGo, its Go-playing deep learning system, computer thinking rivals human thinking in a growing number of areas. And as the line between human and machine intelligence blurs, CAPTCHAs fail.

(Image: Google)

(Image: Google)

In a presentation at Black Hat Asia last month, Columbia University researchers Iasonas (Jason) Polakis and Suphannee Sivakorn described work they'd done with associate professor Angelos Keromytis to create an automated system to bypass CAPTCHAs used by Google and Facebook. The two companies, like others, rely on CAPTCHAs as a means of limiting fake account creation, spam posts, and other abuse of online services.

"Our system is extremely effective, automatically solving 70.78% of the image reCAPTCHA challenges, while requiring only 19 seconds per challenge," the researchers explain in a paper documenting their work. "We also apply our attack to the Facebook image CAPTCHA and achieve an accuracy of 83.5%."

Attacks on CAPTCHA systems have been an issue for years. When Google bought reCAPTCHA in 2009, the company hoped the technology would give it an edge in combating automated fraud. And for a time, it did.

But machines keep getting smarter. Advances in optical character recognition have made text-based CAPTCHAs all but unusable. CAPTCHA letters now have to be so distorted to fool the machines that people can't read them either. Google's latest iteration of reCAPTCHA alludes to the illegibility of recent CAPTCHA challenges in its marketing tagline: "Tough on bots, Easy on humans."

The researchers disclosed their work to Google, which has already implemented some counter-measures, and to Facebook. 

"We're regularly in touch with the security research community and we appreciate their contributions to the safety of reCAPTCHA and other Google products," a Google spokesperson said in an emailed statement. "The Columbia University researchers notified us about this issue in May 2015 and we've since strengthened reCAPTCHA's protections based on their findings and our own studies."

In an email, Polakis describes CAPTCHA research as an example of a security arms race in which defenses get compromised and then hardened, only to be overcome again.  

But in the past few years, Polakis said, advances in generic solvers against text CAPTCHAs have made distorted text challenges obsolete. Researchers have turned to more advanced tasks like extracting semantic information from images, as Google has done with its most recent reCAPTCHA system.

Google also looks at browser settings and user-agent data to determine the kind of CAPTCHA challenge it should present. These observable characteristics may help companies keep one step ahead of the bots, for a while at least.

Polakis says that the novel attacks his colleagues and he have developed underscore the difficulty facing those trying to design functional CAPTCHAs.

Gain insight into the latest threats and emerging best practices for managing them. Attend the Security Track at Interop Las Vegas, May 2-6. Register now!

"We believe that the capabilities of computer vision and machine learning have finally reached the point where expectations of automatically distinguishing between humans and bots with existing CAPTCHA schemes, without excluding a considerable number of legitimate users in the process, seem unrealistic," Polakis wrote. "As these capabilities can only improve, it will become even more difficult to devise CAPTCHAs that can withstand automated attacks."

Polakis is careful not to declare "game over," noting that CAPTCHAs remain an open problem and that alternative directions in research may allow CAPTCHAs to survive.

Google has revised its approach to CAPTCHAs numerous times over the past seven years. But there may come a point when machines become so perceptive and analytically capable that we will be able to create bots that pass any automated test we create.

(Editor's Note: Black Hat Asia is produced by UBM, InformationWeek's parent company.)

Thomas Claburn has been writing about business and technology since 1996, for publications such as New Architect, PC Computing, InformationWeek, Salon, Wired, and Ziff Davis Smart Business. Before that, he worked in film and television, having earned a not particularly useful ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Ninja
4/9/2016 | 4:31:37 PM
I'm guessing that in the future, that will be the only available option, but it is only a matter of time before they're cracked as well. And if you have ever seen the old "Wonder Woman" television series starring Lynda Carter, or Disney's "Peter Pan"; then you have seen the future of voice recognition technology. Real people will probably never imitate the voices and speech patterns of people they are not identical twins with that well, but I can easily imagine programming a computer to do it.
User Rank: Ninja
4/9/2016 | 4:13:29 PM
Re: Good article
Apparently, I failed to notice that option. Thanks.
User Rank: Ninja
4/9/2016 | 4:10:00 PM
Re: Good article
People with limited vision can select the SOUND choice and the characters are spoken outloud.


I don't like CAPTCHA either, but it served its purpose
User Rank: Ninja
4/9/2016 | 1:55:09 PM
Good article
I've never liked CAPTCHAs (how do blind people deal with them?) but now I understand why they're getting harder and harder (I should have guessed).  At this point, they're almost more trouble than they're worth.  I fully expect them to be obsolete in no more than five years (perhaps as few as one).

Maybe next, we get to improve voice recognition/imitation technology in the same fashion.
User Rank: Ninja
4/9/2016 | 10:22:45 AM
Security Arms race very much it is...
I thoroghly enjoyed reading the article & research paper as well becauseit was so relateable.

Going ahead getting MFA in place accurately and efficently is going to be the key as the IoT Wave kicks into high-gear.

Especially when Enterprises wish to analyse and accordingly calibrate products focussed on Consumers.

None of this is going to be easy to implement or even monitor and I daresay we might end up outsourcing all this work to the Bots all over again!!!


Good Blog!

Data Science: How the Pandemic Has Affected 10 Popular Jobs
Cynthia Harvey, Freelance Journalist, InformationWeek,  9/9/2020
The Growing Security Priority for DevOps and Cloud Migration
Joao-Pierre S. Ruth, Senior Writer,  9/3/2020
Dark Side of AI: How to Make Artificial Intelligence Trustworthy
Guest Commentary, Guest Commentary,  9/15/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
IT Automation Transforms Network Management
In this special report we will examine the layers of automation and orchestration in IT operations, and how they can provide high availability and greater scale for modern applications and business demands.
Flash Poll