Search Engines Find Stolen Identities - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

02:13 PM
Connect Directly

Search Engines Find Stolen Identities

The prevalance of identity theft, and the fact that search engines provide links to pages containing personal data, is fueling a debate over whether search-engine companies should be working harder to address privacy issues and whether they should be held accountable for privacy violations.

During the first six months of 2005, more than 50 million identities were lost or stolen in a series of high-profile data breaches across the United States. Thanks to search engines, many can be easily found.

For example, fed a few abbreviations associated with personal and financial information, a Google search returns links to a wide variety of Web sites. Most are harmless. A few, however, lead to page upon page of sensitive personal information including Social Security numbers, credit-cards numbers, dates of birth, driver's license numbers, bank-account numbers, logon names, and passwords.

One such link points to a Russian carding site. Carding is another term for phishing, the practice of soliciting personal information online under false pretenses, usually with the intent to defraud. The site advertises itself as "forums for E-commerce."

The summary text of the site, as listed on Google's search-results page, reveals the presence of Social Security numbers. But clicking on the corresponding link leads to a logon page without such information. This may be the result of action by Google or by the site owner.

The cached link, however, still contains the sensitive information hinted at by the summary text on the search-results page. It opens a snapshot of the site's forums from Feb. 21, 2005. Posted there in a forum thread, amid Cyrillic characters, are a handful of names, addresses, Social Security numbers, and credit card numbers, in the Roman alphabet, of individuals from California, Georgia, Florida, Massachusetts, Ohio, and Oregon.

A similar query leads to a Vietnamese Web forum with the personal information of three Americans. Another link leads to a cached page with the personal information of several dozen other individuals from various countries.

Google, of course, is not the only search engine that can find stolen identities. Using the same search string and English-language restriction that returned 127 results when entered in Google, Ask Jeeves delivered only 11 results. Nonetheless, several links led to personal information, including a text file with 47 pages of credit-card numbers, Social Security numbers, and other personal data. Yahoo returned 42 results, including an Arabic site with personal information posted in Roman characters and a cached Unix-oriented site with personal information. MSN Search returned 401 results, but most of the summary-result information was unreadable -- a majority of the links pointed to non-HTML file formats that present garbled text when accessed with a Web browser.

Neither Ask Jeeves, MSN Search, nor Yahoo returned a link to the Russian carding site. Searching for a specific Social Security number found on the Russian site produced a single Google result and no results using the other three search engines. This could be the result of less effective indexing, more effective index policing, or other factors.

Florida resident Marjorie Roberson is among those whose personal information can be found on the Russian site. She believes her information was stolen last year when she responded to a phishing E-mail that she took to be an official message from America Online. The information on the carding site supports that view: Many of the field identifiers preceding the posted data begin with the letters "Aol."

Roberson is surprised that her information is still available online. "I contacted Google and told them this information was obtained illegally," she says, adding that she has also complained to the Federal Trade Commission and notified the major credit bureaus to place fraud alerts on her credit files.

"The problem I keep having is people keep trying to open these accounts," Roberson explains. Just last week, she says, someone tried to write a check on her bank account using a generic corporate check from Office Depot. While she hasn't suffered any financial loss, being a victim of identity theft has cost her in terms of time and aggravation.

"There ought to be some type of protection," Roberson says.

There are laws designed to prevent, deter, and punish identity theft, but mostly they provide recourse after the fact. They haven't reduced the incidence of identity theft, which has been rising steadily for the past three years. According to a February report from the FTC, identity theft and related fraud complaints numbered 403,688 in 2002, 542,378 in 2003, and 635,173 in 2004.

While a number of new federal laws dealing with identity theft are in the works, attorney and author Mari J. Frank, who runs, says that these laws will likely pre-empt stronger state laws and prohibit any private right of action. That means individuals wouldn't be allowed to sue companies for negligence that might have led to identity theft. Instead, they would have to rely on the FTC or some other government agency to initiate legal action.

"That means there's no enforcement because these government agencies don't have the money or resources to really go after [identity thieves]," Frank says. "There are so many of these cases that law enforcement only investigates about 10%. And of those 10%, they only really prosecute about 10% of those."

California has a law that prohibits individuals and nongovernmental entities from posting or publicly displaying Social Security numbers. It also prohibits transmitting Social Security numbers over the Internet unless the number is encrypted or the connection is secure.

But legal experts say this doesn't apply to search engines. "Search engines point to information that other people are making available," says Deirdre K. Mulligan, director of the Samuelson Law, Technology & Public Policy Clinic and professor at UC Berkeley School of Law. "I think it would be problematic to hold them accountable."

Indeed, there are compelling arguments for differentiating between those who publish information and those who distribute it, like Internet service providers and search engines, in terms of liability. In cases of copyrighted material, the Digital Millennium Copyright Act recognizes that distinction "in circumstances where the [service] provider merely acts as a data conduit, transmitting digital information from one point on a network to another at someone else's request."

But search engines, by their own admission, are more than mere data conduits.

On its Web site, Google says, "We're committed to providing thorough and unbiased search results for our users; therefore, we cannot participate in the practice of censorship. We stop indexing pages of a site only at the request of the webmaster who's responsible for those pages, when it's spamming our index, or as required by law. This policy is necessary to ensure that pages aren't inappropriately removed from our index."

In other words, Google does practice censorship, just for a very narrow range of offenses: index spamming and when compelled by the law.

Other search engines acknowledge being even more hands-on. Ask Jeeves, for example, blocks taboo sexual queries and analyzes Web pages to determine if they are objectionable. "Ask Jeeves is dedicated to providing its users easy access to the vast amount of information on the Web," the company said in a statement. "With that said, the very nature of the Web -- allowing the free exchange of ideas and information -- does provide opportunities for individuals to post inappropriate, sensitive and, in some cases, illegal information. Ask Jeeves employs sophisticated algorithms, filters, and query-side and document-side schemas to identify objectionable content and to determine how content, if any, is presented to users."

Among those whose personal information can be easily found on the Internet, greater efforts by search companies to prevent sensitive personal information from being indexed would no doubt be appreciated. In the meantime, Frank of suggests that search companies offer a clear way to have Web pages with personal information removed from their indexes.

Search companies have policies and procedures for blocking sexual content, and reporting spam and copyright violations, but dealing with privacy violations doesn't appear to be a priority. True, they have privacy policies, but those rules don't cover privacy violations perpetrated by unaffiliated third parties.

In fact, Google specifically acknowledges the issue. At Google's Help Center, one of the question categories about which users can inquire is "Removing information from Google's search results."

Google's position about this is quite clear: "If you're inquiring about removing information that appears in Google's search results, please note that the information is actually located on third-party publicly available web pages," Google says on its Web contact form. "There's nothing Google can do to remove content from our index without the cooperation of a site's webmaster."

Google is, however, willing to protect the privacy of its senior management. When CNET reporter Elinor Mills chose to illustrate the privacy implications of search engines by running a Google search of Google CEO Eric Schmidt and publishing the results in a July 14 story, the company reportedly took action. According to, "Google representatives have instituted a policy of not talking with CNET reporters until July 2006"

Ask Jeeves appears to be more sympathetic to privacy concerns. "Consumers who believe their private information is being used inappropriately on the Web can contact our customer service team and provide details for review," Ask Jeeves said in a statement. "Sites or pages identified as using or providing private information inappropriately will be blocked from our index."

As editor Danny Sullivan sees it, search engines are still coming to terms with privacy concerns. "It's an issue that's not going away, but neither is it one they seem to be addressing," he says.

Google and Yahoo did not respond to requests for comment. MSN declined to comment.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
The State of Cloud Computing - Fall 2020
The State of Cloud Computing - Fall 2020
Download this report to compare how cloud usage and spending patterns have changed in 2020, and how respondents think they'll evolve over the next two years.
2021 Outlook: Tackling Cloud Transformation Choices
Joao-Pierre S. Ruth, Senior Writer,  1/4/2021
Enterprise IT Leaders Face Two Paths to AI
Jessica Davis, Senior Editor, Enterprise Apps,  12/23/2020
10 IT Trends to Watch for in 2021
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/22/2020
Register for InformationWeek Newsletters
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
White Papers
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll