The Open Content Alliance differs from Google Print in that only works in the public domain, or those contributed by copyright holders, would be stored.

Antone Gonsalves, Contributor

October 3, 2005

4 Min Read

Yahoo Inc., the University of California and more than a half dozen other organizations launched on Monday a consortium dedicated to digitizing books and other materials from libraries, archives and publishers, an effort that would compete with a similar, but controversial, initiative by Google Inc.

The Open Content Alliance differs from Google Print in that only works in the public domain, or contributed by copyright holders, would be stored in a network of databases maintained by the Internet Archive, a non-profit group in San Francisco. Yahoo will run the search engine on the OCA Web site.

Besides skirting the copyright issues that have led to one lawsuit against Google by the Authors Guild, the OCA plans to make its index searchable by any organization interested in tapping its resources. Google controls access to its library index, in order to prevent others from getting access to anything but snippets from copyrighted material, unless authorized by the copyright holder.

Brewster Kahle, founder and digital librarian for the Internet Archive, said the OCA hopes to build a "virtual library that would be brought to you by different search systems."

"If we pull this off, then we'll have done something very worthwhile," Kahle said.

Other organizations that have joined the initial launch of the OCA and will contribute either content or technology include Adobe Systems Inc., the European Archive, Hewlett-Packard Co., the National Archives in the United Kingdom, publisher O'Reilly Media Inc., Prelinger Archives and the University of Toronto.

All OCA content would be searchable and downloadable at no charge, the group said. The available content is expected to range from historical works of fiction to children's books to highly specialized engineering whitepapers.

Content under copyright would be made available through the OCA under a Creative Commons license. Creative Commons is a nonprofit group that developed a licensing program to make copyrighted material available for personal use, reuse and repurposing.

OCA stored content would be available in PDF, a popular document format developed by Adobe, and other widely adopted formats.

Daniel Greenstein, UC librarian for system-wide planning, said the UC chose to join the OCA because it was "very taken by the openness to the approach."

Under OCA, the UC will be able to work with Yahoo in grouping works into collections that will be particularly helpful to academics and professionals.

"They don't just want stuff," Greenstein said of many library users. "This gives us a degree of influence (in the indexing process). We will build this together."

Greenstein said about 15 percent of the UC's 33 million books are in the public domain.

Google announced its library project in December, starting with collections in Harvard, Stanford, the University of Michigan, the University of Oxford and The New York Public Library. The company scanned books without first asking permission of copyright holders, leading to intense opposition from groups such as the Text and Academic Authors Association, the Association of Learned and Professional Society Publishers, the Association of American University Presses and the Association of American Publishers.

Last month, the Authors Guild sued Google, claiming the search-engine giant is violating copyright laws by copying books without permission, even though the company would only show snippets in search results.

Google, on the other hand, claims it's protected by laws governing fair use of copyrighted works. The company has stopped scanning such books until November to give publishers time to notify the search engine if they don't want to be a part of the project.

Bruce Sunstein, partner in the Boston-based intellectual property law firm Bromberg & Sunstein, said Google is stretching the definition of fair use like never before.

"It's really a new paradigm," Sunstein said of Google's ability to store information. "There's no legal precedent that's exact."

Because it's unlikely a court decision would satisfy both sides, Sunstein expects Congress to eventually get involved.

"I would not be surprised to see a legislative solution way downstream -- maybe five years," Sunstein said.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights