Wikia Search Gets Distributed Web Crawler - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
News
7/27/2007
05:52 PM
Connect Directly
LinkedIn
Twitter
RSS
E-Mail
50%
50%

Wikia Search Gets Distributed Web Crawler

Jimmy Wales buys Look Smart's Grub search engine and secures it under an open source license for a public release later this year.

Wikia, Inc., a provider of community Web sites that users can edit, said on Friday that it had acquired distributed search software called Grub to enhance the company's forthcoming wiki-inspired search engine.

At the O'Reilly Open Source Convention (OSCON), Wikia co-founder Jimmy Wales announced the acquisition of Grub from search engine Look Smart and the release of the software under an open source license. Financial terms of the deal were not disclosed.

In much the same way that Wikipedia relies on the distributed brain power of the Internet community, Wikia Search aims to make use of distributed processing power of Internet-connected computers.

"That's a very loose analogy but the idea is that you have a lot of spare bandwidth that you're not using a lot of the time, and if you want to use it to do something, this would be something you could do with it," said Wales. "This tool, it's not really a tool where people will be making editorial judgments, so it's different."

As a distributed program, Grub benefits incrementally from each user that installs and runs the software. The Grub client will make local bandwidth, processor time, and storage space available so that Wikia Search, once it launches, can crawl and index Web pages.

"Of the various pieces of the puzzle that we need to create the full search engine, this is one of them," said Wales. "We're planning to have first public Web site available by the end of this year."

Wikia Search will rely on Lucene, a Java-based open source indexing and search library that powers search services at sites like Digg and Joost, and will probably use Nutch, an open source search engine built atop Lucene.

Though the components of Wikia Search are still being decided on, people will play a major role. "We're definitely intending to have human input into the search results, through the social Web site that we're designing right now," said Wales.

Despite the potential problems of involving people in the search process, Wales believes that search engine spammers can be kept in check by the community. "If people are abusing the system, then they should be kicked out," he added.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Commentary
2021 Outlook: Tackling Cloud Transformation Choices
Joao-Pierre S. Ruth, Senior Writer,  1/4/2021
News
Enterprise IT Leaders Face Two Paths to AI
Jessica Davis, Senior Editor, Enterprise Apps,  12/23/2020
Slideshows
10 IT Trends to Watch for in 2021
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/22/2020
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Slideshows
Flash Poll