This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.
Data Warehouse Alternatives Make a Hit in India and Europe
Telco Reliance scales with DW appliances. Web marketer TradeDoubler loads and queries faster, cuts cost with a column-store database.
The Need for Speed
Scalability was decidedly not the problem facing TradeDoubler. In fact, the Web marketing firm's warehouse was less than one terabyte, but complex analytic queries against as many as 3 billion rows of data demanded extensive aggregation. What's more, since the firm studies constantly changing clickstream data, the database had to be continually rebuilt, reindexed and tuned.
"You have to structure the database to be able to ask the questions, and that takes a lot of work," says CTO Ola Uden. "We had a one person working with the data full time, but depending on the complexity of the queries, it took anywhere from half a data to two days to get the data out."
Early this year TradeDoubler implemented the Brighthouse column-store database from InfoBright. Column-oriented databases are faster than conventional (row-oriented) databases in many analytic applications because they can query selected attributes without wading through all the non-relevant data in each row. Leading column-store databases are also designed to take advantage of commodity hardware supporting massively parallel processing. TradeDoubler is running Brighthouse on an inexpensive ($12,500) Dell server with two quad-core processors.
TradeDoubler optimizes Web marketing campaigns across Europe and Asia for more than 1,600 advertisers by analyzing Web clicks, impressions and purchases. Customers include online retailers such as Apple and Dell. Brighthouse and Pentaho BI software are serving as the engine behind TradeDoubler's TD Integral Cross-Media Marketing Platform, which is designed to "understand the complete customer journey" across search engines, affiliate sites in TradeDoubler's network and online advertising."
"If Apple is running a campaign for the iPhone, they want to look at how people ended up buying one at their site," explains Mats Johansson, a senior consultant at Lincube Group AB, which helped TradeDoubler with the Brighthouse implementation. "What did they do before they made that purchase? Did they read a review or were they responding to an ad? Which sites were they visiting and how did they arrive at the Apple store?"
TradeDouble has more than 125,000 Web sites in its network, and it tracks some 20 billion impressions, 265 million unique visitors and 12 million leads per month. The Brighthouse implementation went into production in May, and the firm now loads and rebuilds the database every day, retaining three days of network-wide clickstream data and 60 day's worth of online order information.
TradeDoubler continues to rely on Oracle for many of its transactional processing needs, but constant rebuilding and, in particular, high-volume loading necessitated an alternative approach, says Johansson. "Loading 2 billion rows a day while still maintaining performance on analytic queries would have been quite expensive," he says.
Between faster loading speeds, automated indexing, 30X data compression and faster query times, TradeDoubler is getting faster answers at a lower cost than would have been possible with conventional technology. With appliances, column-store databases and related software-hardware configurations growing in number and diversity (from small-scale to ultra-high-capacity), it looks like the days of building data warehouses from scratch are winding down all over the globe.
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.