Your Next Data Warehouse? Size Up Column-Store Databases and Appliances - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
03:07 PM
Connect Directly

Your Next Data Warehouse? Size Up Column-Store Databases and Appliances

These approaches may beat out conventional databases in price and performance. Here's how to find products that fit with your company's data analysis needs.


Unlike a conventional database that stores data in rows, a column-store database looks at data vertically. In the case of a customer database, for example, the row-store database queries across each customer record, while the column-store database can query selected columns. The row approach is great for OLTP, with lots of writes for each new customer transaction. Column-store is ideal for OLAP, with lots of reads against particular attributes. If you want to explore sales by ZIP code and product, your query will interrogate only the ZIP code and product SKU columns rather than wading through names, addresses, and other irrelevant attributes.

Another advantage of the column-store database is that it can apply optimized compression--upwards of 10-to-1 compression--because the data in each column is consistent (all names, for example). As long as your query isn't record-intensive-- that is, it doesn't involve a lot of attributes--the amount of data coming from a column-store is always less than from a conventional database, so it delivers faster performance.

Column-store databases are ideal for many applications. Telecommunications companies have billions of call data records that need to be in an enterprise data warehouse for billing and historical purposes, but columnar extracts can be placed in a data mart for marketing analysis. Big retailers, banks, and insurance companies take much the same approach to look for up-sell and cross-sell opportunities, fraud, and mortality trends, respectively.

The Internal Revenue Service launched one of the first column-store deployments in 1996, using Sybase IQ, in what began as a 2-TB warehouse. Back then, 2 TB was a very large database, says Jeff Butler, the IRS's director of research databases. The columnar database brought query times down from days to a matter of hours, "which was just a phenomenal boost in productivity," Butler says. "Today, we're at 158 TB, and the queries that once took hours now take seconds."

The IRS's analytic database stores population-level data (meaning no names or Social Security numbers) from every tax return over the last 10 years. The warehouse is used to explore filing and compliance behavior, as well as the potential impact of tax code changes.

The IRS went with a column-store database because it didn't know what data elements it would need to query from one question to the next, says Butler. With a static reporting application, you can use just about any database because you can do a lot of optimization, he says, "but that's not our situation."

Impact Assessment: Column Store Databases and Warehouse Appliances

(click image for larger view)

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
2 of 4
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Becoming a Self-Taught Cybersecurity Pro
Jessica Davis, Senior Editor, Enterprise Apps,  6/9/2021
Ancestry's DevOps Strategy to Control Its CI/CD Pipeline
Joao-Pierre S. Ruth, Senior Writer,  6/4/2021
IT Leadership: 10 Ways to Unleash Enterprise Innovation
Lisa Morgan, Freelance Writer,  6/8/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll