2 Ways To Tackle Really Big Data - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
05:19 PM
Connect Directly

2 Ways To Tackle Really Big Data

Marketers, telcos, and financial services firms are often swamped by machine-generated data. New products from IBM Netezza and InfoBright offer radically different approaches to the challenge.

8 Big Data Deployments In Detail
(click image for larger view)
Slideshow: 8 Big Data Deployments In Detail
If you're trying to analyze Web clickstreams, call data records, financial trading data, log files, or other forms of machine-generated information, chances are you're playing in the "big data" league. But just how big, and how quickly you need answers, will determine your interest in the latest products from IBM Netezza and Infobright.

IBM Netezza on Wednesday announced a High Capacity Appliance aimed at really, really big data. We're talking petabytes, typical for long-term archives maintained for regulatory or compliance reasons. Infobright, meanwhile, has upgraded a column-store database the promises superfast querying of machine-generated data at more routine volumes of less than 40 terabytes. Beyond these specific products, both vendors have answers for the extremes of capacity and speed.

The IBM Netezza High Capacity Appliance is an alternative to the vendor's standard TwinFin product. It boasts four times the data density of the TwinFin thanks to higher-capacity hard drives. It also has about 35% less processing power per rack (to keep costs down and create room for more storage). The appliance stores 500 terabytes per rack, and you can put together as many as 20 racks to handle as much as 10 petabytes of user-addressable data.

Who needs to query that much data? Telcos operating in many countries (India being one example) are required to keep call data records (CDRs) for as long as 10 years so law enforcement agencies can request relevant information. Government intelligence agencies and financial services subject to retention requirements often keep that much data around as well.

This niche was previously addressed by both Teradata, which introduced its Extreme Data Appliance in 2009, and by EMC, which introduced its High Capacity ECM Data Computing Appliance in April.

Fast querying is generally not important when you're retrieving records to meet regulatory requirements. Thus, the EMC, IBM Netezza, and Teradata high-capacity appliances all favor storage over speed. For example, an identical query will run about 2.5 times faster on the Netezza TwinFin than it will run on that vendor's high-capacity appliance. The TwinFin, however, can't match the low-cost-per terabyte of the IBM Netezza High Capacity Appliance, which works out to less than $2,500 per terabyte, according to Netezza (less than a quarter the cost of the TwinFin).

Plenty of companies need both high capacity and super-fast querying. The likes of EMC, IBM Netezza, and Teradata would likely suggest the combination of their high-capacity appliances and one of their high-performance appliances. Yes, at the opposite end of the speed-versus-scale spectrum, Teradata and EMC both have pure-solid-state appliances (Teradata's being the Extreme Performance Appliance and EMC's being the High Performance EMC Data Computing Appliance). These products have less capacity but about 10 times the speed of each vendor's standard appliance.

IBM Netezza announced Wednesday that it will get in on this act sometime next year with an Ultra Performance appliance employing a combination of flash memory and RAM (a contrast with the solid-state disk drives used by Teradata and EMC). Having a high-performance appliance and a high-capacity appliance gives you the best of both worlds, but it's also no small investment.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Becoming a Self-Taught Cybersecurity Pro
Jessica Davis, Senior Editor, Enterprise Apps,  6/9/2021
Ancestry's DevOps Strategy to Control Its CI/CD Pipeline
Joao-Pierre S. Ruth, Senior Writer,  6/4/2021
IT Leadership: 10 Ways to Unleash Enterprise Innovation
Lisa Morgan, Freelance Writer,  6/8/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll