EMC Tries To Unify Big Data Analytics - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
12:19 PM
Connect Directly

EMC Tries To Unify Big Data Analytics

EMC Greenplum Modular Data Computing Appliance puts SQL and Hadoop in the same box, but is it a truly cohesive platform?

8 Big Data Deployments In Detail
(click image for larger view)
Slideshow: 8 Big Data Deployments In Detail
Two separate worlds have emerged in big data analytics, but EMC announced a Greenplum appliance on Wednesday that aims to bring those two separate worlds together.

On the one hand there's structured data that fits neatly into the columns and rows of relational databases. That data has been mastered by relational databases, and even when it gets big (meaning north of about 10 terabytes), there are options such as massively parallel processing supported by products such as EMC's Greenplum database.

On the other hand there's the array of semi-structured, unstructured, and inconsistent data types like server log files, sensor data, social-network comments, and other forms of text-centric information. For that world the Hadoop open-source project has emerged as the leading platform for making such information computable. (Hadoop also handles highly structured data, but mostly as a high-capacity, low-cost data store.)

[Want more on big data deployments? Check out this image gallery on 10 Lessons Learned By Big Data Pioneers.]

With Wednesday's release of the EMC Greenplum Modular Data Computing Appliance (DCA), EMC says it has unified these heretofore separate domains. It's a follow up to the company's announcement last May of Greenplum HD Community and Enterprise distributions of Hadoop software and a promise to deliver a Hadoop appliance.

Greenplum's Community edition includes Hadoop MapReduce, the HDFS distributed file system, the Apache Hive query tool, the HBase column-oriented data store, and ZooKeeper tool for configuring clusters. The Enterprise edition adds proprietary features for snapshotting and replication of Hadoop clusters as well as system management capabilities.

The Modular DCA is one box that can support multiple quarter-rack deployments that can be mixed, matched, and scaled. You can start with a standard Greenplum Database Module for scalable SQL analysis and add a quarter-rack Greenplum HD module for running EMC's Hadoop release.

Other quarter-rack options include the Greenplum Database High Capacity Module, which combines more storage and less compute capacity than a standard module for high-scale, long-term archival storage at a lower cost per terabyte. There's also a Greenplum Data Integration Accelerator (DIA) module designed to host partner applications, like predictive analytics capabilities from SAS, data-integration software from Informatica, and other options said to be in review.

EMC's modular approach lets you scale standard SQL, Hadoop, archival, or analytic application capacity in quarter-rack increments up to a total of six full racks. EMC says its approach will not only save money by eliminating the need for separate hardware platforms, it will also speed insight and minimize storage demands by streaming Hadoop analyses directly into the Greenplum database. In this approach, data doesn't have to be created and stored in one environment and then copied and moved into another.

EMC used the words "coprocessing" and "marriage" to describe the blend of SQL and Hadoop within the modular appliance, but it's not quite that harmonious just yet.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
10 Ways to Prepare Your IT Organization for the Next Crisis
Cynthia Harvey, Freelance Journalist, InformationWeek,  5/20/2020
IT Spending Forecast: Unfortunately, It's Going to Hurt
Jessica Davis, Senior Editor, Enterprise Apps,  5/15/2020
Helping Developers and Enterprises Answer the Skills Dilemma
Joao-Pierre S. Ruth, Senior Writer,  5/19/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Key to Cloud Success: The Right Management
This IT Trend highlights some of the steps IT teams can take to keep their cloud environments running in a safe, efficient manner.
Flash Poll