Big Data Platforms Evolve for Analytics, Machine Learning - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // AI/Machine Learning
02:00 PM
Connect Directly

Big Data Platforms Evolve for Analytics, Machine Learning

Machine learning and advanced analytics are now the centerpiece for companies that got their start as Hadoop distributors.

The term "big data" reached its peak on Google Trends back in 2015. Organizations were harvesting more data than ever before and needed to store it in a cost-effective way, which may be why searches for "Hadoop" also reached a peak on Google Trends that same year.

But by 2018, things have shifted. The three big Hadoop vendors -- Cloudera, Hortonworks, and MapR, no longer promote themselves as Hadoop providers. At the Gartner Data Analytics Summit this spring, Research VP Merv Adrian pointed out that none of those vendors even had the word Hadoop in their booth displays at the event.

(Image: Jessica Davis/InformationWeek)

(Image: Jessica Davis/InformationWeek)

Now the focus is on the analytics and machine learning aspects of data. Cloudera has changed its positioning to this: "A modern platform for machine learning and analytics, optimized for the cloud" -- a change the company made about a year and a half ago, according to Wim Stoop, senior product marketing manager at Cloudera. He spoke with InformationWeek in a recent interview.

Stoop said that until about two years ago, the big focus for the market had been on how to keep more data and more different types of data for longer periods of time. How do you store it all? But as organizations mastered that task, another challenge emerged -- now that we can store it, what do we do with it?

"Hence the focus on machine learning and analytics," he said.

These days Cloudera, Hortonworks, and MapR promote themselves as platforms for analytics, data science, and machine learning, incorporating many of the open source technologies in one place to make them easier for enterprises to consume. (MapR describes itself as a converged data platform integrating Hadoop, Spark, and Apache Drill along with other data technologies. Hortonworks describes itself as a connected data platform and solution.)

All these companies have repositioned themselves as providing much more than just open source storage technologies for big data needs. That's a move echoed by the changing focus of what enterprise organizations want to do with their data programs.

These data platform companies incorporate multiple open source technologies for storing, managing, and performing advanced analytics on big data. The companies are working to make it easier for organizations to consume these technologies, by also offering elastic cloud options for services.  

For instance, at Strata Data London last week, Cloudera announced plans to expand its Altus data science platform as a service offering to the Azure cloud. The service has already been available on AWS for the past year. Altus Data Engineering for Azure simplifies and speeds ETL, data processing, and batch machine learning by reducing complexity, Cloudera said. Azure customers can also use the shared data catalog capabilities in Cloudera Altus SDX, currently in beta. Cloudera said that this is designed to preserve the business metadata and security and governance policies so they can be applied consistently across data processing and analytics workloads in the cloud.

Cloudera Altus Analytic DB, a data warehouse cloud service, will also now be available in Azure.

Cloudera has also updated its Data Science Workbench and the Cloudera Enterprise platform. The workbench update lets data scientists run and track versioned experiments and also more easily deploy models as REST APIs, according to Cloudera.

While the technology for achieving better results with data is arriving, many organizations still have work to do in terms of their own data organizations and processes, Stoop told me. Perhaps that is the next step.

"Many organizations are not yet seeing data as a strategic asset," he said. "They are treating much of it on a departmental and siloed basis. They need to change how they are working with that data… This is not something that happens overnight."

Jessica Davis has spent a career covering the intersection of business and technology at titles including IDG's Infoworld, Ziff Davis Enterprise's eWeek and Channel Insider, and Penton Technology's MSPmentor. She's passionate about the practical use of business intelligence, ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
2021 Outlook: Tackling Cloud Transformation Choices
Joao-Pierre S. Ruth, Senior Writer,  1/4/2021
Enterprise IT Leaders Face Two Paths to AI
Jessica Davis, Senior Editor, Enterprise Apps,  12/23/2020
10 IT Trends to Watch for in 2021
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/22/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Flash Poll