Hadoop's Second Generation Offers More To Enterprises
The first Hadoop tools weren't easy to deploy or manage. But the second-wave tools deliver great advances in usability.
5 Big Wishes For Big Data Deployments
5 Big Wishes For Big Data Deployments(click image for larger view and for slideshow)
Hadoop is one of the single most disruptive recent innovations in enterprise IT. The promise is to turn the ever-growing tide of data into profit. Even just in my own industry, telecommunications and media, Hadoop allows a range of analytic uses in areas as diverse as network planning, customer support, security operations, fraud detection and targeted advertising.
Yet realizing this potential has been challenging for many mainstream enterprises. Many started experimenting with some of the 13 functional modules that make up Apache Hadoop, a set of technologies that required large teams and several years for the early wave of Hadoop adopters such as eBay, Facebook and Yahoo to master.
The first wave of Hadoop technology, the 1.x generation, was not easy to deploy nor easy to manage. The many moving parts that make up a Hadoop cluster were difficult to configure for new users. Seemingly minor details – patch versioning, for instance -- mattered a lot. As a result, services failed more often than expected, and many problems only showed up under severe load. Skills were and still are in short supply, although there is no shortage of good training available from leading vendors such as Hortonworks and Cloudera.
[ Hortonworks gives the low-down on modern-day Hadoop. Read Hadoop According To Hortonworks: An Insider's View. ]
Fortunately, the second generation of Hadoop, which Hortonworks calls HDP 2.0 and which was announced at Hadoop Summit 2013, fills in many of the gaps. Manageability is a key expectation, particularly for the more business-critical use cases that service providers experience. Hadoop has made great advances here with Ambari, an intuitive Web user interface that makes it much easier to provision, manage and monitor Hadoop clusters. Ambari allows the automation of initial installation, rolling upgrades without service disruption, high availability and disaster recovery, all critical to efficient IT operations.
Moreover, the independent software vendor ecosystem that supports Hadoop distributions is broadening and deepening. This is important for two reasons. In our experience, much of a buying decision boils down to how Hadoop fits with existing technology assets; in most cases, that means traditional business intelligence and data warehouse vendors. This also alleviates concerns over the skills shortage. Deutsche Telekom, for instance, has about 600 business intelligence staff with SQL skills. Although many of these people are now moving up the learning curve with Hadoop, it certainly helps that product-level integration provided by the likes of Microsoft and Teradata mean that you don't have to be a Hadoop expert to run your queries on Hadoop.
Improved security and data lifecycle management also matter greatly when you try to establish a general-purpose enterprise big-data platform that serves many different departments, use cases and data policies. Security is delivered via Knox, a system that provides a single point of secure access for Apache Hadoop clusters. Falcon provides the data lifecycle management framework, a declarative language (think of XML) to orchestrate data movement, coordinate data pipelines, and set lifecycle policies and processing rules for data sets.
Most importantly perhaps, as Hadoop enterprise adoption has accelerated it became clear that multiple processing models -- moving beyond batch -- were critical for Hadoop to broaden its applicability for mainstream enterprise use. The common pattern is that enterprises want to store data in the Hadoop Distributed File System (HDFS) and then access it in a variety of ways, simultaneously, and with a consistent level of service. Hadoop 2.0 also includes Yarn, a resource manager that isolates different applications and supports many use cases beyond just batch processing such as interactive, online, streaming and graph processing. It's fair to say that Hadoop has evolved from an inexpensive parking lot for your data to a framework that can help make timely decisions.
A great example is Gigaset, a former unit of the German tech conglomerate Siemens well-known for its mobile phones. With its new smart home system for security and assisted living called "Elements," the company has jumped on the new possibilities now available. What's even more interesting is how Hadoop helped the company unlock an entirely new market, with additional business models on the horizon.
Elements is a cluster of small sensors that can be quickly installed in any home, slapped on doors or cabinets. Designed to be robust and foolproof, Elements observes and pipes data into a Hadoop cloud via a base station. That sounds easy enough, but the alerts, events and diagnostic pings flow to the tune of three terabytes or 10 billion messages per day in 2014. Just the sheer traffic volume of ingesting millions of doors being opened and closed is similar to a denial of service (DoS) attack.
This ocean of raw data is sorted by statistical relevance only, leaving the interpretation and decision-making to individual customers who can see data visualizations on their smartphone or computer. Customers can decide to relay the data stream to third-party service providers such as ambulances or security services. This new real-time information system for consumers, anchored in the emerging Internet of Things, is worlds apart from the old handset business, admits Gigaset's Nicholas Ord, in this video.
That's the story of one company taking the plunge with Hadoop, but when will others follow? I predict that by 2015, more than half of the top 2,000 global enterprises will have a productive Hadoop deployment. I also expect that in five years, we'll see meaningful differences in many industries when it comes to profitability. Enterprises that have fully embraced Hadoop will come out ahead.
Read more about:
2013About the Author
You May Also Like