No One Buys Hadoop

In launching a big data initiative the work extends far beyond the acquisition of Hadoop or other technologies.

Christian Prokopp, Data Scientist, Rangespan

April 16, 2015

3 Min Read
InformationWeek logo in a gray background | InformationWeek

The first quarter results for Hortonworks after its IPO showed that professional services were growing faster than support. That's unsurprising news for someone like myself who is working in big data professional services.

The truth is that no one buys Hadoop. You have a problem, and you want a solution. Big data is not about Hadoop or NoSQL. It is about a group of problems that require novel technologies and architecture patterns to solve them.

We can observe this with the evolution of the distributions. Initially, it was about large-scale computation and distributed storage with MapReduce and HDFS. There were early adopters who had problems fitting for these technologies that have their roots in crawling and processing the web. The distributions made Hadoop accessible and gave support and peace of mind to business operations. The decreasing storage cost and accessibility of Hadoop pushed it further into more and more use cases.

Today, as big data is becoming ever more watered down as a term it still has a core technology with Hadoop. However, the requirements are now around a multi-purpose enterprise grade large-scale data storage and processing architecture that does so much more than MapReduce and HDFS. As such, the distributions are growing upwards in the value chain and outwards on the feature set. Search, governance, security, compliance, business intelligence, and data warehouse integration with Hadoop are only some prominent examples of this movement. The combination of these result in the data lake idea.

The ugly truth though is that there is no one-click install for an enterprise architecture, and while patterns are emerging that hold true across industries everyone has their hurdles to overcome. The usual suspects are legacy systems, data silos, data warehouse dependencies, and change management issues.

How can an organization break up its data silos, and reimagine its infrastructure including the data warehouse and untangle legacy systems? The people in the organization working with entrenched systems and technologies have no spare time to learn or visibility of most new patterns and technologies. Hiring the experience and talent is nearly impossible. It is expensive and scarce, and due to the novelty senior stakeholders in organizations often don't know what to look exactly for in the new hires.

The consequence is a thriving professional services industry. Yes, Hadoop may be the answer to parts of your problems but choosing technology is not the first problem; it is somewhere in the middle. So you know you need Hadoop and out of desperation ask the vendor for advice but you also know that they clearly will favor their solution, which in itself may or may not be part of your solution. Most stakeholders are desperate to find trustworthy and experienced partners who can help them translate their domain challenges and organizational needs to future-proof, coherent and executable plans leveraging technology. Ideally showing immediate success with small cost to generate buy-in and help change thinking organization-wide.

The product and service strategy is an interesting topic to watch evolve since distributors are expanding both. There is demand for professional services, and it increases with the growing capabilities of the platform. What the successful business model around Hadoop will look like in a few years remains an unanswered question.

Related post

  • {doclink 276850}

Read more about:

2015

About the Author

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights