Oracle's new big data tool won't cover all the analysis bases, but it will enable SQL-savvy professionals to query Hadoop and NoSQL sources.

Doug Henschen, Executive Editor, Enterprise Apps

July 21, 2014

3 Min Read
Oracle Big Data SQL is not a SQL-on-Hadoop option, it's a way to SQL query Oracle Database, Hadoop, and NoSQL sources simultaneously.

3. DBAs get Oracle Database-style security controls. So Oracle Big Data SQL opens up table-based access to the data in Hadoop, but with access comes risk. Thus, Oracle gave this feature a way to apply the same kinds of grants, permissions, and policies that DBAs apply when they set up Oracle Database. You might have an "analyst" role defined in Oracle Database that is allowed to see and query some columns but not others, while certain fields of data might be redacted.

"If I want to expose that group of analysts to a set of data that's in Hadoop, I can create an external table in Oracle Database over that data in Hadoop and grant whatever permissions and policies you deem appropriate," McClary explained.

[Want more on the Spark option for big data analysis? Read Databricks Spark Plans: Big Data Q&A.]

4. Oracle Big Data SQL is not a SQL-on-Hadoop tool. This is an important distinction. Oracle Big Data SQL is not just a way to use Oracle SQL against Hadoop. It's a way to query Oracle Database, Hadoop, and NoSQL sources simultaneously.

"SQL on Hadoop is a great idea and we'll continue to ship solutions that provide that, including Impala, Hive, and future efforts to bring Hive on top of Spark," McClary said. "What we're trying to do here is solve a different and perhaps bigger problem, which is integrating big data with the rest of the enterprise architecture."

Oracle-Big-Data-SQL.jpg

Describing Oracle Big Data SQL as "democratizing big data" and "making it consumable by people outside of Silicon Valley," McClary said the point is bringing the value found in big data sources "home into the business." Home, in this case, means into Oracle Database, where it can be analyzed by the many SQL-savvy professionals instead of just a priesthood of PhD-level data scientists.

5. Oracle Big Data SQL will not do everything. It was refreshing to hear Oracle grant that not everything can be expressed or discerned through SQL. Options like Apache Spark and the R language, for example, support machine learning and advanced analytical data manipulations and workflows that are "all valid," said McClary. "There's a place for SQL in reasoning and operating on large sets of data and there's a place for other languages in doing what they're best suited to handle," he said.

The point of Oracle Big Data SQL is accessing and analyzing data in Hadoop and NoSQL sources without requiring a new set of people with a new set of skills. "It's not enough to have big data experiments, you have to be able to operationalize it," said Mendelson. "That means that the people who are used to running your systems need to be able to provide secure access not just to the privileged few, but potentially to everyone."

InformationWeek's new Must Reads is a compendium of our best recent coverage of the Internet of Things. Find out the way in which an aging workforce will drive progress on the Internet of Things, why the IoT isn't as scary as some folks seem to think, how connected machines will change the supply chain, and more. (Free registration required.)

About the Author(s)

Doug Henschen

Executive Editor, Enterprise Apps

Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of Transform Magazine, and Executive Editor at DM News. He has covered IT and data-driven marketing for more than 15 years.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights