Can Data Algebra Make Big Data Faster And Cheaper? - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

12:25 PM
Lisa Morgan
Lisa Morgan
Connect Directly

Can Data Algebra Make Big Data Faster And Cheaper?

Data algebra is a new approach for managing, integrating, and searching data faster and more efficiently. Here's why developers and IT departments may want to consider adding it to their toolsets.

10 IT Infrastructure Skills You Should Master
10 IT Infrastructure Skills You Should Master
(Click image for larger view and slideshow.)

Today's organizations want to manage, process, analyze, and search all kinds of data more efficiently and cost-effectively. To accomplish those goals, they need to reduce unnecessary overhead and find ways to optimize data-related tasks. Data algebra is an option that can help.

Data analytics platform provider Algebraix Data says that data algebra applies mathematical set theory to data analytics tasks. The result is an approach you can use to perform a range of data tasks, whether you are optimizing the performance of Hadoop systems or making database queries.

Two main benefits of data algebra are reuse and optimization, both of which can save time and resources. Here's how it works: To speed database queries, the Algebraix platform resolves a request for data, and then stores the request along with the algebraic expressions, the algebraic transformation, the intermediate results it used to arrive at a result, and the result in an algebraic catalog. That way all of these can be reused.

"Databases calculate various query results, deliver [the results] to a user and then throw it away. Some of that stuff can be reused," Robin Bloor said in an interview. But "it can only be reused if you define it in an algebraic manner." Bloor is chief analyst and cofounder of The Bloor Group and a co-author of the book The Algebra of Data. The book was recently published by Algebraix Data and is available for a free download from the company's website.

Over time, the reuse capabilities can dramatically accelerate query results.

"You can do far more sophisticated optimization when you're using algebraic techniques than you can when you're just using high-level procedural techniques," said Bill Rogers, a senior engineer at IBM and former VP of engineering at Algebraix.

[Looking to beef up revenues? New research from Forrester shows an interesting correlation. Read Does Your Company Need A Chief Data Officer?]

The point is not to re-compute what has already been computed. That wastes time and resources. For example, if a person ran a query on terabytes of data and later added 100 new rows of data, it would not be necessary to execute the entire query again to get a correct final result. It would only be necessary to run the second query on the new 100 rows of data because all of the information about the original query has been stored.

The results of the two queries would then be combined to yield a final result. Instead of taking, say, five hours to run the original query and another five hours to run the second query, the final result could be achieved in about half the time. The original query would still take five hours, but the query on the 100 rows could be executed in microseconds.

Practical Uses of Data Algebra

Here's why this approach can be so powerful. All data can be described in algebraic terms. Data algebra can unify data management across different data structures. It can also improve computing performance and capacity. What else can it do? Some of the possibilities described in the book include spreadsheets that can pull in atypical types of data, better performing Hadoop systems, faster data analytics-related processes, and more efficient search capabilities.

(Image: Geralt via Pixabay)

(Image: Geralt via Pixabay)

"We've been talking about gaming. All software applications -- data management applications, the Internet of Things, defense, security, every aspect of IT -- we could potentially play a role in, but that's too broad, which is why we have an IP strategy. We want to keep the math open source," Algebraix CEO Charlie Silver said in an interview.

The company plans to license its IP. Algebraix holds nine patents. The Algebraix platform is both a proof-of-concept and a commercial product. Algebraix is also planning to build a universal optimizer for Hadoop.

"Applying [data algebra] has changed the way I look at software development and design. Now I think about what's going on mathematically, I understand that, and I understand how I'm going to do that physically. It's made me look at what I'm doing in a more rigorous and precise way," said Rogers.

Working with data algebra has also shown Rogers that things that appear to be different are more similar than they seem. Although the details of the algebra described in the book are more complicated than what's presented here, fundamentally data algebra describes data using hierarchical sets in which the smaller set is included in the larger set: Specifically, a couplet represents a fundamental

Page 2: Will data algebra be accepted?

Lisa Morgan is a freelance writer who covers big data and BI for InformationWeek. She has contributed articles, reports, and other types of content to various publications and sites ranging from SD Times to the Economist Intelligent Unit. Frequent areas of coverage include ... View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Ninja
9/29/2015 | 5:48:54 AM
Re: Correct Algebraix Data link
Thank you for sharing this link. This will definitely help me in my day to day reporting.
Brian Bartlett
Brian Bartlett,
User Rank: Strategist
9/29/2015 | 2:44:04 AM
Correct Algebraix Data link
State of the Cloud
State of the Cloud
Cloud has drastically changed how IT organizations consume and deploy services in the digital age. This research report will delve into public, private and hybrid cloud adoption trends, with a special focus on infrastructure as a service and its role in the enterprise. Find out the challenges organizations are experiencing, and the technologies and strategies they are using to manage and mitigate those challenges today.
IT Salary Report 2020: Get Paid What You Are Worth
Jessica Davis, Senior Editor, Enterprise Apps,  2/12/2020
10 Analytics and AI Startups You Should Know About
Cynthia Harvey, Freelance Journalist, InformationWeek,  2/19/2020
Fighting the Coronavirus with Analytics and GIS
Jessica Davis, Senior Editor, Enterprise Apps,  2/3/2020
Register for InformationWeek Newsletters
Current Issue
IT Careers: Tech Drives Constant Change
Advances in information technology and management concepts mean that IT professionals must update their skill sets, even their career goals on an almost yearly basis. In this IT Trend Report, experts share advice on how IT pros can keep up with this every-changing job market. Read it today!
White Papers
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll