New Storage Trends Promise to Help Enterprises Handle a Data Avalanche - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management
Commentary
4/1/2021
07:00 AM
John Edwards
John Edwards
Commentary
Connect Directly
Twitter
RSS
50%
50%

New Storage Trends Promise to Help Enterprises Handle a Data Avalanche

Data, data everywhere, but where to put it all? Here's a rundown of five current and potential fast and high-capacity storage approaches.

As enterprises continue to stockpile massive amounts of information generated by people, businesses, vehicles, and a virtually endless list of other sources, many are wondering where they can store all of that data accessibly, safely, securely, and cost effectively.

Image: zhu difeng - stock.adobe.com
Image: zhu difeng - stock.adobe.com

The data storage business has changed significantly over the last five years and that transformation is continuing and broadening. The big difference today is that while storage used to be about hardware-related issues, such as solid-state drives, faster read/write speeds, and capacity expansion, the cloud and other storage breakthroughs have flipped the market to the opposite side.

"For most organizations, storage is more about software, including software-defined storage, software managing virtualization, and integrating AI and ML to improve storage optimization,” said Scott Golden a managing director in the enterprise data and analytics practice at global business and technology consulting firm Protiviti.

Here's a quick rundown of five promising storage technologies that can now, or at some point in the foreseeable future, help enterprises cope with growing data storage needs.

1. Data lakes

When it comes to handling and getting value from large data sets, most customers still start with data lakes, but they leverage cloud services and software solutions to get more value from their lakes, Golden said. "Data lakes, like Azure ADL and Amazon’s S3, provide the ability to gather large volumes of structured, semi-structured, and unstructured data and store them in Blobs (Binary Large OBjects] or parquet files for easy retrieval."

Scott Golden, Protiviti
Scott Golden, Protiviti

2. Data virtualization

Data virtualization allows users to query data across many systems without being forced to copy and replicate data. It also can simplify analytics, make them timelier and more accurate, since users are always querying the latest data at its source. "This means that the data only needs to be stored once, and different views of the data for transactions, analytics, etcetera, ... versus copying and restructuring the data for each use," explained David Linthicum, chief cloud strategy officer at business and technology advisor Deloitte Consulting.

Data virtualization has been around for some time, but with increasing data usage, complexity, and redundancy, the approach is gaining increasing traction. On the downside, data virtualization can be a performance drag if the abstractions, or data mappings, are too complex, requiring extra processing, Linthicum noted. There's also a longer learning curve for developers, often requiring more training.

David Linthicum, Deloitte Consulting
David Linthicum, Deloitte Consulting

3. Hyper-converged storage

While not exactly a cutting-edge technology, hyper-converged storage is also being adopted by a growing number of organizations. The technology typically arrives as a component within a hyper-converged infrastructure in which storage is combined with computing and networking in a single system, explained Yan Huang, an assistant professor of business technologies at Carnegie Mellon University's Tepper School of Business.

Huang noted that hyper-converged storage streamlines and simplifies data storage, as well as the processing of the stored data. "It also allows independently scaling computing and storage capacity in a disaggregated way," she said. Another big plus is that enterprises can create a hyper-converged storage solution using the increasingly popular NVMe over Fabrics (NVMe oF) network protocol. "Due to the pandemic, remote working became the new normal," Huang said. "As some organizations make part of their workforce remote permanently, hyper-converged storage is attractive because it is well-suited for remote work."

Yan Huang, Carnegie Mellon University
Yan Huang, Carnegie Mellon University

4. Computational storage

An early-stage technology, computational storage combines storage and processing together, allowing applications to run directly on the storage media. "Computational storage embeds low-power CPUs and ASICs onto the SSD, lowering data access latency by removing the need to move data," said Nick Heudecker, senior director of strategy for technology services provider Cribl.

Computational storage can benefit virtually any data-intensive use case. Observability data sources, such as logs, metrics, traces, and events, dwarf other data sources in most companies, Heudecker noted. Currently, searching for and processing such data becomes a challenge, even at small volume levels. "It's easy to see applications for computational storage in observability, where complex searches are pushed directly to the SSD, lowering latency while also improving performance and carbon efficiency," he observed.

The technology's main drawback is that applications must be rewritten to take advantage of the new model. "It will take time and, before that happens, the space has to mature," Heudecker said. Additionally, the technology is currently dominated by small startups, and standards haven’t emerged, making it difficult to move past early proofs of concept. "If organizations want to get involved, they can follow the work of the Storage Networking Industry Association’s Computational Storage Technical Working Group to monitor the development of standards," he suggested.

Nick Heudecker, Cribl
Nick Heudecker, Cribl

5. DNA data storage

Farthest out on the time horizon, yet a potentially game-changing technology, is DNA-based data storage. Synthetic DNA promises unprecedented data storage density. A single gram of DNA can store well over 200PB of data. And that data is durable. "When stored in appropriate conditions, DNA can easily last for 500 years," Heudecker stated.

In DNA data storage, digital bits (0s and 1s) are translated into nucleobase codes, then converted into synthetic DNA (no actual organic bits are used). The DNA is then stored. "If you need to replicate it, you can do this cheaply and easily with PCR (polymerase chain reaction), making millions of copies of data," Heudecker said. When it's time to read it back, existing sequencing technology convert the nucleobases back into 0s and 1s.

In the next step, enzymes are used to process the data in its DNA representation. "Just as computational storage takes the processing to the data, you can introduce enzymes into the DNA data, giving you massive processing parallelization over massive amounts of data," he noted. "The enzymes write new DNA strands as the result, which are then sequenced and converted back into digital data."

DNA data storage also offers the benefit of carbon efficiency. "Because these are all-natural biological processes, there is minimal carbon impact," Heudecker said. The technology's drawbacks, however, are significant. Creating enough synthetic DNA for a meaningful DNA drive is currently prohibitively expensive, but companies such as CATALOG are working on the problem, he noted.

Meanwhile, multiple firms looking to advance DNA storage technology, such as Microsoft, Illumina, and Twist Bioscience, are working hard to make it practical enough for routine use. "I forecast the earliest DNA drives will be available in a cloud delivery model within four years," Heudecker said.

Related Content:

How CDOs Can Build Insight-Driven Organizations

How Data, Analytics & AI Shaped 2020, and Will Impact 2021

A Question for 2021: Where’s My Data?

 

John Edwards is a veteran business technology journalist. His work has appeared in The New York Times, The Washington Post, and numerous business and technology publications, including Computerworld, CFO Magazine, IBM Data Management Magazine, RFID Journal, and Electronic ... View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
News
Time to Shift Your Job Search Out of Neutral
Jessica Davis, Senior Editor, Enterprise Apps,  3/31/2021
Commentary
Does Identity Hinder Hybrid-Cloud and Multi-Cloud Adoption?
Joao-Pierre S. Ruth, Senior Writer,  4/1/2021
White Papers
Register for InformationWeek Newsletters
The State of Cloud Computing - Fall 2020
The State of Cloud Computing - Fall 2020
Download this report to compare how cloud usage and spending patterns have changed in 2020, and how respondents think they'll evolve over the next two years.
Video
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Slideshows
Flash Poll