Snowflake Launches Virtual Data Warehouses On AWS - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Cloud
News
6/24/2015
11:09 AM
Connect Directly
Twitter
RSS
E-Mail
100%
0%

Snowflake Launches Virtual Data Warehouses On AWS

Snowflake, led by former Microsoft exec Bob Muglia, offers a virtual data warehouse system on Amazon's cloud. Pay for only what you use.

7 Data Center Disasters You'll Never See Coming
7 Data Center Disasters You'll Never See Coming
(Click image for larger view and slideshow.)

Snowflake Computing, led by Bob Muglia, former senior VP of Microsoft's Server & Tools Division, announced Tuesday that it has detached the data warehouse from its typical on-premises location and set it into the cloud.

When built to operate in the cloud, a data warehouse can take on big data-handling characteristics for both structured and unstructured data. It can take advantage of the cloud's elasticity for big analysis jobs, store data on the inexpensive cloud volumes, then shut itself down at the end of the day.

That, in a nutshell, describes some of the characteristics of the Snowflake Elastic Data Warehouse, designed to run on Amazon Web Services and potentially other cloud architectures. There are already data warehouses available in the cloud, but most of them, with the exception of Amazon's Redshift, IBM's BLU Acceleration for Cloud, and Microsoft Azure's Data Factory, were not designed to take advantage of all the cloud's characteristics.

(Image: merrymoonmary/iStockphoto)

(Image: merrymoonmary/iStockphoto)

Snowflake has written its own system to handle unstructured data on a massively parallel processing cluster that can be spun up in the cloud on demand. But don't call it another NoSQL system. Snowflake's engineering team has watched the NoSQL systems try to layer in SQL query capabilities, and concluded those systems haven't gotten as far in employing SQL as some of their early adopters hoped.

"Why not take a SQL database system and extend it to support NoSQL data? That's the contrarian element of what we've done," Muglia said in an interview.

[Will this be the year of Hadoop? InformationWeek asks the question.]

"We built this on a very different architecture than a relational system or any of the Hadoop systems," such as Hortonworks or Cloudera, Muglia continued. It allows multiple data warehouse tasks to be processed at the same time, provided they involve mainly data reads, with few data writes, as most data warehouse tasks do. Scaling the system to do multiple simultaneous tasks is part of its design, he said.

The design doesn't let writes block reads, meaning an analytical process that's underway will be completed by the data that it started with, even though some records in the underlying data may have changed before that process was completed. In that aspect, it resembles the NoSQL systems that practice eventual consistency instead of relational's strict ACID consistency. But if the data warehouse experiences no writes on the data in use, it's functioning as a typical relational system.

Snowflake isn't a data warehouse of big data dimensions or routine enterprise data dimensions. Rather, it's a virtual data warehouse that will be sized to match the job sent to it. When the analytical tasks are finished, the warehouse shuts itself off to save overhead. "In other cloud data warehouses, you would have to unload the data to turn it off and then reload it [to use it again]," he said. Snowflake avoids that data movement task.

Although Snowflake runs on AWS at its US West facility in Oregon, customers may use Snowflake without an AWS account. They also don't need to understand the ins and outs of Amazon virtual machine selection. Customers deal with a service layer provided by Snowflake and create a virtual data warehouse when they wish to load their data. "They don't see AWS," Muglia noted.

Customers with their own AWS accounts may use them to load their data directly into S3, and Snowflake will copy it into a virtual data warehouse for them. But most customers who turn to Snowflake will do so to avoid the data-handling and data-management tasks that accompany data warehouse use in the cloud. "We went to great lengths to remove the need for customer care and management of the data."

Hadoop can ingest massive amounts of machine data, and then sort and analyze it to produce data in a more structured form. But Hadoop clusters are expensive to set up and operate, claimed Jon Bock, Snowflake's VP of product, in an interview. Snowflake can recognize and assemble metadata on machine data, saving it in a "schema-less way," he said. "We manage the metadata updates and tuning," he said.

The customer is then able to examine the data that he's most interested in by submitting a query, for example, against "a few hundred gigabytes of data in a 100-TB table. This scenario is a killer scenario," he claimed, made possible by Snowflake's cloud-based architecture.

Snowflake offers a virtual data warehouse at $2 per Snowflake credit, which amounts to one virtual CPU running for an hour. A 32-CPU double-extra-large virtual data warehouse running for an hour would cost $64.

Snowflake is trying to set a new category, a cloud-native SQL system extended into unstructured data use. Data warehouse and NoSQL system choices already abound in the cloud, and the competition will be keen. Snowflake came out of stealth last October and now has perhaps 12 months to get more than just a foot in the door before the choices offered by the NoSQL, Hadoop, and traditional data warehouse systems operating in the cloud prove overwhelming.

Muglia brings impressive marketing and management credentials to the challenge. But over the next year, Snowflake's staff of 75 people in San Mateo, Calif., will have their work cut out for them. It will have to persuade enterprise skeptics that its category exists, has the legs to endure, and can save its customers pain and money as they pursue their analytics goals.

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Comments
Threaded  |  Newest First  |  Oldest First
asksqn
50%
50%
asksqn,
User Rank: Ninja
6/29/2015 | 7:30:07 PM
IP in the Sticky Cloud
I'm curious what's in the ToS regarding Intellectual Property that's used in Snowflake's Cloud - once the client switches off the cloud, e.g. overnight, is it secured or can Snowflake terminate the account for nonpayment (if that is the case) and keep the data for itself?

 

 
jonbock
50%
50%
jonbock,
User Rank: Apprentice
7/16/2015 | 8:53:25 PM
Re: IP in the Sticky Cloud
Important question. Our terms are straightforward on this--customer data is the customers' IP and customer retains all rights to it.  We do make it easy for customers to export data whenever needed, and we always encrypt the data a customer stores in Snowflake.
Slideshows
Reflections on Tech in 2019
James M. Connolly, Editorial Director, InformationWeek and Network Computing,  12/9/2019
Slideshows
What Digital Transformation Is (And Isn't)
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/4/2019
Commentary
Watch Out for New Barriers to Faster Software Development
Lisa Morgan, Freelance Writer,  12/3/2019
White Papers
Register for InformationWeek Newsletters
State of the Cloud
State of the Cloud
Cloud has drastically changed how IT organizations consume and deploy services in the digital age. This research report will delve into public, private and hybrid cloud adoption trends, with a special focus on infrastructure as a service and its role in the enterprise. Find out the challenges organizations are experiencing, and the technologies and strategies they are using to manage and mitigate those challenges today.
Video
Current Issue
The Cloud Gets Ready for the 20's
This IT Trend Report explores how cloud computing is being shaped for the next phase in its maturation. It will help enterprise IT decision makers and business leaders understand some of the key trends reflected emerging cloud concepts and technologies, and in enterprise cloud usage patterns. Get it today!
Slideshows
Flash Poll