In a column a few days ago, I questioned the value of infrastructure-as-a-service offerings based on their lack of adherence to Moore's Law. My thesis: While CPU performance and drive storage capacity continue to climb at logarithmic rates, IaaS vendors aren't providing those implied cost savings back to their customers. I received two sorts of responses to that column: those thankful for the oversimplified example I provided; and others wanting more concrete numbers applied to real systems.
I took some time to do a back-of-the-napkin calculation for storage, and I'll share my results here. Before jumping into the numbers, however, it's important to know that it's pretty much impossible to do an apples-to-apples comparison between 2006 IaaS prices (the year Amazon first offered EC2 and S3) and 2012 prices. Sure, for storage systems you can compare drive capacity, but that's not the full story. An iSCSI drive array in 2006 would typically come with two to four Gigabit Ethernet adapters, while today you'll get a few 10-Gbps Ethernet adapters. You'll also get six years of advances in firmware and software. So let me say right from the start: This not only isn't an apples-to-apples comparison, but you probably don't want one.
What we want to understand are the relative improvements in cost, performance, and reliability that you got from IaaS vendors over six years compared to the improvements you'd get from buying systems the old fashioned way and running them yourself. For no other reason than convenience, I chose to compare storage prices. I was able to find some good historical data that I think makes for a compelling comparison. I decided to compare Amazon's S3 prices from 2006 until now with the prices of a hard drive and an actual storage array over the same period.
I threw in the storage array because while the price of a hard drive is obviously going to change radically over six years, the price of other storage system components won't change that much. Power supplies and other hardware don't adhere to Moore's Law, and certainly there are significant costs in developing firmware and software for drive arrays that also don't drop logarithmically. So it would be fair to expect that drive prices would change the most, followed by array prices, followed by the price of the Amazon offering, which must take into account other overhead required to run the storage system. The relative magnitudes of the differences are what's important and telling, and that's what we want to understand.
Since quantity makes a difference, we'll assume that we're looking at storing 50 terabytes of data, and that we'll look at the total cost over four years. This is back-of-the-napkin; we know there are lots of costs I'm not including in that four-year number, including amortization, failed drives, additional hardware requirements, maintenance contracts, and the time value of money. A more detailed analysis is critical for a buying decision, but I think we can illustrate some fundamentals without hauling out a spreadsheet (if anyone wants to do that, please do, and I'll post it and give you credit for the work).
[ Want to know more? Get our deep dive report on cloud ROI! ]
Once you find the historical data, both the Amazon and raw disk calculation are pretty easy to do. For Amazon, the S3 2006 price was $0.15 per gigabyte per month, so the total cost for 50 TB for four years--assuming a contract with no clause for reduced price over its term--is $360,000. This year, the Amazon price per gigabyte is down to $0.108 per gigabyte per month, so a similar four-year contract for 50 TB would now be $259,200. So it cost 39% more in 2006 to store the 50 TB than it does now. Note that we haven't calculated any fees for using data or retrieving it--just for storing it. We'll get to other fees later. Nonetheless, Amazon is lowering prices, which seems like a good thing.
Raw hard drive costs certainly have dropped radically. In 2006, a Seagate Barracuda 7200 RPM 500-GB drive would run you about $300. For 50 TB, you'd need 100 of them, so $30,000. Today, a 2-TB Seagate Barracuda costs $120. You'll need 25 of them to get you to 50 TB, so that's $3,000.
Note that you're just buying the raw capacity here. If you opt for RAID 10, you'll need double the number of drives, while for RAID 6 you'll need 25% more. The price change over the six years remains the same though. As these numbers show, the 2006 price is 10 times the current price. Surprising, right? Is it reasonable that Amazon is passing along only 39 points of the 1,000-point cost reduction?
Let's see how the drive array pricing works out.
First, this exercise is far more subjective and data for it is harder to find. Here's what I found: In 2006, EqualLogic released its PS3000X line. Among other things, it was the first EqualLogic product to use serial-attached SCSI drives. Loaded with 16 300-GB drives spinning at 10,000 RPM, the system had a maximum configuration of 4.8 TB. However, this is raw capacity. After applying RAID 6, the usable capacity is 3.5 TB. The only reference I could find to its price was from the very reliable Register, so after a quick conversion from 2006 British pounds to U.S. dollars, that configuration works out to $76,900. You need about 14.4 of these to get you to 50 TB, so the cost is $1,110,000. So at least in 2006, Amazon's service was looking like a pretty good deal for four years. After all, Amazon Web Services is doing backups and guaranteeing 99.9% availability.
There was another configuration of this system, which used 750-GB SATA drives and cost considerably less than the version with 300-GB SAS drives. It runs a bit over $740,000 for 50 TB. That system too looks pretty expensive compared to the Amazon offering (but remember, we've just stored data there--we haven't done anything with it).
Looking at EqualLogic's current systems, we could choose either the 6510 series or the 6500 series. We'll opt for performance and choose the PS6510X, which uses 600-GB SAS drives. The RAID 50 capacity of a fully configured system is 21.7 TB, with a list price of $123,000. We'll need about 2.3 of these systems, for a total of $283,400. If we opt for the 2-TB SATA drives instead, the total cost drops to about $103,200. So the 2006 high-performance system costs 392% more than a present day system. The 2006 lower-performance option costs 718% more than a present day system.
Amazon's storage system price drops are off by a factor of 10 to 20. But wait, you say--that EqualLogic system looked very expensive in 2006 and still doesn't look all that cheap compared to the current Amazon price. Who cares if they aren't adhering to a logarithmic drop in price if it's still a comparable cost, and IT loses the management headache?
Here's the catch: Amazon is pretty reasonable about the cost of writing and storing data, but when you start using its service, there are additional costs. If after processing (which also rings the cash register) you need to read a fraction of that data back, that's a new significant cost. Reading 10 TB a month will cost you another $1,200 per month in today's prices. That's another $57,600 over four years. During processing, you'll read from and write back to the 50-TB store millions of times each month. Let's estimate a million reads and writes a month--that adds another $10,000 a month to the bill, or $480,000 over four years. Over the four years of using your 50-TB data store in Amazon's cloud, your S3 bill alone could easily be in the million-dollar neighborhood.
In 2006, an analysis of using AWS S3 storage versus buying and using your own iSCSI system over four years would have shown that either would cost about $1 million. Given that you lose the management headache, the AWS proposition might look pretty good. But if you do the same calculation today, AWS costs a bit less than $1 million. However the iSCSI system cost is down in the $100,000 range. AWS becomes a very hard cost to justify. And clearly, if Amazon doesn’t change its ways, its model will be dead in another six years.
Normally, the value proposition for cloud services must involve some sort of reduction in your own internal costs beyond simply not having to buy hardware. The idea is that you can reduce staff or repurpose staff to work on more strategic projects, or bring to your company a new app or capability that you wouldn't otherwise be able to provide.
With IaaS, that's almost certainly not the case. You still need to understand your application and its needs, from storage to networking to processing. You also must learn the new art of pay as you go. My calculations here leave off many important elements, partly because I didn't want to weigh down this column with minutia, but mostly because the minutia is unique to every company. Only you know how many people it'll take to run an app in the cloud versus in your data center, or what your labor costs are, or what your capital costs are, or what the risk factor should be for running your app on someone else's hardware instead of your own.
Here's my bottom line: The fundamental value of IaaS is flawed, unless you need it for cloud bursting or a similar seldom-used application. In any case, you need a very sharp pencil to figure this stuff out, so making a buddy of someone in the business office wouldn't be the worst thing you could do in the evaluation.
The value proposition is vastly different for platform and software as a service, but no matter what you do, ask hard questions--like where the heck did my Moore's Law advantage go? Which is to say, why are IaaS prices dropping linearly, when hardware prices are dropping exponentially? It's the difference between percentages and orders of magnitude.
And above all, work the numbers for yourself. I'm willing to bet that for IaaS, it'll rarely make sense as a strategic IT resource.
It's time to get going on data center automation. The cloud requires automation, and it'll free resources for other priorities. Download InformationWeek's Data Center Automation special supplement now. (Free registration required.)