The Department of Energy has disclosed plans to spend $32 million in American Recovery and Reinvestment Act money testing the feasibility of cloud computing as a "cost-effective and energy-efficient" approach to scientific computing. At that price, the experiment is already off to a bad start.
The project is called Magellan, and the DOE plans to deploy internal computing clouds at its Argonne and Lawrence Berkeley National Labs. Some of the applications it has in mind include protein structure analysis, power grid simulations, and image processing for materials structure analysis. Temporary, high-demand workloads are well suited for the cloud; we call it "cloud bursting."
The DOE is absolutely right to explore cloud computing for data processing workloads that require hundreds or thousands of servers for a few hours or days at a time. The problem isn't with the cloud computing model or the applications; it's with the multimillion-dollar commitment that Energy is making to what, by its own description, is a "test bed."
One of the beauties of cloud computing--one if its selling points--is that you don't need to make a big upfront investment to get started. Cloud vendors talk of the advantages of OpEx (variable operating expense) over CapEx (fixed capital expense). Federal CIO Vivek Kundra has been trying to steer government agencies toward cloud computing for this very reason. "We need to get away from this model of investing heavily in infrastructure," Kundra said recently at the InformationWeek 500 conference.
Yet, a big chunk of the DOE's Recovery Act windfall will be going toward fixed-asset computer hardware, including thousands of servers with Intel Nehalen microprocessors. In talking to DOE, I learned that it plans to eventually have more than 50 Teraflops of cloud computing capacity and in excess of 1 Petabyte of disk storage at Berkeley's National Energy Research Scientific Computing Center. Software will include the open source Eucalyptus cloud platform and NERSC's Integrated Performance Monitoring tool.
Energy says it will also explore commercial offerings from Amazon, Google, and Microsoft. That's a better place to start. Amazon in particular has a number of case studies showing how its cloud services can be applied for high-performance computing. Eli Lilly, Harvard Medical School, and Pathwork Diagnostics are all using Amazon Web Services for R&D at, you can be sure, a fraction of what the Department of Energy plans to spend.
Energy's Magellan clouds will be operating early next year. The department's long-term strategy--developing high-powered internal clouds to be used as shared resources, and augmenting those with commercial cloud services--may ultimately prove viable. Indeed, NERSC plans to make Magellan's cloud storage available to science communities via so-called science gateways. This is one of the cloud's high-potential use cases—IT resources and hosted applications that can be shared by researchers, professionals, and companies with common interests.
What might the DOE do differently as it heads into the clouds? Start its experiment with smaller internal clouds, devote more early effort to commercial cloud services, run ROI calculations on both, and compare its experiences with those of other government agencies. Once all that's done, DOE can determine how many millions it wants to pour into building its own cloud infrastructure.
John Foley is editor of InformationWeek Government. Follow me on Twitter at @jfoley09 and let me know what you think about DOE's cloud plans in the comments field below or by e-mail.
InformationWeek and Dr. Dobb's have published an in-depth report on how Web application development is moving to online platforms. Download the report here (registration required).