VMware Cloud Foundry Suffers Service Outage - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Cloud // Platform as a Service
08:16 PM
Connect Directly

VMware Cloud Foundry Suffers Service Outage

As the beta development platform was recovering from a minor power supply problem, human error worsened the setback.

11 Epic Technology Disasters
(click image for larger view)
Slideshow: 11 Epic Technology Disasters
Assembling a distributed computing architecture in the cloud isn't easy to do. The more resources you try to bring together, the more that can go wrong. No, you're not hearing about Amazon's recent EC2 outage again. This time it's VMware.

VMware recently launched a development platform as a set of services in its CloudFoundry.org, a new developer's hosting service. On April 25, the Cloud Foundry experienced service disruption. In trying to recover later that day, it suffered an outage that continued into April 26.

Not only is VMware finding it more difficult than anticipated to keep a cloud up and running, it's sharing another experience with its much bigger and better established fellow cloud supplier: It's refusing to talk about the mishap other than what's presented in an official and carefully presented blog.

"The Cloud Foundry blog is the best resource for you, which details the account of the outage. VMware is continuing to keep that updated regularly to maintain transparency with the community," was the response from VMware to a request for more information Tuesday.

The blog cited is Dekel Tankel's, one of the primary builders and managers of CloudFoundry.org. In a blog posted April 29, he said the trouble started at 6:11 a.m. April 25, when a power supply in a storage cabinet experienced an outage. That deprived users of access to a single logical unit number (LUN), or identifier of a disk or set of disks, in Cloud Foundry. The power supply malfunction wasn't an unexpected event. Clouds are designed to detect and survive lost power supplies, either by invoking a redundant source or by routing around them using a backup copy.

"While not a 'normal event,' it is something that can and will happen from time to time," he wrote, and VMware thought it was prepared for it.

But, Tankel continued in the blog, "In this case, our software, our monitoring systems, and our operational practices were not in synch," he noted. The loss of a LUN was an event "that we did not properly handle and the net result is that the Cloud Controller declared a loss of connectivity to a piece of storage that it needs in order to process many control operations."

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
10 Ways to Prepare Your IT Organization for the Next Crisis
Cynthia Harvey, Freelance Journalist, InformationWeek,  5/20/2020
IT Spending Forecast: Unfortunately, It's Going to Hurt
Jessica Davis, Senior Editor, Enterprise Apps,  5/15/2020
Helping Developers and Enterprises Answer the Skills Dilemma
Joao-Pierre S. Ruth, Senior Writer,  5/19/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
Key to Cloud Success: The Right Management
This IT Trend highlights some of the steps IT teams can take to keep their cloud environments running in a safe, efficient manner.
Flash Poll