Amazon IDs Cause Of Data Center Outage - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Cloud // Software as a Service
News
12/14/2009
08:07 PM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Amazon IDs Cause Of Data Center Outage

The failure of two power components at a Virginia data center affected some EC2 operations on December 9th, Amazon Web Services says.

Amazon Web Services has attributed a 44-minute outage in part of its Northern Virginia data center last week to the failure of power supply in one "availability zone" in the data center, which was soon followed by a second failure of a component in the redundant system.

Users of the Amazon EC2 cloud with workloads in Amazon's Northern Virginia data center experienced problems early in the morning of December 9, with some operations in a part of the data center interrupted during a five-hour period.

Amazon started notifying customers of a problem at 4:08 a.m. Eastern. By 9:41 a.m., it's Amazon Service Health Dashboard reported that "we have completed recovery of most instances affected by this event."

The postings first mentioned a connectivity issue, then acknowledged a power issue. In following up on the postings, InformationWeek asked Amazon whether the power issue was inside the data center or an issue with an external supplier.

Amazon spokesmen responded that "a single component of the redundant power distribution system failed in this zone. Prior to completing the repair of this unit, a second component, used to assure redundant power paths, failed as well, resulting in a portion of the servers in that availability zone losing power."

Forty-three minutes after the first notice, Amazon Web Services posted messages, such as: "The underlying power issue has been addressed, instances have begun to recover" at 4:51 a.m. Eastern and "most affected instances and are operating normally" at 5:11 a.m.

The next day it added an explanation: "A single component of the redundant power distribution system failed in this zone. Prior to completing the repair of this unit, a second component, used to assure redundant power paths, failed as well, resulting in a portion of the servers in that availability zone losing power."

The actual time of the outage, according to a monitoring service that gathers information by pinging traffic over the Internet and off the accounts it maintains inside Amazon facilities, indicated 3:34 a.m. to 4:19 a.m. Eastern.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Previous
1 of 2
Next
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Slideshows
Blockchain Gets Real Across Industries
Lisa Morgan, Freelance Writer,  7/22/2021
Commentary
Seeking a Competitive Edge vs. Chasing Savings in the Cloud
Joao-Pierre S. Ruth, Senior Writer,  7/19/2021
News
How CIO Roles Will Change: The Future of Work
Jessica Davis, Senior Editor, Enterprise Apps,  7/1/2021
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Monitoring Critical Cloud Workloads Report
In this report, our experts will discuss how to advance your ability to monitor critical workloads as they move about the various cloud platforms in your company.
Slideshows
Flash Poll