Amazon S3 Outage Casts Cloud on Cloud Computing
A massive outage raises questions about the reliability of storage-as-a-service
February 16, 2008
Amazon S3 storage-as-a-service reportedly failed for about three hours this morning, leaving business customers worldwide without access to their stored data -- and with lots of questions about their service.
On an S3 customer forum, users emailed frantically from about 5:00 a.m. Pacific time, asking for updates and information. One customer reported that it was the second major S3 outage they'd experienced in about a month. Several wrote of their concern over meeting requirements for their customers while the service was out. Many expressed fear of relying on Amazon S3 going forward.
At 9:09 a.m. Pacific time, an Amazon employee reported that the issue had been resolved, the system was recovering, and that there may be elevated error rates for some customers for awhile. She also stated that Amazon would notify customers about what had caused the problem as soon as possible.
This afternoon, an Amazon spokesperson forwarded the following email to Byte and Switch:
One of our three geographic locations was unreachable for approximately two hours and was back to operating at over 99% of normal performance before 7 a.m. pst. We've been operating this service for two years and we're proud of our uptime track record. Any amount of downtime is unacceptable and we won't be satisfied until it's perfect. We've been communicating with our customers all morning via our support forums and will be providing additional information as soon as we have it.
The event raises questions about the effectiveness of "storage as a service," particularly since Amazon S3 publishes the following claim on its Website: "Reliable: Store data durably, with 99.99% availability. There can be no single points of failure. All failures must be tolerated or repaired by the system without any downtime."One high-profile user of the service says it's unrealistic to expect Amazaon S3 to have 100 percent uptime. "We expect Amazon to have outages. No website I'm aware of doesn't, whether it's Google, Amazon, your bank, or SmugMug," wrote Don MacAskill, CEO of photo site SmugMug, in a blog today. MacAskill says he had no problems with S3 today, even though he's not provided with any out-of-the-ordinary or preferential protection. But he concedes his luck may be owing to how he's designed his system to work with S3 services.
One thing: MacAskill stressed in his blog that "Amazon's communication about this has been terrible." He says he's asked the company repeatedly for an "Amazon Web Services Health page" that would allow customers to view network status.
Maybe he'll get one now.
You May Also Like