Amazon S3 Explains February Outage

Amazon S3 comes clean on last month's widespread network failure

March 7, 2008

2 Min Read
NetworkComputing logo in a gray background | NetworkComputing

Just when it seemed cloud computing was catching a bit of mindshare, last month's Amazon S3 outage stirred up concerns about service reliability.

But Amazon S3 has an explanation for what happened that could help alleviate some of those worries.

File it under "Live and Learn": According to an Amazon S3 spokesperson, the troubles started when one of the service provider's locations started showing elevated levels of authenticated requests from multiple users. Up to then, Amazon S3 hadn't been monitoring those kinds of requests.

"Importantly, these cryptographic requests consume more resources per call than other request types," writes Amazon S3 spokeswoman Kay Kinton in an email to Byte and Switch. "Within a short amount of time, we began to see several other users significantly increase their volume of authenticated calls. The last of these pushed the authentication service over its maximum capacity before we could complete putting new capacity in place."

Because Amazon S3 handles account validation along with authentication requests, the load was just too much for the particular location, which ceased to function.Amazon S3 is taking the following actions, Kinton says:

  • Improving monitoring of the proportion of authenticated requests

  • Increasing the capacity of its authentication service

  • Adding "additional defensive measures" for authenticated calls

  • Implementing a service health dashboard, which should be released shortly.

When Amazon S3 refers to capacity, they're not talking storage. "We added additional request processing capacity for the authentication services," Kinton clarifies. She declines to talk further about Amazon S3's storage setup, except to say that it's growing.

At least the service provider has come clean on its role in the outage. Further, the addition of a dashboard should please existing customers, including Don MacAskill, CEO of SmugMug, a high-profile customer of Amazon S3's service who complained about Amazon S3's lack of this kind of feature in a blog following the outage.

We'll be tracking the anticipated release.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights