Amazon Web Service's Simple Storage Service (S3) suffered a service failure for about eight hours on Sunday, causing outages at online companies that depend on S3 for file storage.
S3 suffered a similar outage lasting about two hours in February, an incident that led many to question the dependability of the increasingly fashionable cloud computing model.
Amazon in a statement said that it is proud of S3's operational performance over the past two-plus years and that customers generally have been pleased. "But any downtime is unacceptable and we won’t be satisfied until it is perfect," Amazon's statement said.
"As a distributed system, the different components of S3 need to be aware of the state of each other," Amazon said in its statement, "For example, this awareness makes it possible for the system to decide which redundant physical storage server to route a request to. We experienced a problem with those internal system communications, leaving the components unable to interact properly, and customers unable to successfully process requests. After exploring several alternatives, the team determined it had to take the service offline to restore proper communication and then bring service online again. These are sophisticated systems and it generally takes a while to get to root cause in such a situation -- we will be providing our customers with more information when we've fully investigated the incident."
Since S3 was launched in March 2006, a variety of companies have outsourced at least some of their storage infrastructure to AWS, including 37signals, YouOS, SmugMug, ElephantDrive, and Jungle Disk.
Don MacAskill, CEO of SmugMug, which uses S3 to store its customers' photos, was quick to defend AWS. "Amazon's S3 service, SmugMug's primary storage provider, is currently experiencing problems," he wrote in a blog post on Sunday. "As a result, a large portion of the photos and videos stored on SmugMug are currently offline. Historically, Amazon has been very stable. We've seen three of these in our entire [two-plus year] history with Amazon, including this one. I expect, like the last two, that service will be restored shortly."
MacAskill stressed that his faith in AWS hasn't been shaken, and that such outages "few and far between, short, and handled properly."
MacAskill has been a strong supporter of AWS since its inception, which may explain why Amazon has pointed to SmugMug as a customer reference over the past two years and continues to feature the company on one of its Customer Case Studies pages.
According to an Amazon official, there's no marketing relationship or quid pro quo between AWS and SmugMug.