Of course, even the best-laid plans can be waylaid by unforeseen consequences, which is likely what happened to Amazon. I don't believe the company would have designed in such a catastrophic failure point on purpose.
What are the options? Stay out of cloud computing? Maybe. Cloud computing, with its automated management plane, is still young. A lot of smart thinking has taken place in the operations and management of big, automated computing services, but there is more to come. Cloud computing is still cutting edge and needs maturity. Of course, I get a chuckle from the idea that cloud availability issues can be solved by using multiple cloud providers via an automated method.
While I haven't given that idea a great deal of thought, I'd need to see some serious proof points to start to believe it. Adding more clouds doesn't necessarily mean additive availability. Building a computing system with products that are reliant on each other and offer five nines reliability actually reduces the statistical uptime because the reliability is multiplicative, not additive. Adding more clouds won't magically make your services more reliable. What you need to do, if you are planning on using cloud services, is to examine the applications you want to put in the cloud and consider how they can be redesigned for resilience. Your application--as a system that includes hardware, software, services, etc.--has to be designed to recover from failure. As George Reese, CTO of Enstratus, said on Twitter, "When you put the responsibility for availability on software, your hardware options increase and your costs go down. And, ultimately, you get greater availability."