Though the mechanics behind many cloud outages are eventually revealed, some of the issues might recur because of tradeoffs made by providers for the sake of cost and profitability.
The majority of cloud outages boil down to software updates or configuration changes gone wrong, says Kurt Seifried, chief blockchain officer and director of special projects with the Cloud Security Alliance. He and other experts see the cloud growing increasingly complex with new features rolled out to meet demand and expectations for innovation, yet the drive to release updates can lead to some corners being cut. “Ultimately, that’s a human failure in that they should have tested it out more,” Seifried says, though he acknowledges that when changes are made to a major system, at some point testing must stop and the updates must be deployed.
Knowing the Problem Does Not Always Fix the Problem
He says though major problems that lead to outages are relatively known, the ubiquity and necessity of the cloud for modern commerce mean there is little choice but to go along with the practices of current providers. “Most businesses make the tradeoff because what are customers going to do? Leave? That’s part of the problem,” Seifried says. “The cost of these outages is largely externalized.”
In early July, Rogers Communications suffered an outage that lasted some 19 hours and affected commerce, including banking and other vital services. Rogers, which has some 2.25 million retail internet customers and more than 10 million customers on wireless, initially offered an automatic credit to its customers that was the equivalent of five days service fees. More recently the company announced it would spend $7.74 billion US in the coming three years to bolster testing and leverage AI to avoid future outages.
Read the rest of this article on InformationWeek.