7 Leading Causes of Cloud Outages
Human error, Mother Nature, power cuts, geopolitics, and more are among the leading causes of cloud outages that represent a growing network resilience challenge.
December 25, 2023
![7 Leading Causes of Cloud Outages 7 Leading Causes of Cloud Outages](https://eu-images.contentstack.com/v3/assets/bltde8121fc52c5c8f3/blt68af37ab405d6223/66052be3f2ac89586189903d/1-cloud-outage-2RTR5TK.jpg?width=700&auto=webp&quality=80&disable=upscale)
cloud outages
Cloud outages can have severe impacts on a business financially and reputationally, with the average cost for one hour of downtime running at $365,000 an hour. As businesses shift more operations and sensitive data onto the cloud, the repercussions for even a few hours of downtime become more substantial. Sadly, the rate at which cloud outages are happening has increased every year, even as cloud service providers spend billions to ensure over 99 percent uptime.
Cloud service providers are facing threats from every angle. While natural disasters and cybercrime receive the most news attention, internal issues take up the lion's share of actual cloud outages, with human errors responsible for half of all system failures. That said, the leading cloud service providers are aiming to reduce the chance of any external threat penetrating the data center with new state-of-the-art facilities that have closed off electricity supplies and advanced cooling systems.
Learn more: What’s Causing Cloud Outages? A Network Managers’ Guide
Configuration Errors
Configuration errors are the leading cause of cloud outages, according to the Uptime Institute. It is an issue that is magnified by many of the foundational cloud tasks, such as deploying a new server or router, having to be coded in manually through command line interfaces. Without the safety net that text editors and other tools offer, programmers invariably make mistakes.
Configuration errors happen to businesses of all shapes and sizes, as even with the most airtight failsafe setup, there are always going to be unforeseen errors that occur during the deployment of new infrastructure. Facebook suffered from a major outage in 2021, lasting six hours, which was due to a BGP config change.
Cyber Attacks (DDoS)
From unforeseen to deliberate, cyber attacks have plagued the internet for decades. This is mainly in the form of distributed denial of service (DDoS) attacks at the cloud level, which use multiple systems to flood a network to the point it becomes unavailable and potentially penetrable.
In the age of cloud computing, DDoS attacks have increased in size and sophistication, with Netscout Systems reporting a 33 percent rise in attacks in the first half of 2023 due to global events. Google Cloud recently prevented the largest recorded DDoS attack of 398 million requests per second, seven times larger than the previously recorded biggest.
Power Outages
Cloud service providers are building state-of-the-art data centers that have multiple ways of generating and storing electricity, innovative solutions for cooling down servers, and a whole host of other strategies to prevent server downtime. It is a huge priority as more businesses upload sensitive data and operations to the cloud.
However, almost all of the data centers currently in operation are still powered by external sources, which can be taken out by natural disasters or suffer from grid overload. High temperatures or cooling faults can lead to overheating. Most data centers have fail-safes to ensure that minimal data is lost during a power outage.
Human Error
In many instances, what is perceived to be malicious turns out to be simple human errors, which happen at businesses of every size. According to a survey by the Cloud Security Alliance, up to 50 percent of all system failures are due to human error and mismanagement.
Alongside configuration errors, cloud outages can happen during deployment or maintenance activities when there is an elevated amount of activity on the system. To circumvent this, businesses are utilizing automation and machine learning to mitigate the risk of human errors.
Mother Nature
Cloud service providers are more conscious of natural disasters in the planning of new facilities, building them in areas outside of the reach of volcanos, hurricanes, and earthquakes. That said, while a data center can be placed in a safe zone, the infrastructure connecting data centers to the internet is more precarious, with 1.5 million kilometers of undersea cables powering the entire internet.
The fragility of this network was highlighted during the Tonga volcano. In the aftermath, the undersea internet cable connecting Tonga to Fiji was damaged, leading to the island being cut off from the rest of the world for several hours. While the introduction of satellite internet through Starlink and others could prevent total blackouts in the future, it is not suitable for larger operations.
Geopolitics
Carrying on from mother nature, another source of threat to undersea internet cables is bad actors, whether that’s foreign governments or terrorists. The EU published an in-depth analysis of the security of its undersea cables, arguing that the bloc needed to invest further in the security and stability of the network. This was pushed forward following naval activity by Russia in 2014 and 2022; in the first instance, France accused Russia of sabotage.
The cables underneath the sea are not the only way governments can restrict internet access. The necessity for cloud data centers to be as large as possible makes it easier for governments to censor traffic and minimize the likelihood of workarounds being used to circumvent bans on specific websites or apps.
Miscellaneous Causes
In unstable areas such as the Middle East and North Africa, the lack of clear information on why an outage has happened, who is at fault, and when it will be fixed can lead to long delays. To add to this, if a country does not have mitigation efforts in place if an undersea internet cable is damaged, it can leave them without proper internet access for weeks.
Miscellaneous Causes
In unstable areas such as the Middle East and North Africa, the lack of clear information on why an outage has happened, who is at fault, and when it will be fixed can lead to long delays. To add to this, if a country does not have mitigation efforts in place if an undersea internet cable is damaged, it can leave them without proper internet access for weeks.
Cloud outages can have severe impacts on a business financially and reputationally, with the average cost for one hour of downtime running at $365,000 an hour. As businesses shift more operations and sensitive data onto the cloud, the repercussions for even a few hours of downtime become more substantial. Sadly, the rate at which cloud outages are happening has increased every year, even as cloud service providers spend billions to ensure over 99 percent uptime.
Cloud service providers are facing threats from every angle. While natural disasters and cybercrime receive the most news attention, internal issues take up the lion's share of actual cloud outages, with human errors responsible for half of all system failures. That said, the leading cloud service providers are aiming to reduce the chance of any external threat penetrating the data center with new state-of-the-art facilities that have closed off electricity supplies and advanced cooling systems.
Learn more: What’s Causing Cloud Outages? A Network Managers’ Guide
About the Author(s)
You May Also Like