The ability to restore data in minutes is moot if your server room's air conditioning system goes down for days. Is your cooling system up to its task? Do you have an emergency plan if it isn't? Have you tested that plan lately? Here are some cool ways to protect servers from heatstroke.
Most comedy is tragedy that happens to someone else. Indeed, we take comfort from sharing our woes with others for their amusement and instruction. This anonymous tale of serial tragedy is a good example:
An IT manager carried a phone everywhere, secure in the knowledge that "intelligent" monitors would alert him to any server problems via email. But following an unusually hot weekend, he was greeted by server room temperatures of 113 F./45 C. No bits were moving, naturally. Things went from bad to absurd during the next several days.
The "redundant" air conditioning unit shut down when its twin's power supply overheated and died. The UPS unit on the email alert system lacked such overload protection so it fried. Water vapor condensed all over the server room floor. The functional AC unit was restarted and a repairman was called for the deceased unit, but the AC engineer could not arrive in less than three days. The one working unit got temperatures down to tolerable levels, but only until nightfall.
When the office building's AC system went off for the night, the lone server room cooler overloaded and shut down again. A junior AC engineer arrived the next day, only to learn that access to the rooftop unit required an hour of safety training, a "method statement" from the tenant ("Exactly what are you proposing to do on our roof?"), and 24 hours' notice. Portable coolers were rented with thick vent hoses poking through the open server room door. Those, along with the one working unit, kept the servers running intermittently.