Recently I participated in a webinar where we discussed ways to improve your ability to recover data and improve your confidence in the recovery process. There was a survey question that asked what was the acceptable amount of downtime for critical servers in the environment. All of the respondents--100%--indicated that anything over a day was unacceptable. More surprisingly one third indicated that zero downtime was the only acceptable standard.
For many legacy backup processes, recovering data for mission-critical large applications in less than a day is enough of a challenge. Zero downtime or minutes of downtime implies that there is no recovery because the recovery itself will take more than zero and thereby invalidate the standard. How do you create a zero downtime environment without breaking the bank on sophisticated clustering technologies?
First, my guess would be that most of the respondents, even those that answered zero, would be willing to live with a few minutes of downtime and when most people answer zero that's what they mean. Even a few minutes of downtime, though, with legacy backup technologies is going to be difficult to achieve. There are, however, new approaches that will allow even small to midsize businesses to be able to return applications to their full and ready state within a few minutes.
The first key ingredient to achieve data recovery in less than a few minutes is that the movement of data from a backup device to a source device simply can't occur. No matter how fast the network connection, the time required to make a copy will in almost every case break the goal of only a few minutes of downtime.
This means that the backup application must be able to present the data to the source server in a native state so that the application that it hosts can directly access the data. This also more than likely means that the backup device that stores the backup data needs to be a disk-based system.
Even being able to directly host the data to the application may still not be fast enough to meet the objective of zero downtime or few minutes of downtime. This will be especially true if the physical server that was hosting the application has failed. A physical server failure would mean that a standby system needs to be put in place or, more than likely, ordered so that there is something to actually access the data while it's in place.
Some vendors have overcome this problem by incorporating into their applications the ability to spawn virtual machines and recover the failed host plus its data into that virtual machine, all without moving data. This capability brings the concept of zero downtime or minutes of downtime to the masses. No longer do you need to go out and buy a sophisticated cluster to achieve those goals.
The other advantage of this technique is that the testing of a failed server can now be as easy as the click of a button. The ability to start an application with its data in a test mode almost instantly allows an organization to become very confident in their ability to recover.
Follow Storage Switzerland on Twitter