Let's be clear, I am not pro-tape nor am I pro-disk. I am for making sure that the backup process can be completed in a cost effective, reliable manner. I am also for making sure that restores happen in a fast, reliable manner. In neither case does this mean the exclusion of tape nor the exclusion of disk for that matter. As we will cover in our upcoming webinar "Overcoming the Top Five Challenges to Tape Backup" the answer is that successful, affordable backups and recoveries often require a combination of disk and tape.
First, tape is not responsible for increasing backup windows. Data growth is. In fact tape, when compared to disk, is the only technology that gets measurably faster when each new generation is announced. Disk has been stuck at 15,000 RPM rotational speeds for over a decade and in that time almost every type of tape format performance has increased 5X or more. Also, most data centers that use disk as part of their backup strategy do not use 15K RPM drives, they use 10K or slower SATA drives. Performance gains in disk come from using more disk spindles, which of course only adds to the cost.
The issue with tape is that you do have to keep it well fed with data to be able to take advantage of its performance characteristics. Tape is not random access so it does not respond well to gaps in data flow. Disk provides random access and does handle gaps in data flow. This makes disk an ideal first target for backup jobs that can't keep the tape drive fed, like backups over an IP network for example.
Random access also seems to make disk ideal as an early but maybe not the first restore target. It is true that disk can get to the data faster and for small restores it is an ideal restore option. That is, as long as you are okay with yesterday's data since most backup jobs run once per night. If you are OK with last night's backup, losing potentially a day's worth of work, then you are probably okay with the extra two to three minutes that it is going to take to get that information recovered from tape as opposed to disk.
For mission critical information where losing a day's worth of work is an issue, you are probably going to use, at a minimum, a snapshot technology either in the operating system or in the storage system. You may go so far as a continuous data protection (CDP) option to copy that data in real time to another separate storage system. You probably are not going to count on a once per night process no matter what the target is. CIOs that I have spoken to that have this CDP type of solution send backups of mission critical data ironically to tape first, not even using disk as an intermediary backup area.
The role of disk is to be the target for the middle tier servers in the environment. Those servers can't justify being attached to a CDP system because of expense, and in many cases the loss of 23 hours of data--while not popular--is not business threatening. The good news for backup disk vendors is that this is the largest group of servers within the data center server population. These systems are also the ones that often do not have the horsepower to drive tape drives to their maximum performance so buffering to disk first makes sense.
To shrink the data loss window without the expense of CDP, there are applications that can leverage change block tracking to backup only small amounts of a server. This allows you to run more frequent backups without impacting server performance. Most of these systems do need disk to operate but many can spool to tape for longer term retention.
Tape and disk both have a role to play, especially in the larger enterprise. Be wary of the vendor that says you only need one or the other. In a future entry we will cover the alleged data vulnerability issues with tape.
Follow Storage Switzerland on Twitter