Tape Summit Predicts Tape Renaissance

Summing up the recent Tape Summit in Nevada, the basic message was that tape is not only about to see a resurgence but rather a renaissance. This would seem to be counter-intuitive to many, but the claim has merit. Let's explore that and see why, since IT will be the major benefactor if the claim can be proven.

David Hill

April 25, 2011

10 Min Read
Network Computing logo

Summing up the recent Tape Summit in Nevada, the basic message was that tape is not only about to see a resurgence but rather a renaissance. This would seem to be counter-intuitive to many, but the claim has merit. Let's explore that and see why, since IT will be the major benefactor if the claim can be proven.

Tape Summit 2011, an inaugural meeting of vendors that are involved with some aspect of tape, as well as analysts and press who are very familiar with the tape market, was held recently in Las Vegas, Nevada. It occurred late in the week of the National Association of Broadcasters (NAB) meeting, as many of the same vendors also attended that show. There was a natural bias toward tape, but also note that vendors, such as HP, IBM, and Quantum, sell both disk and tape. So although the merits of tape were stressed, participants had a realistic view of the role of disk in the greater storage market.

My memory may be wrong, but I think that it was Ed Childers of IBM (since the example related to IBM) who documented that the first claim that "Tape is dead" occurred in 1961! So we are now celebrating the 50th anniversary of the reputed death of tape. While those claims are as spurious today as they were then, the markets for tape have certainly narrowed during the past five decades. The two primary uses of tape today are backup and archiving, and both are being targeted by disk vendors.

In the former market, disk to disk backup solutions are becoming more popular, especially those using data deduplication. There are a number of good reasons why disk is taking a stronger role as the first line of defense, which have led many to predict that tape will no longer have any role in backup. This simply isn't true.

Jonathan Marianu, an AT&T Storage Planning and Design Architect, took both disk and tape to task in his enormous backup world. He uses disk with data deduplication, but he also uses an enormous amount of tape. One reason is that there is not enough bandwidth to copy data over a network to a disaster recovery facility. Given the business that AT&T is in, that says a lot.Google's recent Gmail restoration from offline tape was also brought up. This was a logical problem, where turning to offline tape that was not subject to the logical data protection problem became the solution. This illustrates an important point in data protection; that it is better to be able to restore the data, even if it takes awhile, than not to be able to restore it at all.

But how the disk-tape balance will work out for backup is not what has today's tape vendors excited. No, that excitement focuses mainly on archiving.

Archiving is the process of moving fixed content data off of active production systems. Note that the archived copy is a valid working production copy of the data. It is not a data protection copy.

Now, archiving tends to bifurcate into active archiving and deep archiving. An active archive is where the data is easily accessible online (in the sense that an end user can read it and use it for business purposes). A deep archive is usually offline where a system administrator has to retrieve the information.

The first question to ask is how much data could or should be in an archive? The answer is probably almost all of it. A rough estimate used to be that 80% of an archive is fixed content data, but that number is probably small. Much data is fixed immediately upon creation, such as sent or received e-mails or a digital medical image. Consider also that most unstructured data (which constitutes the largest growth area for most companies) is also fixed. An unfinished transaction (structured data) or a word processing document that has not been finished (semi-structured data) are not fixed content, but much data in a database or files in a server are unlikely to change. So the vast majority of most organizations' data could be put in an archive.The second question is: why hasn't the data been archived? There are a number of answers. For one thing, an archive implies the need for good data retention management policies, something that is difficult for many enterprises to achieve. Suffice it to say that data retention has been scarce, except perhaps for some deep archiving that was typically performed on tape.

But does tape have to be limited to deep archiving? Quantum and Qualstar have products that use tape in conjunction with disk caching for an active archive. Nevertheless, active archives typically reside on disk.

The gauntlet was laid down by Molly Rector, Vice President of Product Management and Worldwide Marketing of Spectra Logic, who said that 5% of data is active production data, 15 to 20% could be active archive data that, for performance reasons, needed to be put on SATA disks, and the rest is active archive data that could be put on tape. (Some of the data may be deep archive data, but it is probably a small percentage of the whole.) This was a shocking comment, implying that most of the data in a data center should be on tape! That will, of course, generate a lot of controversy, but, if true, would lead to a renaissance for tape technologies and vendors.

But can it be true? What are the economic benefits and can tape do the job?

The Clipper Group has published a white paper entitled "In Search of the Long-Term Archiving Solution -- Tape Delivers Significant TCO Advantage over Disk"  This paper discusses the long-term preservation of digital data and comes to the astonishing conclusion that a disk-based solution costs 15 times that of a tape solution! This is a well-reasoned paper with the assumptions clearly laid out. If the results of this paper stand (and they very well may), then tape has a very strong TCO case for archiving data.However, other variables play a role in this argument, including reliability, scalability, longevity, compliance, security, and usability. Scalability is not a problem for tape. Compliance and security do not seem to be barriers to its adoption and neither is longevity. Modern tape cartridges are expected to have a shelf life of between 15 and 30 years. This is longer than the time required for the vast majority of migrations--the time when data has to be migrated from one piece of media to newer technologies. With tape, this time is between 7 and 10 years, based on the fact that LTO (the most popular form of open system tape) averages a new generation between every two and three years, and the latest LTO generation can read tape two generations back.

In a typical scenario, the oldest tape would have to be migrated before yet another new generation of tape drive was put in place. Contrast this with disk, where the migration would most likely occur on a 3 to 4 year cycle. It is not that disks do not technically have a longer life than 4 years, but economically it typically makes sense to buy new disks (more capacity for the same dollar and less maintenance costs) every 3 to 4 years. Note that you may very well see a 10 year old tape library running with 7 year old tape drives, but you are unlikely to see a disk array that is 7 -- let alone 10 -- years old. (By the way, operating system and application obsolescence that would affect migration times are ignored as they affect both disk and tape equally.)

One key issue has always been tape reliability. Fred Moore, Horison Information Strategies, discussed this subject in his address entitled "Future Predictions on the Role of Tape and Disk Media in the Data Center." Now Fred might well be nicknamed Mr. Tape, but even more importantly he might well be nicknamed Mr. Storage for his logical, solid analysis. For those of you who want more detail, please read "Tape Technology Leaps Forward to the 3rd Era" .

And Fred did not get into the additional approaches that several vendors described to show how they are improving the integrity of tape. Note that the comparison has to be with the tape of today and the disk technology of today, not the tape technology of the previous generation or generations.

The other key issue to consider is usability and the Long Term File System (LTFS) that is available with LTO-5. LTO-5 drives have two partitions: Partition 0 can hold directory structure information and Partition 1 holds content information. LTFS can take advantage of the directory structure information to more effectively manage tape. That is a huge help for managing tape in an active archive environment. LTFS also provides other advantages, such as self-describing tape. The goal is to be able to read files, if necessary, in the future, using XML even if the native application that created the file is no longer available to read it.All in all, Tape Summit speakers talked at length about LTFS and the benefits it would provide. Notably many of the vendors had been at NAB and spoke about the heightened interest of the broadcasting industry in using tape with LTFS capabilities. One particularly cogent example regarded workflow archiving! I thought that the words "workflow" and "archiving" would be an oxymoron, but I was wrong. Through every step of the production and post-production process in modern film production, tape can and does play an essential role.

Tape innovation would seem to be another oxymoron but as we have seen, it is not. I have tried to keep vendors out of this discussion (so as not to favor one over another) but I will at least mention some. First, though, two organizations that promote tape -- the Active Archive Alliance and the LTO Consortium -- were well represented. Among the companies that I talked with were:

  • Crossroads -- in addition to its monitoring and analysis capabilities of tape, it plans to exploit LTFS to provide network-attached tape file storage

  • Gresham -- its strength is in the management of tape in a TSM environment

  • HP -- tape continues to be a focal point, and it is strongly promoting the use of LTFS

  • IBM -- a strong user of LTFS and a prime mover in expanding tape management automation

  • QualStar -- it has long supported active archiving with its archive management software in tape environments

  • Quantum -- tape management automation and using tape for an active archive (StorNext) have long been capabilities of Quantum

  • Spectra Logic -- it has long been an innovator in tape and has just introduced new data verification capabilities for tape

  • Tributary -- a strong supporter of an integrated VTL (which it calls a backup virtualization layer) that uses tape at the back end.

The real story I took away from the Tape Summit was not disk versus tape. Random access devices, whether HDD, SSD, or some other random-access technology, will always have a vital role in enterprises and the growth of random-access storage continues to have a bright future. But once again the reports of tape's death have been greatly exaggerated.

No, the real story is that the ability to deal with not only the explosion of growth of information, but the need to retain much of it for longer periods of time now has a far more economically feasible answer. That should relieve the pressure and stress of storing humongous amounts of still useful information for the long-term.

And that should tend to spur the information revolution on, as budget dollars can be allocated to meeting other tasks than just paying an ever-increasing storage bill. So long as tape lives, IT is the real winner.

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights