Delving Into Spectra Logic Deep Storage
With its RESTful interface and BlackPearl storage appliance, Spectra Logic promises to breathe new life into the tape market.
October 25, 2013
Spectra Logic is promoting the concept of "deep storage" as a new way of selling tape libraries, but don’t let your thoughts on the viability of tape distract you from learning about deep storage.
Try to step back mentally and emotionally and see where deep storage fits in the recent panoply of IT innovations and what it portends for management of “heavy data”--that is, bulk data--that is a byproduct of the ongoing data explosion.
Spectra Logic defines deep storage as “extremely low cost, power efficient and dense storage that requires some latency when retrieving data.” This is a storage tier for the long-term mass storage of data--typically, from about 200 Tbytes to north of multiple petabytes.
In essence, deep storage is a new application use for a tape library. The two principal application uses of a tape library are data protection (such as for backup and recovery) and active archiving. Yet heavy data does not fit easily into either application category. Since heavy data is working data that maintains a long-term value, it is definitely not data protection, where backup/restore and disaster recovery are the primary functions.
In fact, much of heavy data is fixed content (that is, the data does not change after it is created), so, technically, it might be a target for an active archive. There might be cases where some deep storage data would fit into an active archive. But an active archive also requires overarching software that provides functionality such as compliance and e-discovery for emails. This may be too much or the wrong kind of software management overhead for many applications, such as big data, seismic data or video surveillance data.
With deep storage, specific use cases are probably best managed individually rather than under an active archiving software management umbrella. But that poses a problem: Moving data for deep storage and then accessing that data for business purposes is not easy. In fact, although you could write custom software to do the job, it would still be difficult and complex. Active archiving and data protection have solved this problem, but at the expense of focusing on targeted solutions rather than on providing a framework that can be adapted to serve as a general purpose solution.
In contrast, Spectra Logic has introduced a general-purpose interface that enables the use of deep storage and will eventually be open (at least to some extent). Spectra Logic also launched its BlackPearl appliance to use deep storage with its own products, which are most likely to be with tape libraries, but that will also eventually work with a Spectra disk product.
Deep Simple Storage Service (DS3)
Spectra Logic’s DS3 is a communications interface that allows clients (as in a client/server architecture) to manage and direct bulk storage read (GET) and write (PUT) operations to deep storage, such as tape. DS3 is actually an extension of Amazon's S3 (Simple Storage Service). The extensions enable the use of sequential storage media as well as removable storage media.
Amazon S3 is a de facto standard and one that has been broadly accepted as a Web services interface designed to scale large amounts of data at any time and from any place on the Web. Storage is in the form of objects in buckets where each object has a unique, developer-assigned key. Spectra Logic employs this form of object storage for its deep storage architecture and solutions.
As an extension of S3, the DS3 interface obviously encompasses the REST (Representational State Transfer) client/server architectural style to move objects to and from deep storage using the high-level GET and PUT commands. DS3 is the first native RESTful interface that can work with robotic tape libraries.
[Read how tape is a more cost-effective way to deal with little-used data than hard disk drives in "How Tape and LTFS Can Relieve Storage Pressure."]
Using an extension to S3 as a cornerstone seems to be a particularly good move on the part of Spectra Logic. Deep storage has to be able to play in the modern IT world, and integration with the Web services world is an essential component for doing so. Secondly, deep storage has to be feasible in the sense that the application developer time has to be palatable for organizations. It's about being able to do what would have been too costly (development-time-wise) in a time frame and at a cost that an organization deems acceptable. Previously, that was not true except for custom-built applications that could justify the investment.
The BlackPearl Deep Storage Appliance
BlackPearl is Spectra Logic’s data management appliance based on DS3 that actually implements the use of deep storage. BlackPearl does a number of things:
•Acts as a DS3 server to DS3 clients while using the DS3 interface; data is migrated from a DS3 client to the BlackPearl appliance;
• Stores data as object-based deep storage by grouping collections of data as buckets while being able to store this data using the open, self-describing Linear Tape File System (LTFS) format, and maintaining an object catalog physical storage location and metadata information;
•Manages the deep storage system itself, including inventory, retries and error handling;
•Provides tight integration with Spectra’s BlueScale tape management system for actual management functions of the tape library, such as tape encryption, data integrity verification and system error detection.
Spectra has a developer program that provides all the tools necessary to write a custom DS3 client, including the necessary API, a software development kit (SDK), and a simulator download. The vendor is also working on pre-written clients; one is for use with the Hadoop Distributed File System (HDFS), so data can be migrated out of an active HDFS-managed cluster for long-term storage and future use.
Spectra Logic claims that its tape-based deep storage platform, front-ended by a BlackPearl appliance, can cost as little as 9 cents per gigabyte in multipetabyte environments and 14 cents per gigabyte in smaller environments. Note that this is the full purchase price. The payback would be in less than a year, versus a monthly charge for a cloud service that provides longer response times (most notably Amazon’s Glacier, which is a deep archiving service).
Next Page: Deep Storage Use Cases
Long-term retention of large quantities of data where latencies in the range of minutes is acceptable is among the target use cases for deep storage. Web 2.0, big data, media and entertainment, and genomic research are among the usual suspects. At a recent press and analyst event, Spectra Logic customers described how they're using the technology.
For Yahoo, deep storage provides an alternative to traditional backup processes. The company uses tape libraries to store backups. The process works, but really can be used only for disaster recovery when a lot of data needs to be recovered (as Yahoo's process for restores is not that granular).“Archiving is what we want, backup is what we are stuck with,” according to a Yahoo representative.
Now, Yahoo deals with a lot of fixed content data that is really not a good target for backup, as continuing to backup data that never changes is very inefficient. An archive is a copy of data that does not change, but it's not a backup copy, and Yahoo likes the fact that it is no longer slave to a backup application with Spectra Logic's technology. Although the data is not available to external users, the data can now be available to internal users to unlock potential, such as data scientists being able to run queries.
Another customer, media management software company Axle Video, uses deep storage as a way to manage workflow processes. Although tape is an efficient medium for storing large amounts of data for the long term, it is not a good medium for editing; that's where random access media is more effective. However, having to move or migrate large quantities of random access media when only a portion of the data is really needed at any one time has been a headache.
Sam Bogoch, CEO of Axle Video, says his company has found a way to get the benefits of both sequential media (that is, tape) and random access media (that is, flash or hard disk). Axle Video stores the bulk of data using deep storage, but democratization of media (available for use by all) makes things doable that weren’t before, such as the use of streaming proxies to provide just enough data for editing on random access media, Boguch says.
The process allows customers to search existing media, browse from a range of standard devices (including tablets, rather than just old-style edit stations), and enable the workflow processes of collaboration, logging, review and approval.
Mesabi Musings
Just when you think that some technology is locked away, thinking outside the box can change things dramatically. That is the case with magnetic tape, where the sum of traditional thinking has been that the backup/restore market for tape would continue to decline, while the use of tape for active archiving would continue to increase.
The introduction of deep storage by Spectra Logic has changed that perception, and promises to open up new markets for tape. While this should not have a significant impact on existing products, it could open up a number of new and unusual opportunities that represent a boatload of storage. Since DS3 will eventually be open, it will be interesting to see if any other IT vendors will want to play in this space or if Spectra Logic will have it all to itself.
Spectra Logic is not a client of David Hill and the Mesabi Group.
About the Author
You May Also Like