Going Green With D2D Backup

There are ways to deal with the power realities of disk-to-disk backup

March 26, 2008

8 Min Read
Network Computing logo

Among the many priorities storage administrators face these days, two seem to conflict: Disk-to-disk (D2D) backup seems to be in direct conflict with power-efficiency.

Storage pros in 2008 are implementing or expanding their investment in disk-to-disk backup for improved data protection and reliable data recovery. Compared to traditional tape-based protection, D2D has proven to shrink backup windows, improve restore speed, and bring reliability to a very problematic backup process.

At the same time, storage pros face a crisis in terms of power usage. It's not just about being green, either. It's about growing demand. As one CIO recently told me, "I don't care so much about being green. I simply can't get enough power for my data center."

Unfortunately, despite its multitude of weaknesses, tape has one advantage over disk solutions it is very power efficient. Let’s face it: A cardboard box full of tapes does not use much power. And even though tape has its own environmental impact in terms of storage consumables, fuel for trucks to transport it, and resources associated with temperature-controlled facilities, D2D is still the power hog.

The question is: Can one find a way to add or expand disk-to-disk backup without dropping in another power grid?Fortunately, the answer appears to be "yes." Let's take a closer look at the problem by systematically reviewing the techniques used for D2D and the ways these approaches can be modified to improve the power profile of each.

Basic Considerations
Before we get to the heart of the matter, it's worth reviewing a couple tenets: First, more efficient space utilization (storing significantly more data in less space) is the most practical way to begin to address power efficiency; and second, going green requires more power-efficient technology.

In practical terms, in 2008 server and storage virtualization exemplify ways of using less physical space, and therefore less power, to store data. In storage, data de-duplication is another example of a technique that enables better power efficiency. The less physical servers or storage required, the less overall power required.

But techniques like virtualization and de-dupe don't exist in a vacuum. To really see how D2D power consumption can improve, let's check out each product category in terms of how much power it consumes now and what can be done about it.

The comparative numbers used in the following descriptions are based on examination of various suppliers' published specifications on power consumption. Each data center implementation will be unique. But as a percentage, the delta between the various technologies should be similar.Page 2:The VTL Green Effect


D2D Product Type: Virtual tape libraries (VTLs)

Power Profile: About 80 watts per usable TbyteExplanation: VTLs are not part of the green equation. If correctly implemented, they are possibly an enabler of one of the other power-saving strategies like MAID or de-dupe, but today the traditional VTL simply front-ends racks of unoptimized SATA disk shelves. Those shelves are just one big power-consuming and heat-creating monster in your data center.

The situation becomes compounded when you decide to replicate your disk backups across a WAN connection to a disaster recovery site. Since there are no space efficiencies with VTL solutions, you have to duplicate your investment in capacity in your DR site. The result, of course, is twice the investment in power, cooling, and dollars. On average, we see about 80 watts per usable Tbyte consumed by most standard VTL solutions. In addition, because there is no capacity optimization, the entire backup job needs to be replicated, so the DR site has to be connected on a very high-bandwidth segment, and it has to be relatively close in distance.

The first step toward optimization of the traditional VTL is compression of the data on disk, similar to compression on tape. While this lowers the power profile to an average of about 50 watts per usable Tbyte, it also introduces problems.

First, most VTL solutions suffer a startling performance hit of over 60 percent when using compression, and the ability to receive inbound data from your backup process is greatly hindered. This is in conflict with one of the primary motivations for adding disk to the backup process in the first place – reducing backup windows.

Compression on disk also complicates the move to tape. If you are going to tape – and for most VTL solutions you will – then you typically cannot send data that is already compressed to tape and have the tape drive compress it. You will have to turn off tape compression at the tape drive.Another possible solution is to keep the disk portion of the VTL small and move your backups from disk to tape sooner. This strategy then makes you just as reliant on tape for recovery as you were when you started your D2D initiative.

For most customers the goal is to keep backup available on disk long enough to cover most restore requests. And a growing number of these users want to keep backup on disk for the entire data retention window, eliminating tape in its entirety.

Conclusion: If green IT or power consumption is a concern, VTL vendors that do not offer some sort of capacity or technology optimized power efficiencies simply cannot participate in the D2D discussion.

Page 3:The MAID Green Effect

D2D Product Type: Massive array of idle disks (MAID)Power Profile: About 7 watts to 28 watts per Tbyte

Explanation: MAID is an alternate step. MAID alone is simply a disk target that can spin down and power off disks that have not been used or accessed for a period of time. An obvious market for MAID is D2D backup, especially in power-sensitive data centers.

To complete a D2D solution, most MAID vendors partner with a VTL manufacturer. The first step in examining a MAID solution is to understand the strengths and weaknesses of the provider of the VTL component. Specifically, MAID is addressing power concerns through technology by powering down disks, not through capacity optimization. So taking advantage of this technology requires a tight integration with the VTL solution to ensure that new data is not written on the same drives that contain old data.

In short, you will not be able to mix and match your MAID and VTL solution. You will have to use the VTL that your MAID vendor selects. If the VTL-MAID integration is acceptable and the MAID technology works correctly, you can expect a reduction in power consumed down to the range of roughly 7 to 28 watts per Tbyte. This represents a significant reduction in power consumption compared to standard VTL solutions.

A caveat: Powering disk drives off and on generally creates a level of uneasiness with IT professionals. Most of us have experienced that uneasy moment when you power a piece of technology on, wondering if it will really power up. To address this concern, MAID vendors offer a couple of alternatives. It is possible to run a routine to confirm that the disks will power up when you need them. You can also lengthen the time between power downs to minimize the risk of not being able to power the disks back up. Both of these alternatives obviously have an impact and push power consumption into the upper end of the 7 to 28 watts per Tbyte range.Some MAID vendors have data de-duplication on the roadmap or are currently releasing the technology. This will likely have to be done as a post process, and they will inherit all the challenges associated with post-processing solutions. In addition to those challenges, it is unclear how a MAID vendor will be able to implement data de-duplication and not have to increase disk activity to the point that they lose all power efficiencies.

Conclusion: When considering MAID, a choice will likely have to be made between optimal power utilization and optimal space utilization.

Page 3:The Data De-Duplication Green EffectD2D Product Type: Data de-duplication

Power Profile: About 1.3 watts to 2.8 watts per usable TbyteExplanation: Data de-duplication is a data reduction technique that compares segments of data being written to disk storage with data segments that were previously stored. If duplicate data is found, an additional pointer is established to the original data, as opposed to actually storing the duplicate segments. This removes or "de-duplicates" the redundant segments from the storage system.

Data de-duplication is simple to use and immediately effective, as well as affordable. Assuming a data de-duplication efficiency of 10X to 20X, then the watts per usable Tbyte on a single data de-duplication device are typically in a range of 1.3 watts to 2.8 watts per usable Tbyte.

Data de-duplication systems aren't just more power-efficient for D2D backup. They also perform consistency checks on data and, most importantly, replicate that data, which minimizes bandwidth utilization as well as storage and power demands at the remote site.

Notably, while data de-duplication and VTL can be mixed, de-dupe and MAID cannot, at least not optimally. MAID needs disks to stop; data de-duplication’s wide cross-referencing of data segments across a volume means all disks need to be accessible.

While many VTL suppliers are adding de-duplication capabilities to their solutions, this typically comes as the result of an add-on, sometimes even an OEM relationship. Thus far, many of these integrations have not proven to be as seamless and problem-free as solutions from suppliers that started with de-duplication from the ground up.Conclusion: Of all the efforts toward streamlined D2D, inline data de-duplication systems are the go-to technology when trying to improve backup processes and power utilization. Data de-duplication addresses power, cooling, and space consumption challenges by optimizing disk capacity and storing more data in less space.

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like

More Insights