A Data Reduction Dossier

Want to put a lid on the data explosion? Then think about losing some Tbytes

August 5, 2006

6 Min Read
NetworkComputing logo in a gray background | NetworkComputing

If your data center were a waistline, it would probably pinch mightily (and we bet you'd have to leave that top button undone too).

Compression technologies have been around for awhile, but what else can you do to reduce data overload? Byte and Switch has scoured the industry to come up with the top tips for data reduction:

Size Up Storage

Any reduction of unnecessary data begins with working out exactly what you have in the way of storage. This may be easier said than done, particularly if data storage is spread across a number of national, or even global, sites.

Sometimes, taking stock of storage turns up areas where cuts can be made. "For every 1 Tbyte of primary storage, there's [potentially] 10 Tbytes of secondary storage," claims Arun Taneja, founder of the Taneja Group consultancy.Another analyst encourages IT pros to consider their future needs in making the storage assessment. "I think there are a couple of different variables," explains Tony Asaro, senior analyst at the Enterprise Strategy Group. "[For example], are you going to be upgrading your systems, or can you make the most out of what you have got?"

Take Inventory

Key to reducing the amount of data stored is the ability to quickly resolve what to keep and what to toss. Storage resource management (SRM) software, which often features inventory capabilities, could be the answer.

In the U.K., for example, the Wakefield Health Informatics Service, a public sector body in the north of England, slashed around 4 Tbytes of data by removing old and duplicated files through SRM. To achieve this, the service deployed CA's BrightStor Storage Resource Manager software, which reports on the type and age of files. (See CA Unveils BrightStor.)

One thing to keep in mind: Typically, SRM offerings are either SAN-centric, such as Symantec's CommandCentral Storage and EMC's ControlCenter, or host-based, such as Tek-Tools' Profiler product. (See Review: SRM Suites.) What you use depends on where data is stored.Use Thin Provisioning

While this disk technology is not geared specifically to data reduction, at least one source thinks it could be invaluable for users looking to get the most out of their back-end storage. "It doesn't reduce your data per se, but it does optimize capacity because you're not allocating empty blocks" of storage, explains ESG's Asaro. The idea is that, rather than over-allocating capacity to a specific storage application to support future needs, users can allocate only as much capacity as is needed.

3PAR, which pioneered thin provisioning technology, claims to have tackled this problem through a technique called Dedicate-on-Write (DOW). In a nutshell, this means that physical disk capacity is only used when applications actually write data to the storage array. (See 3PAR Debuts 'Thin Provisioning'.) Like 3PAR, LeftHand Networks' SAN/iQ software also draws from a common storage pool on an as-needed basis. Elsewhere, Compellent also plays in this space, and NetApp has added thin provisioning features to its Data OnTap operating system. (See NetApp Freshens What's OnTap and NetApp Makes Virtual Upgrade.)

Investigate Tiered Storage

Tiered storage could help shrink data overhead, particularly on the bottom rungs of the storage tier. "You could archive some data off to lower-cost storage that will have single instances," Asaro explains. But be aware that in order to do this, you may need to overcome some of the major challenges associated with Information Lifecycle Management (ILM). (See Users Cite ILM Shortfalls and Intel Faces ILM Challenge.)Take Snapshots

"With writeable snapshots you are creating a logical copy so the capacity requirements would be zero or minimal," says Asaro. The idea here is that, instead of taking a full copy of primary data for disaster recovery or testing purposes, users can instead take a writeable snapshot, a view of the data at a given point in time. StoreAge offers this type of technology, as does Compellent, and NetApp has enhanced its Data OnTap operating system with writeable snapshots. (See NetApp Makes Virtual Upgrade.)

Shrink Email

Email needs to be stored and managed somewhere. (See Email Archiving to Hit $7.8B and Email Archiving to Grow.) Specialized email archiving products are nothing new, although some vendors are also touting data reduction capabilities. (See Iron Mountain Upgrades Connected, Into the Email Backup Maze, CA Resells Arkivio, and Archiving's Active Alliances.)

Iron Mountain, for example, offers a feature called EmailOptimizer that works with local email archive files to analyze text and attachments. By saving only changed data, and eliminating saves of duplicate data, the company claims that it can cut the amount of storage space used for email backups by around a third.Wise Up to WAFS

Wide Area File Services (WAFS) technologies, which are gaining momentum amongst users, also fit into the data reduction equation. (See Users Rally Round Remote Solutions.) "The whole thesis of WAFS is to eliminate the proliferation of data in all kinds of remote sites and to consolidate that data into a central site," says Arun Taneja, adding that Riverbed offers a technology called SDR that's designed primarily for data reduction.

Over the last couple of years, a slew of vendors have launched products designed to tackle various elements of the branch office problem -- from application performance to WAN optimization and WAFS. Many are combining technologies believing users can consolidate resources -- such as file servers -- and share their data more effectively. (See Vendors Plan a Week to Watch, Brocade, Packeteer Team Up, WAN Market Tops $236M, and Packeteer Picks Tacit.)

Adopt De-duplication

Data de-duplication technologies transmit only data that has changed since the last backup as opposed to the traditional model of backing up all data onsite every day or week. (See De-Dupe Streamlines Backup.) A slew of vendors currently offer products, including ADIC, Asigra, Avamar, Data Domain, Diligent, and Symantec, typically split between agent-based or agentless approaches.Data Domain was among the first storage vendors to use data de-duplication to compress data, although other vendors are also getting into this space. (See Users Ponder Tape Independence, FalconStor Extends VTL, and FalconStor Plots De-Dupe Debut.) Sepaton, for example, is currently beta testing its DeltaStor data de-duplication software, with general availability expected by the end of the year. (See Sepaton Readies De-Dupe and De-Dupers Lining Up.)

Another startup, Exagrid, claims to be taking a different tack to the established de-duplication posse. Fred Pinkett, the firm's vice president of market development, told Byte and Switch that Exagrid searches for byte-level changes in backups, as opposed to repeated data in the likes of word documents, which he claims enables users to reduce their data even further. (See Exagrid Intros 2.1 Backup.)

James Rogers, Senior Editor, Byte and Switch

  • Advanced Digital Information Corp. (Nasdaq: ADIC)

  • Asigra Inc.

  • Avamar Technologies Inc.

  • CA Inc. (NYSE: CA)

  • Compellent Technologies Inc.

  • Data Domain Inc. (Nasdaq: DDUP)

  • Diligent Technologies Corp.

  • EMC Corp. (NYSE: EMC)

  • Enterprise Strategy Group (ESG)

  • ExaGrid Systems Inc.

  • FalconStor Software Inc. (Nasdaq: FALC)

  • Intel Corp. (Nasdaq: INTC)

  • Iron Mountain Inc. (NYSE: IRM)

  • LeftHand Networks Inc.

  • Network Appliance Inc. (Nasdaq: NTAP)

  • Riverbed Technology Inc. (Nasdaq: RVBD)

  • Sepaton Inc.

  • StoreAge Networking Technologies Ltd.

  • Symantec Corp. (Nasdaq: SYMC)

  • Taneja Group

  • Tek-Tools Inc.

  • 3PAR Inc.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights