Data Domain Speeds Up In-Line De-Duplication

Vendor aims for higher enterprise scalability and performance with new model

May 12, 2008

5 Min Read
Network Computing logo

Data Domain has unveiled a new top-end system it claims substantially raises the bar for in-line data de-duplication.

The DD690, announced today, will de-duplicate data at up to 170 Mbytes per second per single-stream backup job, says the vendor. This compares with up to 160 Mbyte/s for a single-stream job using Data Domain's DD580 system, the former high end of the vendor's roster.

The DD690 features a useable disk capacity (raw capacity minus what will be allocated to RAID, sparing, and so forth) of up to 35.3 Tbytes, compared with up to 21 Tbytes of useable capacity for the DD580.

These performance increases may seem modest to the cynical eye, but Data Domain is intent on one-upping competitors' claims while taking aim at the enterprise customers that haven't so far bellied up to Data Domain.

Up to now, for instance, Diligent (now part of IBM), has claimed a greater attraction for enterprise customers while boasting data ingestion rates of 200 Mbyte/s. Other competitors, such as Exagrid, claim it's faster to de-duplicate data after it is backed up (post processing) instead of de-duping on the fly, as Data Domain and Diligent do.Only lab testing will be able to validate any of the vendors' claims for performance, and most analysts say the in-line versus post-processing argument is a red herring. Meanwhile, Data Domain has definitely upped the ante, at least a bit, in terms of the performance and capacity offered on its machinery.

Ironically, though, other features of the new gear are apt to get as much attention as the performance and capacity upgrade. The DD690 is capable of "fanning in," or handling simultaneous replication, for up to 60 remote sites at once, versus up to 20 sites on the DD580. Users can opt for a VTL or NAS version via the vendor's own DDX hardware, or use the new systems as gateways to their own NAS or VTL hardware. Data Domain also claims its new system furthers a trend of using fewer drives to achieve de-duplication, compared with other suppliers' wares.

"We are CPU-centric, so we get 50 percent-plus faster when Intel controllers are upgraded," says Brian Biles, VP of product management at Data Domain. "We've increased throughput by 10x in four years and improved capacity by about 30x."

Data Domain has matched the performance increase on the DD690 with new pricing: The system starts at about $210,000 for 16 Tbytes. The DD580 costs about $120,000. Notably, Data Domain is throwing in a 10-Gbit/s Ethernet NIC with the new product and is adding the NIC to other high-end systems as well.

At least one customer is pleased with the new wares. According to Eddy Navarro, storage team leader at J. Craig Venter Institute (JCVI), a genomic research group, purchase of the new system has freed the scientific organization from reliance on tape. "Before, we backed up on LTO2 and LTO3 tapes, and the cost of procurement of media, plus tape drive failures and read failures were getting to be too much," Navarro says.The freedom from backup hassles is important in Navarro's network, which contains both NetApp NAS and an EMC SAN, both of which are used to back up genomic data accessed by scientists at JCVI's Rockville, Md., headquarters and remote San Diego office. Now, rather than using NDMP to back up physical drives one-by-one to tape, Navarro can backup everything in one pass, since the Data Domain appliance presents multiple virtual drives, and it can do so more efficiently than before.

"The ability to create multiple virtual tape drives should get us to a near 100 percent backup success rate, which was not possible before due to the backup jobs fighting for physical tape resources and jobs thus timing out when the backup window ended," Navarro states.

When Navarro's group started evaluating de-duplication back in January 2008, these kinds of issues, not "in-line versus post-processing," were major concerns. Indeed, after evaluating "all the market players" in de-duplication, JVCI decided on Data Domain mainly because the vendor could provide a gateway version to JVCI's NAS and SAN.

JCVI backs up its NetApp filers via Symantecs NetBackup 6.5.1 software. The Data Domain gateway is viewed by NetBackup as both a normal storage unit via its VTL interface, as well as a disk storage unit via an NFS share. The NetApp filers, which serve the genomic data accessed by JCVI scientists, back up the data via NDMP (Network Data Management Protocol), to the Data Domain gateway over the JCVI SAN. The Data Domain gateway in turn uses SAN storage presented from an EMC CLARiiON storage array.

A key point for Navarro is the DD690 can handle lots of virtual drives, instead of queuing up multiple physical drives for backup via NDMP, which was the case before.At the very least, today's announcements gives enterprise customers fresh options when it comes to Data Domain's already strong suite of products. "It's another performance/capacity proof point for Data Domain," states analyst Heidi Biggar of the Enterprise Strategy Group. "They've always made their intentions clear to be an enterprise play -- and this announcement gives them the capacity and performance enterprise customers are looking for."Have a comment on this story? Please click "Discuss" below. If you'd like to contact Byte and Switch's editors directly, send us a message.

  • Data Domain Inc. (Nasdaq: DDUP)

  • Diligent Technologies Corp.

  • Enterprise Strategy Group (ESG)

  • ExaGrid Systems Inc.

  • IBM Corp.

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like

More Insights