Timecruiser aims for big savings by using de-duplication with RAID 6

August 18, 2007

5 Min Read
Network Computing logo

Educational service provider Timecruiser is looking to slash its ongoing storage costs by overhauling its backup processes with de-duplication.

The vendor has deployed the technology as part of an effort to cut the cost of expensive RAID 6 hardware. By significantly reducing the amount of data it stores, Timecruiser is looking to shave almost a quarter of a million dollars off its hardware costs between now and 2010.

Timecruiser, which stores emails, course materials, and student coursework for around 80 U.S. colleges, currently backs up over 3 Tbytes of customer data on a regular basis. The Fairfield, N.J.-based company is required to store emails between one and three years, and academic data has to be kept for up to seven years, according to CTO James Wang.

The sheer volume of backups, along with expectations that data volumes are set to explode, had Wang keen to save on the costs of RAID 6 hardware. "The amount of [RAID 6] storage required to store all this historical data would be too great for us."

RAID 6 extends RAID 5's support for redundant arrays with additional parity to protect against the potential for multiple drive failures. (See Whats the Buzz I'm Hearing About RAID 6?) But even though RAID 6 is being deployed more widely than ever, the technology's cost per Mbyte comes at a premium, thanks to its requirement for an extra controller and at least four hard disks per instance. (See Capacity Considerations and Adaptec, Intel Team on RAID.)Wang knew he'd have to reduce the amount of data heading for RAID 6 drives in order to save on capital costs. His group therefore deployed a FalconStor VTL and RAID 6 Single Instance Repository (SIR) device for de-duplicating data earlier this year. (See FalconStor Launches SIR.)

De-duplication, which aims to reduce the bulk of backed-up storage by ensuring that the same information is not stored in two places, is growing in popularity. (See Insider: De-Dupe Demystified, Data Domain Goes Public, Storage Bubble Wrap, Quantum Gains on Rival's IPO, and De-Dupe Vendors Shake Hands.)

For Timecruiser, the savings offered by de-duplication could easily reach $80,000 a year, says Wang. "It could save us having to buy three to four arrays a year [and] it's $15,000 to $20,000 an array," he explains. "That's a lot of money."

Prior to this year, Timecruiser backed up its customer data from a 7-Byte FalconStor IPStor appliance to a 7-Tbyte RAID 6 array, also from FalconStor, although neither supported de-deduplication. (See RKB Installs FalconStor and FalconStor Helps Power DR Facility.)

Now data is backed up from the IPStor appliance to the VTL. From there it is sent, via a Fibre Channel link, to the SIR device, where it is de-duplicated.Both the VTL and the SIR have 7-Tbyte capacities, although Wang sees de-dupe as a way to significantly slow down his data growth. "We have used it in production for roughly three months, and we see a 4-to-1 de-dupe ratio," he says, adding that this should improve over the coming months. "I hope to see a ratio of 10 to 1, because, as more data accumulates, more of it can be de-duplicated."

Although he would not reveal specific pricing, the exec confirmed that he spent "less than $100,000" on the VTL and SIR device, including a discount, and expects to get a return on his investment within a year.

Timecruiser, which predominantly relies on FalconStor storage, did not evaluate de-dupe offerings from other vendors in this space, although Wang told Byte and Switch that he is content with his choice.

FalconStor, like rival Sepaton, uses a form of de-duplication called post-processing, which takes place after the data has been received from the back-up servers. (See Sepaton Adds De-Dupe to VTL and Sepaton, Hifn Partner.) Other vendors, such as Data Domain, offer a form of de-dupe called inline processing, which takes place as data is being received from the backup servers. (See Data Domain Goes Public and Data Domain Unveils DD580.)

The big benefit of post-processing is that it is less likely to slow down backups, although it is seen as a better fit for enterprises with extra disk capacity to store the data until it is de-duped. (See Quantum to Offer De-Dupe Duo.)This is not something that worries Wang. "I am comfortable with it because I know that our VTL technology is very reliable."

The exec nonetheless admits that de-duplication presents some unique challenges, particularly for a firm handling sensitive academic data. "With the SIR de-dupe, we only have one single copy, so the security of that copy becomes critical," he says, explaining that, previously, full copies of all the data were performed on a daily basis.

In an attempt to build additional layers of security into this process, Wang and his team are looking to add remote replication to their de-dupe infrastructure, possibly replicating SIR data to an offsite PrimeVault device, also from FalconStor. (See FalconStor Focuses on China.) "We're testing the compression ratio to make sure that it is feasible with that much data," he explains.

The exec is also tweaking other parts of his data center infrastructure. The array replaced by the SIR device, for example, has been re-configured as a RAID 10 appliance, and is now being used to support one of Timecruiser's databases. "RAID 10 offers higher levels of reliability [than RAID 6] because data is mirrored within the array."

— James Rogers, Senior Editor Byte and Switch

  • Data Domain Inc. (Nasdaq: DDUP)

  • FalconStor Software Inc. (Nasdaq: FALC)

  • Quantum Corp. (NYSE: QTM)

  • Sepaton Inc.0

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like

More Insights