11:30 AM -- I chided Dell and CommVault last year for pitching Dell's DL200 backup server (driven by CommVault's Simpana backup, formerly known as Galaxy) as bringing data de-duplication to the masses when it really did file-level, single-instance storage. Now, CommVault's new Simpana 8 suite leapfrogs the major players by integrating block-level data de-duplication into the core data management agent behind the backup and archiving functions of the Simpana suite. As if that weren't enough, they've also extended the de-dupe functionality to tape.
Before sending data to a media server, the Simpana 8 backup agent also performs the blocking and hash calculations and sends the hash values along with the data. The media server then identifies duplicate blocks and stores the data on any disk resource available to Simpana -- DAS, SAN, or NAS.
This approach should use somewhat less host CPU cycles than Avamar or PureDisk, which conduct a more complex conversation with the data store server(s) to identify unique blocks before sending them but will send more data over the net. Simpana will globally de-dupe to minimize network traffic between media servers so remote offices with local backup pools will use less network bandwidth than those without.
When a job spools datasets off to tape, it copies the blocks that contain data from any of the files or other objects in the dataset and creates a new hash catalogue that it writes to the tape. The tape can then be read by any Simpana 8 media server, but restores will require some cache disk space.
Simpana is content aware, seeing all the file system metadata as backups or archive jobs run and taking that into account when dividing the data up into blocks to de-dupe.Howard Marks is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage ... View Full Bio