George Crump


Upcoming Events

Cloud Connect
Santa Clara
Feb 13-16, 2012

Cloud Connect brings together the entire cloud eco-system to better understand the transformation we're experiencing and promises to be the defining event of the cloud computing industry. Learn about the latest cloud technologies and platforms from thought leaders in Cloud Connect’s comprehensive conference.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

Deduplicating Replication - Atempo

Enterprise Backup Software manufacturer Atempo has entered the deduplication marketplace, as have a growing list of other software manufacturers. Hopefully, we can cover all or most of them in future entries. Atempo has integrated deduplication as part of its Time Navigator agent software that is installed on a server to be backed-up. Atempo deduplication is source-side, meaning not only does it reduce the amount of data stored on disk, it also reduces the amount of data transferred across the network. The agent that includes deduplication has been available for Linux, Macintosh, Windows, AIX and HP-UX operating systems. They recently added support for Solaris, Windows 2008 and Windows 7.

A concern that I have had with source side deduplication is the additional load that is being placed on the servers being backed-up. The process of identifying duplicate information can place a burden on the local system. Atempo says they have not seen any reports of significant impact. Time allowing, I'd like to get these applications in the lab and see what the impact is. This is something worth testing.

Atempo's approach has taken the traditional three-tier architecture of backup applications (client, server, device) and added a fourth that they call HyperStream Server. HyperStream is software only; you supply the server hardware and storage for it to use. The agent, if it is using deduplication, sends its data to the HyperStream server which uses, stores and manages the deduplicated data repository. The agent will verify with the HyperStream server prior to sending new blocks if the data already exists. The agent can also compress the block prior to sending it as well. If the HyperStream server does have a prior copy of the data, it updates the reference and tells the agent not to send the data block. If it does not have that data block, the agent sends the data block, and then HyperStream stores the data and updates its reference database.

HyperStream Server is controlled and managed from Atempo's server software Time Navigator. Deduplication is optional, so if you choose to send data directly to tape or standard disk, that data is sent directly to the Time Navigator server. Time Navigator and HyperStream can co-exist on the same physical server but there are two separate storage areas for the data types. When it comes to replication, one HyperStream server can communicate to another. Similar to the client software, it can compress data before it sends the information to the next remote HyperStream media server. Like the client, it only sends unique information to the remote HyperStream server. At this point, the replication is one-to-one; Atempo does not yet have the ability to fan in multiple sites to a single site.

Assuming there are enough client CPU cycles available for source-side deduplication, there are clearly advantages to not having to send all the data prior to finding out if it already exists and is going to be discarded. The addition of the fourth tier, in this case HyperStream, is really not that much different than adding a deduplication appliance. If you are re-evaluating your backup software then looking at one with deduplication makes sense.

Related Reading


More deduplication Insights



Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Data Deduplication Reports

Research and Reports

Hypervisor Derby
August 2011

Network Computing: August 2011

TechWeb Careers