HP Dedupe Comes Of Age

According to HP, more than 50% of information within an organization is unconnected, undiscovered and unused, but it says it can help harness the power of 100% of this data to drive insight, foresight and action. HP is making a number of storage-related--Information Optimization--announcements, including the debut of what it calls deduplication 2.0, a market where it has had minimal participation but now expects to pose a significant challenge to the dominant vendor, EMC.

November 29, 2011

6 Min Read
Network Computing logo

According to HP, there is a massive information gap--with more than 50% of information within an organization unconnected, undiscovered and unused--but it says it can help harness the power of 100% of this data to drive insight, foresight and action. The world's largest IT vendor is making a number of storage-related--Information Optimization--announcements, including the debut of what it calls "deduplication 2.0," a market where it has had minimal participation but now expects to pose a significant challenge to the dominant vendor, EMC.

Just over a week ago, HP reported fiscal 2011 results of non-GAAP net revenue of $127.4 billion, with the Enterprise Servers, Storage and Networking group up almost 10% year-over-year to $22,241 billion. The storage portfolio showed revenue growth of 4% year over year.

As part of its storage news, the company is announcing the HP B6200 StoreOnce Backup System, the "industry’s fastest large-scale deduplication appliance at 28 Tbytes per hour," with more than three times the rate of competing systems and able to eliminate multiple backed-up versions of the same information, lowering storage capacity requirements by up to 95% and delivering rapid data recovery without impacting daily workflow. The new Data Protector software integrates the HP Labs StoreOnce deduplication engine to deliver data backup for smaller IT environments.

Targeted at midsize clients, the X5000 G2 Network Storage System, designed by HP and Microsoft, can support over 10,000 users on a single system and can be expanded to more than 100 Tbytes of capacity. The B6200 StoreOnce 48-Tbyte Backup System is available with a list price starting at $250,000. The X5000 G2 starts at $30,229.

HP hasn't been doing much in data deduplication, says Arun Taneja, founder, president and consulting analyst, Taneja. He says HP announced StoreOnce for the low end of the market but didn't really push it, and seemed to be waiting for tomorrow, waiting for the major event. This is it.

"HP has arrived, at least on the strategy part, and they have arrived big time." EMC was first to market and has shipped a ton of product, but it has a variety of products and has been inconsistent, says Taneja. HP is late to the market, but for the first time it has product that can compete with EMC across the board with better performance and better scalability. "I think it is a major event for HP. Large enterprises will for the first time have an option. We'll see how aggressively they will compete with EMC."

Lauren Whitehouse, senior analyst, Enterprise Strategy Group (ESG), likes the HP approach. "First, they have a modular approach to deduplication … one algorithm applied in backup hardware and software [and, in the future, it can be applied to primary storage systems]. This is not unlike Symantec’s current strategy. The beauty of that is that it allows users to decide how best to apply deduplication to meet performance requirements per workload, and it’s interchangeable. Down the road, it could mean the difference between a cumbersome transfer process and a streamlined one. Data won’t have to be rehydrated and re-deduplicated as it moves from one storage system to the next."

Another thing that differentiates HP is its target deduplication architecture--that is, scale-out. She says it can grow from a single 48-Tbyte two-node couplet to up to 768 Tbytes of storage capacity based on a massive single namespace. "This allows for seamless scalability, enabling nodes to be added to the configuration without downtime. This is in marked contrast to first-phase scale-up deduplication appliances [what HP refers to as Deduplication 1.0] that, after hitting the maximum capacity threshold, require a disruptive forklift upgrade or additional stand-alone appliances to expand—introducing inefficiency, downtime and management overhead issues. This architecture also lends itself to high availability, which HP is delivering. Node failure is eliminated by pairing nodes within a couplet, so the surviving node can take over if its companion node fails."

Lastly, HP did some things differently in its deduplication method, says Whitehouse--specifically, smaller variable-sized chunks used for comparisons and a sparse index to deliver higher redundancy matching using a small, memory-resident index. This approach allows for a high reduction ratio without performance trade-offs. "More importantly, HP promises faster restore speeds. StoreOnce avoids a high degree of fragmentation by not replacing small amounts of duplicate data with pointers to faraway places with no other related data. Data is also defragmented after deduplication. The result is that restoring data takes less time because reconstituting it does not require many slow random seeks."

Whitehouse says the data dedupe market is in flux. The first wave of solutions were mainly hardware-based, but now deduplication is available in backup hardware and software solutions, so there is more choice. "HP can satisfy requirements for both hardware and software in a single solution. Additionally, they’ve leap-frogged competitors with backup performance [up to 28 Tbytes per hour), restore performance [28 Tbytes per hour) and price (HP claims its solution costs 20% less than rivals'). HP has a good footprint in the market. Their offering should get people’s attention and be disruptive."

Wikibon founder and consulting analyst David Vellante says HP's Information Optimization will have a heavy dose of Autonomy (the infrastructure software developer acquired a month ago for just over $10 billion). "Deduplication 2.0 is a big part of this for several reasons. Including deduplication naturally saves on space and fits into an optimization play. StoreOnce is IP developed by HP Labs, and HP Labs needs more commercial wins. The HP logo was "Invent," and HP needs to get back its invention mojo. StoreOnce is a start; HP was later to the market with its own data deduplication, so this is a big deal for HP customers in that they now have an HP home-grown solution, versus an OEM option or a non-HP option. This allows HP customers to wrap more efficient backup into a larger HP converged infrastructure deal."

The other thing is the vision of StoreOnce--that is, that you can deduplicate everywhere (primary, secondary and backup) without having to re-hydrate data, says Vellante. "This is a futures statement, but on paper HP could deliver on that vision and if it does it could have a big impact."

From a competitive perspective, Vellante says, this is about the emerging storage wars amongst the big whales (HP, EMC, IBM and Dell). "EMC is the dominant player in the deduplication market, with Avamar and Data Domain. It has about two-thirds of the market, and HP is intent on getting its piece of the pie. Just by having a solution under the HP brand, HP will gain share.”

See more on this topic by subscribing to Network Computing Pro Reports Research: Deduplication Grows Up (free, registration required).

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights