Backup & Archive: Not Synonymous
Research says many organizations don't know the difference
October 7, 2006
Do you know your backup from your archives? A slew of recent vendor-sponsored surveys shed doubt on whether you do -- or if you do, whether you're actually doing either task correctly.
A survey by Enterprise Content Management Association of the Association for Information and Image Management (AIIM), indicates storage consumers are basically ignorant when it comes to archiving."Most organizations confuse email backup -- being able to reconstruct a system from a specific point in time in the event of a failure -- with email archiving," the report says. "Email archiving points to the need to identify what needs to by saved, why it needs to be saved, and putting in place the technology resources necessary to archive email and be able to reproduce it in the event of an inquiry or litigation."
The survey report states that 45.9 percent of the 1,000 organizations surveyed consider email archiving the responsibility of individual employees. Another 25.5 percent consider it part of an overall information management strategy, compared to 8.4 percent who see it as a standalone application. AIIM's report on the survey says most organizations consider archiving a collection of massive .pst backup files.
In another survey of 533 securities companies conducted by compliance management firm Orchestria and data protection vendor Iron Mountain, 62 percent of respondents said they have no way to efficiently identify and search for emails in their archives.
Yet another survey, this one by archiving software vendor BridgeHead Software, found most companies at least paying attention to archiving. (See Compliance Ranks Third.) 52 percent of the companies surveyed by BridgeHead said they would look at email archiving and 67 percent would look at file archiving over the next year. But BridgeHead CEO Tony Cotterill suspects many of the archiving planners are merely considering tape backup without search and indexing features."One area of potential concern is that the high response for archiving with tape might actually come from IT departments continuing to use tape-based backups as their strategy for long-term data retention," Cotterill stated in a release regarding the survey. "Backup simply doesn't have the granular data management functionality to address compliant data-level retention, destruction, access control, or authentication needs over long periods."
The notion that organizations confuse archiving and backup isn't new. The July 2006 Byte and Switch Insider report on email archiving also raised the issue. (See Insider: Email Archiving Hits Bottom.) According to that report, backup involves making point-in-time copies of data to protect against hardware failures or catastrophic data loss and includes operating systems as well as applications.
Archiving, in contrast, requires fast file-level access to data and should involve search and retrieval software and indexed repositories. Where backup volumes are usually kept for days before being replaced by new volumes, archived files can be kept for years or even decades.
Okay, the recent surveys were done by storage or compliance vendors, so you can take them with a grain of salt. But is it really the case that savvy professionals don't know the difference between backup and archiving?
Maybe. Sometimes. Tom Pacek, VP of technology at hospital network Virtua Health, doesn't buy the notion that organizations don't know the difference."I don't believe that system administrators or data centers confuse backups with archiving," Pacek says. "I do believe that with disk becoming so cheap, many people do not believe they need an archive. They keep everything on disk. I believe that backup solutions vendors and disk storage vendors try their best to create confusion or perception that archiving is not needed."
Pacek says tape still has a role in archiving, but disk is required for certain applications. Virtua Health uses Mimosa NearPoint for email archiving and StorageTek tape libraries for archiving hospital data. xxx backs up its large PACSs images to disk offsite.
"Tape is definitely a viable solution for applications that do not require large files to retrieved instantaneously, "he says. "Applications that can pre-fetch data ahead of the time that is really needed works fine via tape. Even smaller data files requiring seconds to retrieve and display works well on tape. PACS, images, however, are so large and require so much bandwidth for transport that it really requires fast disk for the application to be useful."
But when IT administrators know the difference, they may have to take time to explain it to their users.
"They're two separate processes, but people sometimes get confused with the terminology," says Pethuraj Perumal, IT manager for software-developer Synopsys. "Users in my organization get confused, too."Perumal says he uses Data Domain appliances for disk-based backup, and his group archives data to tape with Symantec's Veritas Net Backup. "I tell my users to archive special data that they have an agreement to keep for some time," he says. "They probably don't need to access the data any more now, but because of agreements they have to keep it. We typically offer one-year, three-year, and five-year archives, but we also have seven- and even 11-year archives for financial and tax-related data."
Paul Massiglia, a former CTO at Veritas and currently chief technology strategist at NAS startup Agami, traces the confusion to the terms themselves. "I dont like the archive tag. Let's call it 'data that's not highly active transactional data,'" Massiglia says.
"Backup and archiving are probably obsolete terms. Enterprise data is evolving to several types. There's data that I need to conduct my business every day; data that I need to keep online that when I get queries against it those queries need to be satisfied in minutes as opposed to days; and data people need to keep online but is not frequently accessed."
As with backup, the biggest thing about archiving is retrieving data. The difference is, organizations have to pick specific records from the archives, which makes search a big piece of the picture.
"It's one thing to store it all, but they expect us to find it all, too," Statistics Canada assistant director of IT Guy Charron said last week at the Storage Decisions trade show.Charron, whose government agency is required to keep Canadian census data for 92 years, says one objective of archiving data is to weigh the risk of under-retention against the risk and cost of over-retention. "The risk is either you're going to keep something too long, or delete it too soon."
— Dave Raffo, Senior Editor, Byte and Switch
AIIM - The Enterprise Content Management Association
Agami Systems Inc.
BridgeHead Software Ltd.
Data Domain Inc. (Nasdaq: DDUP)
Iron Mountain Inc. (NYSE: IRM)
Orchestria Corp.
Symantec Corp. (Nasdaq: SYMC)
Synopsys Inc.
You May Also Like