One of my many interests is data governance, risk management and compliance (GRC), a subject that doesn't get as much attention as it deserves. So I was pleased to be invited to speak on data retention at the recent Excellence in Governance, Risk Management, and Compliance Conference (EGRC 2013) in Portland, Maine. This cozy conference provided a number of insights on GRC topics as well as the opportunity to meet attendees from both private and public organizations.
Among the issues I addressed in my presentation was data retention management, including both data disposal and data preservation. But I'd like to focus here on the need for data disposal: why it's important to take your data mountain and reduce it to a manageable and useful data molehill.
A recent vendor briefing included the information in Figure 1, and although I have long suspected the truth of it, I had no credible source to make a quantitative rather than a suggestive argument against it:
Figure 1: Data Retention Requirements
Source: 2012 Compliance, Governance, and Oversight Council Summit
The chart shows that 1% of data in an enterprise has to be preserved for litigation hold, and 5% has to be managed to cover compliance requirements. Another 25% is reasonably determined to have current business value. That means a whopping 69% of all data--more than two-thirds--has no value whatsoever!
One might quibble with the figure (what is the underlying research, etc.), but let's apply a "reasonableness" litmus test: For the most part, businesses and their IT organizations mainly focus on what is happening now (current transactions, emails and analyses) and not on the process by which data accumulates.
IT acts as the custodian of data (and usually bears the burden of the cost of storing and managing it), but is not the "owner" of that information. Although a business unit may be the official owner, individual employees act as "stewards" for particular data sets. But what if an employee leaves, with his or her email, Word documents, and the like left as no-longer-used data debris? Who knows and who manages it? The answer is: probably no one.
Reasons For Tackling Data Disposal
Now I will submit a challenge: How important is it to get rid of that useless data?
Assume that 20% of an IT budget is spent on storage and that 70% of your data is of no value to your business. That means 14% (plus or minus, depending on individual enterprise differences) of the average IT storage budget is simply wasted. Calling all CIOs: Does that attract your attention? Now, realistically, even if by some magic all the useless data could be safely disposed of, there wouldn't necessarily be instant savings. Although a lot of disk space would be saved, could an array be sold? Hard to say, and the money you'd get would likely be a lot less than you paid for it (the used car depreciation problem).
Freeing up disk space means future storage purchases could be deferred, but that does not translate to immediate savings. But seeking savings that can eventually be redirected to more productive purposes, such as currently underfunded yet desperately needed IT innovation, is a good reason to tackle the problem.
This issue is a "life goes on" type of problem. That means that while you may be able to live with it for the time being, the continued exponential influx of new data will exacerbate the situation over time, making it increasingly difficult to address.
Moving business to the cloud doesn't fix the issue, but it may force businesses to pay more attention. One of the objectives of cloud computing is to provide IT-as-a-service, where users can select the services they want from a self-service catalog. However, this nirvana comes at a price. Resources allocated to and consumed by users means that chargebacks (or at least showbacks) have to be used. And guess what? Does a business "owner" of data want to pay about $10 for every $3 of data that has some useful value?
Next page: The Challenge Of Data Disposal