• 04/15/2009
Frozen Data

Do you know how much of your data changes and how often? Discovering that information could influence how you plan your storage infrastructure
We've all seen the stats that say the vast bulk of enterprise data doesn't change after 90 days. The most commonly used statistic is around 75 percent to 80 percent of enterprise data never changes after about three months. Everybody seems to accept those numbers as accurate, and those statistics have been used to fuel the boom in low-cost secondary storage, multi-tier storage architectures, data migration technologies, and a host of other products and services. I've been hearing another number that is even more interesting -- that the majority of enterprise data never changes once it has been created.

One of the fun parts of interviewing industry people is asking them what they've been hearing as they talk with customers or other vendors or analysts or resellers or integrators. Once you get them off of their programmed sales pitch, they like to swap gossip, rumor, and information as much as teenage girls. Some of what I hear I can't use since it can't be confirmed, isn't important, is mainly badmouthing a competitor, or is passed along as background information.

But it can serve as ideas for columns. Here is one that I've heard from several vendors executives who said they heard it from others that did the actual research: More than half of enterprise data never changes -- ever. Recently I've heard several versions of this -- the majority of enterprise data doesn't change after a day or 65 percent doesn't change after a week. But the most interesting version says more than 50 percent of enterprise data doesn't change at all after it is created and stored. While the data may be accessed and viewed all the time, it doesn't change -- which has some interesting implications and raises some important questions.

I found that stat somewhat surprising until I thought about it a little, and the more I thought about it the more it made sense. Think about all of the files and emails and other types of documents that you create that many others look at but is never changed. That probably applies to most of the emails you send and receive. And then I wondered whether most companies had any kind of a handle on how much of the data they create and collect never changes. If they did know, would that -- does that -- affect the kind of storage they buy and how they manage it? If that number is true, and storage administrators and IT managers knew it to be true, would that mean we will see a continuing boom in low-cost commodity storage and data migration software and archiving platforms and all of the other technologies used to stash away data that needs to be accessed but isn't going to be changed? Would they implement a different kind of storage infrastructure?

I don't know the answers to those questions, nor do I know if it is true that most enterprise data doesn't change. So I ask you, do you have accurate information on how much of the data you manage is changed after it is created? Do you know when it was changed? Can you tell at what point in time that the employees in your company stop changing a file or document or other form of data? Some of you have software that tells you when a piece of data hasn't been changed or accessed after a set period of time and is a candidate for migrating to a lower tier of storage or into an archive. But do you really have a good handle on data change in your enterprise?

