How Old Should I Let a File Get Before Migrating It off Storage?

In order to determine an age limit, I suggest that three factors be considered: frequency of access, speed of access, and the return on investment on the first two factors.

It is estimated that over half of the data on disk is not accessed after 90 days and over half of that will never be accessed again. The associated management and environmental costs of this rarely accessed disk is estimated at 10x that of the disk. Thus, there can be a strong financial case to move data to the cheapest disk possible.

Archiving and HSM are considered two different approaches to a similar goal. Typically, an archive is thought of as a point-in-time backup, which is retained for years. An archive is typically generated by completing a successful backup and then deleting the selected data from the hard disk. This data can then be manually recovered from the archived media when needed.

HSM is used to automatically move data between high-cost and low-cost storage. The movement of data is based on one of two configurations. The first configuration is based on two retention parameters; the amount of time data is allowed to reside on expensive disk before it is moved to cheaper disk and the amount of time it can reside on cheaper disk before being deleted. The second configuration is based on thresholds in which the oldest data is migrated once the disk occupancy exceeds a high threshold and then stops migrating once the occupancy level hits a low threshold.