The Types Of Automated Tiering
March 09, 2010
Automated tiering, the transparent movement of data between tiers of storage, has several methods of delivery. There is some disagreement as to which is the "real" automated tiering. I'm not sure if from a storage manager's perspective it matters, but understanding the types of automated tiering will help you select the best method for your data center.
In reality, almost anything that can move data can claims to be an automated tiering solution. Even Hierarchical Storage Management (HSM) software has some form of automation in it, though if you let it move data, it would in most cases leaving stub files behind. In general, the current automated tiering solutions tend to be more active than a scan-based HSM software, and most do not require the use of stub files.
You could also claim that a storage cache, to some extent, is automated tiering. Data is moving from disk to another tier, RAM inside the array. In fact, several automated tiering solutions are just that. They are cache-based systems that migrate active data as needed, but they are external from the array. The automated tiering versions of these solutions typically leverage SSD or even 15K SAS drives to deliver significantly higher cache sizes. Many can be used for both read and write activity, then trickling data back to the traditional storage as needed. These cache based solutions are now available for both NAS and block-based needs. In almost every case, these solutions are being offered as extensions to existing storage solutions. Think of them as a way to improve the performance of your SAN or NAS without replacing it.
Beyond the cache-based systems, there are systems from suppliers that are more persistent in nature. The various tiers of storage are managed by the automated tiering system to place data on the most relevant tier of storage for its access pattern. Very active data, for example, goes on SSD, slightly active data on SAS or Fibre hard disk, older data goes to SATA drives and in some cases very inactive data can go to power-managed SATA drives. As is the case with cache based systems, there are solutions available for NAS and SAN configurations.
As we discuss in our article "What is File Virtualization?," the NAS implementations of these may be known more commonly as file virtualization or global file systems. In SAN, this is what many vendors believe the sole definition of automated tiering should be. Cache-based solutions are typically focused on improving performance; that means that the rest of the infrastructure has to be tuned to take advantage of the additional performance. Persistent solutions can deliver greater performance, but they also can deliver a cost savings by moving data to less expensive storage tiers as warranted. In our next entry we will explore some of the downsides of automated tiering, and how to avoid them.