Shouldn't Auto-Tiering Also Auto-Optimize?

Automated tiering, the ability to move data between different types of
storage within an array, is quickly becoming a popular feature of
advanced storage systems. While there is a focus on how these systems
will lower costs and increase performance, there is one feature that is
missing: the ability to auto-optimize.

Consider that systems that can auto-tier now have an understanding of
the data types at a sub-LUN level, and in most cases, a sub-file level.
With that knowledge, these systems should be able to compress segments
that have not been used for a given period of time. They clearly already
can move data based on that same knowledge, why not compress it while
you move it? In addition, they should also be able to compare these
sub-file segments to the segments already stored on disk and deduplicate
that data. Again with the meta-data that these arrays are capturing
about these segments they should be able to determine which segments
should be deduplicated and which should not. All that is needed is a
deduplication engine.

Users can reach a proximity of this today especially on NAS platforms.
There are data optimization products that can deduplicate and compress
primary storage in the background, and there are auto-tiering front ends
that can cache and accelerate NAS access. The critical component to make
this happen is that both products would need policy engines that can be
told what to accelerate and what to optimize. You would want the data
optimization product to only optimize data that has not been accessed
for a while and you would want the auto-tiering device to ignore or wait
for optimized data to be re-inflated before moving it into the cache.

As storage systems or the appliances we surround storage systems with
reach a finer grained understanding of the data they house, we should
expect more from them. The capabilities to auto-tier and auto-optimize
are just the beginning.