Automated storage tiering and caching is a method that storage system vendors have used to leverage the performance of solid-state storage for active data, while taking advantage of the economics of hard disk drives for inactive data. With these systems, inactive data segments are automatically moved to the appropriate tier of storage based on access, greatly improving performance, but not delivering perfect performance--there are still bottlenecks that solid state exposes.
Like any performance improvement project there is always another bottleneck to address; in this case the bottleneck is the storage network. Although 16GB fibre channel and 10GbE address much of the performance latency, even they add overhead as data crosses the infrastructure. Also of course, most data centers are not fully upgraded to the higher-speed networks, so their latency will be worse. Finally, networks bring a bit of unpredictability to performance depending on their own level of congestion.
There are a couple of solutions to the networking latency problem. One is to build a private server-to-server network that integrates storage at an extremely high speed. We'll explore this option in an upcoming column. The other solution is to establish another tier of storage, a server-based tier that leverages PCIe solid-state storage. With this method the storage system extends the tier or cache directly to the server so that data is accessed locally, saving a trip down the network.
As I discussed in What is Server Based Solid State Caching?, to a large extent this fix can be accomplished with one of the caching software and hardware solutions that are available from several vendors. These solutions integrate with a server-based solid-state device to provide high-speed local caching of data. Most are read-only caching to ensure functions such as server migration in a virtualized server infrastructure will still work correctly. Even a read-only configuration can greatly improve performance because reads are all local to the server, and it also clears the storage network for write-only traffic. This capability alone might eliminate an immediate need to upgrade the network infrastructure.
Server-based tiering will take caching a step further by placing data on the server instead of caching it there. This also means that writes initially could be stored on the local server's PCIe Flash card and then de-staged to shared storage based on policy. The intent would be to get the most-active data as close to the server as possible on the fastest performing storage possible.
Potentially the most important component of server-based caching is it gives the storage system intimate knowledge of what type of data the server is storing and what to do with that data. For example, the storage system could hold writes for a longer period of time so that larger segments could be written and some of the randomness of I/O could be smoothed out. Server-based tiering software also could group data segments from particular virtual machines or even physical servers together so that writes and reads could be more efficient and reduce head thrashing.
Almost every major vendor has either announced or at least hinted that it is developing technology that will move active data closer to the server. In 2013 expect an onslaught of these solutions and the job of having to consider which approach is best for you and puts data at the least possible risk.
Track us on Twitter
It's time to get going on data center automation. The cloud requires automation, and it'll free resources for other priorities. Download InformationWeek's Data Center Automation special supplement now. (Free registration required.)