Storage caching systems leverage memory-based storage (flash or DRAM) installed in the server or on the network to off-load I/O from the traditional shared storage system and storage network. Caching systems automatically place recently accessed data into the memory-based storage area. These systems have the potential to prolong the life of the storage network and overall storage system. They are also a very popular first step on the path to the solid state data center. As a result competitive systems have flooded the market. Here is what to look for.
Many of the caching systems on the market tend to focus on a particular platform. Some, for example, only accelerate VMware environments, others a specific operating system. The key is to find one that will fix your particular I/O problem. For example, if your problem area is VMware, then look for caching systems that specifically support VMware. If you have a Microsoft SQL performance problem then you may be better served by finding a file aware or even SQL aware system.
[ What else do you need? Read Is Scale-Out Storage A Must Have? ]
In addition to supporting the specific platform you need to accelerate, the other consideration is the storage protocol that the platform uses. Caching is a relatively low level I/O activity so these systems need to understand what is happening at a protocol level. As a result, you will see that each of the caching systems will often support a specific protocol. For example, if you are hosting your VMware images on NFS, then look for a caching system that will support NFS.
In-Server or In-Network
An area of confusion with server caching systems is where should the caching occur? Some systems leverage memory-based storage in the server and others are network based. The network-based systems were originally a caching appliance installed between the servers and the storage. It acted as a shock absorber for read traffic. Increasingly, though, we are seeing the systems that network the flash storage that is already installed in the servers, essentially aggregating them into a common pool of storage.
The obvious differentiator between the implementation types (server vs. network) is that in-server systems are less dependent on the speed and quality of a network in order to maintain performance. But the capacity of memory storage of in-server systems may not be used as efficiently since that capacity is captive to a single host. Also, in-server systems can have performance issues in VMware environments when a VM is migrated. Network systems may introduce latency but they are more resilient to flash or server failure and typically have less issues when VMs are migrated to other servers.
Block vs. File
A final consideration is if the cache is going to be file based or block based. Block-based systems work independent of the files being accessed and move the most active blocks of data into the cache. This capability makes implementation easier in a virtualized environment because the cache can work across VMs.
File-based systems are more aware, and allow for specific files to be monitored and accelerated, in some cases even pinned to the cache storage area. Using a file-based cache may mean either installing inside the caching software inside the Guest OS or leveraging a separate NFS share, but they may be more efficient since they can focus only on the files that need accelerating. As a result, they may require a smaller investment in SSD capacity and therefore less expensive. Some manual interaction is typically required to get this efficiency so there has to be available IT staff to fine-tune them.
Off-storage system caching is a crowded market and we encounter another new vendor at least once a week. There are plenty of good off-storage caching systems but no perfect solutions. The key for product selection is to look for a system that covers your specific performance need. Until the market matures you may also find it better to use two or three systems in your data center based on what needs acceleration.