The state of the art in using server-mounted SSD as a cache has advanced significantly in the past year. The PernixData Flash Virtualization Platform from startup PernixData takes things up another notch, delivering a distributed, replicated write-back cache at least a year earlier than I was expecting.
The vast majority of server-side caching products today implement some sort of write-through or read cache. They use flash to accelerate requests to read data that's been read or written recently. However, they don't actually cache writes; instead, they acknowledge writes to the writing application only once the data has been written to the shared storage at the back end.
Write-through caching only improves write performance indirectly, by offloading the head motions that would otherwise be needed to satisfy read requests. As a result, the back-end array has more IOPS available to satisfy write requests.
A write-back cache like that from PernixData acknowledges writes when they've been written to the cache, as long as there's cache space available. The speed of the SSD, rather than the back-end disk, defines the write latency the application sees.
The problem with write-back caching is that servers are inherently unreliable. A standard server has numerous failure points, from main memory to the disk controller or PCIe slot into which the SSD is plugged. When a server with a write-back cache fails (and it will fail) you won't be able to simply restart the workload from that server on another because some of the data is trapped in the cache and hasn't yet been written to the shared storage.
PernixData solves this problem by mirroring the cache data to at least one additional server. When a server fails, the other server holding the cache data notices the failure and flushes the dirty cache blocks to the shared storage device. By the time the virtual servers restart, the data on the shared storage system will be up to date, so they can start servicing users immediately.
The other advantage of this approach is that it improves the performance of VMs that are migrated from one host to another. Many server-side caching solutions install in the VM's guest OS and break the vMotion paradigm to some extent. For instance, software from San Disk and Proximal Data manage a separate cache for each host server, so while they allow transparent migrations, when a VM arrives at a new host that's running this software, the cache on the new host is cold. Thus, the application on that VM runs more slowly for a period while the cache is populated.
PernixData addresses this problem. Just as it mirrors data to multiple SSDs on multiple servers, it also pools the SSDs to create a single distributed cache. When a VM is migrated to a new host it can still access the cached data in its old server over the network until the local cache populates. Given the high bandwidth and low latency of today's 10-Gbps Ethernet and Infiniband networks, remote cache access can still be several times faster than the back-end disk system.
Architecturally, the Flash Virtualization Platform looks a lot like the Fluid Cache project Dell talked about last summer at Dell Storage Forum. While we haven't heard much since from Dell, the notion of distributed, protected caching is the new state of the art. With more than 20 vendors peddling server-side acceleration, I expect more news in this area soon.
Howard Marks, chief scientist at DeepStorage.net, is a featured speaker at Interop Las Vegas this May, including the conference session "Deploying SSDs in the Data Center." Register here today.