Don't Trade High Availability for Flash Performance
March 05, 2013
As flash-based SSDs revolutionize the storage industry, I thought it might be worth taking a look at how some basic storage system architectures compare when the storage media changes from spinning disks to SSDs.
The most basic storage architecture is essentially a RAID controller with a SAN or NAS target. The controller, whether custom hardware or a standard server, is a single point of failure. As a result, unicontroller systems have been relegated to the very low end of the disk array market, where they're used by SMBs or to hold additional copies of data. The vast middle of the storage market is dominated by dual-controller, modular arrays that will fail over transparently if one controller goes down.
- IBM Analytic Answers for Retail Purchase Analysis and Offer Targeting
- Accelerating Economic Growth and Vitality through Smarter Public Safety Management
Amazingly enough, the move to SSD has resurrected the unicontroller design in the form of rackmount SSDs from the likes of IBM's Texas Memory Systems and Astute Networks' VISX, as well as more feature-rich systems like Nimbus Data's S-Class. The risk of data loss inherent in a unicontroller design might be tolerable for some applications, such as analytics or VDI with non-persistent desktops, but for the vast majority of cases I would find it hard to pay $50,000 or more for a product that doesn't offer high availability.
When asked about high availability, proponents of unicontroller systems will generally recommend a pair of appliances with synchronous replication. If the vendor has done its homework and written a fail-over mechanism into its arrays, a cluster of unicontrollers is available enough for most applications.
Basically, a typical dual controller, active/passive modular storage system is what the systems guys would call a shared disk cluster, much like a typical Windows Server cluster. A pair of unicontroller systems that replicate data is a shared nothing cluster.
In the disk era, unicontrollers were built on industry-standard servers, which offset the additional cost of a second set of disk drives. This meant that unicontroller designs, some of which provided some degree of scale-out as well, like Lefthand's iSCSI array and NexentaStor, have sold thousands of units.
The problem with unicontroller systems in the SSD era is that the flash makes up a much higher fraction of the cost of a storage system than disk drives. In fact, some all-flash unicontroller systems cost as much as competing systems that include HA.
I've even heard vendors suggest that customers buy one flash device and manage HA by using host volume managers or storage virtualization appliances.
But if you mirror in your host computer's volume manager or synchronously replicate from an all-SSD array to a disk-based system to avoid the cost of two all-SSD systems, you give up the performance advantage the all-flash system has on writes. That's because writes will only be acknowledged to your applications after they've been written to both the flash and disk-based systems. This limits application performance to the write speed of the slower disk system.
If instead you asynchronously replicate data across the mixed storage systems through host- or application-level software, you've turned a simple device failure into a full-blown disaster with associated RPOs and RTOs. By contrast, device failure on a true high-availability system would cause no data loss, and at worst a few seconds of failover delay
Users and senior management can accept some downtime, and even some performance loss, in the face of a disaster caused by an external event like a tornado or hurricane. They're a lot less understanding when they are inconvenienced by a problem within the IT department, even if it was the failure of a key piece of equipment.
The only place I can think of where a replicating pair of unicontroller systems might be an advantage would be on a college campus. At the college where I worked, we had two data centers at opposite ends of the campus connected by a loop of 128 strands of single-mode fiber. In an environment like that, a user could put one system in each data center, and get both high availability and disaster recovery with one replicating array pair taking advantage of the lower cost of unicontroller systems.
A year or two ago, speed was the only performance factor that people cared about with all-flash systems; we were so happy with the performance we didn't care about other functionality. But as the all-flash market matures, I'm less willing to sacrifice things like high availability for speed.