Howard Marks

Network Computing Blogger


Upcoming Events

A Network Computing Webinar:
Avoiding Downtime: How Virtualization Can Help In Times of Trouble

June 12, 2013
11:00 AM PT / 2:00 PM ET

Are you caught between a desire for the benefits of the cloud and concerns about security and control? Then you should attend this insight-packed webinar to learn how private data networking technologies like MPLS IP-VPNs can address your concerns and allow you to safely and intelligently reap the savings, agility and other benefits associated with cloud computing.

Join us to hear top industry experts discuss the private data network technologies that are best suited for enterprise cloud access requirements. You won't want to miss this opportunity to learn how your organization can best mitigate risk while reaping the full potential benefits of the cloud.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

See more from this blogger

Which Scale-Out Storage Architecture Is Best?

In a previous blog post, I got all prophetic and said that I had seen the future of solid-state storage and that it was scale out. The sheer performance that even a small number of solid-state disks can deliver is just too much for a traditional scale-up storage controller to handle. The question then becomes: If the future of storage is scale out, which scale-out storage architecture is best?

Of course, scale-out storage architectures aren't just for solid-state, scale-out NAS, and object-storage systems have been solving big data problems of one sort or another for a decade. Over time, we've seen vendors introduce several different scale-out architectures, each one attractive in its own way.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

The simplest approach is to use a shared-disk clustered file system, like IBM's GPFS or Quantum's StorNext, to build a scale-out NAS system like IBM's SONAS or Symantec's FileStore. These systems use a central SAN array to hold data managed by multiple NAS heads. The scale of the systems and their performance are limited by the shared storage array. While clustered file systems are a good solution for organizations needing large, fast file repositories, they're not the answer for solid-state storage, where our problem is in the array controller as much, if not more, than it's the file system.

What we need for scale-out flash is a "shared nothing" cluster that allows us to add nodes to the system without any single choke point, like the shared array in a clustered file system. As we look at shared nothing storage systems, both all-solid-state and those based on spinning disks, we see two quite different architectures.

Some vendors, like SolidFire in the all-solid-state market, HP's Lefthand, and most object storage systems, build up their storage clusters from independent storage nodes. To allow the system to survive the loss of a storage node, they mirror data across two or more nodes in the array. This approach keeps the node hardware simple, usually using off-the-shelf servers, but since all the data is mirrored across multiple nodes they have to store at least two copies of all the data. As a result, they're not terribly space efficient. While disk space is cheap, SSDs are less so, which may push solid-state vendors to the twin model.

While storage systems for archival data can use cross-node RAID or even better erasure coding to distribute data across multiple nodes without the overhead of mirroring, these approaches aren't well suited to the low-latency, high-IOPS applications that solid-state storage systems address. EMC's Isilon uses a combination of mirroring for files or folders that are accessed randomly and erasure coding for older files and those that will be accessed sequentially, like media files.

Rather than using a simple storage node, like rack-mount servers, which have their own single points of failure as their building blocks, twinned systems like Dell/EqualLogic and NetApp build a cluster of dual controller systems. Since each storage node has two controllers and a block of storage, it can use RAID for data protection, keeping overhead down. System designers can also build hybrid scale-up/scale-up by adding drive shelves to the controller pair.

The downside to paired systems is their behavior in the event of a controller failure. When a controller fails in a paired system its twin has to take on its workload, which may cause a significant performance loss. Most peer systems distribute the second copy of data from a single node across all the other nodes in the cluster, so a node failure will have a smaller impact on performance.

The unanswered question for all-solid-state systems is how vendors balance the cost of additional SSD capacity for single-node systems with the cost of additional controllers in a twinned system. This will become apparent as we see scale-out twin systems from vendors like Pure Storage.

Disclaimer: Dell, HP, NetApp and SolidFire are or have been clients of DeepStorage LLC.


Related Reading


Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 
Vendor Comparisons
Network Computing’s Vendor Comparisons provide extensive details on products and services, including downloadable feature matrices. Our categories include:

Research and Reports

May 2013
Network Computing: May 2013


TechWeb Careers