Scale-out storage systems promise to be able to scale as an organization's data storage and capacity demands increase. The problem is that datacenters do not scale out. When you run out of room in a datacenter you have two options: build a new one, or set a policy that "for everything that comes into the datacenter, something must come out." How can you meet scaling demands without having to build new datacenters?
Scale-out storage can bring a lot of value to the datacenter. It can meet capacity demands while also meeting performance demands. More importantly, it can meet those demands without requiring expensive storage system upgrades and migrations. But you need to scale (out or up) intelligently.
Do you really need a scale-out storage system?
The key first step is to make sure you really need a scale-out storage system. If your needs (both capacity and performance) for the next five years or so can be met with a scale-up storage system, you may find that to be an easier and potentially less expensive (upfront) approach. However, if your business, in terms of capacity and performance, is growing rapidly, then scale-out may be for you.
[Learn more about scale-out vs. scale-up storage. See Is Scale-Out Storage A Must Have?]
Understand per-node performance
The next step is to make sure that your scale-out storage vendor can provide efficient per-node performance and capacity. Some scale-out storage systems limit one or the other and force you to add nodes prematurely. They focus on aggregate performance and capacity. Remember that each node takes up space; while a 1U node may seem harmless at first, in a rapidly growing environment those nodes are added so rapidly that their consumption of space, power, and cooling becomes a major challenge.
I've seen scale-out storage vendors claim 1 million I/O per second, but when you review the configuration you realize it took them 100+ nodes to get to that level. By comparison, recent scale-out, all-flash array vendors can get you to 1 million I/O per second in about 4-6 nodes. Not only does this save you space, power, and cooling, it also saves you dollars. The sheet metal alone for that 100+ node configuration would be more than the cost of the 4-6U configuration.
Grow primary storage slowly
Most modern scale-out data storage systems can seamlessly scale performance and capacity. But just because they can does not mean that you should. Primary storage, even if it is scale-out, is still the most expensive form of storage in the datacenter. Modern disk backup and archive solutions allow for easy, transparent, and native movement of data from primary storage to a less expensive tier. Integrated disk/tape archive solutions allow data to be easily moved from primary storage to archive storage.
Most users can easily handle the concept of data being in one of two places, so tell them that if they can't find their data on "/Primary" they will find it on "/Archive." If you think about it, they are already doing this today with Dropbox. If they can move files to Dropbox (which you also need to control), then they can move and find files on "/Archive."
The next question is "should my archive tier be scale-out?" and the obvious assumption is yes. But, as we will discuss in an upcoming column, a strong case can be made for a scale-up disk archive tier backed by tape.
Solid state alone can't solve your volume and performance problem. Think scale-out, virtualization, and cloud. Find out more about the 2014 State of Enterprise Storage Survey results in the new issue of InformationWeek Tech Digest.George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for datacenters across the US, he has seen the birth of such technologies as RAID, NAS, ... View Full Bio