Storage

01:06 PM
George Crump
George Crump
Commentary
50%
50%

Scale-Out Storage Has Limits

Disk may scale as your storage needs increase, but what do you do when your datacenter is maxed out?

Scale-out storage systems promise to be able to scale as an organization's data storage and capacity demands increase. The problem is that datacenters do not scale out. When you run out of room in a datacenter you have two options: build a new one, or set a policy that "for everything that comes into the datacenter, something must come out." How can you meet scaling demands without having to build new datacenters?

Scale-out smart
Scale-out storage can bring a lot of value to the datacenter. It can meet capacity demands while also meeting performance demands. More importantly, it can meet those demands without requiring expensive storage system upgrades and migrations. But you need to scale (out or up) intelligently.

Do you really need a scale-out storage system?
The key first step is to make sure you really need a scale-out storage system. If your needs (both capacity and performance) for the next five years or so can be met with a scale-up storage system, you may find that to be an easier and potentially less expensive (upfront) approach. However, if your business, in terms of capacity and performance, is growing rapidly, then scale-out may be for you.

[Learn more about scale-out vs. scale-up storage. See Is Scale-Out Storage A Must Have?]

Understand per-node performance
The next step is to make sure that your scale-out storage vendor can provide efficient per-node performance and capacity. Some scale-out storage systems limit one or the other and force you to add nodes prematurely. They focus on aggregate performance and capacity. Remember that each node takes up space; while a 1U node may seem harmless at first, in a rapidly growing environment those nodes are added so rapidly that their consumption of space, power, and cooling becomes a major challenge.

I've seen scale-out storage vendors claim 1 million I/O per second, but when you review the configuration you realize it took them 100+ nodes to get to that level. By comparison, recent scale-out, all-flash array vendors can get you to 1 million I/O per second in about 4-6 nodes. Not only does this save you space, power, and cooling, it also saves you dollars. The sheet metal alone for that 100+ node configuration would be more than the cost of the 4-6U configuration.

Grow primary storage slowly
Most modern scale-out data storage systems can seamlessly scale performance and capacity. But just because they can does not mean that you should. Primary storage, even if it is scale-out, is still the most expensive form of storage in the datacenter. Modern disk backup and archive solutions allow for easy, transparent, and native movement of data from primary storage to a less expensive tier. Integrated disk/tape archive solutions allow data to be easily moved from primary storage to archive storage.

Most users can easily handle the concept of data being in one of two places, so tell them that if they can't find their data on "/Primary" they will find it on "/Archive." If you think about it, they are already doing this today with Dropbox. If they can move files to Dropbox (which you also need to control), then they can move and find files on "/Archive."

The next question is "should my archive tier be scale-out?" and the obvious assumption is yes. But, as we will discuss in an upcoming column, a strong case can be made for a scale-up disk archive tier backed by tape.

Solid state alone can't solve your volume and performance problem. Think scale-out, virtualization, and cloud. Find out more about the 2014 State of Enterprise Storage Survey results in the new issue of InformationWeek Tech Digest.

George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for datacenters across the US, he has seen the birth of such technologies as RAID, NAS, ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
EnterpriseArch
50%
50%
EnterpriseArch,
User Rank: Apprentice
4/22/2014 | 12:25:06 PM
Strategy of Technology Required
Solving the storage problem is just one issue within a larger system of constraints that involve hardware, people, and vendors. I see some very large shops facing a crisis and they still do not know it because they do not define all their system's constraints and then comprehensively plan and then measure to these constraints. 

Larger organizations with growth paths should develop a rigorous planning process that includes a well defined strategy on how to use Moore's Law through the hardware stack. The planning process should have defined plan update checkpoints. The plan should have a five year horizon and include all physical systems - not just storage, but data center and organizational elements. The checkpoints should measure actuals vs planned and then redo forecasts, check key assumptions, and look at any new technology incorporations. The Enterprise Architecture should be augmented to allow easy migration to newer systems to take advantage of new technologies, whether increased density or new approaches. The plan should look at tradeoffs and costs. The plan should be owned by a director level position reporting to the CIO who has extensive technical and analytical and leadership experience similar to a S3/G3 shop in the US Army. 

 

 
Slideshows
Cartoon
Audio Interviews
Archived Audio Interviews
Jeremy Schulman, founder of Schprockits, a network automation startup operating in stealth mode, joins us to explore whether networking professionals all need to learn programming in order to remain employed.
White Papers
Register for Network Computing Newsletters
Current Issue
Video
Twitter Feed