Scale-out storage is a storage system that is assembled one server, often called a storage node, at a time and then clustered together into a single logical unit. As each node is added to the cluster, performance and capacity scale in unison. In theory, the more nodes you add the better overall performance. The management of this storage system, especially in terms of provisioning, is done as if all the nodes were a single system, not a dozen individual servers. The idea is that a scale-out storage system should be able to scale almost infinitely but not require additional IT staff.
In scale-out storage the individual node is often a relatively low-end or mid-range server. It is only when that server is clustered with the other nodes that the combined processing power makes the storage performance acceptable. This is important because the individual nodes have a lot of work to do. As a result, when scale-out storage first appeared as an option for the data center to consider, design decisions were made to artificially require more nodes per cluster so that enough processing power was available to deliver acceptable performance.
These designs include systems with low drive count, which meant as capacity grew you very quickly had to add nodes to the cluster, with initial node count being three or four and an ideal processing and bandwidth capacity for many systems was eight to 12. In the days when we used to think a few TB's of data was a lot, nodes only came with two to four drives.
The problem is now, especially in cloud or online application markets, there is going to be plenty of demand for capacity and high node count is going to happen naturally so stuffing the nodes full of hard drive capacity is less of an issue. Also, with the shrinking size of technology there is more physical space left for additional drives. Lastly, the CPU performance capabilities of even a low-end server today is significantly better than that of a high-end server a few years ago.
The impact of this is some scale-out storage systems end up with too high of a node count too quickly. As we discussed in our webinar "Be The Cloud--Storage Choices for Online Application Providers", this is especially true in the online provider market where capacity growth is almost assured. This kind of organization needs to be looking for densely packed storage systems that can still deliver the performance levels that users need.
With today's 2.5" hard drives and today's capacities per drive, and the additional space available thanks to shrinking technology, a 1U or 2U node with double-digit TB capacity should be able to be delivered at very aggressive price points. Assuming a 4-node system, that could be more than 40 TBs of usable capacity in the initial configuration. While this may seem large for a mid-sized enterprise, it is not for online application providers. We commonly run into companies that are adding 50 TBs or more month after month. They will be able to generate the node count needed to see acceptable I/O performance from the cluster.
The net result of high-capacity nodes will be significant cost savings, not only in terms of acquisition costs but also in terms of power and cooling those systems. There are solutions out there today that address these issues and deliver very high capacity per node. A high-capacity node is not for everyone, if you don’t have enough capacity demands to push a reasonable node count don’t go down that path.
Follow Storage Switzerland on Twitter
Read our report on storage for highly converged networks. Download the report now. (Free registration required.)