The Classic Scale-up Problem
The classic scale-up storage problem and the one that scale-out storage claims to fix is that at some point you will hit the "wall" with a scale-up storage solution. That wall might be that the system can't store any more data because it's run out of capacity, or it might be that it can't meet the performance demands of the environment. This typically leads you to buy additional scale-up storage systems, increasing acquisition costs and storage management costs with each silo of storage.
This wall typically is hit long before you run out of physical slots for hard drives. The wall might manifest itself, for example, when the system doesn't have enough network connections to keep up with inbound storage requests, but most of the time the system simply runs out of CPU resources. CPU resources are consumed from the storage controller(s) managing how data is being read, written and protected as well as typical data services such as thin provisioning and snapshots. The scale-out advantage is that as you add more capacity, network and CPU are added in lockstep while still maintaining a single point of management.
The Scale-Up Reality
In the modern era with 8- to 16-Gb fibre channel or 10-GbE connections and hybrid or all-flash arrays, the speed of I/O should be less of an issue and CPU consumption should be well addressed by the inexpensiveness of compute power. A scale-up system with appropriate network connections and processing power should provide plenty of performance and capacity.
Scale-out solutions will claim that this is part of the scale-up problem; you have to buy all this horsepower-upfront. CPUs are relatively cheap but storage I/O can be expensive. But scale up, once you have made that investment, typically becomes a more efficient option. Drive enclosures are all that is needed from that point forward. One powerful CPU eventually becomes cheaper than the dozens of cheaper CPUs in storage nodes.
The scale-out storage advantage becomes a disadvantage as the individual components -- network, CPU, drives -- become faster. This is because as you are likely adding capacity before you can completely use the other resources, the net result is that you end up with a large cluster of storage nodes where CPU and network resources are massively underutilized.
The 'Scale Right' Answer
Scalability of a storage system is always a concern, but you also don't want to end up with a 114-node storage system driven by a need for capacity where the processors and network I/O are sitting idle. The answer for both camps is to provide more of a "scale right" approach.
For scale-up vendors, this means providing simpler means to add additional processing power and network bandwidth to their storage system so that it does not all have to be purchased up front. This should eliminate much of the storage system sprawl problem that plagues many data centers.
For scale-out vendors, it means providing denser storage nodes with more capacity, network I/O and CPU processing power per box. This would result in fewer and better-used nodes, which should drive down long-term costs significantly.