If you listen to the hype, software-defined networking and other software-defined technology will solve all data center problems and allow your computing gear to be an undifferentiated mass of identical nodes.
But will a cluster of identical machines handling networks, servers and storage solve everything? Or do we need a set of task-specific systems? I think the latter, with the move to software-defined technologies more evolutionary. For example, I see network switching chips sticking around for the foreseeable future. Here's why.
Moving the smarts of switching to virtual machines sounds like an answer to many issues, such as adding compute power during peak loads to keep latencies down, which would help the spread of in-transit encryption, firewalling and filtering and access control systems.
But virtualization adds overhead and latency, which isn't good for any operations that require fast response from the system. And unfortunately, networking is getting faster, and doing so at an increasing rate.
For instance, we are about 60% into the deployment of 10 GbE, with 25 GbE and 50 GbE connections coming soon, and we already have 40 GbE backbones and beyond. Moreover, the Internet of Things is looming on the near horizon, and that will expand traffic by a huge factor. The combination of faster speeds and vitualization places limits on what SDN can do.
Inside the data center, storage is placing huge stress on the networking. With Fibre Channel losing ground to Ethernet connections and RDMA on the rise, the software-defined question rolls over into storage, both in appliance configuration and task distribution.
In both networking and storage, tasks can be labelled latency sensitive (data plane) or side band (control plane). With both, real-time operations challenge the concept of simple virtualization that "software defined" implies.
With real-time operations, standard instances that are given periodic slices of a core could introduce huge latencies on traffic. Suppose routing were running in an instance that's 10% of a core. Depending on the how the hypervisor slices and dices the core's usage time, this could add milliseconds to the latency, which would render the switch configuration too slow for general work, and totally unacceptable for low-latency operations such as in financial systems. Likewise, the size of DRAM allocated to an instance may be too small to hold all of the routing tables, especially with IPv6 and IoT.
Current switches don't have this problem. They have control processors right-sized and dedicated to the task they do. The task apps also run on operating systems designed for real-time operation, with protection against interrupt loss caused by task pre-emption. These OSs don't have the overhead associated with general purpose operating systems, and they don't have the issues of hypervisor overhead, time slicing and IO sharing that general solutions encounter.
This all suggests the move to software-defined networking will be more evolutionary, with a first phase where CPU virtual instances are being used for side-band control plane operations such as configuration control, and non-critical data plane operations such as compression and encryption for backup and disaster recovery.
The focus in this phase on the control plane will allow more comprehensive management of the whole network, and bring a bunch of new features to the table. These changes should simplify data center operation and speed up networking.
A second phase where virtual instances solve more time-critical data-plane operations is likely to follow, but will be dependent on specialized instances, perhaps with acceleration using system-on-a-chip (SoC) or GPU assist.