Why Network Performance Management Needs To Become Software-Defined

You may or may not be excited about software-defined networking. Gartner recently declared SDN to have hit the “trough of disillusionment” in its famous hype cycle. But make no mistake, networks will change to become more programmatically accessible. So, it’s time to start thinking about bringing network operations, specifically network performance management, into the software-defined era.

Configuration management

Most SDN use cases are in the realm of configuration management. For example, NTT Communications deployed OpenFlow in its global cloud data centers to reduce manual operations. AT&T used model-based configuration management to enable self-service offerings. SDN isn’t the sole answer in this domain, but it helps configuration management by simplifying interaction with arbitrarily complex network topologies. By having a centralized control plane tracking all these overlaid networks, it’s far easier to manage state and push configuration changes as needed.

Of course, automated configuration management doesn’t equate to SDN. Even without a centralized control plane, using automation for managing network configs is best practice. Networking teams are moving towards DevOps practices by using tools like Chef, Puppet, and Ansible. These tools align the timeframes for rolling out networking components with the speed achieved by the automation of virtualization and containers.

Whether assisted by SDN or not, automated network configuration management isn’t exactly spawning disruptive innovation.  However, it’s eliminating the yawning agility gap that has existed between application/infrastructure teams and networking teams, especially where carrier services are concerned. For service providers and IT networking groups alike, SDN’s speed of configuration restores relevance and credibility that they can play at the speed of cloud.

Software-defined NPMD emerging

If configuration management has taken the lead in SDN use cases, network performance management and diagnostics (NPMD) has lagged. So far, most software-defined NPMD initiatives are primarily more agile and cost-effective forms of network visibility tapping and probe insertion. For example, BigSwitch Networks has for some years now offered what is essentially an SDN-enabled white-box alternative to Gigamon/Ixia/NetScout network visibility boxes.

There also are interesting R&D prototypes such as Ericsson Labs’ Diamond framework, which utilizes an OpenDaylight controller and multi-vendor OpenFlow switches to deploy NFV network probes based on a monitoring intent policy. These efforts are interesting starting points, but they represent only the tip of the iceberg when it comes to the full spectrum of NPMD needs.

The current state of mainstream NPMD is far from software-defined. The typical network operations team deploys a suite of anywhere from five to 25 network monitoring tools and technologies. These include: SNMP monitoring; network performance products performing packet capture or real/synthetic transaction measurement; route analytics; and network traffic analysis tools for NetFlow, sFlow or IPFIX packets generated by switches, firewalls, and routers.

The vast majority of today’s network tools are quite traditional when compared to the cloud/virtual/container form factors and agile, programmatic interfaces of cloud and SDN. Form factor-wise, most NPMD tools are monolithic enterprise software suites, hardware appliances, or downloadable, single server software. Most are based on historical scale-up versus scale-out computing and storage assumptions. Most NPMD tools are still primarily centered around their GUIs, rather than programmatic interfaces. While many NPMD tools claim to provide APIs, for the most part they are used not to empower customer automation, but to enable inter-vendor integration or custom development services.

Breakthrough needed

The short answer is breaking down silos for agility. Programmatic accessibility isn’t just about doing things faster in isolation; it’s about breaking down walls between teams and technologies so that processes support high-quality service outcomes. Silos impede agility.

For example, there are huge, untapped veins of valuable network operations data that can better inform everything from monitoring and troubleshooting to application and service development. Today, that data is mostly thrown away due to scale-up constraints, and what is left is only accessible by a few network experts operating the product GUIs.

An updated approach to NPMD based on cloud-scale principles brings practice in line with the ambitions of SDN. Microsoft Azure’s paper on PingMesh provides an excellent example of cloud-scale, software-defined network operations. Of course, Microsoft has software development resources that most network organizations, including most carriers, can only dream of. But that shouldn’t stop network operations teams from moving forward into DevOps and automation-oriented ways of doing business, and seeking cloud-era tools and technologies to support that orientation.