In Defense Of VMware NSX And The Overlay Approach

By using a network overlay architecture, NSX cleanly segments SDN approaches into two realms: the physical and virtual. The overlay strategy has been criticized, but it makes the most sense.

Kurt Marko

September 11, 2013

6 Min Read
Network Computing logo

At the heart of the debate about how to best build software-defined networks is the issue of decoupling the physical and logical realms. While many network equipment vendors try to conflate the two, ultimate success is likely to hinge on how well SDN and network virtualization architectures separate physical resources like switch ports, network links and L2 traffic flows from logical abstractions like virtual interfaces and networks.

The arguments for and against decoupling took center stage last month when VMware unveiled its NSX network virtualization software and Cisco CTO Padmasree Warrior quickly responded with a blog entitled "Limitations of a Software-Only Approach to Data Center Networking."

In making the case for decoupling, Martin Casado, VMware's chief network architect and key developer of the Nicira virtualization technology underpinning NSX, points out that physical and logical networks address very different sets of problems with, if not divergent at least non-overlapping, sets of requirements. In an interview at VMworld, Casado noted that it's just good engineering practice to segment logically independent problems to allow optimizing solutions for each domain.

Clarifying his thoughts via email, Casado wrote, "Good engineering practice is based around modularization, where you decouple functionality independent problems so each subcomponent can evolve independently." He added, " If we clearly identify the interface and leave the rest decoupled, each can evolve at their own pace (the overlay providing a better service model to end systems and the underlay more efficient transport) while still maintaining correctness."

Casado believes the network overlay/underlay dichotomy is a good example of system modularization, adding that it's conceptually similar to the way routers and switches are designed, with a modular backplane handling traffic and line cards delivering various network services and protocols. "The overlay provides applications with network services and a virtual operations and management interface. The physical network is responsible for providing efficient transport," Casado wrote.

For example, physical networks manage traffic flows between switch ports. Traditionally, there has been a one-to-one mapping between network ports and IP addresses, which has been upset by server virtualization. This might imply a need to bridge the physical and virtual worlds; however, they can operate independently with tunneling protocols like VXLAN and NVGRE that facilitate building overlay networks. Indeed, NSX is just such a network overlay.

Visibility, Scalability Concerns

Common objections to network overlays are that they don't scale, don't provide management visibility to both physical and virtual networks and can't optimize performance for changing traffic patterns due to poor (or immature) interfaces between the two. The scalability issue has been the subject of much, sometimes impassioned, discussion in the blogosphere. Ivan Pepeljnak, chief technology advisor at NIL Data Communications, argued in a blog post that tunneling does scale, as used by schemes like NSX: "Will they scale? Short summary: Yes. The real scalability bottleneck is the controller and the number of hypervisor hosts it can manage."

But what about a virtual overlay's dependence on the performance of the physical network and the inability to control L2 packet flows? A lengthy blog post by Dimitri Stiliadis, co-founder and chief architect at Nuage Networks, makes the point that for a properly designed network--that is, flat CLOS or fat tree--with ample, low-latency bandwidth throughout the data center, there's no need to micromanage local flow switching since any potential improvements using centralized intelligence are negligible.

As Stiliadis aptly puts it: "If we consider a standard leaf/spine architecture, the question that one has to answer is whether any form of traffic engineering, or QoS, at the spine or leaf layers can make any difference in the quality of service or availability of flows. In other words, could we do something intelligent in this fabric to improve network utilization by routing traffic on non-shortest paths? Well, the answer ends up being an astounding NO, provided that the fabric is well engineered."

[Get up to speed on the unfamiliar protocols and technologies NSX introduces to the data center network in "Inside VMware NSX."]

Casado largely agrees with this conclusion, at least for enterprise data centers; cloud service providers are another story. However, he recognizes there must be a control interface between the physical and virtual worlds to handle cases where the fast, wide, flat network fabric still runs into problems. He wrote in his email: "Of course there is an interface between these two [the physical and virtual], for example, QoS bits to enforce SLAs and drop policies, multicast bits to offload packet replication, flow-level entropy to aid multipathing, etc."

In a press Q&A at VMworld last month, Casado pointed out that one reason cloud services like Google were so enamored with OpenFlow, which he originally developed as part of his doctorate research at Stanford, was its ability to handle extreme conditions like elephant flows creating bottlenecks on specific network segments or multiple, real-time streams with time-sensitive delivery requirements. He argues that such conditions are uncommon in enterprise data centers, or at least easily handled by well-designed switch fabrics. Where bridging the physical-virtual divide is necessary, OpenFlow is the answer; indeed, it's the protocol the NSX controller uses to manage virtual switch instances.

NEXT: Cisco's View

This brings us to another objection to partitioning networks into physical and virtual realms--namely, as Cisco's Warrior contended, "It fails to provide full real-time visibility of both physical and virtual infrastructure."

Warrior argued that segregating networks into physical and virtual realms forces users to cobble multiple third-party components into a consolidated management platform, thus complicating IT operations and creating silos of different security policies, log data and orchestration processes. The rest of her post recounts the benefits of Cisco's ACI (Application Centric Infrastructure, an outgrowth of the Cisco ONE SDN strategy), which bridges network control with application services.

But abstract talk, like that in an earlier blog post by Warrior, of an "object-oriented design" with "dynamic policy management across physical and virtual resource pools" via a "deeply programmable" system for rapid application provisioning and placement sounds good on paper, does little to solve existing network challenges as the majority of workloads become virtualized and increasingly nomadic between physical systems.

However, the added management complexity that Warrior described appears to be the biggest downside of decoupling physical and virtual networks. As Greg Ferro wrote in "VMware NSX Caution Signs:" "The level of internal change at organizations that would adopt VMware vCloud 5.5 (the management platform for VMware NSX) is not be underestimated. For example, networking teams must have access to vCenter, security policies must be overhauled and reapproved, and server teams need to understand networking as part of their build practices. If IT infrastructure groups were unionized, there would demarcation disputes, walk outs and management action plans." Of course, adopting Cisco's ACI vision would mean the same type of changes for the server team as Cisco tools replace vCloud or Microsoft System Center Virtual Machine Manager.

At this point of SDN evolution, the modular approach of virtual network overlays on a programmable physical network fabric taken by NSX and vendors such as Embrane, Midokura and Nuage, offers the best balance of features, flexibility/adaptability and ease of deployment on existing hardware while allowing both physical and virtual networks to evolve on independent technology cycles. Traditional network equipment vendors intent on owning the entire cloud hardware/software stack will resist, but vertical integration hasn't been a winning strategy since the mainframe era--a fact the age of SDN and virtual networks is unlikely to change.

[Get deep insight into the technologies and issues around overlay networking in the workshop "Introduction To Overlay Networking" at Interop New York, from Sept. 30-Oct. 4. Register today!]

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights