Editor's note: VMware's Martin Casado and Bruce Davie team up to comment on the benefits of one aspect of software-defined networking: overlaying a physical network with a virtual net. The authors didn't necessarily intend it that way, but this piece in large measure answers the questions raised by Padmasree Warrior, CTO of Cisco, in her Aug. 28 blog, Limitations of a Software-Only Approach to Data Center Networking.
With the recent launch of the VMware NSX network virtualization platform, there has been a surge of interest in network virtualization technologies. A common technique across many network virtualization solutions is the use of some sort of "overlay tunnel" approach, such as VMware's VXLAN, Network Virtualization using Generic Routing Encapsulation (NVGRE, backed by Microsoft), or the IETF standard body's Stateless Tunnel Transport (STT). Overlays provide a means to encapsulate traffic traversing virtual networks so that the physical network is only responsible for forwarding packets from edge to edge, using the outer header. (An earlier VMware post on overlay tunnels is here.)
One question that came up following the NSX launch was around the impact of overlay technologies on network visibility, so we'll address that question here.
In our experience, a well-designed network virtualization solution can actually solvevisibility issues by providing an unprecedented ability to monitor and troubleshoot virtual networks. We discuss below some of the monitoring and troubleshooting tools that can be (and have been) provided in an overlay-based network virtualization platform. These tools enable an operator to determine which problems are in the overlay versus the underlay, and to diagnose and rectify problems in either layer by viewing the underlay and the overlay in a unified manner.
[ Do we need a separation between physical and virtual elements on the network? See SDN Skirmish: Physical, Virtual Approaches Vie For Dominance . ]
As soon as you start virtualizing servers, you face the problem of how to troubleshoot connectivity and performance problems between VMs. This problem doesn't arise due to network virtualization -- it's there as soon as you start communicating between virtual machines. If there is a connectivity problem, you'll need to figure out if it's in the vSwitches, the physical network, or some intermediate device that might be intercepting traffic such as a firewall. What a network virtualization overlay provides is a mechanism to decompose the problem into constituent parts, which can then be diagnosed separately.
Consider the VMware NSX platform. It provides the ability to create, manage and monitor virtual networks from a central API. If you want to see what's up with the connectivity between a pair of VMs, for example, you (or a software tool) can issue an API request to query the status of the virtual network that is supposed to be connecting those VMs.
This request returns byte and packet counters on each of the virtual ports. Another API allows you to inject synthetic traffic as if it came from a given VM and see what happens to it. Did it get dropped due to an access control list(ACL) rule in the virtual network that was too restrictive? Did it fail to traverse the physical network linking the hypervisors? Either you pinpoint the problem in the virtual network, or you identify the physical path that's causing the problem.
Since we continuously check the health of overlay tunnels using health check packets between the hypervisors, we can often detect problems in the physical network before we have a problem with VM-VM connectivity. Because the virtualization controller knows which virtual networks depend on a given tunnel, it can identify which virtual networks are affected by a physical network fault.