Special Coverage Series

Network Computing

Special Coverage Series

Commentary

Greg  Ferro

VMware's SDN Dilemma: VXLAN or Nicira?

VMware has invested in two overlay network approaches: the VXLAN standard originally conceived by Cisco and STT, drafted by SDN startup Nicira. VMware acquired Nicira for more than a billion dollars. Which will VMware choose? Here’s my take.

VMware has a technology problem: It's backing two competing standards for overlay networks: Nicira's STT and the IETF draft standard VXLAN. An overlay network enables network virtualization, which is a core component of VMware's software-defined data center initiative. Both STT and VXLAN have upsides and downsides. I'll look at each protocol and speculate on which direction VMware may go.

First, a little background. Before being acquired by VMware, Nicira developed the Stateless Transport Tunneling (STT) protocol for tunneling between open source software switches in the Openvswitch project.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

VXLAN, which is now an IETF draft standard, was originally proposed by Cisco. Cisco sources say that the company then got VMware involved (although the IETF draft has a lot of names on it). The end result is VMware is telling everyone that it has this great VXLAN overlay network technology that removes any hypervisor dependency from physical network devices. Even better, it's configured and managed from vCenter.

The question is, which protocol will win?

Nicira and STT

Prior to acquisition, Nicira had a software controller for managing tunnels between virtual switches, and used OpenFlow-like commands to configure the vSwitch. STT is a tunneling protocol that connects the virtual switches, thus forming a virtual network.

STT performs this task well enough. It uses the TCP protocol for encapsulation. Supposedly, operating systems can use the TCP offload function of modern network adapters for better performance.

However, STT also has several limitations. One problem is that the limited entropy in the STT header means it doesn't balance loads evenly over Ethernet port bundles in network backbones. Depending on your network design, this could be a significant limitation.

Second, STT currently works only with the Openvswitch software switch on Linux hypervisors such as Xen or KVM. That's not necessarily a problem for cloud providers and very large organizations; for instance, eBay is using Nicira in its OpenStack deployment. However, VMware is more common in enterprise data centers. It's possible VMware could add STT to the ESXi vSwitch, and thus deliver a multicloud network overlay strategy, but the VXLAN protocol already has a lot of momentum.

VXLAN's Multicast Issues

VXLAN depends heavily on a multicast-enabled underlay network to handle broadcast/unicast/multicast Ethernet protocols. (I use the term "underlay network" to describe the physical devices that pass Ethernet frames and IP packets.) What's not well understood is that IP multicast is complex and risky to operate.

Each VXLAN-enabled device is known as a VXLAN Tunnel End Point (VTEP). When the VTEP is configured with VXLANs, it will be configured to join an IP multicast group. Joining the multicast tree is the method for VTEPs to discover the MAC of each host in the VXLAN in a self-configuring and autonomous method. Direct server-to-server data flows are transported through the VXLAN overlay in unicast packets.

IP multicast also provides an efficient way to broadcast Ethernet frames to all servers as is required--for example, for unknown MAC address flooding and IP ARP Requests.

VMware recommends a separate multicast group for each VXLAN; thus, 50 VXLANs would require 50 separate multicast trees in an attempt to control L2 Ethernet flooding problems. L2 loops remain a problem in VXLAN networks, but the failure domain is reduced to an individual VXLAN itself. The problem is that each of those multicast trees requires state to be held in the network layer, which consumes CPU, memory and TCAM space. TCAM size is a serious limitation on network diameter, and overloaded TCAM is serious network threat.

A lesser performance problem is the frame replication silicon in the switches. At its core, multicast is a method for duplicating Ethernet frames inside the hardware of your network. One multicast frame must be sent out of every Ethernet port that needs to receive it. On a data center core switch, this could mean replicating one received frame to 300 ports (thus, 1 Gbps of inbound multicast packets results in 300 Gbps output). Network switches require dedicated silicon to handle the duplication process. For example, this is an approximation of silicon pathways inside a single M1-series line card from a Nexus 7000 showing the replication engines on the blade:

Internal Architecture of Single Line Card Nexus 7000. Source: Cisco Systems
(click image for larger view)
Internal Architecture of Single Line Card Nexus 7000. Source: Cisco Systems

There are a number of IP multicast routing protocols that maintain the multicast trees, including PIM-SM, PIM-DM, BiDir and ASM multicast. In general terms, PIM-SM will be the default choice because it's got the widest vendor support, but that isn't saying much. Most data center switches do not support multicast protocols today. This can make VXLAN hard to deploy in existing networks and usually requires new network hardware.

Next page: Picking a Winner

 1 | 2  | Next Page »


Related Reading



Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 

Editor's Choice

Research: 2014 State of Server Technology

Research: 2014 State of Server Technology

Buying power and influence are rapidly shifting to service providers. Where does that leave enterprise IT? Not at the cutting edge, thatís for sure: Only 19% are increasing both the number and capability of servers, budgets are level or down for 60% and just 12% are using new micro technology.
Get full survey results now! »

Vendor Turf Wars

Vendor Turf Wars

The enterprise tech market used to be an orderly place, where vendors had clearly defined markets. No more. Driven both by increasing complexity and Wall Street demands for growth, big vendors are duking it out for primacy -- and refusing to work together for IT's benefit. Must we now pick a side, or is neutrality an option?
Get the Digital Issue »

WEBCAST: Software Defined Networking (SDN) First Steps

WEBCAST: Software Defined Networking (SDN) First Steps


Software defined networking encompasses several emerging technologies that bring programmable interfaces to data center networks and promise to make networks more observable and automated, as well as better suited to the specific needs of large virtualized data centers. Attend this webcast to learn the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging.
Register Today »

Related Content

From Our Sponsor

How Data Center Infrastructure Management Software Improves Planning and Cuts Operational Cost

How Data Center Infrastructure Management Software Improves Planning and Cuts Operational Cost

Business executives are challenging their IT staffs to convert data centers from cost centers into producers of business value. Data centers can make a significant impact to the bottom line by enabling the business to respond more quickly to market demands. This paper demonstrates, through a series of examples, how data center infrastructure management software tools can simplify operational processes, cut costs, and speed up information delivery.

Impact of Hot and Cold Aisle Containment on Data Center Temperature and Efficiency

Impact of Hot and Cold Aisle Containment on Data Center Temperature and Efficiency

Both hot-air and cold-air containment can improve the predictability and efficiency of traditional data center cooling systems. While both approaches minimize the mixing of hot and cold air, there are practical differences in implementation and operation that have significant consequences on work environment conditions, PUE, and economizer mode hours. The choice of hot-aisle containment over cold-aisle containment can save 43% in annual cooling system energy cost, corresponding to a 15% reduction in annualized PUE. This paper examines both methodologies and highlights the reasons why hot-aisle containment emerges as the preferred best practice for new data centers.

Monitoring Physical Threats in the Data Center

Monitoring Physical Threats in the Data Center

Traditional methodologies for monitoring the data center environment are no longer sufficient. With technologies such as blade servers driving up cooling demands and regulations such as Sarbanes-Oxley driving up data security requirements, the physical environment in the data center must be watched more closely. While well understood protocols exist for monitoring physical devices such as UPS systems, computer room air conditioners, and fire suppression systems, there is a class of distributed monitoring points that is often ignored. This paper describes this class of threats, suggests approaches to deploying monitoring devices, and provides best practices in leveraging the collected data to reduce downtime.

Cooling Strategies for Ultra-High Density Racks and Blade Servers

Cooling Strategies for Ultra-High Density Racks and Blade Servers

Rack power of 10 kW per rack or more can result from the deployment of high density information technology equipment such as blade servers. This creates difficult cooling challenges in a data center environment where the industry average rack power consumption is under 2 kW. Five strategies for deploying ultra-high power racks are described, covering practical solutions for both new and existing data centers.

Power and Cooling Capacity Management for Data Centers

Power and Cooling Capacity Management for Data Centers

High density IT equipment stresses the power density capability of modern data centers. Installation and unmanaged proliferation of this equipment can lead to unexpected problems with power and cooling infrastructure including overheating, overloads, and loss of redundancy. The ability to measure and predict power and cooling capability at the rack enclosure level is required to ensure predictable performance and optimize use of the physical infrastructure resource. This paper describes the principles for achieving power and cooling capacity management.