Network Infrastructure

Why I Like Juniper's QFabric (And A Mea Culpa)

While I was visiting Juniper in early December, I got a chance to sit down with the QFabric folks to discuss some of issues with QFabric and what I saw as a proprietary—with all the badness that word implies—product set in search of a reason. While QFabric is proprietary because of how the components are interconnected, I came away with the impression that the overall design and capacity looks extremely powerful. I think the upsides of the QFabric product set far outweigh the downsid

Mike Fratto

February 1, 2012

5 Min Read

While I was visiting Juniper in early December, I got a chance to sit down with the QFabric folks to discuss some of issues with QFabric and what I saw as a proprietary—with all the badness that word implies—product set in search of a reason. While QFabric is proprietary because of how the components are interconnected, I came away with the impression that the overall design and capacity looks extremely powerful, and I think the upsides of the QFabric product set far outweigh the downsides. Give a month's time between visiting Juniper and now, I'd say that all my ballyhoo about being proprietary was a non-issue. My bad.

Juniper's QFabric, in a nutshell, distributes the traditional chassis switch into discrete components. The top-of-rack (ToR) switches, called QFNodes, are line cards. The QFinterconnect, which the QFNodes are connected to via OM-4 or OM-5 fiber, is the back plane, and the QFdirector(s) are the supervisors (in Cisco parlance), or managers. Each QF node is connected to between two and four QFInterconnects via 40-Gbit links, and there are two QFDirectors that are connected to QFNodes and interconnect via an out-of-band 1-Gbit link.

Greg Ferro, who does network design and consultation for large organizations and also contributes to Network Computing, has written a nice explanation of QFabric and explains some benefits.

Here's why I like it. It's operationally simple. The distributed chassis metaphor is apt and means that multi-switch management is greatly simplified. You can manage up to 128 switches as if they were a single switch, which for all intents and purposes, they are. Think about that for a moment. You don't have to maintain credentials across 128 switches or authentication configuration if you are using RADIUS or some other authentication server.

You don't have to integrate 128 devices into your network management system (NMS), hypervisor management system or other IT systems. Even with scripting or an NMS, making sweeping changes to 128 individual switches in a network is dicey. Granted, you can aggregate multidevice management to simulate a single pane of glass, but that means introducing more servers and management protocols that can get in the way or breakdown. As the number of things you need to manage grows, the simpler your management framework needs to be.

Traffic-wise, you don't have to worry about multiple paths, spanning tree, building N-tiers, or deciding where to set-up routing since QFabric also routes (although Juniper is quick to point out that you likely wouldn't replace your edge or core router with a QFabric, just like you wouldn't replace them with a 1U ToR L2/L3 switch). Any two points in the QFabric is a mere 5 microseconds away. Unless your company requires ultra low latency, anything below 1 millisecond (typically, the granularity that latency is measured and reported in enterprise switches) is probably fine. But, hey, less is better in any case. If you need more capacity at the edge, you can add additional switches fairly cost effectively, as Ferro points out.

Bear in mind that, currently, each QFNode 3500 can be oversubscribed at 3 to 1, based on 48 10-Gbit ports facing the access devices and 4 by 40 gigabit uplink ports facing the QFInterconnects. 480 Gbits inbound going into a 160-Gbit uplink makes 3-to-1. However, engineers at Juniper said the limitation today is the interface speed of the uplink ports. There is no limitation to the QFInterconnect, so speeds can increase in the future provided Juniper ships QFInterconnect cards and QFNodes that support higher capacities.What gets interesting with QFabric is the migration path to and from QFabric, and how QFabric can fit into the data center. In a fit of whiteboard craziness, we mapped out some scenarios. A couple of things come clear:

To the rest of the network, QFabric is just a L2/L3 switch. It's one bridge in a spanning tree, and outside QFabric, it's just Ethernet. That means you can plug a QFfabric into the rest of your network and it will be loop-free.
All the rest of your L2/L3 network will behave just fine, and you can run any other network equipment, like a Cisco Nexus side-by-side.
Any requirements such as reaching hosts defined by routes on an external router or passing traffic through a load balancer mean traffic many have to pass out and back in to QFabric.

If you have already invested in Juniper's QF 3500s, the EX line is not supported and you want to migrate to QFabric, you need a QFInterconnect and a QFDirector, although Juniper recommends pairs for redundancy. You can cable to your existing QF 3500s and they become part of the Qfabric. Take them out of the QFabric, and they become l2/L3 switches. Pretty nice investment protection.

I like it. QFabric is a fairly simple design—simple is good. No need to worry about mutlipath Ethernet protocols like TRILL, SPB, LAG or MLAG. It only scales to 6,144 10-Gbit ports with over subscription, 2,048 if you want non-blocking (that's 16 10-Gbit ports per QFNode). If you dual-home your servers, that only 3,072 servers. I say only tongue in cheek. That's a lot of servers for most organizations, and I will go out on a limb and assume that if you're looking at that kind of scale, it's either a special-purpose computing center or a hosting or cloud provider.

The other elephant in the room is cost. That's a topic I will take up later, as well as digging a little deeper into the design scaling issues. Of course, there are a number of other things to consider, like distance limitations of the OM-4 cable, cable layout and designing the L2/L3 network within QFabric. But if you are looking at upgrading from a 1-Gbit to a 10-Gbit network and you want to take advantage of the new features that network fabrics such as Brocade's VCS, Cisco's Fabricpath and Juniper's QFabric offer, it's worth a long hard look. And I bet the proprietary features will be less important the deeper you look.

Disclosure. I traveled to Sunnyvale on my company's dime. Juniper fed me a hamburger, chips and a soda, and gave me a pen.