Will SDN Kill TRILL?
Posted by
Ethan Banks
February 26, 2013
In recent years, a great deal of time has gone into developing standards, including TRILL, SPB and DCB, that enable layer 2 multipathing and improve Ethernet's capabilities in the virtualized data center. These standards promise to enable all links to forward traffic, eliminate loops, and create meshes to let traffic take the shortest path between switches.
Much digital ink has been spilled over use cases and the pros and cons of these approaches. Vendors have rolled out products, working groups have exchanged ideas, and engineers have tried to sort out just what is going on.
More Insights
Webcasts
- IT Service Management Buyer’s Guide Live – a side-by-side comparison of suppliers
- Integrating Data in a Big Data Environment
White Papers
More >>Reports
- Strategy: Application Monitoring For Security Professionals
- IT Pro Impact: iPad vs. Nexus vs. Surface Tablet Shootout
Now software defined networking (SDN) has leapt into the pool and caused lots of waves with its promise to upend traditional network design and operation. To paraphrase a question put to me on Twitter, is there a future for data center Ethernet standards such as TRILL in an SDN world? If so, how do the standards fit? If not, how will SDN recreate the promised functionality?
There's an assumption built into the question that must be explored. The assumption is that the networks of the world are going to be software defined. And by "software defined," I mean, at the least, a network whose forwarding behavior is programmed by a central controller with a holistic view of the network topology.
I believe it's possible that this is where networks are going, but even if I'm right, that transition is going to take years. SDN is in its early stages. Startups are coming out of stealth with their notion of what SDN is and what problems it can solve. Most SDN products target specific use-cases and are definitely not one-size-fits-all. The OpenFlow specification, which is a core component of the centralized controller model of SDN, is running ahead of the capabilities of currently available silicon to perform all potential matching operations in hardware.
Meanwhile, big vendors are feeling out the market, determining what customers actually want out of SDN, and developing products based around those requirements, while at the same time protecting their vested interests.
If SDN ubiquity is years away, is there a fit for the emergent data center Ethernet standards? Yes, clearly.
While SDN can offer layer 2 multipathing in the data center, is that why you should be shopping for SDN products? Not really. Customers can deploy Cisco Nexus switches with FabricPath, Brocade VDX switches with VCS, Juniper QFX switches with QFabric, or Avaya switches with VENA (to name a few approaches) and build a layer 2 data center topology that works like a layer 3 routing topology. Those fabric technologies are comparatively mature and scale effectively without too much design effort, though they do require a homogenous data center network.
So why should you be shopping for SDN? Consider the following points.
[Ethan Banks is a featured speaker at Interop Las Vegas this May, including the conference session "Chopping Down the Fat Tree in the Small Data Center." Register here today.]
--Centralized controllers learn about multiple paths through the network between a given source and destination, and can create multiple forwarding entries the network switches can use to deliver traffic across multiple paths.
--ECMP and MLAG creation is a relatively trivial task for a central controller with a holistic view of the network, as there is no great distinction to be made between physical devices.
--A centralized network view makes it possible to engineer end-to-end paths for data to follow based on policies defined by a network engineer, such as latency and hop count, and not merely source & destination addresses.
In other words, because a central controller sees the network as a whole, there's no need for distributed protocols to determine a loop-free, best-path topology. An individual switch no longer has to figure out for itself how to get to a remote destination; the switch is told how to forward by the controller. "Best path" can mean whatever a network designer wants it to mean, and not what a group of protocol designers decided it meant in an RFC.
Does that mean data centers can get rid of TRILL and SPB? Think in terms of domain to answer that question. How many switches are under the central controller? What sorts of forwarding controls are required? How much of that functionality is available in silicon, and what impact does that have on switching performance? What mechanism is used to connect to other switching domains? Data center SDN doesn't have answers to all of these questions yet, and so it's hard to predict the long-term role for TRILL and SPB in the data center.
That said, in my opinion TRILL and SPB have an immediate role to play in the data center. These are well-documented technologies with reference architectures you can buy today with support in silicon from many vendors. Assuming a service life of 5 to 7 years for network gear, TRILL and SPB are positioned to serve you well for the duration. However, the long view of L2 multipathing is that as OpenFlow performance improves in lockstep with OpenFlow capable-silicon, forwarding techniques using OpenFlow could conceivably displace TRILL and SPB as an L2 multipathing technology.
Ethan Banks, CCIE #20655, is a hands-on networking practitioner who has designed, built and maintained networks for higher education, state government, financial institutions, and technology corporations.


















Comments:
2013-02-27T15:36:45
very true, the only paranoia i have with startups is their credibility to support the product long term with the quality and backing required to support the scale of production cloud network, i would love to go with startups, the issue is with layer 8 and above tied with large vendor relationships =)
Permalink
2013-02-27T15:05:20
Great Article and thanks for sharing your thoughts Ethan. I would agree overall, and today in particular having robust fabric elements and capabilities - openflow will evolve probably to numerous specialized variants some called openflow some called other things but in many cases software control will be reliant upon being able to manipulate robust capabilities that exist in hardware.
That being said I personally would tread cautiously with things like trill and spb - the way things were headed many organizations were inline to adopt them not because of a specific need but because that was the next thing coming on the vendor roadmaps - a few years ago it seemed everyone in networking thought they were on the path to replacing their FC sans outright. Today some organizations have a specific compelling need for Trill and have business models that can support the training and overhead as not many engineers in the field have experience with trill at all let alone at any scale.
I would think for a majority of average-size companies multichassis lag technologies can accommodate L2 domains and multipathing that especially when paired with a clos architecture can well exceed needs unless there is a need to do a big multi-hop fcoe deployment (which seems to grow less and less attractive by the minute) - for many if not most of these average deployments I would think trill would not make sense especially considering for most it means new hardware and new training to support something new when the market will start to see more robust SDN deployments from major vendors within the year ... sure they wont be fully mature, but frankly neither is trill. Sure, trill is somewhat more mature but nowhere near the point where there is a strong expertise level among the network engineering community at large, that alone takes years ... not to mention the many vendor interoperability issues that still exist with Trill and many untested use cases and so on. If SDN never happened and trill was in full swing, the roadmap for trill would still be full of new things that it still needs to have delivered the ultimate vision people hoped to attain with the technology. As the major vendor offerings get released and gain momentum, it doesnt make trill irrelevant but it does decrease the size of the relevant market and could decrease the momentum to the point that the still looming issues never get resolved.
While I dont hold high expectations for the future of trill, SDN is still immature, new variants will emerge and who knows, at the end of the day SDN controllers may end up relying on trill as an underlying capability in the fabric. I guess time will tell.
Thanks for sharing and great article!
Permalink
2013-02-27T00:37:42
As far as executing first, I think I have that answer - the SDN startups. Certainly with multitenancy, because there's entrants there. Startups have little to lose. Identify a need and a few keenly interested potential customers, then fix the problem as quickly as possible. Find more use cases. Grow the product capabilities. Build intellectual property value & a customer base. Raise to a satisfactory capital level. Exit.
The big vendors are slower because they are trying to meet the needs of an existing customer base while at the same time not disrupting their current business model or jeopardizing existing products. Riskier. Harder to innovate with entrenched processes and accountants analyzing R&D budgets and making guesses at market direction. So they end up with a message that's meant to meet market expectations, assure the faithful, yet preserve their current way of life. And to be fair, I admire some of the larger vendor approaches for their combination of taking on SDN while at the same time keeping inside their comfort zone. Clever.
Permalink
2013-02-27T00:26:49
RE: "productizing," I still want my "network in a box", at least from an enterprise perspective. I still believe there's a use-case there that I tried to articulate several months back. Need to think through that again now that time has gone by and SDN product announcements are all the rage in SV.
Permalink
2013-02-27T00:24:13
Marten, RE: "non OpenFlow ways to drive..."
IIRC, your company's product and many others only care about OF tangentially. While I didn't get into it here, that's yet another interesting long-view question I know SDN consumers will have to sort out: just what will the market's commitment to OF be? I'm waiting to see what rumored new merchant silicon brings, how well it maps to OF1.3, and then how vendors translate that into capabilities and performance.
It seems possible that OF specifically might not matter much in the long run, although SDN as an approach clearly will.
Permalink
2013-02-26T23:17:26
Ethan, very nicely written article.
The underlying L2 multipathing capabilities used in todays commercial silicon are fairly limited and both TRILL and SPB use a variety of (similar) tricks to enable their network multipathing. Where OpenFlow is just a layer on top of these capabilities, these same tricks can be used. And of course there are non OpenFlow way to drive the capabilities of network hardware through SDN.
The key as you note is centralized control and visibility, your "In other words..." paragraph is spot on, even beyond path calculations and forwarding...
Permalink
2013-02-26T20:50:58
Hi Ethan,
I really like your post. Finally a realistic view on what SDN is today and what it might become in the future. SDN has indeed a huge potential, after all, theoretically, with SDN and proper hardware/software support, your network can become as smart as your programming skills go, rather than being "limited" to what different CLIs and protocol interops allow you to do. I agree with you that productizing SDN for a general consumption the way it looks today is challenging except for some low hanging fruit use case, while TRILL based technologies and SPB, have proven Data Center deployment record and methodology.
Thank you,
David
@DavidKlebanov
Permalink
2013-02-26T18:15:47
great post.......!
i think sdn will bring great innovations not only from networking perspective but also unified fabric, multitenancy, converged networks, qos, etc...great innovation, fascinating ideas, phenomenal challenges ahead, what would be interesting to see is which vendor executes comprehensively first!
Permalink