When MLAG Is Good Enough
March 07, 2011
Do you need multi-path Ethernet in the data center using TRILL, Shortest Path Bridging (SPB), Cisco's FabricPath or Brocade's VCS in order to maximize network efficiency and reduce congestion? Probably not, unless you manage a very big data center with servers and access ports running into the tens of thousands. If you have less than 5,000 access ports in your data center, you can probably avoid routing Ethernet frames and use your vendor's multi-chassis link aggregation (MLAG) product-set to interconnect the access in an active/active design. MLAG works
Nearly every story, blog, or article about FCoE says that you have to be running TRILL, SPB, or something like it--yes, I have made that claim as well-- so that you can have an efficient mesh network where traffic flows--a unidirectional stream of packets--follow the same path through the network and arrive at the destination in the correct order. Unlike other protocols Fibre Channel--aka SCSI in a frame--doesn't like data arriving out of order.
By using shortest path, we can ensure that the latency is reduced, since fewer hops mean less delay and if there is a change in the network topology, we can find the next best path. That kind of design assumes you have a mesh or somewhat meshed network where there are multiple paths through the network and no way to manage the flows. Lower level protocols, the data center bridging protocols, will handle the lossless networking.
All network vendors today support multi-chassis link aggregation, which is based on the IEEE 802.3AX-2008 (the original LAG standard was 802.3ad) Link Aggregation (LAG) standard. LAG allows you to bond two or more physical links into a logical link between two switches or between a server and a switch. Since LAG introduces a loop in the network, Spanning Tree has to be disabled on the LAG ports. LAG doesn't double the capacity, which Ethan Banks points out in The Scaling Limitations of Etherchannel -Or- Why 1+1 Does Not Equal 2 because LAG implementations place traffic on links based on flows, not packets. This is done so that packet order is preserved end to end and to remove the possibility of packet duplication--two design requirements of 802.1AX-2008. The algorithm that determines which flows go to a specific link should be designed well enough to evenly distribute the load across all available links, but there is no way to predetermine how long a flow will last or how big the frames will be.
Standard LAG is only between two peers. Multi-Chassis LAG is proprietary. Ivan Pepelnjak has a short run down of MLAG and fabric features from Brocade, Cisco, HP, and Juniper. Suffice it to say that the MLAG features look nearly the same, but switches from multiple vendors won't interoperate with MLAG. What is interesting is that MLAG systems all share a common trait--no more than two core switches.