• 03/28/2013
    4:39 PM
  • Rating: 
    0 votes
    Vote up!
    Vote down!

WAN Optimization Part 1: TCP Limitations

WAN optimization and application acceleration technologies help get around performance limitations in TCP. Here’s how the numbers add up.
While WAN optimization and data acceleration have existed for years, there are misconceptions around the underlying premise, the technology, and the differences between the various offerings. Understanding each of these points is important because many major IT initiatives, such as hybrid clouds, data replication for active/active data center configurations and improved disaster recovery, and data center interconnect (DCI), place even greater emphasis on the WAN.

To help clarify the issues, I'm going to look at data acceleration technologies in a series of blogs. I'll examine the problem and the approaches to improving WAN performance. I'll also examine some of the deployment issues and options around this technology, such as its role within a software defined data center, appliances or services, and the value of virtual vs. physical implementations.

WAN optimization and application acceleration products will continue to speed up the movement of data over WAN connections between locations, but increasingly, the technology is being adapted to address specific business challenges, namely:

• Workload acceleration, which addresses the challenge of accelerating discrete workloads in a virtual machine.

• Storage acceleration, which looks at the challenges of accelerating storage access from branch offices.

• Replication acceleration, which improves the movement of data from data centers to disaster recovery locations and between data centers.

We'll address the importance of each of these niches in the blogs ahead.

The Basic Problem

Regardless of the network, throughput is constrained by three factors – the amount of available bandwidth, the time needed to move the data (delay), and the quality of the link (packet loss). Whether the network stretches across a room or a continent, all three factors are present (at least, theoretically; loss and latency are often irrelevant in office networks).

But when the network stretches coast-to-coast, all three factors play major roles in determining throughput, and the more bandwidth involved the worse the problem becomes. Consider this: a 10Mbps connection between New York and Los Angeles can theoretically use 75% of the bandwidth (7.5 Mbps), but increasing that bandwidth to 100Mbps still only yields 7.5 Mbps of throughput, despite a ten-fold increase in bandwidth.

One way to understand why this is the case is to resort to the old highway analogy. Suppose a family of five wants to travel from New York to Washington, DC , which is about a four-hour ride in your five-seat sedan. Double their speed and the time drops in half, right? But what happens if there are eight people who need to make the same trip in the same car? It would take three times as long to complete the task – four hours one way, four hours back, and four hours to take the remaining individuals to Washington. And should the car should break down on any of those trips, well, the trip will take a lot longer.

Computer networks work the same way. The delay in transporting data and the loss of packets have a significant impact on the amount of data that can be transported across the network. When packet loss is non-zero, the Mathis algorithm formalizes this as:


Let's look more closely at these points.

Maximum Segment Size (MSS) is the maximum amount of the TCP data contained within an IP frame. Typically, stations will use an MSS of 1460 bytes, which is the maximum amount size of an Ethernet frame (the Maximum Transfer Unit) minus 40 bytes for TCP and IP header information.

Round trip time (as opposed to just one-way latency) is important because it reflects the delay it takes to send a request and receive an acknowledgment before sending the next set of data. Sort of like those additional trips the car needed to deliver the remaining passengers.

Packet loss is the percentage of packets lost on the network. Each time a packet is lost TCP backs off, waits, and retransmits the data, decreasing the amount of data that can be sent within a given window of time. Packet loss rates range on today's network; .1 percent is common for MPLS networks within North America, 1% is very common for Internet VPNs within North America.

Let's return to our initial example of a 10Mbps connection. Our maximum segment size is 1460 bytes or 11680 bits. Packet loss is assumed at .1 percent and RTT between New York and L.A. runs about 49ms or .049 seconds so we have:

Maximum Throughput = (11680/.049) * (1 /(SQRT.1%))= 7.5 Mbps

If we increase the line bandwidth tenfold, we're still limited by TCP's constraints, RTT, and packet loss. The result? The performance improvement is marginal at best.

Note that this is a back-of-the-napkin calculation that illustrates my point. For more on measuring performance, see the post "TCP Performance and the Mathis Equation" by Terry Slattery.

Delivering more data across a coast-to-coast connection can be done, obviously. Data acceleration vendors have been remarkably effective at driving throughput of 60 Mbps and more – on a 10Mbps link. We'll better understand why that's so important now and how that's even possible in our next post.

David Greenfield is a long-time technology analyst. He currently works in product marketing for Silver Peak.

Log in or Register to post comments