Network Infrastructure

Does QoS Deliver?

If you need your data packets in 30 milliseconds or less, it's time to implement network Quality of Service policies.

September 2, 2003

18 Min Read

Getting Jittery

There are four major reasons for implementing QoS:

• Latency, the amount of time it takes for a packet to get from one place to another, may be caused by bandwidth saturation, lack of resources (CPU or RAM) on a network device, distance or type of connection. You can reduce latency only so much--there's no way to send a message and receive a reply via a satellite link in less than 500 ms, for example.

• Jitter refers to the variance in latency. Unless two nodes are on the same switch, latency will vary greatly from packet to packet. When network bandwidth is saturated, jitter increases. For applications such as file downloads or Web browsing, this is not usually a big deal. However, streaming video and VoIP suffer greatly from high jitter. QoS can be used to help even out jitter by giving streaming traffic a higher bandwidth priority. Another solution is to increase buffer size.

• Random packet loss, which occurs when networks or devices are oversaturated, causes clipping in streaming media, reset and dropped connections, and other transaction difficulties. Worse, dropped packets need to be retransmitted, compounding the problem. QoS methods can limit the amount of bandwidth a protocol or connection uses, thus preventing or limiting oversaturation.• Controlling bandwidth use is the final reason for QoS. Traffic like FTP and streaming video can siphon off bandwidth like a napkin sponging up grease from a pepperoni pizza. Peer-to-peer file sharing has caused headaches on most college campuses, with some campus networks experiencing 100 percent saturation from P2P traffic alone. Enterprise networks may find that remote-control software is slowing the responsiveness of the Web servers. If fast Web server responsiveness is important to the company's image, this is an issue.

Being able to control bandwidth is especially critical when there isn't much to go around. QoS can give priority to business-critical applications, like Citrix, network-management tasks or database queries, while leaving the leftover for less important activities.

Never underestimate the damage a Web browser can do to a network: Most QoS devices will let you specify minimum and maximum rates per protocol, while advanced products let you specify minimum and maximum rates per session. You could say that every individual streaming video session will get a minimum of 100 Kbps, for example, but all the streaming video sessions combined can't use more than 1 Mbps. You can also use QoS to give certain users preferred status.When considering a QoS implementation, you must decide whether you need a product that can do high-level inspection. QoS chiefly works at Layer 4 or Layer 7 of the OSI model.

At Layer 4, only port numbers and IP addresses are examined. Not a big deal in the good old days, when a protocol was mapped to its own port. Today, most firewalls permit traffic to pass through Port 80 from any machine in an organization, so users can watch streaming video, use Web mail, run a P2P service, view Web pages and tunnel SSH connections--all through Port 80. A number of these activities can use regular HTTP or encrypted HTTPS, which will confuse some content filters.

If you're in a locked-down environment where users aren't going to plug in their own machines or install unapproved software, Layer 7 capabilities may not be crucial. College campuses, on the other hand, shouldn't even consider forgoing high-level inspection.You may also lose out on other cool features by not selecting a Layer 7-enabled QoS device. For example, Sitara's QoSWorks can identify traffic based on HTTP content type. You can rate-limit images or embedded multimedia files to allow faster transfers of HTML text. Some Layer 7 devices can also detect whether a session using the HTTP protocol is delivering a Web page or downloading an MP3. You may want to have separate policies for these two functions.

Beyond Layer 4 versus Layer 7, when it comes to choosing a QoS method, you can get basic, or you can get complex. Let's start simple.

Plain-Cheese Techniques

When faced with a shortage of network resources, the easiest solution is to overprovision.

Need more bandwidth? Install a second T1. Routers dropping packets? Buy a RAM upgrade. Although it may sound like the lazy man's answer to QoS, overprovisioning is a valid and sometimes necessary route. For example, you can't transfer camcorder DV (digital video) streams over 802.11b in real time. Consumer-quality DV runs at 25 to 36 Mbps, while 802.11b runs at 11 Mbps (in practice, it's only 4 to 6 Mbps). If you have a 128-Kbps ISDN circuit and want to run 100 simultaneous VoIP sessions at 8 Kbps, not even the most fully loaded QoS solution will deliver.Furthermore, enabling any QoS method on routers, firewalls and other network infrastructure devices consumes processing power and RAM, which may force you to buy more powerful equipment anyway.

Just be careful you don't end up spending more money provisioning than you would using a specialized QoS technique. For example, consider a branch office with a 1.5-Mbps DSL connection that costs $150 per month. Web traffic is hideously slow, so you might decide to upgrade to dual connections rather than listen to users grouse about the lack of bandwidth. First, do a traffic analysis: Streaming Internet radio or something equally frivolous might be the culprit. By using QoS to reduce or eliminate streaming audio, we found that our example branch office could save $1,800 annually.

Another way to conserve bandwidth is to use compression. Graphics can have their resolution or color depth reduced, video can be compressed with efficient codecs such as MPEG or DivX, and audio can be encoded at a lower bit rate or converted from stereo to mono. These lossy compression schemes can be taken only so far before there is a noticeable drop in quality.

As an alternative, you can use a lossless method such as zip, gzip or Aladdin Stuffit. Lossless compression doesn't reduce quality, but it doesn't cut file sizes as dramatically, either. Furthermore, lossless compression doesn't work efficiently on already compressed data, such as JPEGs, MP3s or video files.

Another best practice is to enable HTTP compression on Web servers. Compression was specified in HTTP 1.0 but was left as optional for client support. HTTP 1.1 clients, on the other hand, are required to support compression. HTML compresses efficiently: We once squeezed 8 GB of text onto a standard CD. The only downside is that HTTP compression imposes CPU overhead.Expand Networks, Packeteer and Peribit Networks all sell site-to-site compression appliances that sit behind the Internet routers at your branch offices and automatically compress all traffic flowing through them. Of course, you must have a compression appliance at each end of a link, and traffic not sent between the appliances is not compressed. But these devices are inexpensive and will maximize your WAN links (see "Smarter Compression Technology.").

If none of these simpler methods fills the bill, it's time to get fancy.Unlike data that goes over the Internet, LAN-only traffic can honor QoS policies across various subnets. Traffic is separated into classes, which can represent a protocol, IP range or MAC address, and flows. Most systems refer to a complete TCP session as a flow--when a Web user performs the TCP three-way handshake and HTTP transfer and then finally closes the session, all that traffic is considered part of the same flow. QoS devices sometimes apply policies to an entire class, to each individual flow or a combination of the two.

To achieve QoS, modify the ToS (Type of Service) bits. ToS is composed of 8 bits, falling between the ninth and the 16th bits of the IPv4 header. Bits 0, 1 and 2 of the ToS field may be used to indicate the relative priority of a packet (see RFC-791) on a scale of 0 to 7. Bit 3 indicates normal or low delay, Bit 4 indicates normal or high throughput, and Bit 5 indicates normal or high reliability. The RFC says a packet should use two of these three options at most. Bits 6 and 7 are reserved for future use.

There are no official guidelines for what to do with this information. Networks are assumed to drop lower priority traffic in favor of higher priority. Traffic of the same priority level cannot be differentiated further, and network control traffic (such as RIP or ICMP messages) usually occupies the highest priority, thus limiting us to an effective six levels.

Unfortunately, according to Cisco Systems, Bits 3 through 5 were not implemented consistently between network vendors (see "DiffServ -- The Scalable End-to-End QoS Model"). Even worse, RFC 1349 redefined Bits 3 through 6 a decade later by way of five classifications: minimize delay, maximize throughput, maximize reliability, minimize monetary cost or normal service. We could pick only one, and none of this guaranteed bandwidth capacity. IPv6 has dropped the ToS octet in favor of a "traffic class" octet.Using ToS bits is often referred to as "coloring," an aging but still used reference to the bronze, silver, gold, diamond and platinum service levels offered by many service providers. ToS is still used occasionally, but rarely on the LAN.

Integrated Services, or IntServ, can provide end-to-end QoS by assuring a level of available bandwidth, so long as every router on the network is set up to support and honor IntServ. There are two QoS levels provided: guaranteed service and controlled load. The guaranteed service level assures that a set amount of bandwidth is available and that there will be no additional delay on account of queuing packets. Even if you overprovision your network, IntServ will still make sure you can meet the guaranteed service level.

Controlled load acts like traditional IP traffic on a lightly loaded network--things work on a best-effort basis, and there are no strong guarantees. Non-IntServ traffic gets the leftovers.

An end host initiates an IntServ QoS session by sending out an RSVP (Resource Reservation Protocol; RFC 2205) request. RSVP is a signaling protocol that requests a resource reservation across the network. The network will approve or reject the request, depending on whether every hop can fulfill the request. If approved, the bandwidth is reserved. Each router maintains a state table for every IntServ session. If the original sender crashes or loses connectivity, the IntServ session will time out and the reservation will be canceled.

One problem with IntServ is the need to maintain state across the entire network. This taxes routers' limited CPU and RAM resources. And, every device in a packet's path, including end nodes, must understand IntServ. In practice, IntServ has been used on small-scale networks but hasn't caught on in highly distributed or large networks because of scalability concerns.DiffServ (Differentiated Services; RFC 2475) addresses some of the shortcomings of IntServ and ToS. DiffServ is more scalable and can work across multiple networks if implemented correctly. The first six ToS bits in an IPv4 packet, or the traffic class octet in an IPv6 packet, are referred to as the DSCP (Differentiated Services Codepoint; RFC 2474). DSCP supports as many as 64 classes.

A network will form a collection of DiffServ routers, called a "DiffServ cloud." Traffic is classified when it enters the cloud. A provider will usually negotiate with a customer and establish a service-level agreement. For example, a corporation may subscribe to a bronze, silver or gold package from an ISP. The corporation's contract will determine the DiffServ priority setting.

The biggest advantage to DiffServ is that it operates at the boundary. Once data enters the cloud, internal routers don't need to maintain QoS state information. This allows the internal routers to focus only on routing.

However, DiffServ is still unpredictable. Individual internal routers may react oddly to the ToS field or possibly alter it. There are no set standards: One provider's gold standard may be another's bronze. Thus, while you may be paying for the best service, your ISP's peering agreement may say otherwise. Because DiffServ works by dropping packets selectively during high saturation periods, lowest-class members could lose connectivity completely for several seconds during bursts.

DiffServ works well on a larger LAN or WAN because it has a lower overhead and is more scalable than IntServ. Because DiffServ classification is done at the entrance to the cloud, end nodes and intermediate routers don't need to understand or set DiffServ bits.Traffic shapers are the pinnacle of QoS--sometimes the terms are even used interchangeably. This category includes products from Allot Communications, Lightspeed Systems, Packeteer and Sitara Networks that perform QoS, deep packet inspection, classification and traffic reporting. Although these products can classify data by looking at DiffServ and ToS settings, they don't rely on those technologies. And because these devices operate as standalone appliances, there shouldn't be required configuration changes or interoperability issues with the rest of your network.Traditionally these products sit near the network edge, although you can use them to shape internal LAN traffic, too. Most operate at Layer 7, solving the "Let's run everything over Port 80 and nobody will notice it" syndrome. This problem occurs when you want to set a policy for a protocol (say HTTP) that runs on the same port as a protocol you don't want (such as P2P or streaming media). Layer 7 devices can tell if traffic going to Port 80 is HTTP, P2P, streaming video, or an HTML or JPG transfer.

Traffic shapers typically are installed in monitor-only mode for a few days. This lets you see what kind of traffic is going across the network and what is taking up the most bandwidth--traffic shapers' reporting tools are perhaps their most valuable assets.

Although definitions and features are vendor-specific, traffic shapers have a few common capabilities: Traffic can be shaped based on class (such as protocol or subnet), flow or both. And you can set minimum and maximum bandwidths, as well as burst.

Burst is permitted only if there is extra bandwidth available. For example, you could assign FTP 10 Mbps normally, but burst to 15 Mbps when there is bandwidth available. You can give some advanced commands to high-end shapers, such as "Allow 8 Kbps for every VoIP session, but let VoIP use up only 1 Mbps. If VoIP is using 1 Mbps, do not allow any new VoIP sessions."

Lines or WindowsTraffic shapers work by queuing packets or manipulating TCP window sizes. Vendors that use queuing claim that TCP will throttle itself down automatically because of the queue, so using TRS (TCP rate shaping) is unnecessary, unnatural and not specified by the IETF.

Also, TRS cannot handle UDP (User Datagram Protocol) traffic. This is a minor consideration, however, because any traffic-shaping vendor that uses TRS will implement a second, queuing-based algorithm to handle UDP. Allot Communications and Lightspeed both make queuing traffic shapers.

TRS works by manipulating TCP-control data and window sizes. This tricks the TCP session into believing it is communicating to a host on a much slower link. Similar shaping would occur naturally if a modem user connected to a server on a T3. TRS vendors claim that queuing adds latency, causes more dropped packets and isn't as good at slowing down incoming traffic. Packeteer and Sitara both use TRS.

Queuing vendors have four major techniques at their disposal:

• PQ (priority queuing) works just like ToS. Higher-priority queues transmit before lower queues, meaning lowest queues can become starved.• CBQ (class-based queuing) overcomes some of the starvation problems inherent in PQ. Classes can be configured with a minimum amount of bandwidth, and can borrow bandwidth from other classes if available.

• WFQ (weighted fair queuing) will increase or decrease a queue size based on priority level. Bandwidth utilization is not taken into account.

• HWFQ (hierarchical weighted fair queuing) evaluates the worst-case packet delay under various traffic scenarios based on real-time traffic, and uses this data in evaluating the queue.

The queuing versus rate-shaping argument has gone on for many years--but frankly, we feel that management interfaces, reporting quality and performance matter SUB: much more than the underlying technology.

Just Queue ItImplementing QoS doesn't have to be difficult or overly time-consuming, but it can be if you try too hard. Pick those applications that you absolutely need, and make sure that they always get enough network resources. Or, simply limit the worst offenders.

Bottom line: Don't automatically associate lack of performance with a need for more bandwidth. There's no reason to upgrade to Gigabit Ethernet when enabling QoS on a router for free could yield satisfactory results. At minimum, QoS can be used in the short term to keep the network up and running until the next purchasing cycle.

Michael J. DeMaria is an associate technology editor based at Network Computing's Syracuse University Real-World Labs®. Write to him at [email protected].

Post a comment or question on this story.Quality of Service

In a perfect world, QoS would have been built into the Internet from the start. Then, inconsistent bandwidth utilization, jittery latency, packet loss and greedy applications wouldn't worry us.But the world isn't perfect, so for streaming media, such as voice over IP, you need to ensure low and steady latency as well as a specific amount of bandwidth. Applications like FTP and Web traffic may not be sensitive to latency, but they can eat up bandwidth like crazy. An oversaturated network causes packet loss, inefficient data transfer and snarky end users.

QoS can be achieved in several ways. Enabling compression and buying more bandwidth are the simplest solutions, but may only delay the inevitable. More effective ways to stop users from clogging your pipes are to use Type of Service, Integrated Services, Differentiated Services, queuing or TCP rate-shaping. Of these, only the latter two are effective at controlling data sent across the Internet. The rest are best for internal LAN QoS or between partners with service agreements.

Don't fall prey to the philosophy of buying more bandwidth whenever you run short. By implementing a few simple QoS policies, you can tame bandwidth-hungry applications in favor of mission-critical protocols. And because QoS capabilities have found their way onto many pieces of network infrastructure, the cost to enable QoS could be minimal.Ultimately, true QoS can extend only as far as your network borders, effectively stopping at your Internet routers. ISPs and Internet backbone providers aren't required to honor your QoS scheme, and you can, in fact, expect them to gleefully ignore it.

So should you implement QoS, or just buy more bandwidth?

The "buy more bandwidth" argument is simply this: Bandwidth is cheap, QoS is complicated. If you have enough capacity, QoS becomes irrelevant. The Internet constantly grows in capacity, and technological advances have continued to increase networking speed.This argument holds more water on the LAN side than on the WAN side. Gigabit Ethernet is affordable, and most systems have a hard time saturating it, let alone 10GigE. WAN speeds, though, haven't increased at such an affordable pace, and it's commonplace to find your Internet connections saturated at least a few times a day.

QoS capabilities now come bundled free in many infrastructure products, including multipurpose security boxes such as FortiNet's FortiGate, VPN gateways and routers.

The biggest oversight in the unlimited supply model is that if you have extra bandwidth, someone will find a use for it. It's like traffic systems: In the 1930s, New York City built two bridges to ease congestion on its existing span. After the bridges were completed, congestion was just as bad as before the project began, despite the increased capacity.

With the proliferation of file sharing, music trading and P2P, bandwidth should no longer be looked at as limitless. Web pages have become bloated as well.

You can block some frivolous downloads with content filters (assuming your users don't wise up to the fact that content filters can't easily block encrypted HTTPS traffic). However, even traffic that conforms to your acceptable use policy can wreak havoc. A 5-MB attachment sent to a 100-recipient distribution list will do some damage. High-speed users connecting to your Web server may be sucking all the bandwidth away from DSL users. QoS can ensure that all users have an equal experience at your site.Some organizations run Web site analysis tools against the Web log file. These files can consist of multiple gigabytes, so downloading them by FTP off the Web server may suck up so much bandwidth that there is little left for critical SQL lookups.

Then there's VoIP. We hear from our readers that VoIP pilots are on many IT managers' agendas. If you're among them, you need QoS. Even small periods of bandwidth saturation--just a few seconds' worth--can cause clips in phone conversations. Without a guarantee that VoIP calls will be just as good as POTS calls, getting management to approve a VoIP rollout could be difficult.QoS is full of confusing and sometimes contradictory terms. Here's how Network Computing defines them:

Quality of Service: A way to provide better or stable service for select network traffic through bandwidth or latency control.

Saturation point: The amount of load (packet count, simultaneous sessions or bandwidth utilization) that causes a network device to start dropping an unacceptable percentage of packets.

Flow: A session between two hosts (such as a TCP session). This includes handshaking, data transfer and termination. There can be multiple simultaneous flows between two hosts.Class: A grouping of flows based on common criteria. May include protocol, source/destination address or subnet.

Classification: Detecting, identifying and potentially marking flows.

Burst rate vs. maximum rate: If a QoS device supports bursting, it can let a class or flow be configured to use more bandwidth than the maximum rate, but only if extra, unused bandwidth is available. Think of it as a second max rate: Burst will always be higher than max rate. If burst equals max rate, then bursting is effectively disabled.

Web LinksInfrastructure white papers & research reports

Infrastructure books

Definition of the DS Field

DiffServ Architecture

DiffServ white paperIntServ & DiffServ primer

Revised ToS specification

RSVP specification

"PacketShaper 8500: Traffic Management Gets Smart"

"Fine-Tuning for QoS"