Upcoming Events

Cloud Connect
Santa Clara
Feb 13-16, 2012

Cloud Connect brings together the entire cloud eco-system to better understand the transformation we're experiencing and promises to be the defining event of the cloud computing industry. Learn about the latest cloud technologies and platforms from thought leaders in Cloud Connect’s comprehensive conference.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up
Workshop
W O R K S H O P  
Tuning Voice Over the WAN

  January 22, 2001
  By Dave Brown



Tradeoffs at each step

Even if the network-associated delay (access queues, transport, and far end queues) is designed to a bare minimum, voice encoding and decoding delay times still can contribute substantially to the overall budget if users do not carefully choose algorithms for voice compression or, in the case of G.711, for voice sampling and forwarding.

Table 2 (below) lists the most common voice encoding techniques used for packet telephony. Shown for each is the algorithm's International Telecommunications Union (ITU) name, bandwidth required, mean opinion score (MOS) (a subjective voice quality rating), and the amount of time the encoder requires to examine the input stream before it can emit compressed (or, in the case of G.711, uncompressed) output. Added to this is the processing time the vocoder needs to actually perform the compression, which can vary from only a few milliseconds for a good digital signal processor (DSP) chip to 100 ms or more for a PC that runs the algorithm in software.


Table 2: Vocoder Algorithm Summary

AlgorithmBandwidth requiredQuality scoreLook-ahead time
G.711a64 Kbps4.4 MOS0.75 ms
Actually nothing more than pulse code modulation (PCM) without compression, this bandwidth intensive algorithm introduces less than a millisecond of delay and offers highest speech quality.
G.72632 Kbps4.1 MOS5 ms
This type of compression, adaptive differential pulse code modulation (ADPCM), requires less than 10 ms for "look-ahead" sampling. It can double the call capacity of a circuit.
G.729A8 Kbps4.0 MOS10 ms
Algebraic code-excited linear prediction (ACELP) is the most popular compression algorithm found in Voice over DSL integrated access devices (IAD) and frame relay or LAN gateways because it provides high quality over modest bandwidth. To minimize encoding delay, a digital signal processor (DSP) usually is employed in the IAD.
G.723.15.4 Kbps3.4 MOS30 ms
Identified as the default speech encoder for IP telephony by the International Multimedia Teleconferencing Consortium (IMTC) VoIP Forum, G.723 offers the best tradeoffs of acceptable voice quality, interoperability with a wide range of IP telephones and conferencing systems, and, low bandwidth consumption. (G.723.1 also can run at 6.3 Kbps and earn a slightly better MOS, 3.6) However, there's a dark side - intellectual property. To use G.723, IAD and gateway manufacturers must pay licensing fees to the developers.


Tradeoffs in packet size may be possible in some systems. Smaller packets (in theory) reduce latency because they can be loaded up quickly and sent on their way. Note, however, that 750 packets with two-byte payloads are required to carry as much information as one 1500-byte packet. The network overhead associated with handling floods of short packets could significantly increase access queue and buffering requirements, leaving you with little or no net gain.

DSP-equipped IP telephones like the EF200 from Komodo Technologies (since bought by Cisco) are set by default to load two voice frame samples in each packet. As we've noted, this reduces the apparent compression time, but greatly increases packet overhead. As a result, a pair of EF200s set to use the G.723 algorithm may actually consume bandwidth at rates approaching 15 to 30 Kbps, instead of the 5.4 Kbps theoretically required.

Examining your network's delay budget and likely sources of jitter may help you make basic design decisions. If, for example, you're looking at a DSL or frame relay connection with 256 Kbps or better committed information rate between a branch and home office, and need to support only four simultaneous conversations, G.711 may be a fine choice. It can provide clear, toll-quality voice service in a network that's got bandwidth to burn. But if you've got only a 64 Kbps TCP/IP pipe and no real control over how many simultaneous calls will be attempted from a LAN to your intranet, G.729 or G.723 may have to be used. G.729 introduces some encoder delay, but it consumes only about 8 Kbps per call. G.723 uses the lowest amount of bandwidth (unless packet sizes are very small), but it can introduce large amounts of delay because of its long lookahead time.

G.711 and G.723 currently are about the only choices you'll find in the lowest cost hardware-based IP phones or in PC-based "softphone" implementations. Manufacturers have to implement at least G.711 to be compliant with the ITU-TH.323 standard; many offer G.723 in attempts to be interoperable with the widest range of collaboration systems, including Microsoft's NetMeeting.


   Page: 1 | 2 | 3 | Next Page

Research and Reports

Hypervisor Derby
August 2011

Network Computing: August 2011

TechWeb Careers