How To Use VoIP On Your Wireless LAN

There's more to consider than you realize, so use this hands-on advice before giving it a try.

May 31, 2005

13 Min Read
Network Computing logo

Technology companies have created WiFi-based phone systems serving niche markets for several years. Early products were introduced with 802.11b-only wireless subsystems having maximum phone capacity of about five voice calls. These systems often used proprietary signaling and QoS techniques. The wireless phone links wouldn't have been secure, and there wouldn't have been sufficient bandwidth to service data applications and phones simultaneously by current expectations.

Ongoing improvements in 802.11 technology—high-rate PHYs, WPA security, and QoS methods—promise to bring Voice over IP (VoIP), VoIP-plus-data, video-plus-data, and VoIP-plus-video-plus-data applications into the mainstream. The growth of VoIP service providers, small and large, suggests business opportunities for applying 802.11 technology to VoIP, or for providing VoIP service over 802.11.

Unfortunately, there are no standard practices for providing VoIP/802.11 service. There are high-level problems to resolve such as billing, call processing, and secure rapid handoff between systems. There are 802.11-level problems to resolve, such as how to support both VoIP and data on the same wireless channel while also optimizing handset battery life.

Emerging VoIP/WLAN systems will be compared with, and may compete with, cellular phone systems. The cellular systems are synchronous; the phones, base stations, and backhaul connections have common timing. Thus, capacity and timing are known and unvarying. There's only one class of service, voice; thus, QoS access methods aren't needed to provide differentiated or guaranteed classes of service. Even when data services are added, they're added in a way that's compatible with the time slots, multiplexing, and management of the voice services. Cellular systems use licensed spectrum and have planned deployments that avoid interference between base stations. For all these reasons, cellular systems are predictable to the microsecond level.

802.11 systems aren't synchronous, are seldom planned, use unlicensed spectrum, can experience significant interference from multiple wireless networks and other non-WLAN devices, and are generally unpredictable at the microsecond level even though they may be robust overall. Talk time, i.e. battery life, is an important point of comparison between cellular telephony and VoIP-over-WLAN. When 802.11 subsystems are added to cellular handsets, they're constrained to use the existing battery system and will be compared directly with the cellular implementation. A well-designed 802.11 subsystem can deliver talk times and power budgets comparable to cellular systems only by placing the 802.11 subsystem into sleep mode between transmitting voice packets, just like the cellular systems.Microsecond predictability of synchronous cellular systems is conducive to a synchronous sleep-wake scheduling discipline for the hardware and firmware in the handset. The 802.11 universe, instead, uses CSMA contention methods that conspicuously lack centralized synchronous timing. This is the principal strength of 802.11, and can be viewed as yet another replay of the perpetual debate between packet-and circuit-switching, between Ethernet and ATM, and between robust/adaptive channel access (good enough) and rigid timing (perfection). This means that new protocol engineering is needed to develop power-save timing techniques that work in the non-synchronous 802.11 world.

It's possible to change the 802.11 MAC into a synchronous, slotted-TDMA design either on a full-time or part-time basis. Numerous proposals and proprietary implementations do exactly that. But the result would no longer be Wi-Fi. Although a technically valid approach, a globally synchronous Wi-Fi infrastructure would be incompatible with existing 802.11 devices, and in some cases, not compatible at all. For the immediate future, we must concentrate on how to work with VoIP and Wi-Fi as it's understood today. That means taking full advantage of the toolkit of new features produced by the TGe QoS group and beginning to be certified as WMM (Wi-Fi Multimedia).

VoIP profiles
VoIP is a constant bit rate (CBR) application. VoIP packets, or frames, are continually generated at a constant interval, usually 10, 20, or 30 ms, although there are exceptions (22.5 ms). The CBR frames travel from source to sink, passing through a various equipment and links along the path. ITU-T Recommendation G.114 specifies an end-to-end latency budget of 150 ms or less. If there's a wireless LAN at the source and/or sink, each WLAN can have only a small portion of the 150 ms. If the CBR packets traverse the Internet or a busy corporate network, the arrival timing at the sink won't faithfully replicate the injection timing at the source. Packets will arrive late, or sometimes not at all. And packets may arrive in bunches at well-timed CBR intervals.

Internet-style latency, jitter, packet loss, and bunching are problems for older codecs. The legacy codecs are far less tolerant of packet loss and jitter than modern codecs designed for Internet use. Some would say the classic codecs are largely intolerant of sub-optimal channels. That's understandable given their history. It's s also understandable that there's interest in tweaking wireless LANs to support the stringent needs of legacy codecs. However, that becomes less important with the proliferation of new codec technology.

In the world of VoIP over broadband Internet, conditions are far less than synchronous or optimal. This has spurred development of advanced codec designs that compare favorably with high-end ITU specifications, such as G.729. For example, the iLBC codec from Global IP Sound is now mandatory in the CableLabs PacketCable spec, is an experimental track specification (RFC 3952) within IETF, and is the basis for at least one well-known Internet VoIP product (Skype). This codec claims to withstand 30% packet loss while maintaining voice quality in the presence of Internet-like delay and jitter. It seems to be the perfect answer for a non-synchronous, open system like 802.11.As codecs improve, the job of providing Telco-quality VoIP service over WLAN becomes easier. There's less motivation to add complex timing and synchronization methods to Wi-Fi just to benefit a codec. However, the VoIP handset still has a problem: it's important to put the Wi-Fi subsystem into sleep mode between VoIP packets. That means the device can't send or receive packets when in this mode. That in turn means the access point (AP) must not transmit downlink VoIP frames to the handset whenever they arrive at the AP. Instead, the AP must know when the handset is in sleep mode and transmit only when the handset is ready.

The need for power-save synchronization makes the all-synchronous network look seductive again. Not to worry. Either the HCCA extension to Wi-Fi or the EDCA extension can be used with emerging power-save signaling methods to achieve the desired goal of synchronizing CBR transfers between an AP and a station without morphing Wi-Fi into a completely synchronous system.

HCCA scenario
Using HCCA polling, once a station is accepted by the AP as a polled client (using protocol handshakes not described here), the station in normal operation sleeps until the expected arrival time for a downlink poll or poll-plus-VoIP frame from the AP (Fig. 1). The station immediately responds in the mandatory time (9 Μs) with uplink VoIP data (or a QoS-NULL) frame. The AP will respond with an ACK if the station sent uplink data.


1. HCCA polling is illustrated is this view of high-level timing.

The station must come out of sleep mode before the expected downlink poll from the AP. This occupies 0.1 to 1.0 ms depending on the hardware design. Then there will be some waiting time until the downlink poll arrives. The poll can be delayed by many factors, including interference, a long duration frame on the channel, an internal schedule conflict within the AP (polling another station), a higher-priority operation (AP must transmit a Beacon), the previous frame exchange took longer than expected, or relative clock drift between the AP and the STA. All these factors will right-shift the schedule. Once the downlink poll arrives, things are predictable. The uplink/downlink frame exchanges should occur in less than 1 ms, depending on the choice of codec and PHY rate. The main sources of timing uncertainty are from a right-shift of the schedule, possible retries after failure, and variable transmission times if variable PHY rates are used. This leads to a high-level estimate of 15 to 18 ms for sleep time for a 20-ms codec period, or an efficiency factor of 75% or better.

Some subtle effects that should be mentioned: the CBR schedule, and the effects of variable PHY rates, non-uniform codec intervals, and packet arrival bursts (bunching) and retransmissions. The CBR schedule is communicated from AP to station when the CBR request is accepted by the AP (via a TSPEC request). It's generally accepted that the average cellular call duration is about 100 seconds. If the AP is provisioned for 20 active calls, we can expect a call setup/teardown every 5 seconds. If we weren't worried about synchronizing the AP polling schedule with the STA, then there would be no effect on the polling schedule. However, the AP must maintain the advertised schedule with each station even though stations will be frequently entering and leaving the polling list. This means that the AP designer must maintain a schedule with fixed time slots.In this context, a slot is a channel time period earmarked for the polled frame exchange sequence with a particular station. But unless all uplink and downlink frames are sent at the same PHY rate, taking the exact same amount of channel time, then the slots will have variable duration. That contradicts maintaining the fixed timing relationship needed for effective power-save synchronization.

Some vendors prefer to operate all the stations at a fixed rate (6 Mbits/s) to avoid this problem. But, if variable rates are used, then either the schedule must be changed and communicated reliably to each associated station—not a good idea because of extra overhead and reliability issues—or the AP must transmit something in the unused time of each slot so that non-polled stations won't fill in the dead air with Wi-Fi packets, thus further perturbing the attempt to make a synchronous schedule.

Lastly, suppose that an AP is supporting a mix of handsets using different codec intervals, a likely scenario. In this situation, it may be impossible to construct a polling schedule that doesn't include periodic and frequent timing conflicts between different CBR clients, leading to a right-shift of the schedule.

Other effects that will perturb the ideal HCCA schedule are the occasional need to send more than one downlink VoIP frame to a station. This happens when packets arrive in bunches at the AP because of Internet or routing queuing behavior. When this happens, the AP must transmit multiple downlink frames instead of the usual single frame. This will right-shift the schedule unless every conceptual slot has sufficient extra time budget. The same right-shift happens if there's an uplink retransmission. It's doubtful that the AP designer will reserve extra slot time for every station in every slot. That wastes channel time. Instead, the likely decision is to allow the schedule to shift to the right at the expense of extending the power-on time for each affected downlink station.

The HCCA approach can be characterized as an N-body synchronization scheme, whereby the AP sets up a polling schedule for N stations that attempt to stay synchronized with the AP in spite of schedule perturbations. It's fair to characterize this as an N-body problem because timing anomalies with any station on the polling for either uplink or downlink traffic effect the timing of the other N-1 participants. The timing interdependence of polling must be compared to the timing independence power-save methods applied to EDCA, which is the other QoS extension to Wi-Fi defined within WMM.EDCA scenario
The EDCA access method provides for prioritized channel access. Each station will select from four sets of priority-controlling parameters for best-effort packets (normal), background packets (very low priority), video traffic, and voice traffic (highest priority). The AP has the same set of priority-controlling parameters, but may also transmit one time period earlier than any station. This is the secret behind HCCA polling (the AP can always transmit before any station) as well as one of the important principles in EDCA (the AP always wins contention for the channel).

Unlike HCCA, where the station must slave itself to the AP's polling schedule, the EDCA station may operate in a specialized power-save mode called unscheduled APSD, or UPSD. In this mode, the station sleeps until it has a VoIP frame ready to transmit (Fig. 2). The AP is expecting this behavior because of a prior signaling handshake conducted between the station and AP.


2. Pictured is the EDCA station's specialized power-save mode, called Unscheduled APSD, or UPSD.

The power-up procedure at the station can occur with perfect timing, i.e., there's no schedule right-shift, or waiting for the poll, or timing effects caused by other stations or conflicting schedules. The station comes to full power and transmits the VoIP frame using the highest priority parameters available to it. It's reasonable to expect the uplink frame to launch with less than 2 ms of power-consuming delay. It's assumed that the AP is configured to avoid long bursts or other behavior that would increase this delay. This assumption must be taken for both HCCA and EDCA. Without it, either scheme will experience a schedule right-shift or a delay in the UPSD frame exchange.

The uplink frame is ACKed by the AP. The station can retransmit if necessary, and will stay awake until the AP sends down a VoIP frame or a null indication (meaning that there are no VoIP packets to send). Implementations on conventional AP hardware show that the turnaround time at the AP can be bounded to values less than 100 Μs, and improvements to this value should be expected. The lesson is that the effort to add this functionality to an off-the-shelf AP is minimal, especially when compared to the many complexities of maintaining a CBR polling schedule.As a practical matter, the UPSD scheme will have about the same 75% power-save efficiency for a 20-ms CBR scenario as the HCCA scheme. With a 30-ms CBR interval, the efficiency should improve to 83%. The principal difference is that the N-body synchronization problem is traded for a one-body exercise.

Complexity arguments favor the EDCA scheme, especially because the power-save efficiency of the two approaches are similar. Two other points of comparison between HCCA and EDCA should be mentioned although they're only indirectly related to power-save procedures. These are the hidden node problem and AP interference.

Hidden nodes are stations that can receive frames from an AP, but may not be able to receive frames from all other stations associated with the AP. Without any mitigating procedures, hidden nodes violate the basic premise of CSMA. That is, the stations should sense the medium before transmitting. If sensing is imperfect because they're hidden, they'll create interference when transmitting.

Fortunately, EDCA is robust in the presence of interference and collisions from all sources including hidden nodes. Also, hidden nodes are rarely a factor in small or enterprise environments where there's good AP coverage. In outdoor environments, directional antennas can eliminate most of the hidden nodes.

It's often claimed that HCCA is immune to hidden node effects because stations transmit only when polled, thus avoiding collisions. This is true. But the polled stations must also be EDCA stations to communicate with the AP for non-CBR traffic. Thus, a hidden node remains hidden to some extent with HCCA. It's also true that HCCA APs and a CBR polling discipline are particularly vulnerable to interference from nearby APs and other stations. Any appreciable interference rate will right-shift the schedule whereas EDCA will adapt without effort. The bottom line is that the one-body solution with EDCA/UPSD is preferred to an N-Body problem with HCCA.About the author
Dr. Greg Chesson is the director of protocol engineering at Atheros Communications. Chesson earned degrees in computer science at the University of Illinois. He can be reached at [email protected].

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights