home news blogs forums events research newsletter whitepapers careers


Network Computing Network Computing Network Computing
HOT PICKS

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers





Building Voice over IP

May 8, 2000

by Philip Carden

 

A migration strategy

Moving straight to a full-blown IP telephony solution essentially means loading up your existing voice equipment and moving it out with a forklift, while probably at the same time upgrading your cabling system to ensure there are enough cat-5 outlets available. The sudden equipment and cabling obsolescence combined with the need for retraining of staff quite possibly make this approach a difficult sell to senior management (particularly since the immediate benefits of an IP telephone handset over a traditional phone may not be entirely obvious). There is an alternative approach that can be taken: IP-enabled PBXs.

Rather than replace the cabling and handsets, just upgrade or replace the PBX so that it speaks IP telephony to the outside world (and makes each of the attached phones appear to the outside world like an IP telephony endpoint). The solution would look like that shown in the diagram below.

First of all, check with your existing PBX vendor to determine whether a suitable upgrade is available. Both Lucent and Nortel will shortly be providing such support for their Definity and Magellan PBXs, respectively.

The other alternative is simply to replace your existing PBX with a pure IP PBX or "iPBX." This approach tends to be better suited to smaller office environments. Some models only support IP telephony, while others support both IP telephony and direct PSTN connections (with intelligent routing between the two). Some models are PC-based (typically running on Windows NT), while others are stand-alone units. Representative vendors of this class of product include AltiServ, NetPhone, ShoreTelecom and Vertical Networks.

 

Encoding schemes

When you speak, you cause air molecules to vibrate — that’s how sound is transported. If you were to plot the displacement of air molecules versus time you would be drawing the waveform of human speech, which might look like the graph below.

 

The function of the microphone in the telephone mouthpiece is to convert this waveform to an electrical signal. The electrical signal, if plotted against the same time scale would look the same. This electrical signal is an analog signal — at any one time, the signal may have any value between the top and bottom peak values (as opposed to a digital signal, which is generated using only two signal levels representing the binary digits zero and one). In order to transport this analog voice signal over a digital network it is necessary to convert the analog signal to a digital data stream of ones and zeros. The process used to do this is called voice encoding (and the device or software program used for encoding and decoding is called a codec).

There are two basic approaches you can take to encoding. The first is to sample the signal strength itself at a rate higher than the frequency of the signal, as shown in the following diagram.

Sampling theory tells us that in order to reproduce the original signal from a digital sample we must sample at a rate at least 2.2 times the maximum frequency represented in the underlying signal. Since the human voice is made up of frequencies in the range 300Hz to a bit under 4,000Hz, we can use a sampling rate of 8.000 times per second. If for each sample we use 8 bits to represent the signal strength then we’ll need a bandwidth of 8 bits, 8,000 times per second or 64kbps. Such an approach is called Pulse Code Modulation (PCM) and is the most widely used method to transport voice on today’s digital public telephone networks. This sampling approach can be refined by doing some additional processing. Adaptive Differential Pulse Code Modulation (ADPCM), for example, predicts what the next value will be based on previous values then sends only the difference between the actual and predicted. Since the difference is smaller than the signal, less bits are required for transport. PCM and ADPCM are used in ITU standards G.711 and G.726, respectively.

The second approach to encoding is to split the voice signal up into substantially larger chunks, which represent whole, recognizable sounds used in human speech. This is the approach used for Codebook Excited Linear Prediction (CELP) and its variants; prevalent examples include MPE/ACELP (ITU Standard G.723.1) and CS-ACELP (G.729).

So which encoding scheme should you go for? The standard by which telephone voice quality is measured is so-called "toll quality," which in effect means the quality delivered by PCM (G.711), which is predominantly used in today’s phone networks and is what you are used to when you use the public telephone network in developed countries. The great thing about PCM is that the algorithm itself is pretty straightforward, so not too much processing power is required — that means high performance and relatively inexpensive encoding equipment. The downside of PCM is that it uses up a whole 64kbps for each voice circuit — not so bad if you’re a carrier with bandwidth to burn, but less optimal if you’re a corporate customer getting heavily charged for that bandwidth. CELP makes a big difference in bandwidth requirements because the sampling frequency can be much lower — this means that "near toll quality" can be achieved with 8kbps. In order to do that CELP does a lot more processing than PCM, which originally meant substantially higher costs and lower throughput. All that said, 8kbps is simply four times better use of bandwidth than PCM and twice as good as ADPCM (which realistically needs 32kbps for near toll quality). For that reason, you’ll generally want to look for G.723.1 or G.729 encoding — the two standard CELP implementations.

If you’re working on calculating how much bandwidth you’ll need for a particular number of voice circuits you’ll also need to consider packet overhead. Rule of thumb? Triple the bandwidth requirement. I know that sounds pretty extreme, but here’s the rationale: Voice packets need to be kept relatively small to minimize delay effects. If you assume the use of a G.729 codec, two samples of 10ms each will go into a single 20-byte packet. But the PPP/IP/UDP/RTP header is 49 bytes — not a very efficient arrangement. There are header compression schemes available (RFC 2508 for RTP compression and RFC 2393 for compression of all headers except IP), but these are not widely implemented, and RFC 2508 won’t operate across an IPSEC VPN. Unless you’re implementing MPLS (discussed later) you’re just going to have to put up with this overhead for now.

Quality of Service

Encoding is not the only driver of voice quality. The human ear is very sensitive to even minor changes in an audio signal (interestingly, the eye is much less sensitive to imperfect video). What this means is that if a signal is to be packetized, the packets must arrive predictably with minimal delay (a specific Quality of Service or QOS). These requirements do not apply to data networking, which is pretty tolerant of variable network performance — even heavily transactional data applications won’t be affected by the odd 500ms delay. Unfortunately, IP was originally designed as a data networking protocol so until recently IP networks offered little in the way of built in Quality of Service.

There are several approaches that can be taken to assure Quality of Service:

Data Link Layer QOS. ATM has well-known, built-in quality-of-service capabilities and with the adoption last year of 802.1Q VLAN tags, even Ethernet, can provide eight levels of prioritization tagging for each frame. The problem with data link approaches is that they only really work if the whole data path is based on the same data link layer. If your Ethernet LANs are interconnected by Frame Relay, the 802.1Q tags will do little good since they won’t be passed.

Type of Service (TOS). Part of every Ipv4 header is the TOS byte, which includes 3 bits, allocated to priority (therefore three levels, the same as 801.Q) and another four bits used to define the type of service. For this to translate into something resembling a QOS mechanism, two things need to happen. Firstly, your router network needs to have the functionality to recognize TOS fields and provide different classes of service based on them (either automatically or manually, using filters). Secondly, the TOS bits need to be set — either by the IP end system (e.g., our VoIP relay) or by the access router detecting the traffic type and setting the TOS bits.

Resource Reservation Protocol (RSVP). The way that RSVP works is that at the start of a session (or voice conversation) control packets are first sent through the network to reserve resources for the connection. If appropriate resources are not available (e.g., there’s insufficient bandwidth available), then the connection is rejected. This approach can provide very strong QOS assurance, but it does not necessarily scale well. It is therefore suitable for intra-corporation requirements (like our toll bypass scenario), but it is strategically unsuitable for a world in which a large proportion of inter-business phone calls are placed via IP.

DiffServ. The Differentiated Services working group has refined the use of the TOS byte so that per-hop behaviors can be requested by a sender. This approach focuses on providing classes of service that the network makes available and which applications can choose to use, contrasted with RSVP in which the application dictates its requirements. DiffServ is less complex than RSVP and better suited to meeting long-term, Internet-scale QOS requirements.

Multi-Protocol Label Switching (MPLS). MPLS is another working party looking at how to improve network layer performance through switching of packet labels. In an MPLS network, the IP datagram header is replaced (at the access router) with a much shorter (13 byte) label, which (apart from speeding switching performance) can be used to identify the class of service requirement at the network ingress point so that intermediate nodes can prioritize traffic appropriately. MPLS also provides for routes to be chosen for a particular stream in response to the QOS required for that stream.

From a design perspective there are two different issues you need to consider when establishing how to provide the required QOS. First of all, clearly you’ll need to establish the capabilities of your router infrastructure and what, if any, software upgrades would be required to support appropriate QOS capabilities. Secondly, you’ll need to make sure that the IP telephony end systems can also interwork with the router QOS mechanisms. The simplest way for this to work is for the end systems to indicate the class of service they require by setting the TOS byte (either using the traditional settings or the new uses recommended by DiffServ) and having the routers automatically detect this setting and allocate resources accordingly. Alternatively, under RSVP, the application will need to be capable of making resource requests.

Philip Carden is a managing analyst with no-8.capital, an e-business and telecommunications investment firm. He has written numerous features articles on Internet and telecommunications subjects and has contributed to two books on Internet Security. He can be reached at pcarden@no-8.com.

PAGE: 1 I 2 I 3 I 4 I 5 I FIRST PAGE
 





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Download Today
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics
 
   
   
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |   Briefing Centers
Copyright © 2008  United Business Media LLC  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights