It's Time To Take a Look At SIP

Session Initiation Protocol is finally here, and it works. We examine how this versatile signaling protocol can help you set up collaborative multimedia conferencing and voice enabled e-commerce.

April 14, 2003

8 Min Read
Network Computing logo

SIP, like HTTP, is versatile and simple to use. It can set up collaborative multimedia conferencing and voice-enabled e-commerce. SIP is expected to become the norm for VoIP implementations within a couple of years, though today just about all enterprise VoIP vendors would rather keep you locked into their proprietary signaling solutions. The IETF published the first version of SIP, RFC 2543, in 1999 and the most recent version, RFC 3261, last June.

SIP is ideal for VoIP, where a session over the Internet replaces the traditional end-to-end circuit for a voice call in a legacy network. The ITU's H.323 multimedia standard, as well as some vendor-proprietary VoIP phones, also do this. VoIP vendors that built products before SIP emerged have adopted H.323. But SIP is simpler to implement than H.323 and is a lighter-weight protocol with less overhead.

SIP is more than a standards-based replacement for legacy phone connections, though. It makes it easier to implement advanced multimedia services, such as presence, which allows you to determine instantly whether a user can and wants to receive a call on a specific phone, as well as over video and instant messaging sessions. It also lets you ring multiple destinations in a VoIP call.

And SIP is making commercial inroads. Microsoft's WinMessenger IM program, which comes packaged with its XP OS, is based on SIP. WinMessenger also uses SIP to make Internet phone calls. Future 3G wireless WANs, too, will use SIP for setting up and tearing down calls.

Still, there are plenty of misconceptions about what SIP can actually do. SIP does not, for instance, transport digitized voice. That's the job of the RTP (Real-Time Transport Protocol), which transports voice after SIP establishes the call. And before SIP can set up voice, text-messaging or video sessions using various codecs and techniques, you need to determine what features the devices in the session support. That's where the Session Description Protocol comes in: SIP relies on SDP to negotiate the capabilities between two endpoints in a potential conversation.You're Invited

SIP can use UDP (User Datagram Protocol) or TCP as a transport, but by default it uses UDP on Port 5060. If a SIP packet is dropped by an unreliable protocol like UDP, SIP retransmits its command once it decides it has waited long enough for a response.

The most common command SIP sends to another endpoint is the "invite" command. When a SIP phone or UA (user agent) wants to connect to another SIP phone or UA, it sends an invite. If the invite is successful, the originator receives a "200" response, which means everything is OK and the session is established.

Like HTTP and SMTP, SIP is in plain text, which makes it easier to parse the commands. And any protocol analyzer can show the actual commands and responses in a simple ASCII translation.

Along with the invite, a SIP header contains a "to" and "from" address similar to that in an e-mail message. Each of these addresses is called a URI (Uniform Resource Identifier) and looks like an e-mail address:

sip:[email protected]

The "to" field in the URI can contain a standard phone number. The SIP header also contains a "call ID," which is a unique number that identifies the SIP transaction, and a "via" field, which tells the UA which IP address to use for sending its response when it's negotiating the initial connection.

Once the session is established, the "contact" field--the UA's IP address--is used. That's the destination the recipient UA uses to talk to the originating UA. When NAT (network address translation) is deployed, the endpoint uses an unroutable NAT address inserted in the SIP layer as the return address. But SIP vendors can use various work-arounds: A SIP device can use the IP packet's source address, for instance, rather than the IP address that appears in the SIP header. A SIP-aware firewall can use NAT to change the IP address in the SIP header.

The invite request uses SDP syntax to tell the UA the caller's media capabilities. When the called party answers, it replies with the OK message, which also includes the supported media capabilities.Some SIP phones can call each other directly, but if you want scalability, you need servers. SIP servers keep track of directory information and the called parties' locations. There are several SIP servers, all of which can run on the same server or on their own hardware platforms.

The SIP proxy server handles the SIP phone's or UA's requests; it tries to initiate a connection to the recipient on behalf of the originator and stays in the loop until it receives a 200 OK response. The proxy server places its own IP address in the "via" field so the destination client knows where to send its response. Then the proxy passes that response back to the originator. The address in the "contact" field is used for direct communication between UAs (see "Can You Hear Me Now?").Once a proxy server passes on an invite request, it immediately sends back a status message called "100," or "trying." That notifies the caller that it's working on the invite request. After the proxy server locates the destination UA and sends it the invite, it sends a 180 "ringing" message to the sender. When the recipient answers, it passes the 200 response to the proxy server, which then sends the message to the originating UA.

The originator then sends an "ack" response to the destination client, letting the client know that it received the 200. All further communication happens between the clients: RTP takes over, transporting digitized audio between the UAs. When the call is completed, a "bye" message is sent to the other UA, which replies with another 200.

Although a proxy server, by default, drops out of the loop once a call is completed, it can be configured to stick around. It works like this: The proxy server inserts its own address into the "contact" field when it communicates with the UA, forcing the UA to send the rest of the responses back to the proxy server instead of the originating UA. Keeping the proxy involved during the call lets you enable call-detail recording, where you can track the duration of calls. It also lets you hide network details about endpoints for security purposes.

A request and call setup also can pass through multiple proxy servers. If one proxy server can't connect to a destination UA, for example, it can forward the request to another proxy server, which then locates, or attempts to locate, the UA.

There's also a redirect server, which can receive requests from a UA or a proxy server. The redirect server doesn't make the connection itself--it merely replies with information on where to retry the original request. It's a quick way to give the UA information it needs without adding a large processing burden to the proxy server.

Can You Hear Me Now?click to enlarge

A SIP gateway server translates between the VoIP world and the PSTN so you can make calls outside your organization. The gateway can provide a legacy connection to an internal PBX, for instance, or go directly to the PSTN. It translates the SIP signaling into something the legacy telephone understands, and has a codec that converts the RTP traffic so it can be sent from a legacy voice circuit and vice versa.

Most VoIP calls between organizations will go over the PSTN because there is no global directory service for the Internet. In the future, a protocol called ENUM (E.164 Number RFC 2916) may use DNS to provide an Internet-based directory service that tracks phone numbers for calling among different organizations or companies.

A registrar server is required if you use proxy servers; it tells the UA the user's location. For example, when a telecommuter plugs in a VoIP phone, the device automatically tells the registrar server his or her location. You can register multiple phones--a SIP phone, mobile phone or even a legacy phone--with the registrar server as long as there is a SIP gateway available when a call is made.

SIP's support for mobility is one of its biggest draws. When a SIP proxy server receives an invite request, for instance, it can contact each registered device sequentially or all devices simultaneously. The registrar server stores the location information for registered users in a location server, which can reside on the same physical server. SIP proxy servers consult the location server to find a UA.

Don't be afraid to take a SIP.I promise it'll go down easy.

Peter Morrissey is a full-time faculty member at Syracuse University's School of Information Studies, and a contributing editor and columnist at Network Computing. Write to him at [email protected].

Post a comment or question on this story.

The Header

• To: Contains a display name and URI (Uniform Resource Identifier) of the destination UA (user agent) or endpoint.

• From: Contains a display name and URI of the UA that's sending the request.• Contact: Information needed so the destination UA can contact the requesting UA once a call is set up.

• Via: Contains the version of SIP and the transport protocol to be used. Also includes an IP address to ensure the sender receives responses to its requests. Each proxy server in a call puts its IP address in the via field so responses get routed through it.


• INVITE: Main command used by SIP to make a call to another endpoint.

• REGISTER: Used by a SIP endpoint to inform a register server of its current location.Servers

• Proxy Server: Makes requests and sets up connections on behalf of a UA.

• Redirect Server: Provides alternative location information to a UA in response to a request, but doesn't participate in the connection setup.

• Registration Server: Used by a SIP device, such as a phone, to register its current location.

• Location Server: Stores location information in a database. Usually runs on the same physical server as the registration server.

Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights