

Designing Fault-Tolerant TCP/IP WANs
By Chris Lewis
Users are becoming less tolerant of losing access to mission-critical data and applications over WAN links. A few years ago, fault tolerance for WANs meant only a dial backup link. But with the increasing complexity and size of corporate WANs, now there are more issues to consider.
In most cases, a WAN provides remote branches with access to centrally located hosts or servers. In many cases, a WAN also allows for direct branch-to-branch communication. Most organizations have remote branches grouped around the main population centers of the country--near New York, Chicago or Los Angeles, for example
. Very few network designers will choose to provide an individual link for every branch back to the central site. This normally proves too costly.
In this workshop, we'll focus on building a WAN, rather than on how to delegate that responsibility to a third party by subscribing to a frame relay or some other shared network. If you do subscribe to a shared network service, you should have options that are accommodated by your network vendor's capabilities.
We'll examine the following areas of WAN design: the backbone, which carries traffic between distribution centers; distribution centers, which connect to each remote branch; the central host site configuration and dial backup options. Each area has different responsibilities and design goals.
Designing the Backbone
A WAN backbone provides higher-capacity bandwidth, with high levels of availability. Let's consider an organization that has remote branches clustered around six major locations, Chicago (headquarters and home of the corpora
te hosts), Boston, Miami, Phoenix, Los Angeles and Seattle. These six cities form the distribution centers to the branches and are interconnected by the backbone links. Typically, these six locations would be referred to as points of presence (POPs) on the WAN.
To
connect the distribution centers to headquarters, we can use star, ring, fully meshed or partially meshed topologies.

A star network is the least fault tolerant. In our example, it's comprised of a connection from the Chicago head office to each distribution center. Typically, backbone links will carry more than 128 Kbps, making them difficult to back up with a typical dial-up ISDN solution (ISDN inverse multiplexing options and Multilink Point-to-Point Protocol (PPP)-based lines are available that combine multiple ISDN Basic Rate Interface (BRI) lines in to one link, but standards in this area are still evolving).
The ring topology creates a daisy-chain effect, with e
ach distribution center depicted as one point on a loop of lines connected in a circle. This configuration provides two leased lines to each location and, therefore, an alternate route should one line to the distribution center fail. It also makes dial backup of the backbone links unnecessary. The downside is that as the number of distribution centers grows, the probability that two links will fail simultaneously increases. In a ring topology, if two links fail, the distribution centers between the failures will be denied service.
An additional point of caution with the ring topology is that the links used to connect one distribution center to another need to have at least twice the capacity of the normal load. This is necessary for the WAN links to handle fault conditions properly. Consider what happens when a link fails in a ring topology. The traffic that was being routed over the downed link will find an alternate path through other links. If the links now carrying this traffic (in addition to their n
ormal load), do not have sufficient capacity, the operation of the entire ring is adversely affected.
A fully meshed network is the ultimate in fault tolerance, but is very costly, so it's rarely implemented. In a fully meshed network, each distribution center has a link to every other, providing m
ultiple alternate paths in the event of line failures. More commonly implemented is a partially meshed network. This topology is a cross between a ring and a fully meshed network in terms of the number of alternate paths implemented between distribution centers.
|