home
NEWS       BLOGS       FORUMS       NEWSLETTERS       RESEARCH       EVENTS       DIGITAL LIBRARY       CAREERS  
Network Computing Network Computing Powered by InformationWeek Business Technology Network

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers


ON THE WIRE

A Bridge Over Very Troubled Waters

by Bill Alderson and J. Scott Haugdahl

There's never a dull moment in the network analysis business, and this time--solving a network problem at a nuclear power plant--was no exception. We were called in to diagnose a problem described in general terms as "users randomly dropping their connections across the campus network." The campus network, a key component of a geographically dispersed corporate network supporting thousands of users, consists of a dual-ring FDDI backbone used to connect several multisegment Ethernet-to-FDDI translational bridges. File servers, DEC VAXes and workstations are connected across the various Ethernet segments. Two of the segments connect to routers for access to the wide-area corporate network.

Scott: Upon our arrival at our customer's site, we were greeted by a huge, cone-shaped cement cooling structure sticking up out of the ground. Not to be intimidated, we gritted our teeth, grabbed our analysis bag of tricks and headed inside.

Bill: Inside the cone?

Scott: Very funny. Shall we get to the core of the problem?

Bill: OK. Anyway, let's tell our readers how we reacted to the problem. First, we confirmed that the dropped connections were only reported by users for traffic between Ethernet segments over the FDDI backbone. Local Ethernet sessions and sessions over the wide area routers were not affected.

Scott: For sessions over the backbone, both IP and IPX sessions were affected, which makes sense, since the Ethern et segments were connected by translational bridges. Translational bridging occurs at the data link layer and, as such, is network layer independent, meaning that any higher-level protocol could be impacted by the problem.

Bill: Our initial analysis of several packet traces taken from two Ethernet segments and the FDDI backbone indicated numerous protocol and application deficiencies. While these deficiencies were not the direct cause of the lost connections, they turned out to be a contributing factor.

Scott: Since the problem was intermittent, we had to catch it in the act.

Bill: Therefore, we picked two of the busiest Ethernet segments over which users were communicating with servers across the backbone. We set up a repetitive NetWare server access test at a workstation on one of those segments which talked to a server on the other segment.

Scott: We attached protocol analyzers to those two segments as well as to the FDDI, thus simultaneously capturing on both Ethernets and the backbone in order to follow the session data as it appeared on all three segments. By setting a filter on the workstation running the test, we could focus on this one session and not worry about the megabytes of data flowing across the FDDI that normally filled our analyzer's buffer in 15 seconds.

Bill: Thus, having cast our lines, we waited for a byte.

Scott: Or even a nibble. We figured we didn't have to wait long, as the average bandwidth on either Ethernet segment stayed in the 20 percent to 40 percent range, with peaks of 60 percent.

Bill: Sure enough, in less than 10 minutes the session failed, and we were able to analyze and compare the packet traces from the three analyzers.

Scott: Whenever a failure occurs, it generally falls into two categories: network failure or end-node failure. A network failure means that the network (cable, hubs, bridges, routers and so on ) failed to deliver a packet end to end. An end-node failure means that the network delivered the packet to the end segment as promised, but a workstation or server failed to act on the packet.

Bill: In this case, network failure appeared to be the most common cause of dropped sessions. In multiple instances, our analysis of the trace files from the three analyzers showed that a request packet was delivered from the workstation, through the first bridge and over the FDDI, but it never passed the second bridge of the target segment, which is where the server resides.

Scott: The packet in question was considered "good." That is, there were no errors present as the packet appeared on the FDDI backbone. The end segment and bridge were still functional at the time, as evidenced by other traffic passing through, as well as the fact that the workstation broadcast a NetWare RIP packet asking for the server to respond and the server answered.

Bill: Thus, the workstation and server could "communicate" via a broadcast and a directed return packet (from the server to workstation), but it could not do so if the workstation sent a packet specifically addressed to the server--this packet was effectively blocked by the bridge.

Scott: We also observed that the bridge on the workstation side was from vendor "X," while the bridge on the server side was from vendor "Y." The sudden blocking of a station's packets always seemed to occur at vendor Y's bridge.

Bill: To us the situation smelled like an internal table look-up problem inside the bridge, since other traffic, as well as broadcasts, was still getting through. We examined the settable parameters of the bridge, which then provided no further clue to the problem at hand.

Scott: After six phone calls to the vendor's technical support hotline, we finally got a response.

Bill: The vendor hotline was at a loss to explain what was happenin g, but the vendor did eventually admit that early releases of the bridge software had problems handling high amounts of Ethernet traffic.

Scott: And just how high are these traffic levels?

Bill: Merely 30 percent to 40 percent! Apparently the bridges could even "go to sleep" if the traffic got too heavy. The vendor claimed the problem was solved in the latest release of its bridge software, so our customer had several solutions to consider.

Scott: Basically, the solutions (short of a network rearchitecture) were to upgrade the bridge, replace the bridge with a vendor "X" bridge or drastically reduce the load across the suspect bridge. In the end, our client chose to replace the bridge with the vendor "X" bridge.

Bill: We felt pretty good about definitively solving another network problem. On top of that, while fishing for the big one, we also identified several other problems for our customer to work on.

Scott: Needless to say, we were glowing with success as we left the power plant.

Bill and Scott are principals of Pine Mountain Group. They can be reached at otw@pmg.com. Portions of the actual traces files from selected columns are available in the Network Computing On the Wire CompuServe forum or via Pine Mountain Group's Home Page (http://www.pmg.com).

September 1, 1995







Looking for a new job?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
The tumbling of IT jobs stopped in the second quarter, as the IT sector added about 44,000 jobs.

It's just a glimmer, but Oracle is starting to see a bit of light at the end of the recession tunnel.










2009 IT Salary Survey: Meager Raises, Solid Prospects
Though raises are notably smaller than a year ago, and job security’s shrinking, IT careers are looking safer than many others in this economic downturn. Get all the findings in InformationWeek's 2009 IT Salary Survey. Available FREE for a limited time.
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



Techweb
Informationweek Business Technology Network
InformationweekInformationweek 500Informationweek 500 ConferenceInformationweek AnalyticsInformationweek Events
Informationweek MagazineGlobal CIOIWK Government ITbMightyByte and SwitchDark Reading
Digital LibraryIntelligent EnterpriseInternet EvolutionNetwork ComputingPlug Into The CloudDr. DobbsContentinople
space
TechWeb Events Network
InteropVoiceConWeb 2.0 ExpoWeb 2.0 SummitEnterprise 2.0Mobile Business ExpoNoJitter
Black HatGTECEnergy CampCloud ConnectGov 2.0 ExpoGov 2.0 Summit
space
Light Reading Communications Network
Light ReadingLight Reading AsiaUnstrungCable Digital NewsInternet EvolutionPyramid Research
Heavy ReadingLight Reading LiveLight Reading InsiderEthrnet ExpoTelco TVTower Technology Summit
space
Financial Technology Network
Advanced TradingBank Systems and TechnologyInsurance and TechnologyWall Street and TechnologyAccelerating WallstreetBST SummitBuyside Trading SummitIT Summit
space
Microsoft Technology Network
MSDNTechNetTotal IT ProTotal Dev ProNET Total Dev Pro CommunitySQL Total Dev Pro Community
space


App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |  Advertising Contacts  |   Briefing Centers
Copyright © 2009  United Business Media LLC  |  Privacy Statement  |  Terms of Service