home news blogs forums events research newsletter whitepapers careers


Network Computing Network Computing Network Computing
HOT PICKS

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers




Network Baselining and Performance Management


Reporting and Alarms

For performance-management purposes, you want to be able to produce meaningful reports that describe how a metric is trending relative to a baseline. For the most critical service elements you may examine such reports every day or once a week. But you don't have the time to check a report on every measured service element with such frequency -- so you need some mechanism by which you are alerted when a particular metric has changed in a significant manner. This is achieved by means of thresholds and alarms. A threshold is a baseline set to a level of the metric at which you want to become aware of trends in that metric.

I use the term baseline in the generic sense of "yardstick" or "standard for comparison." While the term is often used in this sense, I have heard several capacity-planning people argue over the semantics of this use. Many prefer to reserve the term "baseline" for the current state at the time of the original measurement and use the term "threshold" only for the level at which awareness becomes necessary. Unfortunately the term "baseline" is already used to mean something much more similar to "threshold" in the related field of service-level management-- the target level of performance.

When a threshold is exceeded you want to be notified by means of an alarm, e-mail, page or other "pushed" indicator. As we discussed previously, there is a capability in SNMP to send traps from devices in a network to a network management system. This approach is used to report faults such as a line down or an interface not responding, but it can also be used to send alerts when certain thresholds are exceeded. For example, this mechanism is frequently used by the SNMP agents in Ethernet hubs to report when preset error thresholds have been exceeded. However, such mechanisms tend to be focused on real-time changes in the operating environment rather than trends which develop over time. The threshold/alarm functions used in performance management are, in fact, usually provided by the reporting tools.

Let's take a quick look at some of the actual tools that can be used to collect and summarize the data and generate these alarms.

There are two classes of reporting tool that we're interested in. The first class of tools is used to collect and report on data from SNMP agents. The second class is the polling systems, which ususally combine data collection and reporting capabilities.

Examples of the first class of tools are Kaspia Network Audit Technology from Kaspia Systems and NetworkHealth from Concord Communications. These tools interrogate MIB data from a wide variety of SNMP agents and can provide a large variety of summary reports showing trends in the collected data over time. They also support various threshold mechanisms so that a network manager can be notified when a particular service element requires attention.

The second class of tools is quite diverse. If you have identified a number of different services that require this type of approach, you may be best served by a tool that provides a wide range of polling alternatives. For example, IP.Check from Baranoff Software addresses a wide range of IP based applications, as well as providing simple TCP/IP network-level polling using ICMP (Internet Control Management Protocol). On the other hand, if you are more focused on a particular type of service element then you may want to investigate tools that offer more depth in a particular area. For example, AlertPage from Geneva Software is focused on response and availability of network and servers (at the network level) while MailCheck, also from Baranoff, is focused purely on messaging systems.

It is beyond our scope to provide an in-depth analysis of all the tools available in each category. Similarly it is not practical to offer exact advice over what threshold values should be used for each type of network technology. In many cases the tools that offer threshold capabilities will have default values already set, and those are a good place to start. Otherwise, since utilization seems to be the metric which causes the most confusion, here are some guidelines based on my own experience for four common classes of service element. I have assumed in each case that you will set thresholds against metrics for both average utilization and peak utilization.

  1. Leased lines. Average utilization: 45% of line speed, Peak Utilization: 70%. Measurement period of one day.

  2. Frame relay. Since Frame Relay allows burst rates above the committed information rate (CIR) you can afford a smaller margin of error. Average utilization: 55% of CIR, Peak Utilization: 80%. Measurement period of one day.

  3. Ethernet LANs. The way Ethernet works is that a device on a shared LAN which needs to send data simply waits for the wire to go quiet then places its data on the LAN segment. If another device on the same logical segment attempts to do this at the same time, both devices detect a 'collision' and back-off for a short random (yes random!) period of time before waiting for another quiet period to try again. In practice this actually works very well while the utilization remains low. However, as the utilization increases, the performance gets exponentially worse. At around 40% utilization all that's happening is collisions and no actual data is getting sent. Therefore you never want to get anywhere near 40%. Set thresholds at 15% for average utilization and 25% for peak utilization. Those numbers can be increased for pure switched networks (since collisions are no longer a consideration) to 25% and 40%. Measurement period of 15 minutes.

  4. Other LAN technologies. No such problem for token ring and ATM. Average utilization: 50%, peak utilization 70%. Measurement period of 15 minutes.


Control

Tools that report on performance are just that -- tools. They are of no value unless it is clearly understood how the output from those tools will be used. For each key service and measured service element, you must define who is responsible for generating and analyzing reports, who receives alarms, who tunes thresholds and (most importantly) the mechanisms by which exceeded thresholds lead to network capacity or topology changes.


Print This Page


e-mail E-mail this URL





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Purchase Today: $299
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics
 
   
   
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |   Briefing Centers
Copyright © 2008  United Business Media LLC  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights