home news blogs forums events research newsletter whitepapers careers


Network Computing Network Computing Network Computing
HOT PICKS

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers




Network Baselining and Performance Management


Data Needs Analysis and Instrumentation

If we are going to produce meaningful reports of trends against the baselines, we need to ensure that the right kind of management information is available from the devices in the network. Most of the metrics will require that data be gathered from different parts of the network (the notable exception being metrics related to service provided and in some cases services used). The near universal mechanism for gathering data from network devices is SNMP. Let's quickly review the basics of the SNMP framework that is the cornerstone of most network management solutions.

In a typical SNMP management scenario, any device that needs to be managed (a router, for example) must run an SNMP agent. The SNMP agent lets the network device exchange information with any SNMP-based network management system. The SNMP agent can send different types of information in different ways by utilizing different aspects of the SNMP protocol. For real-time events, such as failures or exceeded thresholds, the agent will send an SNMP "trap" to the management system so that the management system can flash an icon or page an operator. In baselining our service elements, we are more concerned with building a picture of network performance and trends over time. Since it would be very inefficient to send every piece of performance data over the network in real time, the SNMP agent accumulates performance information that a management system can retrieve at regular intervals using the SNMP "get" command. The information is stored by the agent in a standard format defined by the MIB for that class of device. Our important design issue then, is to ensure that the kind(s) of information that we want to baseline are available from the SNMP agents in place. In other words, that the network is properly instrumented.

Fortunately most routers and other network devices have SNMP agents that can provide most of the information needed for capacity planning and network health purposes (including availability, utilization and frame relay CIR). It's important to differentiate between "capacity planning" and analysis of "network use." The information available from standard network devices is limited (it doesn't usually go beyond RMON Groups 1 to 4 because of the processing overhead that would be required). You probably won't need the additional information for simple capacity planning purposes, but you may need it if you wish to monitor trends in network use (like changes in the level of use of a particular protocol or application). If you do need greater granularity of information than that available through the SNMP agents in your network devices, you must install network probes in strategic locations in your network (probably on those links that you've classed "critical service elements"). If you do need greater granularity of information than that available through the SNMP agents in your network devices, you must install network probes. Network probes, from vendors like NetScout and Visual Networks, can provide detailed breakdowns of traffic but tend to be expensive, so location is important.

The SNMP agents perform the function of accumulating real-time data, and management systems periodically poll the agents to collect accumulated (and usually summarized) data. This approach works well for pure bandwidth-service elements (LAN backbones, frame relay service or leased lines) and may be used for managed services that are provided to you (application-level monitoring from RMON2 and proprietary approaches like NetScout Enterprise RMON). However, in most cases a different instrumentation approach is required for the end-to-end metrics needed for services provided and some classes of services used.

In general, end-to-end measurement requires a polling system that uses service-specific actions to measure the delay between sending a request and getting a response. For example, to measure network response time, a simple polling system might send out pings to specific IP addresses (maybe router ports) and measure the time the ping takes to get a response. If the overall key service you are providing goes beyond purely the network level to include servers, you might use application-level monitoring. In that case the polling system will need to perform a series of actions that are the same as those performed by a user -- for example, connect to a specific URL and use FTP to download a known-size file or use HTML to download a Web page. In most cases the polling system is actually built in to the tool used to generate the reports. We'll look at the specific tools below.

But first, we must address the main challenge with the polling approach to instrumentation -- the difficulty of eliminating from the measurement those factors beyond the scope (and therefore control) of the service you are providing (or the service being provided to you). For example, let's say that one of your critical service elements is a managed WAN backbone consisting of both the frame relay backbone and the premises routers -- all supplied and managed by your service provider (either an external telecommunications provider or the corporate WAN group). Assuming you don't have access to the provider controlled routers, you need to locate the polling devices as close as possible to the service being measured, which for all practical purposes means on the LAN segment the router is attached to. In such a case, the measurement factors beyond the control of the service provider are the performance and availability of the LAN segment, and the performance and availability of the polling system itself. Both of these factors may be eliminated from the measurement by introducing a control measurement, such as the response time to another device on the same LAN; if the WAN performance/availability metrics change without significant change in the LAN performance/availability metrics then that change is due to the service being provided. The metric in this case should be the delta between the overall measurement and the control measurement (for both performance and availability).

In this example, eliminating the beyond-scope factors is relatively straightforward because the control measurement takes place across a network we own. While this is usually the case for network services, many network managers are providing more complex application services. Take for instance a Web-hosting service that has been outsourced to a service provider. An organization's Internet presence may represent the most visible electronic interface to its customers, yet your customers will access the Web-hosting network via the Internet and neither you nor your service provider will have any control over a major part of the data path. In this case, the major beyond-scope factor is the Internet, which neither party controls, so a more complex approach is required. The factors within your control are the Web server and all elements of the network between the Web server and the point of connection to the Internet backbone. A triangulated approach to polling instrumentation can easily provide availability data (two polling systems are connected to the Internet via different ISPs, which poll both the Web site and each other for control purposes. However, instrumenting for performance is more challenging. Ideally, a third polling system would be introduced near to the point of Internet backbone connection. This would allow measurement of response time attributed to those parts of the data path under your control. However, this would require location of equipment on service provider premises, which is probably impractical. A more realistic alternative is to introduce a control measurement to another Web site on a different ISP network (preferably one dedicated to the purpose so that other traffic can be discounted as a factor in the control measurement).

For each of the metrics determined in the previous step, you need to establish a means for collecting the required data. Use SNMP wherever practical and supplement it with polling systems where necessary. If you don't collect the right data you won't be able to generate the reports you need. Perhaps the reporting tools will do some of the required measurements for you. Perhaps you'll determine that some of the desired metrics are simply too hard to measure. In short, expect to go back and forth between these steps a few times as you establish what is practical.


Print This Page


e-mail E-mail this URL





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Download Today
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics
 
   
   
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |   Briefing Centers
Copyright © 2008  United Business Media LLC  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights