I’ve been working with two large organizations recently on projects which include identifying gaps in network management tools (in the broad sense, including UC management) and related processes. That inspired me to list some best practices that I’ve identified over the past 20+ years. I want to say the tools come and go, but then again, some tools just won’t fade away. For example, I’m beginning to think HP OpenView/NNMi will outlive me. Let’s not get into whether any of the other effects of aging apply to either NNMi or me.
Let’s start with some basic facts:
- Network management tool vendors will likely sell only one copy of their product to each customer, unless the company is huge. Overall, not that many copies get sold.
- Writing good code for SNMP and NetFlow data, let alone other technologies (QoS, IP SLA, MediaNet…), is complex.
- There is little standardization across vendors’ products, let alone integration and sharing of data.
- On top of that, many hardware vendors have done very poor implementations of SNMP, e.g. making a whole new copy of their MIB tree every time they add things in a new version, or sending traps every time someone logs in or out. That’s not how you’re supposed to do things! It makes the management software programmers’ job that much harder.
All those items drive the economics for the vendors. It’s a tough market. You need specialized niche programmers, you have to invest a lot to manage each new technology, and the relatively limited revenue will limit your R&D budget.
Results of this:
- Incumbents have an advantage, in that they’ve done all the platform and base functionality. If they didn’t architect well, that can hinder shifts as technologies change and new capabilities arise.
- Due to the difficulty of deploying and learning a new network management tool, companies have substantial inertia about replacing existing products that work, but perhaps could work better.
- Due to the difficulty of learning what a new tool can do better, there is a disincentive to try and buy new products, as above. (As I write this, I’ve just romped through seven one-hour UC management product presentations with demo. There are a lot of details to assimilate, things they do and do not do.)
Another basic factor to bear in mind is the way we technologists have been trained by vendors: buy a new product or technology, and it’ll make things better. Sometimes it does. Sometimes it just adds work, or the labor costs exceed the benefits. Cisco’s Martin Roesch, founder of Sourcefire, talked about that a bit at our recent CMUG seminar, concerning security products: as you add new sensors and products, you do not want exponential growth of complexity. Cisco is architecting for that.
The reason I bring this point up is that one common thread across a lot of companies is the lack of ownership and time to maintain and add value by tuning network management products. That is, the tools we have don’t solve our problems, so we add another, making the time constraint worse, so it too fails to add the value we hope for.
To succeed with network management tools, all of the following are needed:
- Strong lucid understanding of the requirements up front.
- Emphasis placed on tools with a lot of built-in automation of common tasks, since maintenance gets slighted in most organizations. This should be considered a major requirement.
- Tools that are relatively easy to use, providing good built-in monitoring, alerting, and reports out of the box.
- Tools that are carefully selected to get the job done, delivering on the main requirements. Avoid overlaps.
- Strong testing of tools before purchase. You should test over, say, a 60-day to 90-day, not a 30-day period. Make sure you’re getting the functionality you thought you would get. Sales claims can be… overly enthusiastic.
- Adequate funding and staffing to maintain the tools. That includes not only making sure devices are accessible and managed by the tools, but that someone is looking for ways to add to canned reports, monitoring, log or trap filtering, and otherwise adding value to the tools.
- A process to ensure all devices are properly managed, along with periodic inventory review to verify that all devices are being managed by the tool suite.
- Funding, time, and an “evangelist” or “tool champion” for each tool – someone who learns it inside out, tweaks it, and can provide ad hoc just-in-time training to other staff. (Note I did not say “champion tool.”)
- Ease of access to tools (portal, logins based on an Active Directory login and group membership for access to each tool). Barriers to use have to be removed or tools won’t get used.
- Short training sessions with demos of using a tool to solve a problem. Five- to 10-minute video recordings may help scale this. Train people on how to solve one small problem with the tool, rather than on everything the tool can do, i.e. train in the context which people will be approaching the tool.
Focus on requirements
For some reason, people seem to get really attached to a particular network management tool. This might be because there is so much involved in learning any tool, so the tool you know becomes a personal comfort zone. Or maybe people just end up really liking one tool, perhaps the one they chose.
In any case, this can become the enemy of progress. Focusing on requirements helps. Do ask questions like “what do you want the tool to do?” Otherwise, discussion can become a “my tool/your tool” battle, and get very emotional.
Reasonably good network documentation, including diagrams, are the “A No. 1” basic network management tool, the real starting point for network manageability. If you don’t have the basics, you need to fix that.
I’m a fan of a network overview document, oriented around the OSI layers. Physical sites and links. Where are the VLAN extents? how many are there? What’s the big picture IP addressing scheme? Routing scheme? Points of route redistribution? Where do filters live, and what is their purpose? All words and diagrams, not so heavy on tech details. Think “orientation and big picture.”
I also like “rationale” documents. As in “why did we build the routing this way?” These documents explain choices – why certain decisions were made, particularly ones that others might wonder about, and then try to “fix” – breaking something, or running into the problem you cleverly avoided with your undocumented design feature.
In my latest network management tool assessment, I’m realizing I like tightly integrated suites, when done reasonably well. They can do things that isolated products cannot. Some vendors get that, others do not. How to tell: loose integration usually means both tools have a shared web front end, but do not really share much data.
As an example, having both NetScout Infinistream for packet capture, and Riverbed for NetFlow collection at a site creates two islands of information. Users would have to jump back and forth. They are likely to just use one or the other, but not both. Using one vendor for both sources of data increases the tool’s effectiveness, and the user’s as well.
This article originally appeared on the NetCraftsmen blog.