Can Service Management Save Your Network?
With a set of new protocols, service management may let you to manage your network as a unified whole, rather than on a component-by-component basis.
December 1, 2004
Pity the beleaguered network architect. They jump from crisis to crisis, making sure everything is running well--or at least just running. What they need is a new way to proactively monitor and manage their network, a way that helps them anticipate problems and resolve them before users notice. What they need is service management.
Service management is a set of technologies and organizational principles that promises to save IT from the tyranny of component-level management. Instead of monitoring the operation of separate networking devices--switches, servers, routers, and the like--service management looks at the inter-relationship between them. With IT looking to weld applications together through Web services and virtualize processing across pools of servers, network architects will need this capability to monitor interactions and ensure that the underlying platforms can adapt automatically to new requirements.
While service management was first conceived in the 1980s by mainframe administrators, networking deployments were limited because the means for gathering configuration information was incomplete or ineffective. However, new protocols are being developed that will replace SNMP and allow service management systems to query and modify a broader range of network-connected devices.
With that information pulled into a common database, network managers will have a holistic view of the network. They'll be able to translate application-level requirements into network-level requirements, opening the way for firm SLAs based not on arcane metrics such as packet delay, but on user-level terms such as transaction response times.
MANAGEMENT BLUEPRINTSIn the 1980s, the U.K. government saw the chaos that was coming in networked IT and developed a set of best practices documents, based in part on work done by IBM with its Yellow Books. The documents were collectively known as the IT Infrastructure Library (ITIL).
The specification was divided into two sections: service support and service delivery. Service support focused on getting a service running, keeping it running, and knowing the network resources available. Service delivery, on the other hand, dealt with the financial side of deploying a service and ensuring that the service lived up to agreed terms and conditions.
ITIL has gotten a lot of attention in the United Kingdom and is gaining favor in Europe. Adoption in the United States has been slower, but is expected to grow quickly. In fact, all the major network management vendors have adopted ITIL to some degree, and some have consulting services that can assist in implementing the ITIL best practices. For example, IBM's Tivoli management products incorporate ITIL's best practices approach to network management. HP, Computer Associates, BMC Software (along with Remedy, a BMC company), and Mercury Interactive are among the largest management product vendors to specifically incorporate ITIL. Axios Systems, Compuware, and others also offer applications based on ITIL best practices. Then there's Microsoft, which has introduced its own version of ITIL, dubbed the Microsoft Operations Framework (MOF).
There's even speculation that the ISO may create a specification based on ITIL that will standardize components of service management. Companies that already support ITIL concepts will be ahead of the competition if this indeed happens.
WOE UNTO SNMP The ITIL principle is built around complete configuration management, whereby information about all networking devices is brought into a common database called the Configuration Management Database (CMDB). The service desk can use information in the CMDB to handle change requests and problem reports (called incidents). Changes can be identified by comparing data in the CMDB to data acquired by regular sweeps of the network.
The problem is that there's no standard means for populating the CMDB with configuration management data, or pushing out the necessary changes to network devices. SNMP is the most commonly used protocol for this purpose, but its design assumptions create a whole new set of problems.
SNMP today is built with the assumption that it's independent from surrounding protocols. This independence means the protocol can't easily adapt to new devices and can be expensive to implement. The SNMP security model, for example, works independently of the network security model, forcing architects to implement additional key management schemes. This is a big reason why SNMP security was never deployed, and why the IETF's Integrated Security Model for SNMP (ISMS) Working Group is now looking to address the issue.
SNMP is also built around a 448-byte message, limiting the size of a transaction that can be implemented in a single SNMP exchange. Stringing together messages is possible, but impractical for some tasks, such as configuring ACLs. As a result, network architects often end up interacting with networking equipment through Web interfaces or Command Line Interfaces (CLIs).
SNMP also lacks standardized mechanisms to manage most of the interesting features of network elements. Vendors like it this way because they prefer to provide proprietary MIBs to fully manage their products. However, this forces third-party management consoles to run applets from equipment vendors in order to make use of the proprietary MIBs. Such an arrangement virtually ensures that a management console can't holistically manage a network with SNMP. THOSE DARN SERVERS
SNMP's problems haven't gone unnoticed by the standards community, and efforts are under way to address its limitations for managing server, client, and network infrastructure.
Server management efforts have been particularly effective with the introduction of Intelligent Platform Management Interface (IPMI) 2.0 by Dell, HP, Intel, and NEC. IPMI provides a standard interface to the Baseboard Management Controller, a monitoring chip being included in most new servers. The controller monitors sensors within the server, measuring environmental conditions such as thermal and voltage levels, as well as the server's fan operation.
The Distributed Management Task Force (DMTF) is building on IPMI with its Systems Management Architecture for Server Hardware (SMASH), a collection of efforts aimed at enabling network managers to gather hardware and low-level firmware and software information. SMASH in turn augments Web-Based Enterprise Management (WBEM), the standard released by the DMTF in 1998 that introduced the Common Information Model (CIM), a platform- and technology-independent standard used for describing compute and networking environments and which underlies nearly all the DMTF's work.
At the same time, Microsoft is pushing its own proposed management framework, code-named Web Services for Management Extension (WMX). The protocol replaces IPMI, SMASH, and OASIS' Web Services Distributed Management (WSDM) by describing a generic Simple Object Access Protocol (SOAP)-based management protocol that reuses Microsoft's existing Web Services (WS) specifications and security models to support server management operations (see "Server Management Systems," Technology Roadmap, October 2004).Client management will become simpler with the recently announced Active Management Technology (AMT) from Intel. AMT works by installing a purpose-specific management processor on the motherboard (see "AMT: The Secret Sauce"). Unlike IPMI, which is primarily for servers, AMT is designed for servers and lower-end devices such as desktop and notebook computers. As long as the computer is getting power and there's a connection to the network, management software will be able to receive information from the AMT. In addition to supporting a device inventory, the AMT chip can be used by managers to receive reports on device status. It can even be used out-of-band by managers to shut down and restart systems, or to re-install OSs or software on otherwise "dead" machines.
SNMP REPLACEMENTS
Meanwhile, at the IETF, the folks who developed SNMP are working on a replacement protocol. The Network Configuration Working Group aims to replace proprietary CLIs and Web interfaces with a standard configuration protocol. The protocol will encode configuration information as XML documents and provide basic operations to upload, retrieve, and edit configurations across Secure Shell (SSH), SOAP, or the Block Extensible Exchange Protocol (BEEP), which is a middleware protocol.
According to a report to the Internet Research Task Force (IRTF), the IETF's sister organization, XML messaging is faster than SNMP for large amounts of data. This is because it can combine hundreds of SNMP messages into a single XML document. XML's structure also allows vendors to embed metadata describing their own extensions, simplifying the process of gathering information about a new device.
Backward interoperability with SNMP-based devices is an obvious concern, and one that the IRTF has already addressed. One approach makes use of a gateway between the SNMP device and the XML management console. The gateway converts SNMP MIBs to XML code and vice versa. This also provides faster communications between the devices and the management console because the gateway can combine and compress MIBs from numerous managed devices into a single XML string.Management data is already being encapsulated in XML documents. An early example of the use of XML for network device management is JUNOScript, developed by Juniper Networks for its routers. Using JUNOScript, client applications can manage Juniper routers using XML.
The problem with XML-based messaging is that it can produce very large documents. Vendor concerns to this effect led the IRTF to run a comparison between SOAP and SNMP. The IRTF found SOAP stacks to be very efficient, consuming only a small amount of memory and CPU resources. However, the size of SOAP messages required compression, which in turn consumed precious CPU cycles on the device.
As XML appliances are deployed in the infrastructure, vendor concerns over SOAP message sizes may disappear. Appliances from vendors such as DataPower Technology, Forum Systems, and Conformative Systems dramatically accelerate the processing of XML documents. XML compression could become just another service that these appliances perform before documents are sent across major junctures in the network.
Until XML appliance penetration grows, however, users are likely to see the Network Configuration protocol implemented through SSH, says Juergen Schoenwaelder, a professor in the Electrical Engineering and Computer Science department at International University Bremen in Germany and chairman of the IRTF's Network Management Research Group.
PROPRIETARY APPROACHESAs the standards groups develop new protocols to address SNMP's problems, vendors are racing ahead with proprietary approaches for product implementation. LANDesk Software, for example, sweeps for its own agents across the network and uses other methods to assess network conditions. For instance, to determine the OS installed on devices, it can analyze the TCP packets sent in response to a ping.
Meanwhile, others are focused on holistic management without using the ITIL model. UXComm, for example, has developed tools for low-level discovery, with the ability to dynamically discover assets and dependence faults, as well as monitor performance. Discovery isn't the main push, however. UXComm focuses on automating configuration changes and, where possible, repairing crashed or underperforming devices. Its method, which makes heavy use of IPMI, was developed without concern for ITIL or MOF.
UXComm's CEO, Mark Sigal, doesn't see the need for ITIL. The company's AutonomIQ application suite uses agents to perform system discovery and to automate reprovisioning, maintenance, and other management tasks. AutonomIQ can also help managers remotely detect and repair some system problems. Sigal says it's difficult trying to choose a single standard for management because so many exist, and because no single standard has become "standard."
NetQoS' SuperAgents are designed to monitor an entire network. According to the company, SuperAgents are installed on a single data center server, where they monitor system processes anywhere on the network. Data collected by the SuperAgents can be used to determine where actual problems are occurring on the network.
One of the more promising areas of application management is the use of embedded software that can interact with management software. Motive has developed a set of management applets that can be integrated into applications, allowing managers to monitor those applications while running. If necessary, a manager can make modifications to an application, force updates, remove buggy updates, and remotely perform other management tasks. NONSTANDARD CMDB
If these standards and products are effective, companies could find themselves facing two other challenges. For all its value in managing a network, there's a major problem looming over CMDB. Namely, there's no standard.
Although the data types are relatively consistent from one vendor to another, field definitions, field lengths, relationships between field data, and other characteristics may not map directly from one vendor to another. Mapping an OpenView CMDB onto a BMC CMDB is nontrivial. Enabling a Veritas application performance management package to also integrate with the OpenView CMDB is similarly a challenge.
There are no formal efforts to establish a standard structure for CMDB, either. However, many point product vendors work closely with developers of large applications to assure transparent compatibility between products. Down the road, nonstandard CMDBs could still be a management problem, especially for merged companies trying to integrate management systems.
SUCCESS Even if CMDB merging can be avoided, network architects are likely to face another challenge--success. Companies that have implemented various levels of service management report that the approach can work. However, the process takes planning and commitment from top management. Staff or specialists familiar with implementing service management are also crucial to the overall success of such a transition.
Ultimately, service management would place IT in the role of a service provider. Internal SLAs could help departments decide the level of services they truly require, and what they're willing to pay for them. Such SLAs could help tune their expectations and establish the IT department as a provider of services, rather than as a cost center.
With all the promise that applying service management principles to an organization can provide, inertia will probably keep the transition from happening at many organizations. If what's working now seems adequate, then why change?
A survey of Network Magazine readers confirms this view. More than half the readers polled indicated that they're satisfied with their network management hardware and software. They also didn't have plans to increase spending on management in the next 12 months.
The switch to service management requires support and planning from the top level down. It's not free, but often offers rapid returns. Without support at the top and a consistent implementation plan, the chances of a successful implementation are slim.Although the benefits can be substantial for even the smallest organizations, making the switch takes effort and involves process changes. The resistance of inertia will continue to be a strong force, slowing the change to service management.
Mark Brownstein can be reached at [email protected]. Send comments to [email protected].
Pink Elephant, an ITIL training and consulting firm, has developed a program called PinkVerify that evaluates products against the ITIL compatibility criteria and maintains an ongoing list of the compatibility status for products submitted. Go to www.pinkelephant.com/consulting/toolsets.
AMT: The Secret Sauce Service management is a powerful concept looking for effective methods of implementation. Today, most management traffic runs in-band. When a user turns the computer off for the day, asset management, inventory, and many other elements of service management disappear from the network.
Intel's AMT, announced in September, can change that by moving reporting and intervention capabilities onto a separate processor. As long as a computer with an AMT chip is connected to the network and power supply, network administrators can work with it. Unlike IPMI, which can only report basic system information such as CPU fan failures and turn power to a computer on or off, the AMT enables a wide range of service management tasks.
The AMT chip features nonvolatile memory and is always available. The nonvolatile memory stores asset information and alerting filters. Additionally, it includes reserved areas that will allow select vendors to offer additional or specially tuned capabilities beyond that supplied by the AMT.
Unlike software agents, the AMT is expected to be resistant to events that can disable software-based management products. Intel claims that its design makes the AMT hardware and firmware "tamper-resistant."
Asset management and tracking is expected to become easier and more accurate once computers with AMT start making their way onto desktops and into servers. A manager can poll the network and receive accurate information from AMT-enabled computers. The AMT also enables accurate tracking of assets, allowing managers to target maintenance, upgrades, and revisions to specific computers and, presumably, to workgroups or classes of users.Perhaps even more important are the AMT's alerting and remote troubleshooting and recovery capabilities. The AMT can alert the manager to hardware failures and impending failures, OS lockups, system boot failures, and hardware sensor reports similar to those handled by IPMI.
The AMT also supports Serial over LAN, which allows a computer's keyboard and text to be redirected to the manager. Upon receiving an alert from the AMT, the manager can enable remote diagnosis. This might involve reinstalling failed drivers, or even a complete OS if necessary.
AMT is expected to start appearing on hardware in the first half of 2005.
Risk Assessment: Service Management Service management concepts are new for many network managers. Products from HP, IBM, BMC, Computer Associates, and other vendors provide varying levels of service management capabilities. The area still needs better integration of products across vendors.
Implementing service management is something that can be done in all organizations. Many large and mid-sized organizations have successfully implemented service management. What's needed is a new focus on network management, one from the top down that recognizes IT as a service provider, not just as an entity that keeps things running.
Service management's impact can be enormous. Companies that have implemented one or more aspects of service management report an increase in productivity and user satisfaction, and a reduction in downtime, freeing network employees to be deployed to more useful areas than those involved in just keeping things running. This proactive approach to management has paid off exceptionally well for current adopters.
Where there's fundamental change, there's also some risk. A rollout of service management practices and tools can support and incorporate current network management tools. However, for some, changing from reactive management to proactive management can be a leap. Some managers may have problems with the transition. Without commitment from top management, implementing service management efforts may not be completely successful.
You May Also Like