There's little doubt that monitoring and alerting tools showing device and interface up/down status and historical utilization statistics are important. After all, they are a foundational part of ensuring the network is up and running from a high-level perspective.
That said, the role of the enterprise network administrator is evolving to the point where new monitoring tools are required. Thanks to cloud computing, the Internet of Things (IoT), and digital transformation (DX), network monitoring is now being pushed beyond traditional physical boundaries. Let's look at several reasons why network monitoring tools in the enterprise must now track application, device, and software-as-a-service (SaaS) related performance.
Until recently, enterprise network infrastructures were relatively static in nature. Physical boundaries separated the corporate network that contained most end-user applications, data, and services within the LAN, WLAN, and WAN. Thus, from a network perspective, if the network devices were up and pushing packets, relatively little added visibility was required. SNMP, ping, traceroute, and syslog reporting were all that was needed. The only intelligent routing capabilities were found in spanning tree, dynamic routing, and hardware/software failure recovery protocols. Lastly, most adds or changes to the network were considered major and required a great deal of time to architect and implement.
About a decade ago, things started to change, starting with the mass acceptance of cloud computing. It was here where physical network boundaries began to dissolve. Suddenly, we found networks stretching into third-party managed infrastructure-as-a-service (IaaS) clouds and apps/data moving into platform-as-a-service (PaaS) and SaaS environments. It was also where visibility gaps in network monitoring and alerting tools began to form.
As cloud expansion continued throughout the years, new technologies and business trends began to emerge that put further pressure on the need for advanced network visibility. These include the deployment of IoT devices onto managed or unmanaged LANs, WANs, and carrier cellular networks. Additionally, the shift towards digital transformation made it, so network performance had not only to monitor interface performance statistics but also the performance of individual applications that are now considered mission-critical thanks to DX.
Finally, networks are now being designed to be scalable and more intelligent from a resiliency perspective. Architectures are now to the point where major and highly-automated changes to the network – and thus to data flows – are considered normal. All of which is creating a situation where the network team must be at the ready to monitor, troubleshoot, and remediate highly-specific problems when the slightest hint of performance problems are revealed.
Fortunately, there are tools available that network administrators can tap into to achieve the level of cloud, device, and application performance visibility required. Some of the data sources, such as Netflow and IPFIX, have been around for years, but often ignored from a monitoring perspective. Other monitoring data sources are new, including streaming network telemetry and network-based deep packet inspection (DPI). Thus, old network monitoring sources combined with the new are providing entirely new levels of visibility.
While added monitoring and alerting capabilities are great, it can add to the workload of an already busy network admin. That’s why we’re seeing a shift away from separate network, application, and device monitoring tools towards what’s being referred to as artificial intelligence (AI) for IT operations or AIOps for short. AIOps platforms combine traditional monitoring tools with streaming telemetry and DPI and analyze all of it using AI. AI analyzes each data source and correlates multiple anomalies to automate the identification of problems while also providing detailed information on how to fix the issue. Thus, if an AIOps platform is properly implemented, not only does it provide more visibility into potential problems, it also eliminates many manual troubleshooting and remediation tasks.
Related Network Computing articles: