After examining the benefits and disadvantages of both internal and external monitors, we have found that the best monitoring solution is a hybrid. For troubleshooting, testing and detailed management, you will need an internal solution, but to really know how your site is performing, you will also need an external monitor to provide the all-important customer view of performance. Without this data, you can't justify the cost of scaling initiatives or show that a customer's "slow" experience is indicative of a last-mile issue and not a problem with your site. You need data to back up your claims, and it needs to be from the customer point of view. External monitoring offers this and could provide a very reasonable justification to move ahead with your scaling efforts if necessary. Or the data could give you ammunition to refute claims that your site is not performing up to snuff.
Internet service monitors provide the capability to assess the performance and availability of Internet services, such as HTTP, mail, news, FTP and the most important aspect of e-business, transactions. Whether they are services provided by a third party or software installed in strategic points on your network, these monitors provide the information you need to ensure that your customers are receiving top-notch performance while also making it possible to proactively fix performance issues by scaling your services.
By using performance statistics, you can do trend analysis. This statistical information -- response times, number of visits, number of transactions and so on -- can offer valuable insight into the ability of your site to support customers as they wish to be supported. For instance, a degradation in response time from your servers may help you notice a steady increase in customers as servers strain to support the greater load. A monitor can notify you of such situations by alerting you to this degenerating performance trend. This information can then be used to determine how best to scale your e-business site before customers click away.
Innies and Outies
Two basic types of monitoring are used to maintain your e-business infrastructure performance:
- Internal monitoring. This type of monitoring is provided by software installed locally on your LAN. Generally, some type of agent is distributed to provide performance and availability statistics from strategic points within your corporate LAN.
- External monitoring. This type of monitoring is provided as a service by a third party and is often referred to as "hosted" monitoring. These services provide performance and availability statistics from a number of strategic points across the country and often globally, depending on the service provider.
The most significant differences between the two types of monitoring are the types of data you can expect to receive and the way in which poor performance is reported. Internal monitoring systems often provide e-mail, pager and SNMP alerts when performance falls below a specified threshold. External monitoring can provide e-mail and pager alerts, but the services cannot be configured for SNMP because, among other reasons, many routers and switches are configured to block SNMP traffic into and out of most networks. Therefore, external providers cannot guarantee delivery of an SNMP alert and do not provide this option. Without SNMP, integrating an external monitor into existing performance-management systems is difficult.
The type of data received from external monitoring systems is often not granular enough to fully analyze performance issues. An external monitor may break down the total response time for a Web page into DNS time and server time, and it may offer client, server and network details as well. But an external monitor cannot examine in detail the performance of a single server. Internal systems can often be configured to monitor and capture not only the performance statistics of a particular Web page but any database or other transaction-based performance data associated with the page.
Among the most alluring aspects of a hosted monitor is that you don't have to install software on the LAN and change code on Web pages. Hosted monitors generally need nothing more than a URL and a few options configured to begin monitoring. Internal monitors, on the other hand, often require agents to be distributed across the LAN. Some recent offerings also require code changes to deliver performance statistics to the central server. Code changes are bad -- especially if you need to make them to sites already in production. Upgrades or changes to the service will force more revisions to production sites, and every time you modify those sites, you incur the overhead of staging, testing and deployment. The potential cost in time and quality is not worth the few benefits of a service that requires coding changes.
But perhaps the most important aspect of hosted monitors is that the data provided by these services, while rarely as detailed or informative as administrators and developers would like, closely approximates the customer's experience. External monitors are more likely to offer performance statistics that provide insight into why sales or visits from certain areas of the world may be lagging behind those from other areas. This information, in turn, can lead to tactical decisions in Web site deployment that can turn around these performance issues -- such as the implementation of a global load-balancing scheme and the introduction of caching appliances into the network architecture.
When data is returned, it is often presented initially as a summary of overall response time and availability from multiple locations around the world or at least the country. The data can then be examined in greater detail, including DNS time, response time and even network latency, to present a more accurate view of the Web site's performance. Holistix Remote Monitor (see "Holistix Offers Solid Alternative to In-House Site Monitoring") is a hosted monitoring service that provides a heavily detailed view of your site's performance from varied global locations -- including a breakdown of statistics by object response time and size, network latency and DNS times. Keynote Systems, one of the first providers of hosted monitoring services, also provides a detailed view of performance data (see "How Healthy Is Your Site?"). High network latency may indicate the need to insert caching appliances at the edge of your network or in strategic locations around the country to reduce the number of hops between client and data. If response times for specific Web objects are high, reducing the size of the objects may be necessary. For instance, large images can often be reduced to a lower resolution and thus a smaller size, providing quicker response times. If response times for all objects appear to be deteriorating, it may be time to add a server to your farm or to implement a farm.
If you've implemented a homogeneous Web farm (all servers in the cluster are identical in content), and specific pages have high response times -- specifically those requiring dynamic content or executing CGI-based scripts -- you might want to reconfigure your cluster topology so machines executing CGI-based scripts are separated from those providing static content. This could be done by separating the machines into separate clusters or using the services of a Layer 7-capable load-balancing solution (see "Web Server Director Comes Out on Top of the Pile").
Still, internal monitors can provide a view of performance that external monitors cannot, because internal monitors can look at the servers within a cluster. Internal monitors require that agents be specifically deployed on the servers in the cluster to achieve monitoring of individual services hidden in its depths. This additional view of performance provides an edge when you're looking to scale up your e-business.