Web SLA Managers: The View From There

We tested eight Web-site monitoring services on our NWC Inc. site and found Gomez Performance Network the best.

November 7, 2003

21 Min Read
Network Computing logo

A Web monitoring service should offer a flexible and sensitive threshold mechanism. If you have specific objectives spelled out in your service-level agreement, make sure that your service provider can collect data in a way that lets you verify conformance.

Pricing for these services typically is based on the number of URLs monitored and the frequency of that monitoring. The pricing scheme may take into account the number of monitoring locations, the site location, the transactions monitored and even the number of Web pages involved in a transaction. For our purposes, we averaged price across a matrix that considers these pricing variables and split them into two groups, one for single URL monitoring and one for transactional monitoring. This is not a realistic price model; it's a price average, and actual costs are likely to be higher. We couldn't ignore price, but in our Report Card we gave it a low weighting of only 2.5 percent. "Web Monitor Pricing Chart" shows a simple chart of average prices. You can find a more detailed pricing chart here.

The Watchers

For our tests we monitored our NWC Inc. site, a real site housed at our lab in Green Bay, Wis. NWC Inc. produces and sells the all-important ... and fake ... widget, but it is a real enterprise-scale business application testing lab with a real Web-based storefront for transactions. For more detailed information about NWC Inc., see inc.networkcomputing.com.

For three months, we pointed Web monitoring services from Alert Me First, AlertSite, BMC Software, Computer Techniques, Dana Consulting, Elk Fork Technologies, Gomez and Keynote at our site. We monitored single URL pages and multiple pages strung together, aka transactions. Our transactions looked for widgets, bought widgets and even shipped widgets. We figure we bought more than 1 million widgets during our tests. Watch eBay for some good deals.Our winner, Gomez, nudged out stalwart Keynote and Web monitoring newcomer BMC. All three offer good service monitoring and reporting. But Gomez outpaced the competition with its impressively granular performance-data gathering and service-monitoring controls. Keynote, which has the most experience in this area, charges too much. BMC has done a good job focusing on services while maintaining its strengths--network and systems monitoring. In the middle of the pack, we logged some excellent offerings from Elk Fork, AlertSite and Dana Consulting. Bringing up the rear, Alert Me First and Computer Techniques' 1stMonitor monitor only single URLs and only from a single location. But for certain price-sensitive shoppers, they might fit the bill.

Gomez Performance Network simply gave us more information than any of the other services we tested. It isn't the cheapest service, but Gomez's pricing makes sense and is predictable.

The MyYahoo-like start page offered obvious links to existing single-URL and transaction monitors, real-time tests, benchmarked comparisons of competitors and charts and graphs. So we could determine the root cause of a Web site's trouble, GPN service led us through its considerable data collection by linking every report to some contextual information. It didn't solve all our problems, but it did provide the deepest insight of all the services we tested.

Most of the services tracked how long it took to resolve DNS times, connect to the server, download the first byte of data, then download all of the content. Gomez goes a bit further by breaking out offsite redirects, root pages and each object on a page. This last service--called page object or element tracking--is offered in real time and provides historical data. It gave us a unique view into the exact makeup of each page and our transaction performance.

Gomez's trending reports provide a quick gauge for longer periods. The reports offer many and varied options for time, chart type (line or bar histogram) and data aggregation. We liked the data view, a quick summary of the single-URL and transaction monitoring, linked to underlying daily detail summaries, which were in turn linked to daily details for each monitoring location. The Gomez interface let us choose multiple transactions or single URLs to display concurrently for comparison. Gomez retains data for one year--one of the longest periods offered among the services we evaluated.

The alerts log on Gomez's product has a nice filter--by test, time, error type, subject line and progression. The usual historical time slice and sort by monitored device is similar to that offered by Dana Consulting's, Elk Fork's and Keynote's products.Our tests left us without a doubt as to our winner. Gomez's service was more detailed, easier to work with and more sophisticated than the others. For example, all the services have some static thresholds, like the number of tests that have to fail before a fault is alerted and notification sent. Gomez offers this feature and gathers performance data over time to form a baseline. Percentage deviation from this baseline can be used to set thresholds for notification.

Gomez also sets what it calls Health Monitors for measuring service impact. Both the "at risk" value and "critical" value are defined by the monitoring locations. A status report will show a health value for a page or transaction and give you a quick check as to whether you're meeting your agreed-upon service delivery.

Items such as response time, content match, transaction failure alert, page object alerts, page inaccessible and server unreachable all had different values that can be set to trigger an alert. And each could be specified as a percentage of failing nodes or a fixed number of nodes. We would have liked to have notification groupings like these in the services from BMC, Dana Consulting and Elk Fork. This would have let them send different notifications to different Web sites or network administrators based on their need to know when particular transactions or URLs failed.

User-access security is a two-tiered model, with an admin group and user read-only group. More groups, finer granularity and delegated authority would improve this model. We weren't thrilled with the fact that inaccessible options were offered in the user interface. It says the administrator is allowed to change preferences only, but the interface let us enter edit mode and appeared to save the changes without any warning that they might be invalid due to access restrictions. It turned out that the changes weren't saved and security was enforced. But these unavailable UI actions should be removed.

GPN service 4.0; Last Mile URL Monitoring. Gomez, (877) 372-6732, (781) 768-2100. www.gomez.comBMC Patrol Express is a new offering--production began just as we started testing. Ordinarily, testing any "1.0" release gives us the heebie-jeebies, but we were pleasantly surprised with this service. Patrol Express provided the best views among those services tested, and it consistently demonstrated maturity and stability. The BMC legacy of deep system and network monitoring is evident in Patrol Express.Patrol Express starts with a status screen organized around groups of service. Each service, which can consist of single URLs and transactions, shows an overview of color-coded bars for the underlying Web page, and the number of monitoring locations having trouble executing the URL or transaction. These services are then applied across the product for performance and fault views.

The services are organized like a directory. After you select a branch representing a service or transaction, the right-hand window displays the aggregation of the service at whatever level you select. There is no dumbed-down explanation for the CEO and no clicking through meaningless happy-face graphics for those who know where they want to go.

We first looked at alerts for our entire account; then we drilled down by service, which consisted of URL and transactions related by the site they were running against, then by the separate URLs and transactions within the service. It's more useful to see high-level views when they're organized into logical groups this way.

Patrol Express has current-status and health-status options, which display historical performance in relation to service goals or thresholds as primary service-monitoring screens. Its reporting isn't as granular as that offered by Gomez, but the service shows performance over time, in comparison to thresholds, for availability, MTTR (Mean Time To Repair), page download and transaction path time. We created groups that included similar services and Web pages. We could then run all of the above reports on the entire group, or a particular transaction, on a page-by-page basis extending back as far as 12 months.

A log of alerts gave us a precise history of what happened. As alerts are resolved, they are displayed with an "X" icon. Of course, if an intermediate problem continues to occur, there's an obvious trail in the log.Real-time tests are handled a bit differently in Patrol Express. Errors are linked with a label called "Diagnose." This feature e-mails a traceroute and a breakdown of connect, first-byte and last-byte times for whatever location is experiencing the error. We ran a test and received a response within five minutes.

We tested Patrol Express monitoring infrastructure behind one of our firewalls. The 35-MB Remote Service Monitor (RSM) downloads off the Patrol Express site to a PC. The RSM will ping and perform SNMPv1, v2 and v3 gets. SNMPv3 is implemented and includes authentication and privacy support. The PC running RSM then shows up in our account on the Patrol Express Web site as an available monitor, just like the ones that are running collocated at service providers.

Patrol Express will not only monitor routers and switches, but also a large number of storage devices, including those from Brocade Communications

Systems, Cisco Systems, Hitachi, McData Corp. and Qlogic Corp.PATROL Express 3.0. BMC Software, (800) 841-2031, (713) 918-8800. www.bmc.com Keynote takes the roll-your-own approach to service presentation through its MyKeynote start page. We were able to save graphs, quickly access crucial overviews of availability and performance, dig deeper into graphs and reports and generate new ones, all while dictating the mix of current and historical data. Although the Keynote interface isn't as simple and straightforward as BMC's, it's usable and flexible.

To diagnose a problem using the Keynote service, we picked an alert and then ran reports around that URL or transaction until we found something to compare. Then we clicked through the summaries and sifted through the details to get a sense of which problems had occurred.Real-time tests are housed under a "diagnose" tab and include ping, nslookup, traceroute and page download. This last test provides data on download time for each element, connect time, DNS lookup, redirect and error in a grid. When you pause the mouse over each breakdown, the site shows the element or procedure. This helped us understand how a particular page was performing.

The Keynote reports are linked logically to underlying data. For example, we ran a report showing availability and response time for the last month and noticed a period when there were no measurements at all. By selecting a monitored summary point on the overview graph, we were able to drill into the underlying data points for that day and see the availability and response in relationship to service goals for each hour.

Keynote's error-log display summarizes counts by days and provides links to the underlying details. Each line-item detail provides the status and time and notes to whom the alert was sent. Like Elk Fork's, Keynote's display links to a graphical representation of the error and shows details about the monitoring location and error.

Keynote limits data storage and display. In one example, we selected scatter graphs of all measurements and were limited to four hours of data. Keynote retains data for six weeks--longer than the four weeks Dana Consulting offers, but much shorter than Gomez's one year. The reports had the usual line of scatter graphs with threshold overlays for service comparisons. Keynote adds a "tear" publishing function to all reports, which lets the report live in a new window, making it easier to compare reports. The service also has a post option that maintains the report on the Keynote server for up to a week and gives access to anyone with the appropriate URL.

Keynote's trend reports, like those of Gomez, can display multiple transactions or URLs on the same report. The service's over-time options are granular, showing from one hour to six weeks worth of line graphs for availability and response time, with all data displayed on a single graph. This display let us easily gauge the performance of different URLs, especially when compared with the separate graphs Gomez provides.Baseline reports, which pitted our transactions against similar sites, also are available. Keynote, Gomez and AlertSite all offer this feature. Like Gomez, Keynote compares utilization and availability but adds regional and service-provider views.

Keynote supports alert triggers based on monitoring city or monitoring agent. You can specify all agents/cities or just some agents/cities, and define the availability and/or response threshold violation.

Instead of grouping alerts or notification, Keynote creates alert definitions. We like being able to attach many alert definitions to a URL or transaction monitor, and then attach the notification to many URL and transaction monitors. The alarm display allowed us to easily assign and audit those alarms we had applied to our monitoring.

Site administration is tightly controlled by Keynote and was not completely available to us. Keynote says it likes to review all scripts prior to

letting them go live on its production network. And they charge more for full access. We offered to trade access for widgets, but Keynote hasn't gotten back to us yet.Keynote Web Site Perspective & Keynote Transaction Perspective. Keynote Systems, (800) KEYNOTE, (650) 403-2400. www.keynote.comElk Fork's ElkMonitor is simple to use and offers good value. That said, it's definitely a second-tier service in line with AlertSite and Dotcom-Monitor.com. We were impressed with the fact that the data collected by ElkMonitor rivaled that received from Gomez and Keynote. But Elk Fork needs to add monitoring locations.

The initial console is simple but functional and got us quickly to the data collection. The front page of this interface shows each defined test and scrollable actions for each. For example, for each test you can create a report, edit the test, run an on-demand real-time test, show the status as of the last poll or manage the notifications.

Data is retained for four weeks, but when we selected time periods by actual dates we were able to go back two months. Elk Fork doesn't charge for the additional storage but doesn't guarantee it either.

ElkMonitor. Elk Fork Technologies, (866) 355-3675, (828) 682-2843. www.elkmonitor.com

Our impression of AlertSite can be summed up with the word good. AlertSite offers good service insight, data collection, pricing and monitoring locations. Its service is straightforward to use and gave us a quick sense about our site's current status while providing adequate details.

We found the service easy to administer, and it didn't take a lot of work to get going. We also found the site easy to navigate; it's split into segments for status, diagnostics, setup, transactions, alerts and notifications.The AlertSite monitoring console displays the latest poll results in a compressed, easy-to-understand format. The monitoring-locations status is rolled up into a single line, which can be expanded to show specific location failures. Error codes--with context-sensitive definitions--are part of the status display.

However, we expected more summary, trend and error reports linked to underlying details within the diagnostic page. Instead, AlertSite lets you request a poll on a monitored site. Part of a notification definition can be a network snapshot in the form of a ping or a traceroute with the results included with the alert. Using this feature, we correlated some network knowledge at the time of a fault.

AlertSite's baseline report shows DNS, connect, first byte, last byte and error times. Neither Gomez nor Keynote offer a baseline. The AlertSite performance reports break out response time by monitoring location and roll these stats into daily and hourly averages. The regional reports are an average over a selected period of time or a more detailed version with averages recorded per hour.

AlertSite doesn't have any distribution options--pages cannot be mailed or posted to a Web site. Its Web interface is the primary way to get reports. By default, response time and availability data is kept for six months for single-page URL monitoring and three months for transactions. You can pay more for longer retention periods.

AlertSite has a useful notification-management feature that lets you group users by notification method, trigger or schedule. This makes it easy to add someone to a list for notification. It also takes the pain out of auditing who is on call for what site.AlertSite Web Site Monitoring. AlertSite, (877) 302-5378, (561) 218-5527. http://www.alertsite.com

Dotcom-Monitor.com has a start page similar to ElkMonitor's: It shows the currently configured single URL and its transactions. You can edit, schedule and report on each. Also like ElkMonitor, Dotcom-Monitor.com doesn't allow for customization of this initial page.

We ran reports, edited scripts and managed monitoring schedules all for the devices listed. The service's reporting offers straightforward options regarding the monitoring location, the amount of time to report, site status and summary reports. The detail report displayed the 50 previous polls with status and response times.

Dotcom-Monitor.com does not offer many links between reports. Reports are simple and are not linked to the granular data. The summary report has links, but those underlying availability and response-time reports are static displays. They can't be e-mailed, shared or saved. The service will retain your data for 60 days with the standard offering--longer periods are available with custom pricing.

Dotcom-Monitor.com. Dana Consulting, (888) 479-0741, (763) 577-9668. www.dotcom-monitor.comAlert Me First wants be known for its quickness. But this single-location, single-URL monitor is not much of a bargain. Its service is simple to set up even if its interface is a bit cluttered. It's not very expensive, but it lacks reporting and flexibility. Its service thresholds are OK, but not as deep as those offered by the products in the middle of the pack.

Alert Me First takes pride in its name, pointing out that the frequency of its polling--at two, four and nine minutes--is a full minute earlier than most other services' default polling values, andhence it gets its data first. It is a step up in sophistication from 1stMonitor, the other single-URL monitoring service we tested, but it's also a big step up in cost. It's priced at more than $22 for a single URL monitored every 15 minutes, compared with 1stMonitor, which costs $6.95 for similar functionality.The Alert Me First interface gives you a choice of conventional top-level selections--which devices to monitor, who to alert and so on. The device portion shows an overview status and a link to more details. As we monitored devices, the detail status displayed green for good connections, yellow for warnings and red for failures.

Alert Me First reports summary activity in daily increments, displaying minimum/maximum/average response times and availability percentages on a per-day basis, as well as overall for up to 30 days. The detail of the entire day's monitoring probes are linked to this summary and displayed graphically as a horizontal line graph.

Alert Me First. Alert Me First, (403) 399-4318. www.alertmefirst.com

This Estonia-based one-location, one-URL monitoring service is as simple and cheap as they come. If all you need to know is whether your site is up or down, this is the service for you.

The 1stMonitor console is simplicity itself. The entire Web interface is a single page where configuration amounts to specifying a URL, how often you want the service to check on its availability and where to send an e-mail alert. The online status indicators are green (OK) after a successful page download, yellow (warning) when the page down fails, and red (critical) when contact with the server fails or the specified "keyword" isn't found. That's it for the online Web site. We were in and out of there in less than 10 minutes.The service sends notifications as text e-mail. Alerts show the date, time and host name. When the server became available, we received an "OK" message with the server's current response time. We also got weekly text reports showing the actual time and percent of time the URL was in available, warning and critical states. As with Alert Me First, data retention is not available.

1stmonitor. Computer Techniques, (775) 429-8330 (Salem, Ore). www.1stmonitor.com

Bruce Boardman, executive editor of Network Computing, tests and writes about network management and systems. He has 12 years' IT experience managing networks and distributed computing for a financial service provider. Write to him at [email protected].

Post a comment or question on this story.Web monitoring services, by downloading Web pages from computers collocated at various telecommunication service providers around the world, can give companies a sense of how their Web site looks and behaves for real end users outside the firewall.

We tested services ranging from those that can only monitor a single Web page from a single site to more expensive services that have locations worldwide and can follow transactions and report on how long it takes to download each element of a Web page.Our Editor's Choice, Gomez Performance Network, just edged out Keynote and BMC Patrol Express for top honors. In the middle of the pack, a number of solid offerings from Elk Fork, Dana Consulting and AlertSite performed well; Dana Consulting's service earned our Best Value award. At the lower end of the spectrum, inexpensive but reliable services 1stMonitor and Alert Me First may fit the bill for those with shallow pockets.

How We Tested



Detailed Pricing Chart


click to enlarge

We engaged each service as though we were an actual customer and used our NWC Inc. widget-manufacturing site as the target. We don't really build widgets, but users can browse our catalog, purchase widgets and track their orders. The site uses Web services, Java beans and back-end databases and feeds a separate financial-tracking site. For more info on NWC Inc., see inc.networkcomputing.com.



List of Cities


click to enlarge

We recorded and specified URLs for each of the services to exercise the critical paths necessary for widget purchase. We gathered data on the availability and responsiveness of the site for several months. We also ran transactions against our nwc.com site. We created service thresholds and reports, and received alerts for the duration of the test. We also administered the site, changing thresholds and tracking problems that we were alerted to or noticed in the reports.

Our in-depth pricing chart can be found left.A full list of cities for which each company provides service is right.

Web Links

"Managing Apps Span a Widening Spectrum"

"How SLAs Are Used"

"The End All of Network Performance Management"

R E V I E W

Web Performance Monitoring Services



Sorry,
your browser
is not Java
enabled




Welcome to

NETWORK COMPUTING's Interactive Report Card, v2. To launch it, click on the Interactive Report Card ® icon

above. The program components take a few moments to load.

Once launched, enter your own product feature weights and click the Recalc button. The Interactive Report Card ® will re-sort (and re-grade!) the products based on the new category weights you entered.Click here for more information about our Interactive Report Card ®.

From the Editors:

update 12/05/03

The prices listed for Gomez GPN was in error for the pricing scenarioswe proposed. Gomez pricing averaged $295 for single URL and $795 for transactions. This represented per unit pricing and not total prices. Total per month cost increases as the number of URL andTransactions increase, however remains flat as the number of site from which monitoring is performed increases as long as the number of these monitoring sites is 10 or over.

Due to this error we asked Gomez to resubmit its pricing in response the specific scenario specified, which covered up to six monitoring locations world wide. This changes the pricing score for Gomez. URL price moves from a score of 3.5 to 1 and Transaction pricing moves from a 3.75 to a 2.5. The total Gomez score moves from a 4.51 to a 4.41. The report card now reflects this change. (See the complete and updated pricing matrix for all vendors here).

From the Editors:Due to an interactive report card constraint the following weights were changed. The alterations did not impact the outcome or grading, however.

  • Service Monitoring: From 30% to 31%

  • Transaction Monitoring Price: From 2.5% to 2%

  • URL Monitoring Price: From 2.5% to 2%


SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights