Upcoming Events

HDI Service Management 2010 Conference & Expo
October 6-8, Miami

IT service and technical support professionals gather at the annual HDI Service Management Conference & Expo to explore some of the hottest topics affecting IT service management. The half-day conference workshops provide the processes, frameworks, templates, and tools to help you meet the service demands of your business..

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up
Network & Systems Infrastructure
F E A T U R E  
Review: NetIQ Shows Analysis Smarts

  January 21, 2002
  By Michael J. DeMaria


Printer Print Full Article
Printer Print This Page
Printer Download the PDF
E-Mail E-Mail This URL
What's your sign? I'm a Capricorn. According to most astrologers, that means I'm hard-working, organized and efficient. Let's say you run an astrology Web site and find that 60 percent of your traffic goes to the Capricorn horoscope page. You want to maximize advertising revenue, so you cleverly seek advertisers that sell products ideal for Capricorns. Hammocks and vacations, no. Organizational software and day planners, yes. Your site becomes immensely profitable, you retire at age 36, and, because you're a Libra (artistic, peace loving and idealistic), you move to a commune in Vermont and paint pictures of trees.


There's one problem with this scenario: The stars won't reveal the demographics of your Web site visitors. That's why you need Web traffic analysis tools -- software that will analyze Web site log files and generate reports on factors such as visitor count, entry and exit pages, most viewed products and so forth. And we're not referring only to Web sites with shopping carts; sites that rely on repeat visitors, banner ads and quality content -- even ISPs trying to determine which user is getting the most hits this month -- can benefit from these tools too. Therefore, for this article, we wanted to test products offering a range of transaction analysis on Web server log files.

We performed our tests at our Syracuse, N.Y., Real-World Labs® using log files -- not simulated data -- from actual Web sites. For our tests we sought Web traffic analysis tools that could deliver analysis of page hits, latency, browser type, paths and drop-off rates, for a start. We eliminated products that required us to modify the Web pages by, for example, adding tagging or extra cookie data.

This limiting factor was a tough decision for us because we wanted to test on real data gathered over a month -- we refused to do simulated testing and wanted a real-world benchmark. However, no simulation tools can generate the depth and diversity of Web users we were looking for, and we would not have been able to test a site that required modifications. It costs money to modify all your Web pages and ensure that the new JavaScript doesn't crash users' browsers or make the page render differently. If your site contains thousands of pages, this task could be enormous, especially if the site is using static pages with no server-side includes.



Web Traffic Analysis Tool Features (chart)


Click here to enlarge

Dynamic-content sites may have an easier time with this task, assuming they can embed the code into the templates. We also wouldn't be able to test our ISP's user pages, as that would require all the users to add the code to their Web pages. For a true VRM (visitor-relationship management) solution, which would merge offline data, credit transactions and so forth, you'd need a custom application. We didn't want this article to be an RFP -- we wanted to test shrinkwrapped solutions. And because we did not consider solutions that would require consultation outside the scope of how to use the product, we disqualified companies, such as Accrue Software, NetGenesis or Personify, that offer highly customized, in-depth log analysis obtained by going to organizations' offices and working with their Web staffers.

These services are great if you want to merge offline data and correlate Web site visitors with brick-and-mortar customers, but the dollar cost is steep -- up to six figures -- and the time required to implement such solutions could range into months. In addition, such custom analysis would be impossible for most ISPs to implement as a service to their Web hosting customers because it is too labor-intensive and may require modifying Web pages; some vendors we talked to use little Java applets or custom cookies on every page.

In one of our test scenarios, we pretended to be an ISP looking at which user had the most hits or bandwidth utilization. Our rule was that the software can look at the log files only. With this in mind, we ended up with four entries: Microsoft Corp.'s Commerce Server, NetIQ Corp.'s WebTrends Reporting Center, Quantified Systems' Urchin 3.3 and Sane Solutions' NetTracker eBusiness Edition.

Data, Data Everywhere

We gathered approximately one month's worth of log files from four Web sites, all supplied by Syracuse-based ISP and Web hosting provider Dreamscape Online. We got logs for zodiac-x-files.com, virtualfreesites.com, hotshoppe.com and dreamscape.com, which includes dreamscape.com user's pages. In total, we had almost 9 GB of real-world log files to sort through. During our tests, we took the stance that we were looking to do log analysis as an added service. All but one of the Web sites run on Sun Microsystems Solaris and Apache Web servers. The exception is hotshoppe.com, which runs on Microsoft Windows NT and Internet Information Server (IIS).

Hotshoppe.com is the online side of a store in downtown Syracuse, and through the site you can purchase spices, wood chips and sauces so hot you need to sign a waiver. We were interested in seeing the most viewed products and category of products. For example, was chicken-wing sauce viewed more often than steak sauce?

In all, we got a good, diverse group of actual Web log data, though we did accept some limits on data access because of privacy concerns. For example, the purchase page of hotshoppe.com is on a separate secure server, so we didn't have those log files.

One of the easiest ways to see shopping-cart abandonment rates is to compare the number of people who go from the cart page to the thank-you page. However, the thank-you page for hotshoppe.com is also on the secure server. This made determining exact shopping-cart abandonment rates impossible, but we weren't about to ask for log files for a credit-card transaction server. Also, the banner ads for virtualfreesites.com were pulled from a remote server, so we couldn't see the number of clicks, but we could see the number of impressions.

For the dreamscape.com site, we wanted to find out information such as which user account was generating the most hits or using the most bandwidth. We also looked for the most popular Web browsers and platforms.

Pulling Out the Big Guns

In our initial discussions with vendors, many advised us to use a powerful machine to analyze large amounts of traffic. On average, the vendors, especially the consulting-based vendors, said we should use a dual- or quad-processor machine with at least 1 GB of RAM. No problem there: We had a beefy machine for our tests -- a Compaq Computer Corp. ProLiant DL580. With quad Xeon processors, 4 GB of RAM and 54.6 GB of striped RAID disk space, this machine was übercool. It also contained much more power than we needed for our tests (but would make an awesome Quake server). CPU utilization was rather low, around 25 percent at most, and RAM usage stayed at less than 1 GB. Of course, your actual horsepower needs will vary depending on your site's size and usage levels.

You also can take advantage of the added RAM for more efficient caching, possibly storing the log files or databases, for example. We also heard recommendations to use a separate machine as a database server. To perform our biggest test -- on the dreamscape.com site, which had approximately 40 million hits -- the products took between 2.5 hours and 16 hours: Commerce Server took 16 hours, NetTracker 12.5 hours, WebTrends six hours and Urchin 2.5 hours.

Commerce Server and NetTracker use a Microsoft SQL Server database and are heavily affected by the speed of the disk subsystem. However, these two products work a bit differently from the others as well: Commerce Server and NetTracker shove much more information into the SQL database. For example, you can use NetTracker to see the entry and exit pages of each and every visitor, as opposed to seeing just the most common among all the users.

This also means that NetTracker and Commerce Server require more disk space. NetTracker used 10 GB of drive space for the database files; Urchin used only 50 MB. But you can do advanced and home-built data mining with a SQL-based analyzer, which some organizations might find desirable. WebTrends and Urchin use proprietary databases.


   Page: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Next Page

Best of the Web

Data deduplication: Declawing the clones

Data deduplication is emerging as a critically important new arrow in the storage administrator's quiver to answer hard questions about the increasing problem in storage growth costs.

Quick Read

Compression, Encryption, Deduplication, and Replication: Strange Bedfellows

One of the great ironies of storage technology is the inverse relationship between efficiency and security: Adding performance or reducing storage requirements almost always results in reducing the confidentiality, integrity, or availability of a system.

Quick Read

WAN Optimization Whitelists and Blacklists

Optimization is a fantastic way of saving money and creating really happy customers at the same time, but it doesn't work flawlessly for all applications.

Quick Read

WAN Optimization as a Managed Service: It's Not About the Cost

This insight examines how organizations outsourcing their WAN optimization initiatives to a third-party go about achieving their goals for application performance, reducing operational costs, and streamlining enterprise infrastructure.

Quick Read

Premium Content

Don't Stop At VoIP
June 2010

Network Computing June 2010


Salary

Video