home news blogs forums events research newsletter whitepapers careers


Network Computing Network Computing Network Computing
HOT PICKS

IMMERSE YOURSELF:

SOA

  |

Data Center

  |

802.11n

  |

Data Privacy

  |
APO  |

Virtualization

  |

NAC

  |

Security

  |

Network Mgmt

  |

Enterprise Apps

  |

Storage & Servers


Network & Systems Infrastructure
F E A T U R E  
Review: NetIQ Shows Analysis Smarts

  January 21, 2002
  By Michael J. DeMaria



Printer Print Full Article
Printer Print This Page
Printer Download the PDF
E-Mail E-Mail This URL
>> continued from previous page

Hi Ho, Hi Ho, Data Mining We Will Go

You'd be amazed how much information can be found in the log files of Web servers. You can see IP addresses of hits, access dates, the HTTP gets and posts, byte sizes of file transfers, error codes, browsers the client used, referrers of where the visitor came from, and search-engine keywords.

Of course, to perform log analysis, you need to turn on logging. During periods of heavy load, a site might turn off logging because writing log files adds considerable disk and processor overhead. Likewise, to get information about referrers and search-engine keywords, you need to turn on the referrer log option. Not all Web servers will log referrers by default, and some servers will log the referrer in a separate text file.

For simplicity, you may want to combine the referrer and access log into one file; Apache calls this an extended log file. In fact, Apache once received a Big Brother award for its default logging capabilities. Whatever. We feel the information in the log files is useful for technical and security reasons, such as finding out if someone is linking to a deleted page or attempting to hack your Web server with buffer overflows. Although many feel the Big Brother people have a point, singling out Apache isn't fair: Most Web servers include some form of logging capability.

Of course, while our having free rein to use logged data is ideal from an IT standpoint, corporate privacy policies may come into play. Suppose you took information about users (based on IP address, login ID or cookies, for example) and sold it to another retailer. Then you could start to build a user profile on someone. Remember the trouble DoubleClick got into when it tracked people via cookies across multiple Web sites? For our tests, we assumed that the privacy policies of the sites forbade revealing information about individual visitors to outside sources. We kept this in mind during our tests, and all data and printouts were returned or deleted.

So what can you find out with these log analyzers? A lot. Each product supports a standard set of items -- hit count, page views, path analysis, entry pages, exit pages, duration of stay, visitor count and so forth. You also can see error messages and find broken or outdated links, assuming you have set up your Web server to log HTTP error codes.

All the products we tested let you see a good deal of information relating to the referrer log. Part of the HTTP header includes referrer information, which tells the Web server where you are coming from. This would be like someone calling you on the phone and saying "Hello. Mr. Zizek said you might be able to assist me." You know where the caller came from.

The referrer also reveals search-engine keywords. Marketing people love this feature. Say the top search engine for hotshoppe.com is Google, and the most common keyword is hot sauce. Google lets you purchase advertisements based on what a person searched for. So to maximize traffic, hotshoppe.com would purchase hot sauce on Google. While that might seem so obvious it's silly, there are many other possible keywords, such as chicken, spicy, pepper, chili and wings. You want advertising to be as cost-effective as possible, which means limiting the number of keywords you purchase. This capability lets you get the most bang for your buck by identifying the top keywords.

But don't get too excited. These tools also produce information that is inexact or just plain wrong. How do you count a visitor? By IP address, session time or browser type? Twenty people sitting behind a proxy may count as one visitor. Dial-up ISPs constantly reassign IP addresses depending on who is dialed in, so it's possible for any number of different people to have the same IP address. Although cookies help greatly in tracking visitors, not every site or user accepts cookies, so the total number of visits may be off.

WebTrends lets you see the most active cities, and it turns out that about 85 percent of Internet users live in Vienna, Va. Perhaps you should advertise there. But wait, Vienna's also the home of America Online. The other vendors don't include a most active city report, primarily because such reports are so inaccurate.

It's also a bit disappointing to see most of the visitors to your site staying for less than 10 seconds. Time of visit can be misleading, especially if someone just wants to read a 5,000-line article. This is why JavaScript, applets, session IDs and other little tricks are commonly used for tracking visitors more accurately. Of course, users can turn off applets and JavaScript, just as they can do with cookies.

Web analysis tools are a boon for the marketing department, and one of the most useful features they provide is the ability to determine which products are being viewed most often on an e-commerce site. Many Web sites will use dynamic pages for their product catalogs, so a URL might look like /product.php?ProductCategory=5&ProductID=12. All the products support parsing each URL, so in our example URL all the products will show the most accessed product categories and top products. WebTrends also lets you create a text file matching up the IDs, so instead of seeing products 1, 5 and 28, you'd see black binder, stapler and 20 oz. coffee mug, for example.

These tools can be invaluable in determining Web site visitor demographics. All the products we looked at work in a similar manner, but each has distinguishing features as well. We recommend that you consult with marketing and sales when deciding which to purchase.

Report Card

We graded the products in five areas, with each category given equal weight. They were number and customizability of reports; detail level, which measures how specific the reports could get, with additional points awarded if the product supports custom application data mining; speed; price; and user interface. Pricing is all over the board for these products; for more details, see "If You Have To Ask, You're Probably Buying an Analysis Tool". All the products we tested operate through a Web browser, but we also considered the quality of printouts and how easy it would be to print directly to transparency.

Picking a winner was a bit tough because any given company's choice will depend on its needs. In the end, though it was a tight race and everyone made a strong showing, we gave NetIQ's WebTrends our Editor's Choice award. WebTrends had better e-commerce reporting capabilities, and it was fast and compact.

Urchin is great for those who need speed -- it blew the other products away with how fast it analyzed data. If you need to process 8 GB or more of log files every day and don't need as great a level of detail, Urchin is the best choice. NetTracker is great for those who want to do very in-depth data mining. It retains lots of information, such as the entry and exit pages of each and every visitor. However, this depth comes at a cost of speed.

Finally, there's Microsoft Commerce Server, which is an entire e-commerce suite that runs on IIS. For this review, we looked only at the product's traffic-analysis capabilities, but keep in mind that it does more than just log analysis. For those who use Commerce Server as their e-commerce software, the product's log-analysis capabilities are such that you won't need a third-party solution. However, if you use Apache or some other e-commerce software, you should steer clear.

All the vendors except Urchin allow for more advanced data mining. Sane Solutions and Microsoft use a SQL Server database, which you can query against with your own custom code. WebTrends also offers this capability in an add-on package.


   Page: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Next Page





Ready to take that job and shove it?

Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.










InformationWeek U.S. IT Salary Survey 2008
Salaries for business technology professionals are falling. Here's what you need to know in order to make good hiring decisions and personal career choices. Purchase Today: $299
 
ROLLING RIGHT ALONG
Follow key Network Computing Reviews from conception to completion. This Week: Holistic APM.



Network Computing Reports Emerging Enterprise Podcast Series: Secrets to Success








TechSearch


Microsite of the Week


Powerful Information at Your Fingertips



techweb
Online Communities TechWebInformationWeekLight ReadingIntelligent EnterprisebMightyNetwork ComputingDark ReadingDigital LibraryWall Street & Technology
Byte & SwitchNo JitterInternet EvolutionLight Reading's Cable Digital NewsContentinopleUnStrungBank Systems & TechnologyAdvanced TradingInsurance & Technology
Face-to-Face Events
InteropWeb 2.0 ExpoWeb 2.0 SummitVoiceConBlack HatCSISoftwareEntrprise 2.0 ConferenceGTEC
Mobile Business Expo
InformationWeek 500 ConferenceBuy Side Trading XchangeBuy Side Trading SummitBank Executive SummitInsurance Executive SummitTelcoTVEthernet ExpoOptical Expo
Magazines  
InformationWeekWall Street & TechnologyInsurance & TechnologyBank Systems & TechnologyAdvanced TradingMSDNTechNetSmart EnterpriseThe Architecture JournalDatabase Magazine
 
Research & Analyst Services  
Heavy ReadingInformationWeek ReportsInformationWeek Analytics
 
   
   
App Infrastructure   |   Messaging & Collaboration   |   Network & Systems Mgmt   |   Network Infrastructure   |   Security  |   Storage & Servers   |   Wireless   |   Enterprise Apps
About Us  |  Contact Us  |  Site Map  |  Technology Marketing Solutions  |   Briefing Centers
Copyright © 2008  United Business Media Limited  |  Privacy Statement  |  Terms of Service  |  Your California Privacy Rights