Upcoming Events

Executive conference

Cloud Connect March 16-18

Comprehensive thought leadership for executives, IT professionals and developers. Topics include: the ROI, cost and economics of on-demand computing; Migration strategies to move from on-premise to cloud-based IT; Vertical cloud specialization, tailoring features and architectures to specific applications, industries, and customer ecosystems

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up
Technology Business Applications
R E V I E W  
Panning for Gold

  September 18, 2003
  By Sean Doherty


>> continued from previous page

CSIRO Panoptic Enterprise Search Engine 4.2.0
TOC Issue TOC
Printer Print full article
Printer Print this page
Printer Download as PDF
E-Mail E-Mail this URL
Discuss Discuss this article
flame author Flame the author
 
  In this article
arrow
Introduction
arrow
CSIRO Panoptic Enterprise Search Engine 4.2.0
arrow
Kanisa Site Search 5.0
arrow
Mondosoft MondoSearch 5.1
arrow
dtSearch Web 6.20
arrow
Executive Summary | Web Links
arrow
How We Tested
arrow
Report Card

Panoptic achieved the best performance in our navigational searches, proved to be the best indexer out of the box and offers the best price-to-feature ratio. And best of all, installation and configuration were a breeze compared with that of Kanisa Site Search, which requires a number of postinstallation steps to configure IIS and enable the file system for use.

Although Panoptic is not as full-featured as Kanisa and MondoSearch, it has the most intuitive administrative interface to manage the search process. After the installation from CD-ROM terminates, the system is almost ready to use with a Red Hat version of Linux.

Panoptic requires an Intel PIII 1-GHz processor with 512 MB of RAM and at least two 40-GB disk drives. Panoptic is flexible when it comes to the OS: It is the only participant that supports Linux, Windows and Sun Solaris, and the only product that supports SSL out of the box. The admin interface and the sample user interface support Internet Explorer, Mozilla and Netscape Web browsers. By default, the admin interface is available from a secure port (HTTPS, 443), while the sample user interface is available under the default port (HTTP, 80).


Like that of MondoSearch and dtSearch, Panoptic's sample user interface can be configured from the administrative interface. But without any configuration, the advanced-search form contained entries that leverage author and title metatags. You can also refine your query to search within your results if you receive too many hits. Panoptic supports all the major standards for metadata, including the Dublin Core. Our other participants support metadata but do not detail their support. And you also can limit your search by document type and date.

To begin the search process, you create a collection--a finite set of Web pages to index and search. If you have logical divisions in your Web content, you can distinguish them by collection to facilitate search and retrieval. For example, you can create separate "collections" distinguished by content type: news, sales support. This can narrow a user's search and increase the number of relevant documents returned.

We created a Web collection by giving it an external display name "Network Computing Magazine" and a unique internal name "nwcmag." Then we identified the collection as our Network Computing production site. As with Kanisa and MondoSearch, you can confine the content collection to specific pages such as those on the www.networkcomputing.com site or its alias www.nwc.com. That way, the Web crawler will not detour and follow off-site links. You can also limit the discovery depth from the starting URL. All four search engines in this review support deep link limitation.

Panoptic supports its own Java-based crawler, called FunnelBack. When you set up a Web collection, you define how a crawler will gather data for the search engine to index. In the advanced settings, you can directly edit a collection configuration file that contains the options for FunnelBack. For example, you can limit the length of time the crawler runs. You can also configure a maximum number of pages to store, limit the number of clicks (links) away from the home page and define many other settings. We excluded a file type to disregard Netgravity links. All the crawlers have a similar feature that excludes certain directories or files from a crawler's scrutiny. This is in addition to following the directives in a robots.txt file.



Search Engine Features
click to enlarge

FunnelBack took just less than nine hours to crawl our production Web site and index 34,720 documents--more than any other participant. Once it completed the crawl, Panoptic made the results to the collection immediately available to the default search form.

Because Panoptic does not provide a preview or prepublishing database--Kanisa or MondoSearch do--to test before going live, it has two options that protect you from putting a partially collected database into production. A changeover-percentage option specifies a minimum size to make a newly gathered collection available vis-ˆ-vis the collection it is replacing. In addition, Panoptic has a "vital_servers" option, which prevents an update from overwriting your production database if a server is down during the collection process.

Panoptic's easy-to-use administrative interface set it apart from Kanisa and MondoSearch. In addition to setting parameters, you can use a form to update collections using the crontab file; this is a multistep process for Kanisa and MondoSearch. Panoptic also has extensive log files, but does not provide the reporting that Kanisa does.

Panoptic Enterprise Search Engine, CSIRO (Commonwealth Scientific and Industrial Research Organisation). +61-2-6216-7060. www.panopticsearch.com


start top  Introduction Kanisa Site Search 5.0 

Best of the Web

Data deduplication: Declawing the clones

Data deduplication is emerging as a critically important new arrow in the storage administrator's quiver to answer hard questions about the increasing problem in storage growth costs.

Quick Read

Compression, Encryption, Deduplication, and Replication: Strange Bedfellows

One of the great ironies of storage technology is the inverse relationship between efficiency and security: Adding performance or reducing storage requirements almost always results in reducing the confidentiality, integrity, or availability of a system.

Quick Read

WAN Optimization Whitelists and Blacklists

Optimization is a fantastic way of saving money and creating really happy customers at the same time, but it doesn't work flawlessly for all applications.

Quick Read

WAN Optimization as a Managed Service: It's Not About the Cost

This insight examines how organizations outsourcing their WAN optimization initiatives to a third-party go about achieving their goals for application performance, reducing operational costs, and streamlining enterprise infrastructure.

Quick Read

  Sponsored Links

Premium Content

Data Centers Gone Wild
February 22, 2010

NWC


Salary

Video