Upcoming Events

Executive conference

Cloud Connect March 16-18

Comprehensive thought leadership for executives, IT professionals and developers. Topics include: the ROI, cost and economics of on-demand computing; Migration strategies to move from on-premise to cloud-based IT; Vertical cloud specialization, tailoring features and architectures to specific applications, industries, and customer ecosystems

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

  F E A T U R E

Open-Source Search Engines

October 16, 2000


Open-source search-engine efforts are alive and well. They may not be quite up to the highest capacity, but they are almost infinitely configurable. Most are light on user interface for the search administrator and require command-line and config-file control, but they are powerful and flexible.

Ht://Dig (www.htdig.org) was developed at San Diego State University and released under the GPL (GNU General Public License). It's a solid search engine for Unix machines. Ht://Dig's robot crawls links on Web pages and the indexer interfaces with open-source code to read PDF and Microsoft Word files. The response is fast and the relevance ranking reasonable (it will improve in version 3.2, under development as of this writing). There are several options for "fuzzy" text searching, including soundalikes, common word endings and synonyms. The system has required configuration files for administration, but an open-source ConfigDig interface now provides access via Web browsers to many of the features. The core development team is active and responsible, and there's a friendly community mailing list.

UdmSearch (search.mnogo.ru) was also developed under the GPL and can index Web pages, FTP sites, Usenet newsgroups and local files. For index storage, it can use almost any SQL server. Because it was developed in the Russian Federation of Udmurtia, it's very good at supporting multiple character sets and languages. In addition to simple HTML forms, UdmSearch provides PHP3, PERL and C CGI access to the search engine, offering significant flexibility and options in arranging search results. There's an active online community, and the developers answer questions quickly.



PAGE: 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I NEXT PAGE
 

Best of the Web

Data deduplication: Declawing the clones

Data deduplication is emerging as a critically important new arrow in the storage administrator's quiver to answer hard questions about the increasing problem in storage growth costs.

Quick Read

Compression, Encryption, Deduplication, and Replication: Strange Bedfellows

One of the great ironies of storage technology is the inverse relationship between efficiency and security: Adding performance or reducing storage requirements almost always results in reducing the confidentiality, integrity, or availability of a system.

Quick Read

WAN Optimization Whitelists and Blacklists

Optimization is a fantastic way of saving money and creating really happy customers at the same time, but it doesn't work flawlessly for all applications.

Quick Read

WAN Optimization as a Managed Service: It's Not About the Cost

This insight examines how organizations outsourcing their WAN optimization initiatives to a third-party go about achieving their goals for application performance, reducing operational costs, and streamlining enterprise infrastructure.

Quick Read

  Sponsored Links

Premium Content

Data Centers Gone Wild
February 22, 2010

NWC


Salary

Video