![]() |
|
| F E A T U R E | |
|
|
|
What Is Natural-Language Searching? October 16, 2000 |
||
|
|
Wouldn't it be nice if you could ask your search engine a question in plain English and have it answer with the pages you want? The next wave of search engines may let you do just that--and do it in an even more useful way than Ask Jeeves or Excalibur RetrievalWare does. Search-engine designers are working with linguists to understand and act on the meanings in text--a field known as natural-language processing, or NLP. The designers use algorithms that rely on phonetics, morphology, syntax, semantics, discourse and pragmatic context. Search engines can use these linguistic aspects to improve searching by expanding search terms--for example, finding France, Germany and Belgium when searching for Europe. In the most complex form of natural-language search, the system will analyze and distill each query and document to a formal representation, which can be compared to locate the best matches. Such systems are designed to be better at answering questions and dealing with vague information needs--a question like "Why is the sky blue?" for example--than are traditional engines that require exact matches of words, such as "What articles have been published in Network Computing about search engines?" Most search engines simply use NLP to interpret queries written in question form and to regularize singular and plural forms of words in the index. Excalibur RetrievalWare, for example, sets up a knowledge base for synonyms and provides conceptual searching. Ask Jeeves employs human subject experts to invent questions and locate their associated answer pages; this engine's designers use NLP mainly to match user questions to predefined questions. The Web search engine Northern Light uses NLP to recognize and separate kinds of documents (press releases or FAQs, for example), which provides helpful clustering of results. But even as NLP promises to make searches better, there's still the issue of retraining users. Analysis of search-engine logs shows that very few users type whole questions. Instead, they tend to enter two or three key words and hope for the best. As a result, the search engines miss contextual clues as to the nature of the questions, which makes NLP less useful. Still, in the long-term, NLP for document analysis will locate more useful pages and improve results' relevance ranking and categorization tremendously.
|
|
|
|
PAGE: 1 I 2 I 3 I 4 I 5 I 6 I 7 I 8 I 9 I 10 I NEXT PAGE |
||
Best of the Web
Data deduplication: Declawing the clones
Data deduplication is emerging as a critically important new arrow in the storage administrator's quiver to answer hard questions about the increasing problem in storage growth costs.
Compression, Encryption, Deduplication, and Replication: Strange Bedfellows
One of the great ironies of storage technology is the inverse relationship between efficiency and security: Adding performance or reducing storage requirements almost always results in reducing the confidentiality, integrity, or availability of a system.
WAN Optimization Whitelists and Blacklists
Optimization is a fantastic way of saving money and creating really happy customers at the same time, but it doesn't work flawlessly for all applications.
WAN Optimization as a Managed Service: It's Not About the Cost
This insight examines how organizations outsourcing their WAN optimization initiatives to a third-party go about achieving their goals for application performance, reducing operational costs, and streamlining enterprise infrastructure.




