Let's talk concept, or conceptual search. This type of search does not replace familiar keyword and Boolean searches but expands searching capabilities. Most eDiscovery collection and search products support concept search although they might differ in specifics. Here's the question: with most meet-and-confer still concentrating on keyword searches, is concept search worth the investment in supporting products and time? The answer is sure -- if you know what you're doing.
Can it be useful? Yes, of course. The New Jersey Law Journal reported a dramatic example of an internal investigation into suspected embezzlement. The company had its suspicions but a keyword search for terms related to banks, accounts and deposits turned up nothing meaningful. Then the company ran a search for clustered and threaded terms. They came up with a large number of baseball-related discussions between two men who were not sports fans. The company matched the terms and email dates to bank transfers, exposing the embezzlers and their code words.
However, this entertaining example is not the major reason for using concept search. One of the real challenges of keyword and Boolean searches is that a party may insist that the opposing party use dozens, even hundreds of keywords to search. The idea is that it is simple to carry out such searches. But what happens is that the returned data sets are very large. This strains storage resources and processing cycles but most importantly adds huge burdens to the already expensive manual review process. Concept search ideally fixes this problem by significantly improving search accuracy, which results in smaller and more relevant data sets without lawyer involvement. When the lawyers do go to review results, they are dealing with a much smaller and far more accurate set of data.
As usual, Sedona weighs in with useful guidance.
"Alternative search tools are available to supplement simple keyword searching and Boolean search techniques. These include using fuzzy logic to capture variations on words; using conceptual searching, which makes use of taxonomies and ontologies assembled by linguists; and using other machine learning and text mining tools that employ mathematical probabilities."