Christine Taylor

Network Computing Blogger


Upcoming Events

Cloud Connect
Santa Clara
Feb 13-16, 2012

Cloud Connect brings together the entire cloud eco-system to better understand the transformation we're experiencing and promises to be the defining event of the cloud computing industry. Learn about the latest cloud technologies and platforms from thought leaders in Cloud Connect’s comprehensive conference.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

A Quick Look at Concept Search

Let's talk concept, or conceptual search. This type of search does not replace familiar keyword and Boolean searches but expands searching capabilities. Most eDiscovery collection and search products support concept search although they might differ in specifics. Here's the question: with most meet-and-confer still concentrating on keyword searches, is concept search worth the investment in supporting products and time? The answer is sure -- if you know what you're doing.

Can it be useful? Yes, of course. The New Jersey Law Journal reported a dramatic example of an internal investigation into suspected embezzlement. The company had its suspicions but a keyword search for terms related to banks, accounts and deposits turned up nothing meaningful. Then the company ran a search for clustered and threaded terms. They came up with a large number of baseball-related discussions between two men who were not sports fans. The company matched the terms and email dates to bank transfers, exposing the embezzlers and their code words.

However, this entertaining example is not the major reason for using concept search. One of the real challenges of keyword and Boolean searches is that a party may insist that the opposing party use dozens, even hundreds of keywords to search. The idea is that it is simple to carry out such searches. But what happens is that the returned data sets are very large. This strains storage resources and processing cycles but most importantly adds huge burdens to the already expensive manual review process.  Concept search ideally fixes this problem by significantly improving search accuracy, which results in smaller and more relevant data sets without lawyer involvement. When the lawyers do go to review results, they are dealing with a much smaller and far more accurate set of data.

As usual, Sedona weighs in with useful guidance.

"Alternative search tools are available to supplement simple keyword searching and Boolean search techniques. These include using fuzzy logic to capture variations on words; using conceptual searching, which makes use of taxonomies and ontologies assembled by linguists; and using other machine learning and text mining tools that employ mathematical probabilities."


Page:  1 | 2 |Next Page »

Related Reading


More e-discovery Insights



Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Research and Reports

Hypervisor Derby
August 2011

Network Computing: August 2011

TechWeb Careers