David Hill

Network Computing Blogger


Upcoming Events

Where the Cloud Touches Down: Simplifying Data Center Infrastructure Management

Thursday, July 25, 2013
10:00 AM PT/1:00 PM ET

In most data centers, DCIM rests on a shaky foundation of manual record keeping and scattered documentation. OpManager replaces data center documentation with a single repository for data, QRCodes for asset tracking, accurate 3D mapping of asset locations, and a configuration management database (CMDB). In this webcast, sponsored by ManageEngine, you will see how a real-world datacenter mapping stored in racktables gets imported into OpManager, which then provides a 3D visualization of where assets actually are. You'll also see how the QR Code generator helps you make the link between real assets and the monitoring world, and how the layered CMDB provides a single point of view for all your configuration data.

Register Now!

A Network Computing Webinar:
SDN First Steps

Thursday, August 8, 2013
11:00 AM PT / 2:00 PM ET

This webinar will help attendees understand the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging. It will also help users decide whether SDN makes sense in their environment, and outline the first steps IT can take for testing SDN technologies.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

See more from this blogger

Text Analytics Key To Unlocking Big Data Value

The talk these days is all about big data, but extracting insights that lead to value is the role of analytics--not just big data alone. And since a lot of that data is textual in nature, the responsibility for delivering value falls upon textual analytics. And that is a big deal.

I recently attended the Text and Social Media Analytics Summit in Cambridge, Mass. The conference highlighted the increasing importance of text analytics. Let me touch upon just a few of the ideas discussed at the meeting.

Good Analytics Crucial To Deriving Big Data Value

Gary King of Harvard University gave one of the most thought-provoking presentations, with the challenging title of “Big Data Is Not About the Data.” His thesis was that the value of big data actually lies in the analytics. To prove his point, he cited an examination of the solvency of the U.S. Social Security Administration (SSA). The SSA had used essentially the same statistical methods for 75 years, and overall SSA forecasts were inaccurate, inconsistent and overly optimistic. Through the use of customized analytics that King’s group at Harvard developed, forecasts using publicly available information showed that the SSA Trust needs over $1 trillion more than it thought.

[Read about Pivotal, the big data venture spun out of EMC, in "Pivotal Is About Big Data, Not Fighting Amazon."]

King said this type of analysis would also apply to the insurance industry and public health, among other areas. His argument on the value of analytics over data seems to be that the data was already available, but extracting value depended upon building analytical techniques that could unlock that value.

Although King makes a strong point, the answer is that both data and analytics are important. All the analytics in the world will be of no help if the data does not exist or you cannot access the data for use. Still, King’s thesis really speaks to the need for creativity in the use of analytics to take advantage of data.

Integrating Structured And Unstructured Data Concepts

Traditional analytics has tended to focus on structured data--that is, relational databases (such as doing analyses using traditional data warehouses). Much of big data tends to fall into the unstructured data category. (I distinguish between semi-structured and unstructured data, but I won’t push the difference here.) Unstructured data tends to respond to analytical techniques such as text analytics, rather than the analytics typically applied to SQL data. This has led to the thesis that the two are separate and distinct (as well as to the thesis that non-SQL techniques will dominate).

Ralph Winters of Emblem Health and other speakers at the conference vigorously disagreed with this point of view. In his presentation, “Practical Text Mining with SQL -- Using Relational Databases,” Winters clearly showed the value in mapping unstructured data to structured data with a full-text search that led to a weighted word matrix and other types of structured analyses. This could be used to spot churn or conduct a sentiment analysis. Tying a relational database to such things as a Hadoop connector, open source text mining tools and file interfaces can lead to increased analytical richness.

The whole field of text (and other) analytics continues to evolve. Integrating analytic concepts that have traditionally been applied with structured data along with techniques that have traditionally been applied to unstructured data shows great promise.

Text Analytics Has Many Practical Uses

Many examples were discussed during the summit, so I hesitate to focus on just one presentation, but Sergei Ananyan, CEO of Megaputer Intelligence, did a good job of discussing the business applications of text analytics.

In the 21st century, text analytics is taking advantage of machine learning, semantic analysis and deep linguistic parsing. All that can lead to useful applications, such as loan default analyses and sentiment analyses, according to Ananyan. One of the more important areas is medical diagnostics, where early diagnostics can eliminate common source of error. Another example is the use of text analytics in e-discovery, which is the examination of electronic information for evidence in a legal case.

Mesabi Musings

We have been subject to an application-driven software intelligence perspective of IT--where applications have dominated our consciousness as to where we derive value from IT--for most of our lives. So a data-driven software intelligence perspective such as big data, where value in IT is squeezed from the data itself, is not only unfamiliar and hard to comprehend but also a little uncomfortable. Yet the world of data-driven software intelligence is the world of text analytics and will transform our view of how to get value from the IT infrastructure.

Pay attention to what is happening as it will affect your business life more and more.


Related Reading


More Insights


Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 
Vendor Comparisons
Network Computing’s Vendor Comparisons provide extensive details on products and services, including downloadable feature matrices. Our categories include:

Research and Reports

Network Computing: April 2013



TechWeb Careers