How To Tackle The Big Data Challenge (Part 1)
April 23, 2012
Big data is a term getting bandied about a lot these days. It describes the phenomenon of information that keeps growing in organizations, thanks in part to the growth in social media. According to the InformationWeek Research: The Big Data Management Challenge survey of technology professionals, regardless of industry, the five top data drivers are financial transactions, email, imaging data, Web logs, and Internet text and documents. The two main benefits of big data management are being able to standardize procedures and services, and the ability to organize data so it can be searched, browsed, navigated, analyzed and visualized, according to survey respondents.
"If you’re creating large data sets, you have no choice but to embrace big data management," says Michael Biddick, CEO of Fusion PPT, and author of the InformationWeek report. "Without the right tools and architectures, your business won’t be able to effectively use the information it has collected."
- Big Data Analytics: Are You Ready?
- Bring Salesforce.com Alive with Your Key Business Processes: Register Now
- Forrester Study: The Total Economic Impact of VMware View
- HP Newsletter with Gartner Research: Maximizing Your Infrastructure through Virtualization
However, big data management has its challenges: 57% of respondents note that budget constraints are the main barrier preventing them from taking action. Biddick points out that storage alone can quickly consume an enterprise budget, and that most data centers double their storage capacity requirements every two to three years. "To manage big data, companies need to figure out the right mix of policies and technologies to balance access, performance with capacity, security, and short and long-term costs,"’ he says.
And what even constitutes big data is not easy to define, Biddick says. There are four elements that are required for a data set: size (30 Tbytes is a good starting point, he says); type of data, whether structured, unstructured or semi-structured; latency, since big data changes rapidly and new data sets that result need to be analyzed quickly; and complexity. The characteristics of complex data, says Biddick, include large single-log files, sparse data and inconsistent data.
According to a recent study by the Enterprise Strategy Group--which defines big data as data sets that exceed the boundaries and sizes of normal processing capabilities, forcing organizations to take a non-traditional approach--the cure can be almost as painful as the problem. Managing big data is an issue because the platforms are expensive and require new server and storage purchases, training in new technologies, building up an analytics tool set, and finding people with the expertise in dealing with it.
Another study from Infineta Systems, a provider of WAN optimization systems for big traffic, found that data center-to-data center connectivity is a "silent killer" for big data deployments. And a third recent study, IDC Predictions 2012: Competing for 2020, said big data analytics technologies will be one of the driving forces for IT spending through 2020.