Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Big Data Analytics And Why Datacenter Infrastructure Matters

We are witnessing an amazing trend in the IT world. There are now more devices, more connections to the Internet, and more demands from devices than from people. At the same time, IT environments have a new challenge: the data all these devices produce.

The average employee already utilizes, on average, three to five devices to access a corporate datacenter. As more devices come online and connect to the cloud, they will produce more data -- data that will need to be managed and quantified. Let’s consider some numbers:

• According to IBM, the end-user community has, so far, created more than 2.7 zettabytes of data. In fact, so much data has been created so quickly, that 90 percent of the data in the world today has been created in the last two years alone.

• As of 2012, there are more than 240 terabytes of information collected by the Library of Congress in the U.S.

• Facebook handles 300 million photos a day and about 105 terabytes of data every 30 minutes. It also stores 100 petabytes of data on one Hadoop disk cluster, according to published reports.

• In 2012, the Obama administration officially announced the Big Data Research and Development Initiative. Now, there is more than $200 million invested into big data research projects.

• Research firm IDC forecasts the big data market will grow from $3.2 billion in 2010 to $32.4 billion in 2017.

• Finally, the 2012 IDC Digital Universe report predicts that we'll reach 40 zettabytes by 2020, a 50-fold growth from the beginning of 2010.

My questions to readers are: What good is a large amount of data if it can’t be analyzed or quantified? Furthermore, how is it possible to manage such large data sets over a cloud or WAN environment? Hadoop, for example, has become the unofficial standard behind big data management engines. These platforms are very new and require a new type of skillset to deploy and manage.

[Read about a new generation of cloud-based log aggregation and analysis services that promise to help enterprise IT security teams filter gigabytes of log data in "Big Data Analytics, Cloud Services and IT Security."]

Another challenge is that there aren’t too many people who can speak the language of big data. Remember, all of these technologies are still emerging. But, surprisingly, a 2011 Ventana Research survey, done in conjunction with analytics company Karmasphere, showed that the big data uptake was happening fast.

Ventana polled IT managers, developers, and data analysts working with Hadoop in hundreds of companies of various sizes and industries and found:

• More than one-half (54%) of organizations surveyed are using or considering Hadoop for large-scale data processing needs.

• Over 82% of respondents said they benefit from faster analyses and better utilization of computing resources.

• Eighty-seven percent of respondents are performing or planning new types of analyses with large-scale data.

• A whopping 94% of Hadoop users surveyed say they now perform analytics on large volumes of data not possible before; 88% analyze data in greater detail; and 82% report they now retain more of their data.

• More than two-thirds of Hadoop users perform advanced analysis -- data mining or algorithm development and testing.

Now back to reality. With the advancements in cloud computing, IT consumerization, and the increase in data, where will all of this be controlled? The reality is simple: Behind all of these technologies sits the datacenter. High-density computing, advanced types of networking methodologies, and more efficient ways to control a datacenter infrastructure are the only means to meet today’s business demands.