David Hill

Network Computing Blogger


Upcoming Events

Where the Cloud Touches Down: Simplifying Data Center Infrastructure Management

Thursday, July 25, 2013
10:00 AM PT/1:00 PM ET

In most data centers, DCIM rests on a shaky foundation of manual record keeping and scattered documentation. OpManager replaces data center documentation with a single repository for data, QRCodes for asset tracking, accurate 3D mapping of asset locations, and a configuration management database (CMDB). In this webcast, sponsored by ManageEngine, you will see how a real-world datacenter mapping stored in racktables gets imported into OpManager, which then provides a 3D visualization of where assets actually are. You'll also see how the QR Code generator helps you make the link between real assets and the monitoring world, and how the layered CMDB provides a single point of view for all your configuration data.

Register Now!

A Network Computing Webinar:
SDN First Steps

Thursday, August 8, 2013
11:00 AM PT / 2:00 PM ET

This webinar will help attendees understand the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging. It will also help users decide whether SDN makes sense in their environment, and outline the first steps IT can take for testing SDN technologies.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

See more from this blogger

EMC Sees Big Opportunity In Big Data

According to an EMC-sponsored IDC report, the amount of data amassed by consumers and businesses is expected to increase by 44 times in this decade. A lot of that information will be what many, including EMC, call big data. Obviously, big data requires storage and other products and services that the company provides, so it should come as no surprise that, in its recent blizzard of announcements, EMC targeted big data as one of its key markets. Let's try to understand big data and what it means, and then briefly illustrate how EMC is addressing the big data market through its recent acquisitions of Isilon and Greenplum.

EMC's working definition of big data is "data sets, or information, whose scale, distribution, location in separate silos or timeliness require customers to employ new architectures to capture, store, integrate (into one data set), manage and analyze to realize business value." Now, that is quite a mouthful and requires some time to digest and, of course, it fits around what EMC can or wants to do. However, the definition covers the essence of the subject and makes some valid points. But let's look at some examples to gain a better perspective on the breadth of where big data resides in the real world:

  • Medical information--including medical images, such as MRIs, as well as electronic health records (EHRs);
  • Increased use of broadband on the Web--including the 2 billion photos each month that Facebook users currently upload, as well as the innumerable videos uploaded to YouTube and other multimedia sites;
  • Video surveillance--this is a booming business with a need for enormous volumes of storage, as well as the advanced analytics to make sense of it;
  • Increased global use of mobile devices--the torrent of texting is not likely to cease;
  • Smart devices--sensor-based collection of information has a tremendous future enabling smart electric grids, smart buildings and many other public and industry infrastructures;
  • Non-traditional IT devices--including the use of RFID readers and GPS navigation systems;
  • Non-traditional use of traditional IT information, including the transformation of OLTP into, say, a data warehouse for applying analytics, e-discovery and Web-generated information tools; and
  • Industry-specific requirements, including high-performance computing solutions in genomic research, oil and gas exploration, entertainment media, etc.
Now, a critic might say that there is nothing new here. For example, medical images and broadband Web access have been around for a long time. The reply is that big data-related changes are probably mostly a matter of degree but also, to some extent, a matter of kind. The matter of degree comes about because of the increased intensity of usage and higher scale--sheer volume of petabytes of storage--than we have ever had. The matter of kind relates to the transformation of data from analog to digital and the need to get business value in new ways. But the key point to remember is that big data is a huge market that translates into "big money." From an IT business perspective, that is why big data matters.

There have been, roughly speaking, three major waves in the kind of structure that information has had from an IT perspective. Note that these new waves do not replace the old waves that continue to grow, and all three types of data structure have always been present, but one type of structure tends to dominate the others:

  • Structured information. This is the information that finds a home in relational databases and has dominated the use of IT for many years. It is still the home of mission-critical OLTP systems businesses depend upon; among other things you can sort as well as query on structured database information.
  • Semi-structured information. This was the second major wave in IT and includes e-mails, word processing documents and a lot of information stored and presented on the Web. Semi-structured information is content-based and can be searched, which is the raison d'etre of Google;
  • Unstructured information. This can be thought of as primarily bit-mapped data in its native form. The data has to be put in a form that can be sensed (such as seen or heard in audio, video and multimedia files). A lot of big data is unstructured, and its sheer size and complexity require advanced analytics to create or impose structure that makes perceiving and interacting with it easier for humans.
Unfortunately, this classification scheme is not perfect. First, there are numerous hybrid and composite forms such as a photo embedded in a word processing document. Secondly, while "records" is a term that applies to databases, and much of the semi-structured information is stored in files, other information resides in streams such as captured by a video camera. And then there is the entirely separate concept of objects.

The bottom line, though, is that traditional IT infrastructures--including servers, storage and networks--were built around structured information and bent to adapt to semi-structured information. However, they are really not designed for the multifaceted structure requirements, scale and analytical demand required by big data.

That is why EMC underlined new architectures in its definition of big data, and that is also why it acquired Isilon and Greenplum. Much has been written about these acquisitions, so I will focus briefly on how the companies illustrate the need for different architectures for big data.


Page:  1 | 23  | Next Page »


Related Reading


More Insights


Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 
Vendor Comparisons
Network Computing’s Vendor Comparisons provide extensive details on products and services, including downloadable feature matrices. Our categories include:

Research and Reports

Network Computing: April 2013



TechWeb Careers