Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

7 Storage Vendors Keeping Up With Big Data

  • One of the more confusing aspects of a big-data project is figuring out what type of storage to use. Wading through all the technical terms and marketing-speak is time consuming and you often end up confused and frustrated as opposed to enlightened. In this slideshow, we provide some shopping guidance for big-data storage. There are many storage vendors pushing their particular products and architectures. We list seven that are adept at supporting big data and explain how they can be good options in certain scenarios.

    Your choice in a storage platform will mostly revolve around the type of data being stored, how it is collected, and what type of analytics framework is used to collect and perform data mining. Some popular analytics frameworks include Spark, Hadoop, Flink, and NoSQL.

    There are several different types of storage solutions for big data, depending on your environment. From a network-attached storage perspective, clustered/scale-out NAS remains a popular choice. Object-storage systems also are gaining traction as storage optimized for hyperscale computing environments. Last, there's a niche market in big-data analytics that can take advantage of pre-configured and easy to deploy hyperconverged platforms, which include compute, network and storage in one neat little single-vendor bundle.

    That said, let's take a look at which storage vendors and products we're excited about when it comes to supporting big-data initiatives. Our section includes the typical players that have been in the storage space for years. But we also include some up-and-coming vendors and technologies that are making an impact on the rapidly growing and changing world of big data.

    (Image: Sehenswerk/iStockphoto)

  • EMC 


    EMC this spring announced Virtustream Storage Cloud, a hyper-scale storage cloud available as a service or on premises to take on rivals such as Microsoft, Oracle, and IBM. VSC is an object- storage platform that provides a number of unique controls and automation tools as well as a laser-like focus on applications that are I/O intensive. EMC acquired Virtustream last year.

  • IBM


    If your big-data ambitions require the speed of an all-flash storage array, you may want to check out IBM’s FlashSystem portfolio. IBM claims that the minimum latency for storage access is 250 microseconds or less. The company looks to be targeting big-data deployments in the cloud that would suffer if storage latency is too high.

  • Infinite IO


    Performing data analytics on ultra-fast flash storage arrays is where all the action is. Yet most big data is going to sit dormant for long periods of time. The area of cold storage is an interesting topic right now and new companies such as Infinite IO are tackling it. The company’s unique storage architecture helps to bridge the gap between active (hot) data and inactive (cold) data in low-cost cloud storage tiers.

  • Hitachi Data Systems 


    HDS appears to be focusing its big-data efforts on its Virtual Storage Platform (VSP) line, which combines NAS and cloud storage tiering to lower overall storage costs for customers.

  • Seagate 


    If you’re in the market for a high-performance computing environment with high I/O requirements – and looking for a storage platform that supports Intel’s Enterprise Edition for Lustre (IEEL) platform – Seagate's ClusterStor might be right for you. Seagate made the IEEL support known only a few months ago; it helps  bring the widely popular ClusterStor portfolio to new customers that preferred using Intel’s distribution of Lustre.

  • Cisco 


    When it comes to hyperconverged systems that combine compute, network, and storage, Nutanix used to be the biggest game in town. But since hyperconvergence has taken off as of late, Cisco has entered the arena with its HyperFlex line of products. From a storage perspective, Cisco’s product combines both solid-state and spinning-platter drives that are architected in such a way to provide optimal performance and high availability. While HyperFlex may not be right for large-scale big-data projects, it allows for a decent amount of scaling and sufficient east-west throughput for big-data workloads.

  • Pure Storage 


    If you haven’t noticed by now, many big-data initiatives require the raw I/O speed of solid-state drives. One of the small, agile players in the big-data SSD storage market is Pure Storage. Its latest Evergreen storage product line can be upgraded not only from a storage capacity standpoint, but also from a performance standpoint. Users can easily upgrade controllers to the point where the company claims customers can deploy the technology "once and keep expanding and improving it for 10 years or more, all without any downtime, performance impact, or data migrations.”