GridIron Systems: Mining Big Data 'Gold' in a Flash

Trends in the IT industry sometimes resemble gold rushes as vendors pan for revenue "nuggets." The use of solid state devices (SSDs)--most notably, flash memory--is the central point of one of these, but just as with the real 19th century gold rushes in California and Alaska, not all prospectors (that is, vendors) will be successful. Where the claims are staked can make all the difference in the world, and GridIron Systems is staking one with a focus on accelerating big data analyses.

David Hill

January 12, 2012

8 Min Read

The IT industry loves to give trends labels ("cloud," anyone?) and "big data" is the buzz label for one recent trend. Three distinguishing characteristics that are often noted with respect to big data are volume, variety and velocity. Volume is the quantity of data. Variety describes the fact that big data is not merely structured information (the familiar SQL type found in relational databases) but also includes semi-structured data (which is content-searchable) and unstructured data (which is bit-mapped data, such as video surveillance files). Velocity relates to the speed required to both capture big data and analytically process it.

Now, big data is nothing new. Large data warehouses have been around for quite a while, and specialized vertical market information, such as seismic data, has been captured and analyzed for years. But large volumes of new sensor-based information (such as more utility readings captured with "smart" meters) and new sources of semi-structured and unstructured information (such as that generated by and stored on the Web and the Human Genome Project) have led to big data being added to the IT lexicon.

But big data is highly complex for a number of reasons beyond volume, variety and velocity. Sometimes the data is transient (meaning that it is captured, analyzed and deleted quickly, such as with very frequent RFID sensor information, where the value of the information is quickly extracted and there is no ongoing value or need for storage or archiving), but sometimes it is persistent (where the data is kept for a long period of time, such as with historical sales information). Note that this variety of continua has serious implications for how value is derived from big data through processing (such as the speed of processing and when it needs to take place). This also impacts how the data is best stored. Standard relational data warehouses were not designed to handle many of the new big data workloads or analytical processes; this is why the term "virtual data warehouse" is coming into vogue.

Another major problem in working with big data is the "I/O gap." Essentially, while server performance has continued to evolve, storage performance has remained essentially flat. For example, while the capacity of disk drives has increased dramatically, the rate of revolutions per second of disk drives has remained essentially constant. What this means practically is that when servers can process more I/Os than the storage can deliver, the speed of processing is slowed because the servers cannot process data that they don't have.

This I/O bottleneck is a performance problem that is not exclusive to big data--increasingly, robust enterprise applications run into the same problem. Additionally, because SSDs promise to solve this problem (since they have no mechanical parts that impact I/O) they are also a generally hot topic across enterprise storage. For its part, GridIron Systems focuses on the big data aspects of the I/O gap. Let's see what it has to offer.

GridIron Systems targets businesses' need to accelerate the processing of big data workloads and requisite high bandwidth and/or IOPS. A workable solution must also enable highly concurrent data access since there may be a large number of both users and applications active simultaneously. In addition, volatility is a big issue, as the queries that access the data may be rapidly changing, and there is often a high data ingest rate.

This concurrent access requirement and increasing volatility render traditional caching and tiering solutions, even those utilizing SSDs, ineffective when dealing with performance-constrained big data stores. Traditional caching is designed for read/write production systems and makes certain assumptions, such as a relatively fixed size data set, which benefits uniformly from lower latency.

GridIron recognizes the different requirements of big data workloads and has designed its algorithms to take advantage of the differences. Traditional tiering works best with large, stable data sets, where taking some time to move data from one group to another is reasonable. Many big data workloads require concurrent bandwidth to process data more quickly (which GridIron delivers) than traditional tiering can handle.

GridIron Systems' solution is the TurboCharger, an SSD-based (primarily flash, but with as little RAM as necessary) appliance that resides in storage area network (SAN) between servers and storage arrays. GridIron's objective is to provide solid state performance to data in the SAN without requiring IT administrators to have to change a thing--no software, database, server, storage or process changes of any sort are required. This is important not only in that the GridIron TurboCharger can be deployed without administrative burden, but it also lets IT feel comfortable in knowing that they could remove the TurboCharger if necessary (although performance would revert to the pre-TurboCharger state).

GridIron currently offers two models of the TurboCharger appliance--the GT1100 with an SSD capacity of 2.5 Tbytes and the GT1100A with an SSD capacity of 6.5 Tbytes. Each has a bandwidth of 1.6 Gbps and 100K IOPS.

Note that since the GridIron TurboCharger is entirely separate from servers and storage, current arrays can still be used; the TurboCharger is designed to complement existing systems and does not have to hold all the big data simultaneously. This means that the current storage array itself could have a tier 0 SSD layer, a Tier 1 FC/SAS layer, and a Tier 2 SATA layer. This external SAN-based approach frees up server and storage system processing from the complexity and mechanics of SSD operation and management.

So how can simply having additional SSDs as a front end in an appliance lead to GridIron's claims of speeding up applications two to 10 times and reducing read latency from 10 times to 100 times? There are some hardware advantages in GridIron's approach, as the appliance can add concurrent bandwidth, enabling multiple applications to concurrently access the same array without interference. But GridIron's secret sauce lies in proprietary software algorithms and heuristics that enable better caching for performance enhancing data.

Obviously, GridIron does not make a lot of details available, but basically the TurboCharger appliance allows big data to be read cached (as big data analyses are on data that has been captured, or already written to disk). The TurboCharger examines I/O patterns quickly and sees what, when and how big data is actually used. This can be done in real time, which is important because tiering in an array often examines patterns that occur over a day or more and this is simply too slow to be effective for many big data applications.

In addition, GridIron provides the concurrent bandwidth among applications to prevent "thrashing," which occurs because of resource contention (that is, a storage bottleneck), meaning that less work gets done as servers spend more time waiting for those resources. The effect of this concurrent bandwidth capability is conceptually equivalent to maintaining a real-time DBA whose sole function is to continuously relay out data sets to match server processing demand, at no performance cost, and that is continually responsive to changes in usage or data growth.

Among the benefits of a GridIron deployment are that both CPU and storage utilization are maximized. Moreover, the same storage array can be used for both production and data warehousing applications. Even though GridIron does not cache writes, it can improve write performance indirectly by offloading the read I/O work that the array would have had to perform otherwise, thus allowing the array to process the writes more efficiently.

GridIron cites a number of client examples. In one, an Oracle data warehouse for an online comparison shopping site was able to reduce the production of critical reports from six hours to about 30 minutes. In the process, GridIron says, about $2 million was also saved from storage and server consolidation. In another case, a financial services hosting provider was able to triple improvement in response time as well as double its overall hosting capacity, leading to savings of more than $1 million.

I/O bottlenecks are a very real problem that does not allow servers to exploit all of their capabilities while trying to process information from storage. That is why so many vendors are rushing in with SSD-based solutions to try and solve this problem. GridIron Systems does not try to solve every I/O bottleneck problem, but instead focuses on simply addressing the problem as it relates to the analysis of big data.

GridIron seeks to accomplish this through its TurboCharger line of appliances. On the business side, the advantage of faster analyses is the ability to take advantage of the results of the analysis faster (and, in some cases, that may be the difference between the data being of material value since late-arriving data may be worthless). On the IT side, better CPU and storage utilization leads to better IT investments. That can include savings, which makes GridIron's approach to enhancing big data acceleration and analysis worthwhile.

At the time of this publication, GridIron is not a client of David Hill and the Mesabi Group.