Special Coverage Series

Network Computing

Special Coverage Series

Commentary

Howard Marks
Howard Marks Network Computing Blogger

A Tale of Two Object Stores

New storage systems have joined the ranks of object stores, but they're really more general purpose storage devices that use some object storage concepts.

As the volume of unstructured data they need to store has grown over the past few years, organizations have discovered that their data is pushing up to, or over, the limitations of classic block and file based storage systems. Object storage systems, such as Amplidata’s AmpliStor, Data Direct Networks' WOS, and Amazon’s S3, provide the ability to store huge numbers of objects across exabytes of disks.

More recently, the designers of new storage systems are using object storage concepts on the back end of their systems while actually providing more traditional block or file access.

Traditional object stores -- although it seems a bit strange to call object stores traditional -- present their data through RESTful, HTTP-based Get/Put APIs. Relieved of the overhead of maintaining a hierarchical directory structure and having to support in-place data updates with all the locking overhead that implies, object stores can more easily scale-out to enormous dimensions.

No current object store is as pure as Seagate’s new kinetic drives, which use a native key value store, actually maintaining data on the disk drive sequentially by key. Some of the best known object stores, such as OpenStack Swift and EMC's ViPR, run on top of much more conventional file systems, scaling beyond the limitations of file systems by spreading the objects across many of them. Others translate objects to block IDs before writing them to local SATA drives.

The new class of object storage systems isn't out to create hugely scalable systems with RESTful interfaces but to take advantage of the power and scalability of object storage to build block and/or file based systems. Rather than map an incoming file or object to an object in their back end, storage systems from vendors like Exablox, SolidFire and Coho Data break incoming logical volumes and/or files into smaller objects and then store those.

[Read how Coho Data uses an integrated OpenFlow controller to reduce the latency associated with traditional scale-out storage designs in "Coho Applies SDN To Scale-Out Storage."]

Several of these new-age storage systems break the data into fixed size objects of 4KB-64KB, calculate a hash for each block and then use the hash value as the URI for the data chunk, turning their back end into CAS (content addressable storage). Each node, and ultimately each disk drive in the system, is responsible for storing those objects over range of hash values.

Data protection is typically provided by assigning two or three disk drives, in separate nodes, to hold each range of hash values and replicating the object across them. As the more observant reader will have already figured out, systems using this type of small object CAS get data deduplication as a side benefit of the architecture.

Since the object back-end, like a more traditional object store, doesn’t modify objects in place when a volume or file is updated, new objects are created to store the new data and the file’s metadata is modified to include the new object. Logical volumes, and files, are defined by their metadata, which makes snapshots and file versioning essentially free in terms of both performance and capacity consumed.

The real question is whether the vendors of these systems should call them object stores. From a technical point of view, they do use object technology but when I hear object store, I think of flat name spaces, RESTful interfaces and essentially unlimited scaling, all with limited random access.

These new systems, all of which use a significant amount of flash, are more scalable general purpose storage systems that just happen to use object storage as underlying technology. Do I care? Sure, but I’m not convinced we should lump them in with the RESTful crowd.



Related Reading


More Insights



Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 

Editor's Choice

Research: 2014 State of Server Technology

Research: 2014 State of Server Technology

Buying power and influence are rapidly shifting to service providers. Where does that leave enterprise IT? Not at the cutting edge, thatís for sure: Only 19% are increasing both the number and capability of servers, budgets are level or down for 60% and just 12% are using new micro technology.
Get full survey results now! »

Vendor Turf Wars

Vendor Turf Wars

The enterprise tech market used to be an orderly place, where vendors had clearly defined markets. No more. Driven both by increasing complexity and Wall Street demands for growth, big vendors are duking it out for primacy -- and refusing to work together for IT's benefit. Must we now pick a side, or is neutrality an option?
Get the Digital Issue »

WEBCAST: Software Defined Networking (SDN) First Steps

WEBCAST: Software Defined Networking (SDN) First Steps


Software defined networking encompasses several emerging technologies that bring programmable interfaces to data center networks and promise to make networks more observable and automated, as well as better suited to the specific needs of large virtualized data centers. Attend this webcast to learn the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging.
Register Today »

Related Content

From Our Sponsor

How Data Center Infrastructure Management Software Improves Planning and Cuts Operational Cost

How Data Center Infrastructure Management Software Improves Planning and Cuts Operational Cost

Business executives are challenging their IT staffs to convert data centers from cost centers into producers of business value. Data centers can make a significant impact to the bottom line by enabling the business to respond more quickly to market demands. This paper demonstrates, through a series of examples, how data center infrastructure management software tools can simplify operational processes, cut costs, and speed up information delivery.

Impact of Hot and Cold Aisle Containment on Data Center Temperature and Efficiency

Impact of Hot and Cold Aisle Containment on Data Center Temperature and Efficiency

Both hot-air and cold-air containment can improve the predictability and efficiency of traditional data center cooling systems. While both approaches minimize the mixing of hot and cold air, there are practical differences in implementation and operation that have significant consequences on work environment conditions, PUE, and economizer mode hours. The choice of hot-aisle containment over cold-aisle containment can save 43% in annual cooling system energy cost, corresponding to a 15% reduction in annualized PUE. This paper examines both methodologies and highlights the reasons why hot-aisle containment emerges as the preferred best practice for new data centers.

Monitoring Physical Threats in the Data Center

Monitoring Physical Threats in the Data Center

Traditional methodologies for monitoring the data center environment are no longer sufficient. With technologies such as blade servers driving up cooling demands and regulations such as Sarbanes-Oxley driving up data security requirements, the physical environment in the data center must be watched more closely. While well understood protocols exist for monitoring physical devices such as UPS systems, computer room air conditioners, and fire suppression systems, there is a class of distributed monitoring points that is often ignored. This paper describes this class of threats, suggests approaches to deploying monitoring devices, and provides best practices in leveraging the collected data to reduce downtime.

Cooling Strategies for Ultra-High Density Racks and Blade Servers

Cooling Strategies for Ultra-High Density Racks and Blade Servers

Rack power of 10 kW per rack or more can result from the deployment of high density information technology equipment such as blade servers. This creates difficult cooling challenges in a data center environment where the industry average rack power consumption is under 2 kW. Five strategies for deploying ultra-high power racks are described, covering practical solutions for both new and existing data centers.

Power and Cooling Capacity Management for Data Centers

Power and Cooling Capacity Management for Data Centers

High density IT equipment stresses the power density capability of modern data centers. Installation and unmanaged proliferation of this equipment can lead to unexpected problems with power and cooling infrastructure including overheating, overloads, and loss of redundancy. The ability to measure and predict power and cooling capability at the rack enclosure level is required to ensure predictable performance and optimize use of the physical infrastructure resource. This paper describes the principles for achieving power and cooling capacity management.