• 11/25/2013
    12:46 PM
  • Rating: 
    0 votes
    Vote up!
    Vote down!

Object Storage: The Next Storage Paradigm

Object storage is evolving from a data archive to the primary form of storage in large systems.

I remember my first object store in 2007. Using a COTS x 86 server with 6 TB of storage, it was powered by Caringo software. I needed a cluster of four units to make a decent starter kit, and it promised metadata searching and replication. The setup worked fine.

Work I'd done previously on Replicus-derived solutions set my expectations. It wasn't very fast and seemed suited for archiving data. A year later, a bunch of my systems were storing the Human Genome Project at Johns Hopkins, and it was clear that this was no longer just an interesting technology.

Since then, we've come a long way. Many more vendors are in the game, and open-source programs like Ceph and OpenStack's Swift promise reduced prices. Amazon uses the S3 object store to underpin a good portion of its AWS cloud service. Other cloud providers now offer similar solutions.

None of this is occurring in a vacuum. There are many other changes in the storage market. For instance, solid-state disks (SSDs) are finally beginning to overtake mechanical disk drives, especially in the performance market, but the impact on bulk-oriented object storage has been slower than in other areas of storage.

Experience with large installations has made it clear that object has much longer legs in scaling out than blockIO or NAS systems. The traditional SAN becomes unwieldy at scale, and NAS is hard to manage.

This has resulted in renewed interest in object as the primary storage form in large systems. Performance has been addressed, and storage tiering is allowing SSD storage to provide a much faster path to active data, just as in hard disk drive (HDD) arrays. With lower drive prices, deduplication of objects, and the advent of erasure code systems -- which use about half the space of replication-based stores -- SSD is economical for a high-speed tier and much faster.

Cheap HDD bulk storage makes it possible to have more archive-class capacity per node for the more than 80% of data that is inactive. This also improves economics, making the latest stores attractive on both a cost-per-IOPS and cost-per-TB basis.

Ceph has already begun the next step. The RADOS object store that underpins the system also has BlockIO (iSCSI) and NAS gateways and can support S3 and Swift APIs. This unified approach erodes the need for separate storage silos for each protocol. A management and tools ecosystem is growing around Ceph, which is now mainstreamed in the Linux OS release.

EMC's ViPR system addresses a different point in unification. It brings the legacy gear typical of a datacenter into a single unified pool of storage. This is more a management approach than a data flow fix, since the legacy machines maintain their protocols, but it seems reasonable that, at some point, there will be gateways to bring them together, so that the pool can present as any of the standard options.

Ceph will likely evolve in ViPR's direction to provide the same capability. Theoretically, a Ceph server could expand to connect iSCSI or Fibre-Channel storage devices, though today it can't manage them on a single screen.

Another aspect of object is the way that host systems connect to storage. A typical OS storage stack shows layers of inefficiency. As a result, work is happening on new interfaces and modes of access. Seagate just announced an object-mode interface for its new shingled bulk drives. This is not an object store in its own right, but it is a drastic simplification of the stack. The NVMe interface, aimed primarily at SSD, is an attempt to slim the stack further. These two interface changes should help object store performance.

In the area of unified object stores, the storage industry is working on a key/data access mechanism for database work. This is already in its early stages and should tie in to content and metadata searching as time goes on. If that happens, I'd expect to see graphics processing units added to object storage nodes.

To speed up operation and reduce network traffic, some storage features need to migrate to the server. Compression and deduplication hashing must be done before shipping data over the network, for instance. Encryption of data at rest seems a logical addition on the store side, but encryption in transit needs to be resolved, as with all storage.

Looking further out, I predict that the Ceph/ViPR unified object model will be the norm for new storage products by 2020, and Ethernet-based access will be used in new installations. SAN and NAS will be absorbed into the unified storage pool, and storage management will be simpler.


Object store different from object database?

Is this different from the sort of object database you'd use to store objects from an OO programming language?

Re: Object store different from object database?

There are similarities. Object storage uses a database to identify where an object's segments are stored, and carries an extended amount of metadata on the object. The main difference is that an object store may place the data in a wide variety of file formats, while object storage adheres to a structure of replicated objects broken down into chunks that are distributed over the storage pool by an algorithm (Ususally CRUSH).

An OOD accesses data as objectsm while a object Store saves them as objects. In a way the two complement each other.

Re: Object store different from object database?

I do appreciate the explanation. Compared to Object DB, Object Storage is somehow more linear and structured. This is reasonable - the goal of Object Storage is storing massive amount of data in an efficient manner with brevity. We must be able to access the data in a straightforward and efficient manner.

Re: Object store different from object database?

Some object storage is built on or uses No-SQL object databases.  Basho's Riak CS is an object storage application that runs on tope of their Riak object storage database.  Cloudian uses a highly-forked Cassandra implementation to store object metadata and some small objects in conjunction with their proprietary HyperStore technology for larger objects.  

What about everyone else?

Well, while Caringo's CAStor goes back to 2007, but so does Sage Weil's CEPH, which was the subject for his PhD thesis in Computer Science at UC Santa Cruz in 2007.  InkTank is the commercial sponsor for CEPH.  Aside from the large public object storage providers, like AWS with S3, there are number of newer object storage software providers like Scality (RING), Cloudian (Cloudian), Basho (Riak CS), Cleversafe, Amplidata and Data Direct Networks (WebOS).  Some object storage vendors do use No-SQL databases.  Basho's Riak is modeled after AWS Dynamo and Cloudian uses Cassandra in conjunction with their proprietary HyperStore technology.  There are emerging in object storage, but almost every object storage vendor implements some level of API compatibility with S3.  S3 is the de facto standard for object storage.  SNIA (Storage Networking Industry Associaton) is promulgating their CDMI (Cloud Data Management Interface) standard for object sorage, which will also be backward compatible with S3.  Seagate's Kinetic Open Storage Platform does portend a simplification of the object storage "stack" by eliminating the POSIX file system and storage servers with their SAS expanders and RAID controllers.  Object storage is the next storage platform for unstructured data.  

Re: What about everyone else?

Hi Tim, The pace of offerings is picking up fast. I think losing the huge multi-layered file stack is a major step we have to take, but Kinect only addresses a local storage environment, and leaves the block distribution and replication issues to some other node. With that limitationm I still see a need for CEPH or some other package to handle the bigger picture.

Re: What about everyone else?

Yes, Seagate Kinetic is all about object storage.  The Kinetic Open Storage Platform simplifies the object storage stack by "disaggreating" the application from the storage devices it uses.  The application layer is the responsibility of the developers, so some of the things you mentioned as being necessary will be their responsibility to implement in software. The interconnection layer between the application and Kinetic HDDs uses the LibKinetic libraries for C++, Java, Python, Erlang and Google Protocol Buffers.  Seagate Kinetic HDDs, which currently are 4TB Seagate Terascale HDDs spinning at 5900 RPMs, have circuitry added for two SGMII GbE ports, which use the existing SAS connector to plug into a switched Ethernet Layer 2 backplane.  While putting Ethernet ports on HDDs is not new, combining Ethernet with a Key-Value API on a disk drive is new.  The tray builders for Seagate Kinetic HDDs include Dell, Newisys, Supermicro and Xyratex.  Their JBOK (Just a Bunch of Kinetics) trays will likely have 2 or 4 10GbE connections to a ToR (Top of the Rack) switch.  No storage servers needed, just application servers, which could be in the same data center or co-location site or not.  The Kinetic HDDs also do some additional work having to do with replication and migrating data.  After all, there are over 1M lines of code (soon to be 2M) running on a HDD! The point of Seagate Kinetic is not to create a "Seagate only" technology.  Rather, Kinetic is a way to change how object storage gets done by disaggregating applications from storage devices using plain old TCP/IP over Ethernet and a Key-Value API.

Re: What about everyone else?

I suspect we'll see Kinect on SAS too. After all, old SIGs never die they just invent new standards!

Cloud computing

Cloud storage is the next big thing after mobile phone apps. In fact mobile usage will be enhanced by it

Re: Cloud computing

I think the tie between the cloud and mobile is very tight. Withouit the cloud, the compute bandwidth etc won't be availble for the expansion of mobile to the Internet of Things that is coming.