Scality: Climbing The Cloud Storage Heights

The traditional storage business models of NAS and SAN have been worthwhile for many enterprises, but the question remains: Is NAS or SAN best for cloud computing environments? Scality, a storage vendor, asserts that, in many cases, the answer is no. Scality's RING "organic storage software" is, according to the company, the first object-based storage platform to provide the levels of performance necessary for robust Web-based applications.

David Hill

June 24, 2011

7 Min Read
Network Computing logo

The traditional storage business models of NAS and SAN have been worthwhile for many enterprises, but the question remains: Is NAS or SAN (or some combination of both) best for cloud computing environments? Scality, a storage vendor with offices in San Francisco, Paris and Tokyo, asserts that, in many cases, the answer is no. Scality's RING "organic storage software" is, according to the company, the first object-based storage platform to provide the levels of performance necessary for robust Web-based applications.

Scality’s target customers are public email providers and public cloud providers, although large enterprises could use RING for a private cloud. The data best served by RING is not structured data--that is, relational database residing/managed information, such as that found in mission-critical online transaction processing systems. Instead, Scality focuses its efforts on unstructured data, which the company defines as anything from emails and word processing documents to photos and videos.

For purposes of this discussion, I will accept Scality's definition of unstructured, although I differentiate between semi-structured file (such as word processing documents) and unstructured files (such as videos).

Now, the volume of unstructured information greatly outweighs that of structured data in enterprises and clearly dominates consumer markets, so there is andwill continue to be plenty of data to manage.

Note that for the purposes of providing online storage services for unstructured data, the IT architecture, including the storage architecture, of public cloud service providers is a black box to the user. That means that the end user generates well-defined inputs (for example, a request for a specific piece of information) and receives well-defined outputs (getting back the requested information) with the required service levels, suchas low latency from the time that the request is made until it is satisfied, and high reliability that the specified delivery will be met. But all of this is performed out of sight, if not out of mind, of the end user.

You could think of it in terms of what my good friend Peter Keen, the well-known IT consulting guru, told me many years ago: Technology should be "mundane and opaque." To Peter, mundane means that the technology should be easy to use, while opaque means that you have no idea how the technology works. The bottom line is the user neither knows, nor should he or she care, what IT architecture the service provider uses.

This means that the service provider is not locked into choosing traditional network-attached storage/storage area network (NAS/SAN) technology unless it wants to, but can choose virtually any architecture so long as it supports the service’s performance and service-level agreement (SLA) requirements. Scalilty argues that the service provider should use multiple server/storage nodes, where a portion of the overall storage is inside each server as dedicated direct-attached storage (DAS) arrays.

Note that this back-to-the-future DAS model is antithetical to the concept of shared storage as found in a conventional SAN.

So how does this architecture work? How does a user, who may be anywhere in the "cloud," access his or her data? How does the system deliver the required performance?Scality views each instance of an x86 server and its DAS as a "cell," although "node" is a more generally used term for this approach. Scality uses the term "organic storage" to convey a visual image of how the cells can be managed in concert.

The cells are connected by a meshed network topology, typically with Gigabit Ethernet. The company’s software provides the intelligence to manage RING-attached assets, such as finding the information that meets a specific user request. Each cell has a key address, and the cells are logically organized on a circle--hence, Scality’s use of the term RING--and the RING system supports a peer-to-peer addressing scheme. This means that there is no central point that manages all the addresses, as that would be a bottleneck. It also means that there is no inherent limit in the number of servers (as, once again, there are no choke points) that can be contained in a single RING.

Now, Scality takes an object-based approach to storage software. That means that what we typically call files are managed as objects. Why the quibbling with words? The use of the term "file" implies that self-contained information, such as a word processing document, is managed through the use of a file system that contains an index table that points to the physical location where the data is stored. That sounds simple but, in practice, there are limits to the number of files that can be managed.

The essential value of finding data using object-based storage is that it is simpler in design and has the potential to scale to billions of individual "objects," including files. This scalability is the reason that a lot of attention is currently being focused on object-based storage. The key limitation with object-based storage has traditionally been its performance; it has been viewed as inherently slow in retrieving data.

However, vendors have increasingly found ways around those limitations. For example, in the Scality RING there can be many cells (up to many thousands). While there is no way to directly identify in which cell a particular piece of data resides, Scality’s technology allows the query to rapidly "hop" from cell to cell. Additionally, there is an effective way of limiting the number of cells that have to be accessed; if a cell does not have the data, it can point to another cell where the data is more likely to reside.

The maximum number of hops for 10,000 nodes is seven hops. Doesn't this lead to slow performance? You would think so, but Scality has devised a method for queries to locate and fetch data from anywhere on the ring in no more than 14 milliseconds. The Scality RING infrastructure also operates in a massively parallel mode, as compared with the serialized mode that occurs on a SAN, so the company believes that it has solved the performance issue.In addition to performance and capacity scaling (billions of objects and a large number of petabytes can be supported in Scality RINGs), the company’s highly parallel software intelligence has no single point of failure--some software resides permanently on each cell's server. Multiple copies of the data are made for protection and recovery purposes. Altogether, that leads to Scality RING systems being highly resilient and self-healing.

Sounds excellent overall, but two key questions remain: How much does Scality’s RING solution cost for primary storage and for long-term storage? Scality claims its five-year total-cost-of-ownership calculations show its solution costs a little more than a third of what a comparable SAN or NAS solution would cost. Scality also says that RING solutions cost only half as much as similarly featured long-term storage solutions offered by existing cloud suppliers such as Amazon S3, which uses object-based storage, as well.

When a new computing world model, such as cloud computing, arises, the old way of doing business, including architecture, may be subject to radical rethinking and change. That is what Scality is betting on with its Scality RING Organic Storage. Scality is not alone in this space--other vendors are developing their own innovative object-based storage solutions for cloud environments. Scality feels that it can run rings around its competitors (sorry, I couldn’t resist), and who knows? It may be right.

Scality has a good story to tell. Now it comes down to how well it executes and delivers on its promises and whether or not an upstart company that is not yet on the radar screens of many prospective customers has an even better solution. However, Scality has raised the stakes, so any challenger would have a lot of work to do. From a management, architecture and price/performance perspective, service providers may very well want to give Scality a close look.

At the time of publication, Scality is not a client of David Hill and the Mesabi Group.

About the Author(s)

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights