Storage

09:00 AM
Howard Marks
Howard Marks
Commentary
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Will Ethernet Take Over The Spinning Disk?

With its open Ethernet drive architecture, Western Digital's HGST subsidiary joins Seagate's Kinetic drives in using an Ethernet port to access an object store on the disk.

It must be a tough time to be running a disk drive company. Flash is taking the high performance, and high margin, top end of your market, and since Seagate and Western Digital each have more than 40% market share, you can’t really acquire market share. To insure the future of a hard disk drive company, you have to diversify into SSDs and add value beyond mere capacity to your spinning disks, such as an Ethernet interface.

While the ATA-over-Ethernet (AoE) protocol at the heart of Coraid’s storage systems was designed to allow a server to send block storage commands directly to individual disk drives, the concept has never caught on in the mainstream. Even in the Fibre Channel world where each drive was individually addressable, software RAID and JBODs never made sense -- it places too much of a load on the host’s processor.

The new generation of Ethernet drives isn't just using Ethernet as a new connection interface; it's also moving the communications protocol from simple commands to read-and-write data blocks to a higher level of abstraction. At this week’s OpenStack Summit, Western Digital’s HGST subsidiary demonstrated its open Ethernet drive architecture. Like Seagate’s Kinetic drives, it uses a 1 Gbit/s Ethernet port to access an object store on the spinning disk.

By implementing the basic Object Storage Device (OSD) on the disk drives, this new generation of Ethernet drives offloads media management and even data placement for object storage systems from the storage server to the disk drive. This lets architects of object storage systems like OpenStack Swift manage a hundred or more Ethernet-connected drives with a single server that would ordinarily manage 12 to 24 SAS or SATA drives.

There are some significant differences between HGST’s open Ethernet drives and Seagate’s Kinetic. Seagate decided to implement a simple key-value store and support vendors like SwitftStack and Scality to interface their object stores to it. HGST gives developers an even more open playing field by letting them run an instance of Linux on each disk drive.

Each HGST open Ethernet architecture drive has a 32-bit ARM processor and memory in a system-on-chip ASIC. Storage system developers can recompile the code for their basic storage building block for the ARM and run it on the drive. HGST claims this means developers can run their native code rather than writing connectors to Kinetic’s key-value pair API. Given the amount of support I’ve seen for Kinetic, I don’t think this is a big deal for anyone but a developer that hasn’t gotten Kinetic working yet.

Unlike Seagate, HGST hasn’t announced an actual product and is calling this a technology demonstration. At the OpenStack Summit, it demonstrated the Swift OSD running on 4TB drives. It also showed a 4u 60-drive chassis with built-in Ethernet switch and 10 Gbit/s uplinks.

As the storage market bifurcates into performance and capacity markets, Ethernet disk drives may be a great solution for exabyte-scale storage. Ethernet disk drives, like ARM-based microservers, are based on the assumption that it’s more cost effective to spread a workload --  in this case a storage system --  across thousands of low-cost processors than to run it on a few more powerful Xeons. Western Digital and Seagate are betting they can make a little more margin on each Ethernet-connected disk drive and still make it less expensive than using Xeon servers to provide OSDs.

If they can, there will be a real market for them as the perfect place to keep our never-ending pile of stored rich media in both public and private cloud stores. However, if even exabyte-scale object stores can’t see a significant cost savings, Seagate and Western Digital may not be able to sell enough Ethernet interface drives for the market to find other appropriate use cases, or for them to recoup their development costs.

Howard Marks is founder and chief scientist at Deepstorage LLC, a storage consultancy and independent test lab based in Santa Fe, N.M. and concentrating on storage and data center networking. In more than 25 years of consulting, Marks has designed and implemented storage ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
AbeG
50%
50%
AbeG,
User Rank: Ninja
5/20/2014 | 4:56:08 PM
Ethernet may save the spinning disk
A fair amount of speculation has ocurred where people wonder if SSD will eventually make HDD obsolete.  Adding an ethernet standard to the HDD may make it even more appealing as a part of a disaster recovery solution that is more competitive than cold storage tape backups.
Sean1271
50%
50%
Sean1271,
User Rank: Apprentice
5/17/2014 | 9:39:07 AM
Re: Transferable to SSD?
Lorna,

Traditionally storage controllers managed the placement of data on spinning disk drives. Now Software on top of a server running Linux OS provides this function across a few drives or potentially tens of thousands. Administering the data placement functions on a drive by drive basis would be impossible to manage for almost every company. Even the traditonal management of SAN Storage devices on an enclosure by enclosure basis is operationally too expensive and way too risky to deploy in todays Cloud " Always On & Always Available" design architectures.

All types of drives ( SSD, SAS, SATA) can be deployed in a Server typlically between 12 and 128 drives in a single enclousure (JBOD=Just a Bunch of Disks).  Object Storage Software like OpenStack Swift which is OpenSource allows for the management of 100's and even thousands of these enclosures to appear to be one Gigiantic pool of Storage that can be distributed literally around the world. The idea is that drives (spinning, or memory based) do two things (1- store data, 2- fail). So storing multiple copies of objects in real time or near real time provides Globally Distibuted failure domains that allow users to  access their data even if one of the data centers where their data sits in an enclosure on a disk or SSD has been offline due to Wild fires, Hurricanes, Floods, Tsunami's, or Tommy Backhoe who cut the fiber line while digging to replace aged sewer pipe.

As for your direct question, Swift can tier different drive types for different Use cases on a policy basis. SSD's have superior latency reponses for acknowleding writes to the storage, and for retrieving the object back to the user or application. Software layers integrated with Swift can store highly accessed objects early in the objects life on the lowest latent SSD drives, and then as data ages and is accessed less and less, it can be stored on slower and more latent media.  SATA drives or these emerging ethernet drives like Kinetic price out at 1/40th the cost of SSD but still have the same Data Durability that Globally Distributed Storage like Swift provides. 
Brian.Dean
50%
50%
Brian.Dean,
User Rank: Ninja
5/16/2014 | 11:01:19 PM
Re: Ethernet SSDs?
@Howard, great article that looks at the future low cost SoC storage market, it could prove to be useful in many applications, for example, applications in the IoT where data needs to be collected overtime but accessed not so frequently.

As for general computation, I feel that ARM's microservers might result in lower utilization -- higher cost -- pre virtualization era, when up against the Xeons and IBM's Power architectures of the world. On the other hand, if the economics of microservers are lower by a large margin, then utilization could be overlooked.
HowardMarks
100%
0%
HowardMarks,
User Rank: Strategist
5/16/2014 | 1:43:34 PM
Ethernet SSDs?
Lorna,

First of all SSDs, and to a lesser extent Shingeled Magnetic Recording HDDs, do manage their own data placement.  The flash controller does garbage collection to consolodate active data into some pages and create empty pages for the new data. Products like EMC's XtremIO take advantage of this by not doing any system level garbage collection and leaving that to the distributed processors in the SSDs.

The big difference is in use cases. Flash and newer forms of non-volitile memory are taking over primary storage for applications where random I/O performance is the gating factor in application performance like OLTP databases.  Spinning disks are being relegated to storing less structured data where the performance factor that matters is throughput not IOPS like medical imaging and rich media distribution. 

NVM developement is therefore chasing lower latences by moving NVM to faster interfaces and closer to the CPU like SanDisk/Diablo's UltraDIMM. Since the primary advantage of HDDs over SSDs is cost the HDD world is working on ways to reduce the total system cost, especially at scale, to increase the fraction of the total system sale that goes to WD or Seagate.

 
Lorna Garey
50%
50%
Lorna Garey,
User Rank: Ninja
5/16/2014 | 1:13:51 PM
Transferable to SSD?
Howard, Is this object management and data placement technology available only on spinning disk, or could it in theory be transferred to SSDs? Or, do SSDs already do this?
Slideshows
Cartoon
Audio Interviews
Archived Audio Interviews
Jeremy Schulman, founder of Schprockits, a network automation startup operating in stealth mode, joins us to explore whether networking professionals all need to learn programming in order to remain employed.
White Papers
Register for Network Computing Newsletters
Current Issue
Video
Twitter Feed