To address these limitations, the powers that be in the hard drive business -- that is, the three-and-a-half remaining vendors and the INCITS T10 and T13 committees that define the SCSI and ATA command sets respectively -- have come up with three models for shingled drives. Dumb - or more properly, restricted drives -- shingle the whole surface of each disk. These can only be written to by applications that know for write purposes they’re basically sequential devices, like tape.
In order to allow applications to perform some level of random access to the shingled drives, the other two solutions break the drive into zones of shingled tracks with a guard band in between the zones. For each zone, the drive maintains a write pointer or cursor that holds a pointer to the highest numbered block that has been written to in that zone. Applications can write starting at the block following the cursor.
When I first heard about zoned shingled drives, I thought that they solved the random write problem by reading a whole zone into the drive’s buffer memory, updating the data in the buffer, and then writing the data back. While this might work if a zone was only a few tracks, such small zones would limit the capacity gains we got from shingling.
Those capacity gains can be significant, but aren’t as big as some folks would lead you to believe. Shingling by itself could result in a 15% or greater capacity boost, depending on the size of the zones with bigger boosts going to restricted drives. Additional density boosts come from improved linear bit density, which comes from using that higher coercivity media with its smaller superparamagnetic limit, and from using larger data blocks. Writing sequentially allows the drives to write to the disk in blocks bigger than the 512, or more recently 4K, that standard drives use. Larger blocks can use more efficient ECC, consuming less capacity in inter-record gaps and error correcting data.
The drive vendors I’ve chatted with suggest that most drives will have zones of about 100 tracks. Since a track on a modern drive holds on the close order of a megabyte of data, a zone would therefore hold 100 or so MB too much for my simplistic read-modify-rewrite model. The T10 and T13 standards bodies are proposing a new set of commands that will allow an operating system or application to query the drive for the number of zones on it and the location of the write cursor for each zone.
Vendors could even make drives that have some zones with normal track layouts and other zones with shingled drives. A file system could query the drive, discover the standard track zones and use the standard track zones for their metadata while writing files to the shingled zones. Of course, standards bodies move at their own, usually rather glacial, pace so the projected date for a full first draft of the T10 zoned device standard is November 2016.
[Read about Seagate's recent $374 million acquisition in "Seagate Inks Deal To Acquire Xyratex."]
Some of the drive vendors are also planning, and in fact shipping, drives that, except for lower random write performance, look to the computers they're connected to like normal drives. Like SSD controllers -- which if you think about it, face a similar problem storing data in flash pages that have to be completely erased to be re-written to -- these transparent SMR drives use a log-based data layout, so they can constantly write new data to free space in a shingled zone.
Obviously, managing a log-based data layout requires a bit more intelligence in each drive than simple LBA, and since the logical to physical block map will normally be stored in the drive’s RAM, each drive will need a little bit of non-volatile memory and enough capacitor to dump that table in the event of a power failure. The additional couple of bucks in software and electronics should be worth it for the additional capacity.
Shingled recording should be a good solution for capacity oriented drives where random I/O performance isn’t important. For more performance-oriented purpose use, and even greater capacity, we’ll have to wait for heat-assisted magnetic recording and/or bit patterned media to make it out of the lab and into our datacenters.