Howard Marks

Network Computing Blogger


Upcoming Events

A Network Computing Webcast:
SSDs and New Storage Options in the Data Center

March 13, 2013
11:00 AM PT / 2:00 PM ET

Solid state is showing up at every level of the storage stack -- as a memory cache, an auxiliary storage tier for hot data that's automatically shuttled between flash and mechanical disk, even as dedicated primary storage, so-called Tier 0. But if funds are limited, where should you use solid state to get the best bang for the buck? In this Network Computing webcast, we'll discuss various deployment options.

Register Now!


Interop Las Vegas 2013
May 6-10, 2013
Mandalay Bay Conference Center
Las Vegas

Attend Interop Las Vegas 2013 and get access to 125+ workshops and conference classes, 350+ exhibiting companies and the latest tech.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

See more from this blogger

Hot Flash: Researchers Use Heat to Counter NAND Flash Wear-n-Tear

The limited write endurance of NAND flash storage is significant drawback of the technology, just below its high cost per gigabyte. The very idea that SSDs will fail after 10,000 write/erase cycles rubs storage administrators the wrong way. Now, engineers at Taiwan's Macronix have, according to an article in the IEEE's Spectrum, uncovered a way to extend flash to 100 million or more write/erase cycles.

The Macronix group figured out how to use heat to repair the insulating layers of the flash chip, which degrade with each erasure. Researchers have known that this method works; previous attempts heated the whole chip to 250 degrees C (482°F) for several hours. The Macronix advance uses itty-bitty heaters, derived from the ones they build for phase change memory, that heat small groups of flash pages to 500°C. Macronix also discovered that the elevated temperature speeds up erasures, which wasn't predicted by the materials science geeks. (Before you attempt to revive an old SSD or CF card in the pizza oven, note the solder that holds the components of an SSD together melts at about 185°C).

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

Macronix hasn't announced any product using the technology.

There's certainly some appeal to the idea of resetting the write endurance odometers after 50 or 100 write/erase cycles with built-in heaters for SSDs based in TLC or even QLC (Quad Level Cell, which is flash that stores 4 bits per cell). However, I don't think flash's limited write endurance is that big a problem. Instead, our management processes need to account for the fact that SSDs wear out.

Many people think SSDs just up and stop working, like a dead hard drive, when the 10,000th write/erase cycle completes. That's not true. While SSDs occasionally fail without warning (just like everything else), those failures aren't due to write exhaustion.

The flash controllers in each SSD monitor how often each page is erased, and distribute the wear as evenly as possible across all their flash. Array controllers and host OSes can use SMART (Self-Monitoring, Analysis and Reporting Technology) to check the status of parameter 231 SSD Life Left, which will report what percentage of the SSD's rated life remains. If customers would accept it, array vendors could stop using expensive SLC SSDs, which can be written to as fast as they accept data, and start using MLC flash, which should last for five years. MLC flash should satisfy the performance needs of 80% of array vendors' customers; the others, who need SLC, could get new SSDs shipped to arrive 60 days before the old ones reach the end of their rated life.

Of course, the flash in an SSD doesn't self-destruct on erase 10,001, although at least one controller vendor allows SSD makers to switch the device to read-only when a threshold is reached. Ten-thousand cycles is just the point where the flash has degraded to where the flash manufacturer doesn't want to guarantee it will work. As the flash insulating layers break down, individual cells get stuck and will no longer hold data properly. At some point after 10,000 cycles--and there's no knowing if it's 10,317 or 30,000--there will be too many broken cells on a given page for the controller to be able to correct, and the controller will mark that page as bad. Once too many pages go bad, the SSD will not have any place left to write new data. But this is a gradual, monitor-able degradation, not a fatal failure with data loss.

We should treat SSDs like the timing belts in our cars. They're just parts we replace every 60,000 miles. We know when 60,000 miles is coming, and we can plan for it.


Related Reading


Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 
IaaS Providers
Cloud Computing Comparison
With 17 top vendors and features matrixes covering more than 60 decision points, this is your one-stop shop for an IaaS shortlist.
IaaS Providers

Research and Reports

The Virtual Network
February 2013

Network Computing: February 2013

Upcoming Events



TechWeb Careers