Howard Marks

Network Computing Blogger


Upcoming Events

Where the Cloud Touches Down: Simplifying Data Center Infrastructure Management

Thursday, July 25, 2013
10:00 AM PT/1:00 PM ET

In most data centers, DCIM rests on a shaky foundation of manual record keeping and scattered documentation. OpManager replaces data center documentation with a single repository for data, QRCodes for asset tracking, accurate 3D mapping of asset locations, and a configuration management database (CMDB). In this webcast, sponsored by ManageEngine, you will see how a real-world datacenter mapping stored in racktables gets imported into OpManager, which then provides a 3D visualization of where assets actually are. You'll also see how the QR Code generator helps you make the link between real assets and the monitoring world, and how the layered CMDB provides a single point of view for all your configuration data.

Register Now!

A Network Computing Webinar:
SDN First Steps

Thursday, August 8, 2013
11:00 AM PT / 2:00 PM ET

This webinar will help attendees understand the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging. It will also help users decide whether SDN makes sense in their environment, and outline the first steps IT can take for testing SDN technologies.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

See more from this blogger

Hot Flash: Researchers Use Heat to Counter NAND Flash Wear-n-Tear

The limited write endurance of NAND flash storage is significant drawback of the technology, just below its high cost per gigabyte. The very idea that SSDs will fail after 10,000 write/erase cycles rubs storage administrators the wrong way. Now, engineers at Taiwan's Macronix have, according to an article in the IEEE's Spectrum, uncovered a way to extend flash to 100 million or more write/erase cycles.

The Macronix group figured out how to use heat to repair the insulating layers of the flash chip, which degrade with each erasure. Researchers have known that this method works; previous attempts heated the whole chip to 250 degrees C (482°F) for several hours. The Macronix advance uses itty-bitty heaters, derived from the ones they build for phase change memory, that heat small groups of flash pages to 500°C. Macronix also discovered that the elevated temperature speeds up erasures, which wasn't predicted by the materials science geeks. (Before you attempt to revive an old SSD or CF card in the pizza oven, note the solder that holds the components of an SSD together melts at about 185°C).

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

Macronix hasn't announced any product using the technology.

There's certainly some appeal to the idea of resetting the write endurance odometers after 50 or 100 write/erase cycles with built-in heaters for SSDs based in TLC or even QLC (Quad Level Cell, which is flash that stores 4 bits per cell). However, I don't think flash's limited write endurance is that big a problem. Instead, our management processes need to account for the fact that SSDs wear out.

Many people think SSDs just up and stop working, like a dead hard drive, when the 10,000th write/erase cycle completes. That's not true. While SSDs occasionally fail without warning (just like everything else), those failures aren't due to write exhaustion.

The flash controllers in each SSD monitor how often each page is erased, and distribute the wear as evenly as possible across all their flash. Array controllers and host OSes can use SMART (Self-Monitoring, Analysis and Reporting Technology) to check the status of parameter 231 SSD Life Left, which will report what percentage of the SSD's rated life remains. If customers would accept it, array vendors could stop using expensive SLC SSDs, which can be written to as fast as they accept data, and start using MLC flash, which should last for five years. MLC flash should satisfy the performance needs of 80% of array vendors' customers; the others, who need SLC, could get new SSDs shipped to arrive 60 days before the old ones reach the end of their rated life.

Of course, the flash in an SSD doesn't self-destruct on erase 10,001, although at least one controller vendor allows SSD makers to switch the device to read-only when a threshold is reached. Ten-thousand cycles is just the point where the flash has degraded to where the flash manufacturer doesn't want to guarantee it will work. As the flash insulating layers break down, individual cells get stuck and will no longer hold data properly. At some point after 10,000 cycles--and there's no knowing if it's 10,317 or 30,000--there will be too many broken cells on a given page for the controller to be able to correct, and the controller will mark that page as bad. Once too many pages go bad, the SSD will not have any place left to write new data. But this is a gradual, monitor-able degradation, not a fatal failure with data loss.

We should treat SSDs like the timing belts in our cars. They're just parts we replace every 60,000 miles. We know when 60,000 miles is coming, and we can plan for it.


Related Reading


Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 
Vendor Comparisons
Network Computing’s Vendor Comparisons provide extensive details on products and services, including downloadable feature matrices. Our categories include:

Research and Reports

Network Computing: April 2013



TechWeb Careers