Howard Marks

Network Computing Blogger


Upcoming Events

Where the Cloud Touches Down: Simplifying Data Center Infrastructure Management

Thursday, July 25, 2013
10:00 AM PT/1:00 PM ET

In most data centers, DCIM rests on a shaky foundation of manual record keeping and scattered documentation. OpManager replaces data center documentation with a single repository for data, QRCodes for asset tracking, accurate 3D mapping of asset locations, and a configuration management database (CMDB). In this webcast, sponsored by ManageEngine, you will see how a real-world datacenter mapping stored in racktables gets imported into OpManager, which then provides a 3D visualization of where assets actually are. You'll also see how the QR Code generator helps you make the link between real assets and the monitoring world, and how the layered CMDB provides a single point of view for all your configuration data.

Register Now!

A Network Computing Webinar:
SDN First Steps

Thursday, August 8, 2013
11:00 AM PT / 2:00 PM ET

This webinar will help attendees understand the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging. It will also help users decide whether SDN makes sense in their environment, and outline the first steps IT can take for testing SDN technologies.

Register Now!

More Events »

Subscribe to Newsletter

  • Keep up with all of the latest news and analysis on the fast-moving IT industry with Network Computing newsletters.
Sign Up

See more from this blogger

More on Performance Metrics: The Relationship Between IOPS and Latency

In my last post, "SSDs and Understanding Storage Performance Metrics," I explained how storage users pay too much attention to throughput. The performance of mission-critical applications in data centers is actually probably more dependent on the storage system's ability to deliver a large number of IOPS while satisfying each I/O request with minimal latency.

Unfortunately, getting useful IOPS and latency figures from vendors, or even some product reviews, isn't easy. A big part of the problem is there's no such thing as a standard I/O operation. When we talk about IOPS, we have to be really clear about both the size and type of the I/O operation in question. You'll commonly see vendors claim their systems can deliver 100,000 or 1 million IOPS while not saying whether those are 512-byte sequential reads, 4K random writes or some other mythical data I/O chosen simply to make their storage systems look good.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

As if that weren't bad enough, you'll also see vendors juking the stats by running simple benchmarks like Iometer against very small logical disks. By doing so, all or most of the I/Os are actually being performed to and from the controller's cache rather than from the storage disks themselves.

More Than IOPS

Even if a vendor's miraculous new storage system could deliver 1,000,000 4K random IOPS with a 60/40 read/write mix, I'd want to know how much latency there was for each of those million IOPS. In a recent blog post explaining IOPS and latency, Dimitris Krekoukias examined the performance of Oracle on a system that was delivering 15,000 IOPS with an average latency of 25ms. The database engine on that system reported a high level of I/O wait time, even though it was grinding through 15,000 IOPS.

One commenter at Krekoukias's Ruptured Monkey blog recounted how the credit card processor he worked for had determined that its fraud prevention application ran fast enough to not significantly slow down the charge authorization process only if the storage array supporting that application had latency under 4ms.

The key to delivering high application performance is a combination of high IOPS and low latency. As an application or benchmark stresses a storage system, it may continue to deliver high IOPS but at higher levels of latency, and that may seriously affect real-world performance.

This relationship between IOPS and latency is one very good reason to pay more attention to published results from benchmarks like JetStress, TPC-C and SPC-1 rather than simple performance tests like Iometer. The rules for reporting results from these benchmarks require vendors to disclose not only the raw performance that their devices achieved, but also the transactional latency. JetStress will fail a system under test if latency exceeds 20ms, while SPC requires reporting latency at several load levels and TPC results include average, 90th percentile and maximum latency.

SSD Caches Change the Game

While average latency is most directly related to typical application performance, as we move to hybrid storage systems that mix flash and spinning disks, the 90th-percentile latency reading will become more significant. Your application's apparent performance may be driven as much by the latency for the 10% of transactions whose data isn't entirely in flash as much as by the much lower latency for the 90% that are.

That 90th-percentile figure may also expose inconsistent latency. This is a problem with some flash-based systems in write-intensive applications, as SSDs have to perform housekeeping to free up a fresh page to write to. Well-designed systems have sufficient RAM cache and overprovisioned flash to keep latency relatively constant.

Once you bundle up a group of disk drives, or SSDs into a system, many factors--from controller CPU and cache size to the way the system organizes its data--can affect both IOPS and latency. Your best bet is to pay attention to both stats.

In our next installment, we'll look at how RAID affects storage system performance and how to configure a set of disks for a given application's performance needs.


Related Reading


Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 
Vendor Comparisons
Network Computing’s Vendor Comparisons provide extensive details on products and services, including downloadable feature matrices. Our categories include:

Research and Reports

Network Computing: April 2013



TechWeb Careers