Are Converged Network Adaptors Suffocating The Virtual Machine?
Last week I reviewed real-world results of a test between the Emulex UCNA and the QLogic 8152 CNA. The testing illuminated a performance delta between the products, and it demonstrated another important issue: a big difference in CPU utilization while running FCoE. For the past few months, Emulex has been making some fairly broad claims that its UCNAs can support nearly one million IOPS, which it says towers over the performance available from all other CNAs in the marketplace. Emulex said that,
May 19, 2010
Last week I reviewed real-world results of a test between the Emulex UCNA and the QLogic 8152 CNA. The testing illuminated a performance delta between the products, and it demonstrated another important issue: a big difference in CPU utilization while running FCoE. For the past few months, Emulex has been making some fairly broad claims that its UCNAs can support nearly one million IOPS, which it says towers over the performance available from all other CNAs in the marketplace. Emulex said that, for virtualized server environments, this leap in network performance is needed to fully realize the capacity of the new Intel Xeon-based servers.
Well, after I reviewed some real-world CNA test results last week, I started thinking that perhaps what Emulex meant by "fully realize the capacity of the new Intel Xeon-based servers" was to choke the CPU so that it had no more room for additional functions. Before we review the test results, let's take a look at the test configuration that was used. The storage units under test were a Texas Memory RAMSAN 325 and RAMSAN 400 connected to a Cisco Nexus 5020 switch and the switch then connected to FCoE adapters in the server. The sever used was a 2.8GHz Nehalem Dual Socket Quad Core processor with 24GB of RAM and PCIe connectivity. The performance analysis tool used was IOmeter.
From an overall test environment perspective, standard, generally available FCoE converged network adapters were used for performance testing. The products under test used the most current software and firmware levels available at the time the tests were run, and the test environment was well defined, documented and can be recreated. In my opinion, the test environment was designed to provide an unconstrained adapter performance analysis and assessment. Tests were run with block sizes ranging from as miniscule as 0.5KB to 1020KB and on single port and dual port FCoE CNA configurations.
The test results from an input/output per second (IOPs) perspective reveal that in some cases the QLogic 8152 performs at higher IOPs than the Emulex UCNA, and in other cases the Emulex UCNA performs at higher IOPs than the QLogic 8152. Also, from a megabytes per/second perspective. the results were similar. However, there was a striking difference in the percentage of CPU utilization for the Emulex UCNA, especially in block sizes ranging from 0.5KB to 8KB and 16KB in both sequential read and sequential write testing.
In one test case, the CPU percent utilization for the Emulex UCNA was a whopping 80 percent. This case was for sequential reads at 0.5KB block size, with 247K IOPs on QLogic; 841K IOPs on the Emulex UCNA, and a concerning 80.66 CPU percent utilization for the Emulex UCNA vs. an 11.16 CPU percent utilization for QLogic. In fact, I understand that in some tests the Emulex UCNA drove CPU utilization north of a staggering 90 percent. Now we know why Emulex didn't talk about CPU utilization when it was broadcasting 1M IOPS--the performance was extracted at the expense of the CPU and using a block size so small that it bears no resemblance to a real-world workload. And in real-world block sizes (4K, 8K), the performance of the products is similar with Emulex's UCNA still exhibiting much higher CPU overhead.
I asked the folks running the tests to take a closer look at why the CPU utilization would be so high in the Emulex case vs. the lower numbers for the QLogic CNA. After a careful review of the Ipfc upstream driver, it appears that there is a significant difference in the implementation in the Emulex Fibre Channel driver for the 4/8Gb/sec Fibre Channel adapter and its UCNA. It also appears relevant functions are now offloaded to the driver as opposed to being located in firmware on the UCNA adapter card itself. This explains why the server CPU would be so busy processing I/O. This led me to thinking about, "What's the big deal with this high level of CPU percent utilization?" Well, I believe, for starters, that users will expect vendors to use the same proven stack on FCoE adapters as has been used on more traditional Fibre Channel adapters as we move forward with this new era of connectivity. It is unclear how all this will affect the many operating systems that new FCoE cards need to be deployed within and how that will effect overall server performance over time.This also ties back to my analysis on how the Emulex UCNA is basically like trying to build a house on leased land: the leased land comes from ServerEngines in the form of a baseline Ethernet NIC while the recreated FC stack comes from Emulex and the two get bolted-together in a product that is essentially owned by two companies. This raises concerns from a support perspective. If you have Ethernet problems, Emulex escalates them to ServerEngines and waits for ServerEngines to try and address them, then Emulex caries back fixes to the customer. Who's responsible for what, and how is this model scalable?
Also, this extremely high level of CPU utilization for the Emulex UCNA will probably result in a greater amount of memory required, and the impact to Virtual Machine density per physical server needs to be carefully considered. Users will have to carefully choose which adapter vendor to use so as to not negatively impact their virtualization plans due to overwhelming CPU utilization. The business impact of higher CPU utilization rates caused by I/O infrastructure products means a potential impact to application performance, and IT negatively effecting a revenue generating application will not be well tolerated by the business side of the house.
In related UCNA news, Emulex recently announced "OneCommand Vision," an IT management tool that Analytico, Inc. believes competes with management tools from its OEMs, including EMC Ionix, HP Insight Control and various software tools from IBM. I don't expect any of these OEMs to carry this software product from Emulex as it appears to step on their toes from an overall management perspective. Clearly, Emulex looks like it is trying to expand its market opportunity, and in this tough economy, all vendors are trying to do the same; however, in the process the company also appears to be creeping into the front yard of its top customers. IT management software is in high demand this year, and many storage systems vendors are rolling out their own unique solutions.
As always, we encourage users to review vendor benchmark activity and to, if at all possible, conduct testing of their own prior to deploying I/O infrastructure hardware. Also, remember to look at, or ask for, "all" the performance numbers so that a full assessment of overall performance can be well understood before deploying these new technologies.
About the Author
You May Also Like