Tripping On Power

Blade systems, 1U servers, and high-density storage systems are straining the power and cooling capabilities of data centers. Here's practical advice on what you can do about it.

June 1, 2005

18 Min Read
Network Computing logo

It wasn't all that long ago that in the light of microprocessor-based servers, raised-floor data centers with chilled water cooling and airflow systems that could knock over a small child seemed like overkill. Not so today: Rapid increases in server and storage system densities are straining existing data center power and cooling capabilities, yet our hunger for processing power and storage shows no signs of abating. So now that summer and the brown-out season are here, how's your data center power and cooling posture?

For many, the answer is uncertain, and for good reason. While application specialists and IT architects have been packing data centers with blade systems and 1U and 2U servers, it's only been by the grace of mainframe forebears that most facilities still function adequately. But don't count on data center designs of the previous century to do the job much longer.

WRECKED RACKS

For most enterprises, existing data centers aren't full, and in a gross sense, existing cooling and electrical systems are often sufficient for the overall load currently presented. The immediate problem is that today's high-density systems can require much more localized power and cooling than is typically available. For example, using HP's online configuration tool, we were able to build a single rack of HP ProLiant BL p-Class blade servers that would require over 30kW of power and would weigh in at over a ton. (We picked on HP here because it provides some of the nicest power requirement calculators, but all the other blade system vendors share a similar story.)

In practice, such a configuration is difficult to deploy. According to Neil Rassmussen, CTO at Uninterruptible Power Supply (UPS) and cooling giant APC, "Not only will you need to size the UPS to power the data equipment, but you'll also need to include the air handling units. In the 10 to 30 seconds required for a backup generator to kick in, a 30kW system will overheat itself." To avoid the additional cost of battery backup for AC units, Rassmussen says a single rack's power budget shouldn't exceed 15kW.Paul Frountan, vice president of engineering at managed hosting provider Rackspace, sees it slightly differently. "Our maximum rack configurations draw up to 20kW, but to achieve that we tightly control the airflow into and out of the rack. As long as systems are running at normal temperatures, they'll survive for the 30 seconds needed for generators to take over."

There are other data center problems with packing racks full. Besides localized cooling, there's the issue of weight. Present-day raised floors are rated for a uniform loading of 250lbs/sq ft or less. If you're dealing with one 10 years old or older, it may only be rated for 75lbs/sq ft. Certain earthquake-prone communities may require even lower loading. They may also require that racks maintain a lower center of gravity than usual.

The heaviest configuration cited in our table on page 36 works out to a load of 744lbs/sq ft, or almost triple the best rating of modern raised floors. To address this, Rackspace always builds its data centers on grade and uses extra floor bracing for heavy systems, according to Frountan. In high rise buildings, maximum floor point loads must also be considered, along with uniform loading restrictions.

Problematically, raised floors that can handle a ton in each rack are less likely to allow for the required cooling airflow. That's because this can require a sub-floor-to-raised-floor gap of approximately three feet, whereas most raised floors are designed with less than a two-foot gap. This clearance is sometimes reduced even further to increase the weight-bearing capacity, resulting in an even bigger drop in cooling capacity. In most instances, the problem can be addressed by pressurizing overhead air ducts as well as the floor gap. That way, when floor-based airflow isn't sufficient, it can be augmented by ceiling-based ducts.

There's also the issue of power distribution. Data center design in the mainframe era called for power densities as low as 30W/sq ft, though 75W/sq ft is more typical. Using the 75W reference, our fully stocked HP blade rack would require dedicating the power capacity of 413 square feet of floor space--hardly a practical proposition.The conclusion here is self-evident. High-density deployments will require a substantial investment in new power and cooling systems, and that means AC, electrical wiring, UPSs, and generators. To that end, the Meta Group recommends that new data centers be designed to support 500W/sq ft. But rare is the data center that meets this criterion, and even if it did, and even considering aisle clearances, that's less than half the power our full rack of HP blade servers would require.

Nonetheless, Frountan says Rackspace designs its facilities so that its racks can be filled. He agrees with the 500W/sq ft recommendation and offers some additional advice: "It's typical to design cooling systems with N+2 or N+3 redundancy, but what happens if three chillers or blowers next to each other fail? You'll also need to make sure adjacent cooling systems aren't on the same electrical bus." When sizing cooling systems for each rack, he recommends oversizing by 10 to 15 percent. "If you remove exactly 10kW of heat from a rack of equipment that produces 10kW, it will remain at temperature, but it won't cool down. If it gets up to 100 degrees while techs have doors open, the extra cooling will return it to operating temperature quickly."

SPREAD IT OUT

For existing data centers, the notion of fully packed racks pumping out trillions of compute cycles per second or storing hundreds of terabytes is an attractive one, but not very practical. A more realistic arrangement is to spread out the equipment and leave a good bit of open rack space.

We stated at the outset that in a great many cases, existing power and cooling systems are adequate to support the gross loads presented to many data centers. While that's true, what's new is that spot loads, both for cooling and power, will easily outstrip the design of current data centers. Power can be moved around, but cooling is another matter. In the good old days of mainframes, much of cabinet design was based not on the real estate of the electronics involved, but rather on the electronics' cooling requirements based on typical data center design. Given a set of processing and storage requirements, IBM not only happily mapped out the mainframe and storage systems needed, but also specified the physical systems of many data centers. That often included chilled water runs for CPU cabinets.

That was then. Now, however, the modular design of modern compute and storage systems makes such a specification almost impossible. First, there's the heterogeneous nature of the data center environment, which prevents vendors from reasonably specifying room designs for equipment other than their own. Then there's the fact that these systems can be arbitrarily located in racks. Vendors specify clearances, but position within the rack is just as important as proximity to other equipment. Systems higher in the rack are prone to pulling the already heated air from systems beneath them.

As a result, localized cooling systems from third parties are often the best way to support higher system densities. In many cases, this simply means ingress and egress fans that force cooled air into and pull hot air out of the racks. In some environments, specialized rack-based cooling systems may make sense. Such cooling systems may require external refrigerant compressors or chilled water to operate, so while the systems themselves are fairly easy to assemble, there's likely to be some plumbing work required as well.

Using these systems alone, vendors claim to support up to 10kW per rack. APC offers one such solution that offers eight usable racks, supplying power and cooling for up to 10kW per rack. The entire system has a footprint the size of 18 racks and a cool list price of $400,000. New plumbing and electrical work will add to the cost.THE BIG PICTURE

Given all this, there's no "right" design or retrofit for data centers. If you're building a new data center, it makes sense to plan for high-density power and cooling from the outset. But Rackspace's Frountan recommends against going overboard. "Cooling systems are always getting more efficient, so it's often cost-effective to plan for retrofits or added capacity in three to five years."

There's a lot of discussion among cooling and power system vendors and data center designers about just how to deploy power backup and cooling systems. No one doubts that oversizing UPSs, generators, and cooling systems costs money in terms of wasted energy, wasted battery life, and initial capital outlay. Yet the economic realities of project planning may simply require that new data centers be initially equipped with the power and cooling they're ever likely to need. It's just the nature of corporate politics that physical infrastructure is most easily budgeted when the building or remodeling actually occurs.

While a more modular design will allow for growth in reasonable increments, corporate overseers need to understand and agree that their investment in a data center isn't a one-time capital expense. They must recognize that incremental designs will lead to lower operating expenses initially, and that expansion will require new capital. If management doesn't think this way, then go for the whole data center enchilada at once.

PROCESSOR, COOL THYSELFWhile there are certainly ways to meet almost any cooling requirement, it makes sense to conserve power to the extent possible. That rack of HP blade servers will run up a yearly electrical bill of between $13,000 and $26,000 depending on locality--and that's just for the IT gear. The cooling system will also suck up about half that same amount of electricity. Over a three-year useful life, electricity can easily be the most costly part of running this rack of gear. Luckily, Intel and AMD finally understand that power consumption in data centers is an issue.

There are essentially three things that CPU designers can do to decrease power demand without sacrificing performance. One very effective measure is to reduce the size of the transistor gates. Moving from a 130nm fabrication process to a 90nm one reduced the Itanium 2's power budget by about 25 percent. AMD did at least as well with its move to 90nm.

Of course, if simple power reduction is really what you're after, CPU designers would need to resist the temptation to use the reduced transistor size to pack more transistors on a chip. They'd also need to leave the clock rate where it is. These two things almost never happen, but still, smaller is better. The roadmap for integrated circuit fabrication processes calls for 65nm and eventually 35nm. Intel and AMD will likely use these new processes to further enhance performance and advance their multicore plans. It's fair to say, however, that both realize that chip power budgets will remain approximately where they are--in the neighborhood of 100W per chip.

GOOD TO THE CORELast year, both AMD and Intel made the pivotal announcement that they would begin building multicore CPUs. While Intel's dual-core Xeons are initially heading for desktop systems, the company will be producing dual-core Itanium and server-appropriate Xeon chips later this year or early next. AMD is pursuing the opposite path, shipping dual-core Opterons intended for multiprocessor systems now, with dual-core Athlon chips due out later.

On the power budget front, both Intel and AMD say their new dual-core chips will draw no more power than their single-core predecessors. To achieve this bit of magic, both companies lowered the CPU clock rate by about 20 percent. So while dual-core systems won't yield double the performance, they'll result in a substantial performance boost without needing more power. Most server manufacturers are planning systems based on the dual-core chip. Sun Microsystems has been planning support from the outset and permits its Sun Fire servers to be field-upgraded to use AMD's new dual-core chips.

Intel's claims for the new Itanium dual-core chip are even grander. Because of its enhanced parallel processing capability, the 1.72 billion-transistor Montecito chip will boast 2.9 times the performance of its single-core ancestor--all while running at the same clock speed, but consuming about 25 percent less power. Intel accomplishes this counterintuitive feat with a new technology called Foxton.

Announced in February, Foxton is essentially Intel's SpeedStep technology on steroids. SpeedStep is the technology behind the radical drop in power consumption by Intel's mobile Pentium chips. Intel recently introduced Enhanced Intel SpeedStep for its higher-clock-speed Xeon chips. Where SpeedStep decreased processor speed and voltage together, Enhanced Intel SpeedStep allows for the two parameters to be adjusted independently.

Formerly known as Demand-Based Switching, Foxton takes Enhanced Intel SpeedStep up a notch by allowing voltage to be adjusted in 32 increments, and clock frequency in 64. Decoupling voltage and clock speed transitions allows each to be done more quickly and more strategically. The processor can dynamically change these parameters based on the instructions being executed.Because of the number of reduced power states and clock rates and the ability to quickly change states, power consumption can be regulated almost down to the instruction level. So while SpeedStep may have been analogous to turning off cylinders in an engine, Foxton is more like a gas pedal, allowing for more rapid performance changes. The result is the impressive power savings announced for the dual-core Itanium. Intel says the dual-core Xeon chips bound for servers will also get Foxton technology.

Not to be outdone, AMD has adapted its laptop PowerNow! technology for use on the Opteron. Dubbed PowerNow! with Optimized Power Management (OPM), the technology allows for independent control of voltage and clock frequency, each in 32 increments. PowerNow! with OPM should perform similarly to Foxton.

Both power management technologies operate partly independently and partly with the involvement of the OS. This support is just coming out now. In the future, Intel and AMD are likely to follow a course pioneered by IBM. This calls for shutting off various parts of the CPU when not in use. For example, if they're not needed, banks of cache memory or floating point units can be turned off. As the multicore strategy plays out, entire cores can be shut down when not needed.

COOL DRIVES

Drive makers are also getting into the act by offering a variety of power-saving states on their devices. In order to save power, drive makers have a number of options at their disposal. Hitachi, for example, offers three lower-power states for its Serial ATA (SATA) drives. These include simply sending the heads to their rest position, or either slowing down or stopping the drive mechanism altogether. The table above shows the four drive states offered on SATA drives, along with the resultant power savings and recovery times. Originally intended for Energy Star-compliant desktop computers, these modes can also be useful in some data center applications. Typically, power-saving modes aren't available on Fibre Channel or Serial-Attached SCSI (SAS) drives. These drives are intended to run constantly, whereas SATA drives were designed to be frequently powered on and off.In fact, according to Bob Wambach, senior product business manager at EMC, when it comes to trading off power usage for performance and reliability, the latter always wins. Though it's not specifically a power-saving initiative, Wambach sees EMC's Information Lifecycle Management (ILM) strategy as one that will also result in power savings. "Our tier-2 storage systems can not only use much higher capacity SATA drives, but they also employ smaller caches and less I/O processing power," he explains. Though EMC offers SATA drives, it doesn't presently take advantage of the low-power operation modes typically available on them.

The storage industry is, on the other hand, investigating storage models that will use dramatically less power for some applications. Massive Array of Idle Disks (MAID) systems are now offered by Copan Systems and others. Billed as an alternative to tape backup, these systems store data on high-capacity SATA drives, then shut off the drives until the data is accessed. The technology is specifically intended to save costs in terms of power and the amount of controller circuitry required.

POWER GRAB

While CPU and drive-power management are important, they aren't the whole story. The wisdom of generations of fathers can be applied to the data center--namely, if you aren't using it, turn it off. Efforts such as IBM's Autonomic Computing initiative seek to better instrument both systems and applications. On the systems side, the emphasis is on performance monitoring, imminent failure assessment, and power management. On the software side, much of the focus is on event notification and management.

The going is fairly hard for these initiatives compared to what might be accomplished on proprietary hardware and OS combinations. In the case of the latter, hardware and OS architects can agree on how to instrument hardware and what to do with the information from that instrumentation. In the non-proprietary world, it becomes a matter of working with standards bodies--most of the efforts of which end up being extended in proprietary ways.

While getting consistent information on server system states can be a challenge, the task of determining policy adherence and the application-level effect of an action such as turning off a server is even more daunting. Vendors typically can't count on owning enough of the infrastructure to determine all the possible ramifications of actions such as a system shutdown. As a result, it's easier for events to trigger powering up new servers than it is to determine when it's safe to shut them off. Nonetheless, efforts in this regard are taking place in the Web services and Service-Oriented Architecture (SOA) standards groups.Editor-in-Chief Art Wittmann can be reached at [email protected].

Risk Assessment: Data Center Power and Cooling

While power and cooling technologies are extremely mature, packaging and, to a certain degree, efficiency continue to evolve and improve. Component manufacturers are becoming keenly aware of the challenges that their devices present for power and cooling. Look for advances in power management technology and greater adoption by equipment vendors.Depending on the age and condition of your existing data center, it may not be feasible or cost-effective to retrofit it to handle a large number of high-density servers and storage systems. Consider alternatives such as building a new remote data center or using colocation facilities to handle some of the high-density load.

If your data center is out of cooling capacity or can't meet spot cooling demands, you need to do something about it. As energy-saving features become more readily available on data center equipment, there's no excuse not to capitalize on them. There's a lot of money to be saved on electrical costs.

Perhaps the biggest risk is in not doing anything to manage power and cooling requirements and having your boss find out what your company could have saved. At least do the legwork to know how much of your capacity you're currently using. Most IT staffers aren't proficient here, so seek the help of professionals, particularly in designing retrofits.

Data Center Gotchas I remember installing first-generation routers back in the 1980s. It struck me funny that the front panel of the routers would stick in place without screws because of the vacuum produced by the fans--even then, cooling issues were a fact of life. In the 40 years that our industry has been building data centers and laying out equipment, there have been a lot of lessons learned. Here's an abridged list:

1) Watch for equipment with side-to-side or top-to-bottom airflow requirements. Devices that require loads of back-panel connections sometimes use side-to-side airflow for cooling. These devices will have special rackspace requirements.

2) Deploy rack rows in "hot aisle" and "cool aisle" configurations. That means equipment exhaust sides should face each other, as should equipment front panels in adjacent rows. This prevents equipment from pulling in air already warmed by other gear.

3) Unless the rack is equipped with its own air-handling equipment, don't use racks with solid doors. Air must be allowed to enter to cool the equipment. Glass doors look cool, but they turn racks into ovens.

4) Use racks with plenty of room for cable runs. Cables can obstruct airflow, so keep them neat and bundled to the side.5) Use panel blanks in the front and back of the rack. Panel blanks prevent hot air from being sucked through the rack and back to the front side of the equipment.

6) Oversize cooling capacity by 10 to 15 percent per rack. You need enough capacity to lower the temperature of the equipment, not just maintain it.

7) Cut floor tiles carefully. Lopping an oversized hole in your floor tile for cable runs seems benign enough, but it can easily impair local airflow. Thus, cut with caution and use brushes or other airflow obstructers around wiring holes.

8) Leave enough room between racks. You need to be able to walk down both cold and warm aisles. More space means better airflow, too.

9) Buy a thermometer. Better yet, buy some sensors and deploy them so that airflow and temperature can be tracked. A fried server shouldn't be your first indication of trouble.10) Get an audit. Particularly if you're planning for a pile of new high-density servers, you'll need an audit to determine the remaining power and cooling capacity.

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox
More Insights