Special Coverage Series

Network Computing

Special Coverage Series


In the Cloud, Distance Matters for Compute Efficiency

Researchers from UC San Diego and Google found cloud-based applications will run as much as 20% more efficiently when data is located close by.

Researchers studying the architecture of cloud-computing services have found that applications running on cloud computing systems run up to 20% more efficiently when the data they need to access is located nearby.

The finding may help to create more efficient data-center and cloud-based computing systems, but it strikes a blow to the notion of the cloud as a ubiquitous, universally accessible computing resource that makes the location of processors and data irrelevant.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

The finding is the result of tests conducted in cooperation with Google by researchers Lingjia Tang and Jason Mars of the Jacobs School of Engineering at UC San Diego. The tests were conducted on Google Web servers, whose performance and data-access status was measured in real time

The two found that software running on Google cluster servers ran significantly faster and more efficiently when the data they used was stored close by rather than in a remote location. The two tested Gmail, search and other applications running in a warehouse-sized Google server installation, then compared those results with tests on similar servers running in isolation rather than as part of a cloud.

They found that, in large clusters, long distances between server and data caused apps to run more slowly because individual processes had to wait longer for data they requested to arrive in cache where it could be processed.

"It's an issue of distance between execution and data," Tang said in an announcement from UC San Diego.

By testing the apps in isolation, where it was easier to identify confounding factors that might have come from other servers, other applications or the network, the two discovered that competition for computing resources within the server--especially competition for space in the CPU cache--also plays a major role. However, distance still remains the primary factor in affecting efficiency.

On multicore systems, applications running on one core will run more slowly if they have to access data accessible through controllers running on another core, they found. Loading the data into RAM, as most applications do, makes it available to apps running on any core. However, applications will still show more latency when they have to use data controlled by software running on a different core.

Part of the reason appears to be latency due to distance, part is competition for space on the bus connecting various cores and space in the cache, as well as the almost negligible distance between processor and data, the two found.

Most of the physical servers on which the cloud is built use Non-Uniform Memory Access (NUMA), an architecture designed to allow efficient multiprocessing using servers with multiple cores or clusters with many servers.

Under NUMA, a processor can access its own on-board memory, or the memory in its server more quickly than it can memory caches on another chip or another server. NUMA compensates for that lag by allowing the processor core to switch tasks to favor threads running on local memory while it waits for responses from tasks using more distant memory.

That makes multiprocessing possible in x86-based servers, but doesn't compensate for the greater distances involved in cloud-based computing. The greater the distance between executable code in active memory and the processor running it, the greater the lag time, Tang and Mars found.

The lag for applications using memory on another core of their own server is almost unnoticeable; the lag for those using memory or processors in another data center, at the far end of I/O buses, Ethernet switches and WAN connections will almost always run more slowly.

The only exception is when all the threads running on a processor are local and have to fight for space in the processor's memory cache. Latency from those collisions can be greater than the lag caused only by distance, if the application is designed to spread its threads among many processors and memory caches to reduce the level of conflict, the report states.

Working from their results, Mars and Tang created a metric called the NUMA Score that measures the amount of dispersion and potential for added latency in applications running on cloud or multicore systems. Keeping the NUMA Score within the right parameters can improve efficiency and speed by 15% to 20%, Mars said.

See a quick-view copy of the report here, or download it here as a PDF.

The score only measures how efficiently servers, processors and data are located; it doesn't map a cloud installation in detail to show which servers are using which pools of data or the ideal location to which applications, servers or databases must be moved to make them run as efficiently--as quickly--as possible.

Moving the data, applications or physical systems across warehouse-sized data centers--or even knowing for sure which applications are accessing which pools of data and when--could be a little more tricky than measuring the end results, however. Tang and Mars presented their findings at an IEEE meeting in China last month and will present them again at UCSD's Research Expo on April 18.



Related Reading



Network Computing encourages readers to engage in spirited, healthy debate, including taking us to task. However, Network Computing moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. Network Computing further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | Please read our commenting policy.
 

Editor's Choice

Research: 2014 State of Server Technology

Research: 2014 State of Server Technology

Buying power and influence are rapidly shifting to service providers. Where does that leave enterprise IT? Not at the cutting edge, thatís for sure: Only 19% are increasing both the number and capability of servers, budgets are level or down for 60% and just 12% are using new micro technology.
Get full survey results now! »

Vendor Turf Wars

Vendor Turf Wars

The enterprise tech market used to be an orderly place, where vendors had clearly defined markets. No more. Driven both by increasing complexity and Wall Street demands for growth, big vendors are duking it out for primacy -- and refusing to work together for IT's benefit. Must we now pick a side, or is neutrality an option?
Get the Digital Issue »

WEBCAST: Software Defined Networking (SDN) First Steps

WEBCAST: Software Defined Networking (SDN) First Steps


Software defined networking encompasses several emerging technologies that bring programmable interfaces to data center networks and promise to make networks more observable and automated, as well as better suited to the specific needs of large virtualized data centers. Attend this webcast to learn the overall concept of SDN and its benefits, describe the different conceptual approaches to SDN, and examine the various technologies, both proprietary and open source, that are emerging.
Register Today »

Related Content

From Our Sponsor

How Data Center Infrastructure Management Software Improves Planning and Cuts Operational Cost

How Data Center Infrastructure Management Software Improves Planning and Cuts Operational Cost

Business executives are challenging their IT staffs to convert data centers from cost centers into producers of business value. Data centers can make a significant impact to the bottom line by enabling the business to respond more quickly to market demands. This paper demonstrates, through a series of examples, how data center infrastructure management software tools can simplify operational processes, cut costs, and speed up information delivery.

Impact of Hot and Cold Aisle Containment on Data Center Temperature and Efficiency

Impact of Hot and Cold Aisle Containment on Data Center Temperature and Efficiency

Both hot-air and cold-air containment can improve the predictability and efficiency of traditional data center cooling systems. While both approaches minimize the mixing of hot and cold air, there are practical differences in implementation and operation that have significant consequences on work environment conditions, PUE, and economizer mode hours. The choice of hot-aisle containment over cold-aisle containment can save 43% in annual cooling system energy cost, corresponding to a 15% reduction in annualized PUE. This paper examines both methodologies and highlights the reasons why hot-aisle containment emerges as the preferred best practice for new data centers.

Monitoring Physical Threats in the Data Center

Monitoring Physical Threats in the Data Center

Traditional methodologies for monitoring the data center environment are no longer sufficient. With technologies such as blade servers driving up cooling demands and regulations such as Sarbanes-Oxley driving up data security requirements, the physical environment in the data center must be watched more closely. While well understood protocols exist for monitoring physical devices such as UPS systems, computer room air conditioners, and fire suppression systems, there is a class of distributed monitoring points that is often ignored. This paper describes this class of threats, suggests approaches to deploying monitoring devices, and provides best practices in leveraging the collected data to reduce downtime.

Cooling Strategies for Ultra-High Density Racks and Blade Servers

Cooling Strategies for Ultra-High Density Racks and Blade Servers

Rack power of 10 kW per rack or more can result from the deployment of high density information technology equipment such as blade servers. This creates difficult cooling challenges in a data center environment where the industry average rack power consumption is under 2 kW. Five strategies for deploying ultra-high power racks are described, covering practical solutions for both new and existing data centers.

Power and Cooling Capacity Management for Data Centers

Power and Cooling Capacity Management for Data Centers

High density IT equipment stresses the power density capability of modern data centers. Installation and unmanaged proliferation of this equipment can lead to unexpected problems with power and cooling infrastructure including overheating, overloads, and loss of redundancy. The ability to measure and predict power and cooling capability at the rack enclosure level is required to ensure predictable performance and optimize use of the physical infrastructure resource. This paper describes the principles for achieving power and cooling capacity management.