Data centers

07:00 AM
Jim O'Reilly
Jim O'Reilly
Commentary
Connect Directly
LinkedIn
RSS
E-Mail
50%
50%

New CPU Architectures Promise Performance Boost

Intel and others are developing chip structures that aim to unlock memory bottlenecks and improve interconnect capabilities.

Moore's Law is far from dead. While making transistors even smaller is becoming difficult, creative ways to speed up overall system performance are about to have a major impact on the design of CPU chips and the surrounding elements they control.

The impact of these new approaches will make it possible to continue and perhaps even accelerate system performance growth, so that new demands such as the Internet of Things can be handled economically. At the same time, the boundaries of high-performance computing will expand dramatically.

Much faster computing also will provide mobile interactions with much better graphics and a truly responsive speech recognition system. Internet searches and website loading should speed up, too.

The art of CPU design is to get as much functionality as possible as close together as feasible, with the fastest intercommunication, and without frying the chip. Until recently, this involved making the real estate used on a die by each function smaller due to die-shrinks using tinier transistors. The external interconnect electronics on the die slowly expanded as a result, as faster connections were needed to keep up. Consequently, chips have gotten hotter, and the relationship between CPU cores and memory/IO has been moving out of balance for a while.

Within the CPU, we are seeing two major new initiatives to address this imbalance. The memory bottleneck is being addressed by a completely new serial channel approach, Hybrid Memory Cube, while interconnect is seeing changes both in external interconnect and in the way that cores and CPU modules talk to each other.

As a result of this, CPUs can sustain more cores within a given power envelope, but the better news is that the system-level performance will increase much more due to the memory and interconnect acceleration.

Hybrid Memory Cube envisions a 3D stack of memory devices mounted on a management die. The stacking process involves through-silicon vias -- connections through each layer of the stack.

Partnering with Micron, Intel recently announced some preliminary HMC products. The Knight's Landing CPUs are 72-core versions of the Xeon Phi, which is aimed at HPC. These promise to have as much as 15x the memory bandwidth of today's DDR3. For applications that use an in-memory approach, such as databases, this will be a massive boost in performance.

The first of these HMC products will offer relatively small memory configurations, so memory mapping will still be critical, but we can expect a rapid evolution to terabyte-sized memories with terabyte-per-second performance. This will revolutionize high-end processing.

However, Intel doesn't have a monopoly on HMC. There's a large industry group working on the standards. Nvidia has talked up GPU solutions based on the same technology, and AMD is working on the issue. HMC technology may have two other flavors. One of the design concepts is that the memory, using up to 90% less power than traditional DIMM, can sit on top of the CPU chip and connect to it with TSVs. This has applications in the space (and power) sensitive mobile market.

Another flavor is the idea of hybrid memory. It will take a while, but we can expect stacks to incorporate flash modules, which would outperform today's NVDIMMs and PCIe flash by large factors. The gating technology for this is 3D flash, which increases packing density enormously.

On the interconnection front, the internal links between memory and cores is getting a makeover. Intel is planning an on-chip silicon-photonic "Omni-Scale" fabric, which will speed I/O while radically lowering power on Knight's Landing and more traditional Xeon processors.

Nvidia, which is also planning extensive use of HMC technology to overcome GPU memory bottlenecks, has announced its own inter-GPU connection plan. Though it is not as ambitious as Intel's Omni-Scale fabric, there is still a major performance boost involved, and implementation is likely to be easier.

Radical new approaches don't happen overnight, but the evolution of systems to much higher performance clearly will happen over the next few years. There will be major software impacts, both in the way operating systems tackle the hardware layer and in the way applications are tuned to take advantage of the new modes of operation.

Beyond these changes, research labs are looking at graphene interconnect and transistors, which would allow 3D processor die with much faster clock speeds than silicon. Flash will migrate to a far faster base technology, too. System innovation is back in high gear.

Jim O'Reilly was Vice President of Engineering at Germane Systems, where he created ruggedized servers and storage for the US submarine fleet. He has also held senior management positions at SGI/Rackable and Verari; was CEO at startups Scalant and CDS; headed operations at PC ... View Full Bio
Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
<<   <   Page 2 / 2
joreilly925
50%
50%
joreilly925,
User Rank: Ninja
7/31/2014 | 7:31:38 PM
Re: Ambient temperature and cpu performance
The whole of the issue is the maximum tempreature at the inlet to the heat sink, which is usually the same as the input to the system. 

45W processors can run at 50C in a 1U cabinet with good airflow design. 75W CPUs need 2U, to allow tall heatsinks, and drop to around 45C inlet.

Anything above that generally needs a heatpipe cooler or a 3U cabinet.

These high-core solutions from Intel save on power by using low-power techniques for off-chip interconnect, so they will likely be around 75 - 95W.
tekedge
50%
50%
tekedge,
User Rank: Apprentice
7/31/2014 | 6:47:03 PM
Ambient temperature and cpu performance
As the cores and the complexities of the cpu increase so are the increased needs to dissipate the heat away from the cpu, necessiating better ventilation and also maintaining a cooler room temperature. Just wondeing, how well they would perform in areas where the  average temperatures are closer to 80 F.
<<   <   Page 2 / 2
Cartoon
Slideshows
Audio Interviews
Archived Audio Interviews
Jeremy Schulman, founder of Schprockits, a network automation startup operating in stealth mode, joins us to explore whether networking professionals all need to learn programming in order to remain employed.
White Papers
Register for Network Computing Newsletters
Current Issue
2014 Private Cloud Survey
2014 Private Cloud Survey
Respondents are on a roll: 53% brought their private clouds from concept to production in less than one year, and 60% ­extend their clouds across multiple datacenters. But expertise is scarce, with 51% saying acquiring skilled employees is a roadblock.
Video
Twitter Feed