• 07/31/2014
    7:00 AM
  • Rating: 
    0 votes
    Vote up!
    Vote down!

New CPU Architectures Promise Performance Boost

Intel and others are developing chip structures that aim to unlock memory bottlenecks and improve interconnect capabilities.

Moore's Law is far from dead. While making transistors even smaller is becoming difficult, creative ways to speed up overall system performance are about to have a major impact on the design of CPU chips and the surrounding elements they control.

The impact of these new approaches will make it possible to continue and perhaps even accelerate system performance growth, so that new demands such as the Internet of Things can be handled economically. At the same time, the boundaries of high-performance computing will expand dramatically.

Much faster computing also will provide mobile interactions with much better graphics and a truly responsive speech recognition system. Internet searches and website loading should speed up, too.

The art of CPU design is to get as much functionality as possible as close together as feasible, with the fastest intercommunication, and without frying the chip. Until recently, this involved making the real estate used on a die by each function smaller due to die-shrinks using tinier transistors. The external interconnect electronics on the die slowly expanded as a result, as faster connections were needed to keep up. Consequently, chips have gotten hotter, and the relationship between CPU cores and memory/IO has been moving out of balance for a while.

Within the CPU, we are seeing two major new initiatives to address this imbalance. The memory bottleneck is being addressed by a completely new serial channel approach, Hybrid Memory Cube, while interconnect is seeing changes both in external interconnect and in the way that cores and CPU modules talk to each other.

As a result of this, CPUs can sustain more cores within a given power envelope, but the better news is that the system-level performance will increase much more due to the memory and interconnect acceleration.

Hybrid Memory Cube envisions a 3D stack of memory devices mounted on a management die. The stacking process involves through-silicon vias -- connections through each layer of the stack.

Partnering with Micron, Intel recently announced some preliminary HMC products. The Knight's Landing CPUs are 72-core versions of the Xeon Phi, which is aimed at HPC. These promise to have as much as 15x the memory bandwidth of today's DDR3. For applications that use an in-memory approach, such as databases, this will be a massive boost in performance.

The first of these HMC products will offer relatively small memory configurations, so memory mapping will still be critical, but we can expect a rapid evolution to terabyte-sized memories with terabyte-per-second performance. This will revolutionize high-end processing.

However, Intel doesn't have a monopoly on HMC. There's a large industry group working on the standards. Nvidia has talked up GPU solutions based on the same technology, and AMD is working on the issue. HMC technology may have two other flavors. One of the design concepts is that the memory, using up to 90% less power than traditional DIMM, can sit on top of the CPU chip and connect to it with TSVs. This has applications in the space (and power) sensitive mobile market.

Another flavor is the idea of hybrid memory. It will take a while, but we can expect stacks to incorporate flash modules, which would outperform today's NVDIMMs and PCIe flash by large factors. The gating technology for this is 3D flash, which increases packing density enormously.

On the interconnection front, the internal links between memory and cores is getting a makeover. Intel is planning an on-chip silicon-photonic "Omni-Scale" fabric, which will speed I/O while radically lowering power on Knight's Landing and more traditional Xeon processors.

Nvidia, which is also planning extensive use of HMC technology to overcome GPU memory bottlenecks, has announced its own inter-GPU connection plan. Though it is not as ambitious as Intel's Omni-Scale fabric, there is still a major performance boost involved, and implementation is likely to be easier.

Radical new approaches don't happen overnight, but the evolution of systems to much higher performance clearly will happen over the next few years. There will be major software impacts, both in the way operating systems tackle the hardware layer and in the way applications are tuned to take advantage of the new modes of operation.

Beyond these changes, research labs are looking at graphene interconnect and transistors, which would allow 3D processor die with much faster clock speeds than silicon. Flash will migrate to a far faster base technology, too. System innovation is back in high gear.


Ambient temperature and cpu performance

As the cores and the complexities of the cpu increase so are the increased needs to dissipate the heat away from the cpu, necessiating better ventilation and also maintaining a cooler room temperature. Just wondeing, how well they would perform in areas where the  average temperatures are closer to 80 F.

Re: Ambient temperature and cpu performance

The whole of the issue is the maximum tempreature at the inlet to the heat sink, which is usually the same as the input to the system. 

45W processors can run at 50C in a 1U cabinet with good airflow design. 75W CPUs need 2U, to allow tall heatsinks, and drop to around 45C inlet.

Anything above that generally needs a heatpipe cooler or a 3U cabinet.

These high-core solutions from Intel save on power by using low-power techniques for off-chip interconnect, so they will likely be around 75 - 95W.

Re: Ambient temperature and cpu performance

I wonder how long it will be before we get to say.  "back in the old days, a supercomputer would fill an entire room.  Now it fits on this relatively small box on that table over there."

I for one am rooting for Moore's law to continue its course for many years to come.

Re: Ambient temperature and cpu performance

If room temperature superconductors (or even liquid nitrogen temperature) work, or if graphene comes of age, we might get that refrigerator sized supercomputer.

With low-resistance interconnect, small is good!

Re: Ambient temperature and cpu performance

What are your thoughts on the new players in this market, the Qualcomm's and NVdias of the world?

Re: Ambient temperature and cpu performance

Nvidia has a very nice niche and is exploiting it well. If dedicated GPU stacks with say an ARM for housekeeping work better than AMD's APU approach, Nvidia will do well.

Qualcomm is a much harder read. They haven't defined their niche yet.

Re: Ambient temperature and cpu performance

@ joereilly925 - Thank you for your input and I get the NVidia comment and maybe I am not as well informed but I always looked at Qualcomm as a wireless play.

Re: Ambient temperature and cpu performance

Qualcomm's chips business is nroadening, with single-chip solutions for mobile 

Re: Ambient temperature and cpu performance

The future of the datacenter might very well become interesting and progressive, thanks to HMC. I wonder, if HMC or another new architecture can be implemented for the end-device i.e. PCs, mobile devices and SoC, etc.

Re: Ambient temperature and cpu performance

It comes down to the economics of the Through-Silicon Via process. If it's cheap enough, we'll see purpose built stacks of CPU, DRAM and flash in phones and tablets. The space and power savings would be a big plus in mobiles.

road maps

Jim, when do you expect the first HMC products to arrive on the market?

Re: road maps

Micron has early versions of the memory stack, but they are low capacity, and more a proof-of-concept than a mainstream product. I think the pace will pick up in a few months time. Intel's next-gen very-high-core processor will use the technology.