IPUs (Infrastructure Processing Units) take their place in the data center alongside other specialty processors designed to accelerate workloads and offload tasks traditionally performed by Central Processing Units (CPUs). In much the same way Graphics Processing Units (GPUs) are leveraged to speed up non-graphic calculations involving highly parallel problems due to their parallel structure, IPUs, introduced by Intel, accelerate network infrastructure and infrastructure services like virtual switching. Shifting these operations to a dedicated processor designed specifically to handle these tasks frees up CPU cycles. The end result is improved application performance and the ability to run more workloads with fewer CPUs.
Diving into the World of IPU
An IPU, like a Data Processing Unit (DPU) and Compute Express Link (CXL), makes a new type of acceleration technology available in the data center. While GPUs, FPGAs, ASICs, and other hardware accelerators offload computing tasks from CPUs, these devices and technologies focus on speeding up data handling, movement, and networking chores.
There is growing interest in the general area of infrastructure task acceleration. In the last year, many vendors have introduced solutions that try to address the same issues.
For example, NVIDIA recently introduced a SuperNIC, which it described as a “new class of networking accelerator designed to supercharge AI workloads in Ethernet-based networks.” It is designed to provide ultra-fact networking for GPU-to-GPU communications.
Additionally, there are many other new accelerators designed to speed up particular workloads. Examples include Graphcore’s Intelligence Processing Unit (also called an IPU) and Google Cloud’s Tensor Processing Unit (TPU).
Understanding the Basics of an IPU
An IPU is a specially designed networking device that includes accelerator elements and Ethernet connectivity. Its aim is to use dedicated programmable cores to accelerate and manage infrastructure functions.
It is often compared to a SmartNIC in that both have comparable networking and offloading features and capabilities. However, an IPU offloads all infrastructure functions from a CPU, whereas a SmartNIC works as a peripheral to a CPU.
The Importance of an IPU in Today's Tech Landscape
Two general changes have occurred that make IPUs (and other solutions that do similar things) a necessity.
The first is the wide-scale adoption of virtualization and software-defined technology in data centers. Tasks that used to be manually carried out or hard-wired in are now done by switching operations from one state to another in software. That involves performing tasks and moving data, infrastructure chores that consume a great amount of CPU cycles in traditional servers and switches.
The other is the change in data center traffic flows due to new application architectures. There has been a shift from traditional client-server applications to more cloud-native, microservices-based apps and services. These apps and services create significant amounts of what is called east-west traffic. Essentially, there are great volumes of server-to-server traffic within and traversing a data center. Moving that data is done via network controllers, virtual machines, and other devices. These devices typically performed various functions that previously ran on physical hardware, consuming many CPUs.
The impact of these two changes on CPUs can be offset using IPUs.
Anatomy of an IPU
IPUs typically combine FPGAs, ASICs, and other accelerators with processor cores. They essentially set up a hardware-based data path that handles infrastructure processing chores at the speed of hardware (rather than performing these tasks in software). That allows a system to keep data moving as networking speeds increase.
Main Features that Make IPU Stand Out
From a conceptual standpoint, IPUs have several distinct architectural components. There is an element of an IPU that is an intelligent infrastructure accelerator. The hardware and software elements of an IPU are programmable to enable an IPU to be customized to meet the performance requirements of different applications and environments.
The elements are combined on a single card that includes a high-speed Ethernet controller and a programmable data path. Such a product lets a vendor match and optimize hardware components and software to each application or service the IPU is intended to run in. Some call this capability a function-based infrastructure.
How does an IPU Work?
A good example of how an IPU works is to compare, at a very abstract level, a data center structure with and without IPUs. For this comparison, let's look at three key elements of a server.
The server hardware in a traditional data center has general-purpose compute components (CPUs) and NIC cards that connect to a virtualized network. The CPUs perform many infrastructure tasks. In an IPU-centric data center, those tasks and the functions of the NIC cards are done with an IPU.
As data volumes grow and data transmission rates increase, there is a huge increase in the number of packets transferred per second. That strains the capabilities of the NIC. And as mentioned above, there is a growing use of software-defined networking (SDN). In such scenarios, CPUs perform virtual switching, load balancing, encryption, packet inspection, and other I/O-intensive tasks. The networking tasks can consume up to 30 percent of CPU utilization.
In an IPU-centric data center, the overhead associated with running infrastructure tasks is offloaded from the server using ASICs or FPGAs to accelerate those infrastructure chores.
IPUs in Action
A single IPU can perform multiple acceleration functions, depending on its design. For example, a common scenario might use an IPU to:
- Accelerate networking by offloading virtual switching, which is quite common in software-defined and virtualized systems. Those tasks are normally performed by the processor(s) running the application.
- Accelerate storage by transferring the storage stack from the host application processor onto the IPU.
- Accelerate security by offloading encryption/decryption, compression, and other security functions that would otherwise be performed by the CPU.
- Handle all infrastructure processing tasks, offloading them from the application processor down to the IPU.
IPU vs. CPU/GPU
What makes an IPU Different from a regular CPU/GPU?
IPUs are specially designed to accelerate infrastructure chores. They are commonly customized depending on the application. That means one IPU might contain a different mix of ASICs, FPGAs, or other processing elements than another IPU.
The hardware acceleration of IPUs has one role…to offload compute-intensive infrastructure tasks from CPUs. In contrast, CPUs and GPUs can support many different functions. For example, a GPU might be used to accelerate graphics generation in one application or speed machine learning training in another.
Simply put, the use of IPUs offers significant improvements in overall system performance and utilization. That translates to reduced latency from an end-user perspective for the same compute load. Additionally, an enterprise or cloud provider gets higher resource efficiency since more work can be accommodated on the same resources.
Are there any disadvantages of using an IPU over a CPU/GPU?
There are two main issues where CPUs/GPUs have an advantage over IPUs.
First, they are designed to accelerate specific functions (e.g., packet processing, traffic shaping, security, virtual switching, etc.). IPUs make use of different combinations of ASICs, processing cores, and FPGAs based on the applications they are to be used for and the functions they are designed to accelerate. So, an IPU installed for one scenario might not be the right fit for another.
Second, many of the elements (e.g., the ASICs and FPGAs) might need to be programmed to customize how they work. Many organizations may not have the expertise in that level of programming versus programming CPUs. To be fair, CPU cores and GPUs have similar issues. However, more organizations have the programming skills and expertise in working with CPUs and GPUs.