NHR@KIT launches first NVIDIA Arm Development Cluster worldwide

The new NVIDIA-Arm cluster joins an ever-growing number of development systems for future technologies.

One of the new NVIDIA Arm HPC Development Kits. The two NVIDIA A100 GPUs on top, the two BlueField-2 DPUs on the right. The Arm processor is underneath the board at the bottom. (Picture: S. Raffeiner)

Whether in engineering, the life sciences, astrophysics or materials research - cutting-edge research in many fields is only possible using powerful supercomputers. The National High-Performance Computing Center at the Karlsruhe Institute of Technology (NHR@KIT) operates several High-Performance Computing systems for researchers from all over Germany. 
Over the past decade, the majority of the supercomputers in the world have relied on just two different hardware architectures from three different manufacturers. Intel and AMD dominate the market for high-performance processors (CPUs) with their so-called "x86" architecture, while accelerator chips (GPUs) are almost always supplied by NVIDIA. This "monoculture" makes it easier for users to switch between systems, but the potential of alternative architectures, which may be able to achieve much higher performance and energy efficiency, remains untapped. 

To exploit these potentials, it is important to give users as well as operators the opportunity to evaluate these alternative architectures in a simple way and under real conditions. A central component of NHR@KIT is therefore the so-called "Future Technologies Partition", a hardware and software test bed for novel, disruptive technologies that have not achieved market penetration yet and are therefore not yet available in the large high-performance computers. This category also includes processors with the Arm architecture. Arm chips are not only being used in mobile devices anymore, but for example also in the currently fastest supercomputer in the world, the Japanese "Fugaku", or in current Apple systems. 

NVIDIA is planning to enter the market for high-performance processors based on the Arm architecture in 2023. These chips are to be used together with its next GPU generation, codenamed "Hopper", in future supercomputers. To enable the porting of applications right now, NVIDIA offers its partners special development kits. Each of these kits consists of an Ampere Arm Processor with 80 CPU cores, two NVIDIA A100 accelerators and two BlueField-2 Data Processing Units (DPU) with InfiniBand connections. 

One of the racks of the Future Technologies Partition,
where the new systems were installed.

"The transferability of the results obtained in the Future Technologies Partition is very important to us," said Simon Raffeiner, HPC Operations Manager at NHR@KIT. "Most of the computations on the large main systems like HoreKa run on more than a single server system at the same time. That's why KIT is the only site in the world that has procured not just a single NVIDIA Arm HPC Developer Kit, but an entire cluster. Only in this way is it possible for our users to test their applications under realistic conditions."

The new systems join a steadily growing number of development systems in the Future Technologies Partition. These include special accelerators for artificial intelligence and machine learning (AI/ML) from Graphcore, existing Arm systems by other manufacturers or new types of all-flash data storage.

"We try to staff the Future Technologies Partition symmetrically, as far as possible," Raffeiner explains further. "For example, if there is a system with an x86 CPU and NVIDIA GPUs, there is also a system with an Arm CPU and NVIDIA GPUs." To round out the current matrix of systems, Arm systems with AMD GPUs are also in influx - a combination that is not in use anywhere else. "We're also porting our own cluster software stack to the Arm architecture, so that the differences on the software side are as small as possible."

Ideally, users should not immediately notice they are using a different hardware architecture and be able to focus on porting and optimising their applications. But it's often a long way to get there, Raffeiner says. "For example, we can currently only access one of the large parallel file systems using a workaround because the manufacturer does not yet offer its software for Arm systems. Here we are jointly working on a solution." The result will then also benefit other operators, who only switch to such novel architectures much later.

The new systems are currently being equipped with suitable software in cooperation with NVIDIA and are expected to be available to users in a few weeks. For more information on the Future Technologies Partition, HoreKa and the National High Performance Computing Center NHR@KIT, please visit www.nhr.kit.edu/

Contact at SCC: Simon Raffeiner



Achim Grindler