Algorithms and methods from the fields of Artificial Intelligence and Machine Learning (AI/ML) are taking on an increasingly important role in modern science. The successful application of these methods depend on a very high amount of computing power. Since the performance of conventional standard processors is usually not sufficient, researchers and data center operators often rely on special accelerator processors that can perform the required computing operations (such as matrix multiplications) very quickly.
Up to now, these processors have almost exclusively been close relatives of the graphics processors that are also being used in conventional PCs or gaming consoles. The internal structure of these chips is also suitable for many scientific calculations, including in the field of AI/ML. For use in data centers, however, more powerful models with additional functions required for professional environments are being used. The Steinbuch Centre for Computing (SCC) now operates more than 1,000 of these accelerator processors in total, including the almost 700 in the new „Hochleistungsrechner Karlsruhe“ (Karlsruhe High-Performance Computer, HoreKa) and more than 130 in the bwUniCluster 2.0 system.
The currently used accelerators from NVIDIA (A100 and V100) and AMD (MI100) deliver both a level of computing performance and energy efficiency that is about a factor of 10 higher than conventional standard processors can do. Thanks to its A100 accelerators, HoreKa made it to 13th place on the list of the world's most energy-efficient computers (as of June 2021). However, there is still great potential for further optimization in the field of AI/ML.
UK-based startup Graphcore, founded in 2016, is one of several companies working on corresponding products. Graphcore calls its processors "Intelligent Processing Units" (IPUs). The current GC200 model has 59 billion transistors, making it one of the largest chips in production worldwide, and bears the nickname "Colossus".
In contrast to the chips from NVIDIA and AMD, which support the entire range of scientific applications, "Colossus" processors are primarily focussed on computational operations with reduced precision data types important for AI/ML. Up to 250 trillion of these special computational operations per second (AI Floating Point Operations, AI FLOPS) can be performed by a single chip. The IPU-POD16 system now put into operation by the National High Performance Computing Center NHR@KIT as part of the so-called "Future Technologies Partition" has 16 Colossus processors. It is the first system of its kind in Germany.
Researchers with access to HoreKa or from the National High Performance Computing Alliance (NHR-Verbund) can get access to the Future Technologies Partition on request. This hardware and software test bed for innovative and distruptive technologies also sports systems with AMD processors, AMD accelerators, ARM processors and other hardware.
Contact: Simon Raffeiner