2021-11-18

Successful Machine Learning Performance Tests with HoreKa

Researchers and Helmholtz AI members from the SCC and the Jülich Supercomputing Center have jointly submitted their results from a competitive initiative using the HPC benchmarking software MLPerf™ to the Supercomputing Conference 21.

HoreKa is equipped with almost 700 of these GPU accelerators from NVIDIA and is optimally prepared for AI applications.

In the Helmholtz AI platform, Germany's largest research centers have teamed up to bring cutting-edge AI methods to scientists from other fields. With this in mind, researchers and Helmholtz AI members of the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT) and the Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich have jointly submitted their results for the MLPerf™ HPC benchmarking suite. This initiative was started in 2020 by companies like Baidu, Google and GraphCore as well as researchers from Stanford, Harvard and Berkeley to precisely study large-scale AI applications. The Helmholtz AI team of SCC successfully executed training runs of the DeepCAM application with up to 512 Nvidia A100 GPUs on the HoreKa supercomputer located at SCC and additionally the CosmoFlow application on the JUWELS Booster system at JSC.

DeepCAM, the Gordon-Bell prize award winning application in 2019, is an artificial intelligence deep-learning software that allows for the recognition of cyclones in climate data. Early detection of these rotating tropical storm systems in the Indian and South Pacific Ocean is key in preventing material damage and loss of life, as well as in identifying potentially arable dry regions. Using HoreKa, Daniel Coquelin in Markus Götz's team was able to complete more than 100 quadrillion calculations required by DeepCAM in just 4 minutes and 21 seconds.

While striving for performance, it is vital to also balance the own climate impact of such large-scale measurements. With HoreKa and JUWELS ranking among the top 15 on the worldwide Green500 list of energy-efficient supercomputers, the computing resources in Helmholtz AI are both computationally and energy-efficient, making them Europe’s fastest and greenest systems for AI workloads. These benchmarks have not only helped us in better understanding our current systems, but also show us pathways for improvements of future systems, in providing us with testing tools to show administrators and users alike the carbon footprint of each individual computing job.

As Helmholtz AI, we hope to be part of this challenging, yet exciting, competition again next year. For this, we plan to not only partake in the so-called closed competition, i.e. measuring existing, official applications as-is, but also to showcase some advanced large-scale training approaches in the open competition.

Further Information:

Contact at SCC: Markus Götz

Achim Grindler