Supercomputing News

Newsbites-Kategorie:HPC,Newsbites-Kategorie:Veranstaltung
KIT Celebrates 40 Years of High-Performance Computing in Karlsruhe

On September 14, KIT celebrated the 40-year era of high-performance computing in Karlsruhe with a festive colloquium. The invited guests enjoyed the opportunity to learn and exchange information about the entire range of HPC in Karlsruhe.

Translated with DeepL.com
On September 14, 2023, the SCC celebrated the era of high-performance computing, which has already spanned 40 years in Karlsruhe, with an internal festive colloquium. The invited guests from research policy and management, data center planning, construction and operation, as well as scientists, both former and active, were able to learn first-hand about the entire range of 40 years of high-performance computing in Karlsruhe during lectures, panel discussions, as well as an exhibition and guided tours. There was plenty of room to review successes and challenges, to philosophize a bit about the future, and of course to celebrate together and share interesting stories.
In his welcoming remarks, Peter Castellaz, the head of the department responsible for HPC at the Baden-Württemberg Ministry of Science, Research and the Arts (MWK), highlighted the state's bwHPC strategy, with which the SCC has not only implemented important aspects of content, but also played a leading role in an intensively lived culture of cooperation. "In addition to the HPC-specific resources and the associated methodological knowledge, KIT has successfully contributed in particular with its expertise in the field of data and research software", says Peter Castellaz. He finds words of praise for the state-wide federated identity management bwIDM, with which the SCC, together with other state institutions, has developed decisive basic requirements for cooperatively provided services - also beyond HPC. In order to accentuate not only technological developments but also the topics of research software and sustainability, a comprehensive state strategy is being worked out until 2032, Peter Castellaz lets it be known.
In her welcoming remarks, KIT Vice President for Digitization and Sustainability Kora Kristof is impressed by the community that has developed over a long period of time in the HPC environment, starting with the Gauss Centre for Supercomputing (GCS) and continuing with the different centers at the Tier-2 (Gauss Alliance) and Tier-3 levels, nationally and in Baden-Württemberg. "What has been developed in HPC by KIT and other institutions at the state level has also helped shape developments nationally, and special thanks are due for that," Kora Kristof notes. In addition, KIT has combined high-performance computing with the topic of energy efficiency and achieved outstanding successes with it. Here, the German Computing Center Award 2017 for ForHLR II and 13th place on the international best list of the most energy-efficient computers for HoreKa speak for themselves. "And in the future, many interesting topics that shape sustainability will also occupy us in the HPC environment - this concerns sustainable buildings and infrastructures, the use of sustainable materials and resource conservation, as well as aspects of sustainable software" predicts Kora Kristof.
Transitioning to the technical presentations, Martin Frank, director of the SCC, characterized the HPC business as a mix of something very dynamic and something conservative. "The dynamic can be seen very clearly in the development of HPC systems over the last 40 years, the conservative is, for example, in the handling of very complex processes such as procurement and the secure operation of the infrastructure," Martin Frank concretizes in his welcoming speech, knowing that experience and innovation are the important poles in the HPC business that make the SCC an important player in national high-performance computing. "All this ensures that scientists are supported in the best possible way in their research with high-performance computers and research software."
In the first technical lecture, Eric Schnepf introduced the beginnings of high-performance computing in Karlsruhe and covered developments up to the present. He made his first IT experiences in the 1970s at the University of Karlsruhe with Algol programs, which he created on punched tape using a Siemens T100 teletype and ran on the Zuse Z 23. In the 80s, in addition to universal computers, he was able to familiarize himself with vector computers, on which it was possible to compute applications more accurately and faster. Eric Schnepf dates the beginning of the HPC era in Karlsruhe to May 1983, when a state vector computer, a Cyber 205, was installed and operated at the University Computing Center in Karlsruhe after previous tests on a similar machine at the University of Bochum. User support was provided by a team from the University of Karlsruhe and what was then the Karlsruhe Nuclear Research Center (KfK). "The procurement only came about because a large community worked very well together: university, KfK and industry partners" affirms Eric Schnepf in his presentation. In addition to many interesting technical excursions into the world of the computer systems installed at the university at that time, Eric Schnepf also gave examples of applications - for example, from climate research - and went into detail about the ODIN cooperation created between Siemens-Nixdorf and the university, which stands for Optimal Data Models and Algorithms for Engineering and Natural Sciences on High-Performance Computers. A milestone was the first TOP500 list of supercomputers, which appeared in 1993. The original handout shows the German list with the two first-placed S600/20 systems from Karlsruhe and Aachen. Eric Schnepf rounded off his presentation with an overview slide of the most important HPC systems of the last 40 years in Karlsruhe, placing them in the borderlines of the TOP500 (rank 1 .. rank 500). "From Cyber 205 (1983) to HoreKa (2023), that's eight powers of ten performance increase, so on average every 10 years factor 100 acceleration. I think that's something to be proud of," says Eric Schnepf, appreciating the development at the HPC in Karlsruhe.
Klaus-Peter Mickel, physicist and former director of the SCC, was already working as a programmer for the IBM machines at the Computing Center of the Karlsruhe Research Center (FZK) at the end of the 1960s and also experienced and shaped the developments of the high-performance computing systems in Karlsruhe from the very beginning. When he accepted a position at the Karlsruhe Computing Center in 1970, he took over the supervision of university employees who wanted to use the FZK machines. After various professional stations, Klaus-Peter Mickel then took over the management of the computer center at the FZK in 1996. In his review of the years between 1966 and 1996, Klaus-Peter Mickel describes the intensive cooperation between the computer experts at the university and the FZK, which finally, starting in 1996, led to the planning of a sophisticated technical and organizational cooperation between the two scientific computer centers - the Virtual Computer Center Karlsruhe was founded. Virtual, yes, because it did not come to a joint computing center of both institutions at one location, as originally considered, but to an association with a legally secured cooperation agreement. There was a joint management committee and different architectural focuses on both sides, each with mutually contributed resources. The university focused on massively parallel computers and the FZK on vector computers, which were very powerful at the time. A dedicated data line connected the two computing centers over 10 km as the crow flies, reaching the respectable speed of 155 megabits per second at the time. Many positive effects were achieved by setting up the virtual data center. In addition to a high level of efficiency, because both sides did not have to maintain both architectures, there was a great benefit for the user groups because they had both architectures at their disposal.
In his lecture, Rudolf Lohner gave an intensive insight into the origins of the university's computing center and the associated development as well as the operation of the massively parallel computers in Karlsruhe, the so-called computing clusters. Rudolf Lohner worked for 20 years for the mathematics professors Alefeld and Kulisch, whom he credited as pioneers of the first hours and founders of the university computer center. He then moved from the Mathematics Institute to the Computing Center at the University of Karlsruhe in the 2000s and was an expert on energy efficiency in high-performance computing centers at the SCC until the end of his active career. In the mid-1990s, massively parallel computing clusters became increasingly common, and such systems have also been operated for research purposes at the university's computing center over the last few years, right up to the present day. Rudolf Lohner presented in an entertaining ramble not only the different cluster systems, but also some important projects and application scenarios. The spectrum ranges from the mathematical simulation of sailboat characteristics for the America's Cup, to the generation of precise weather forecasts, to the development of the institute's own cluster management systems. For the Karlsruhe High Performance Computer (HoreKa) operated today at the SCC and its predecessor systems ForHLR I and ForHLR II, Rudolf Lohner designed the extremely efficient energy and operating concept together with the HPC team. The associated new building was completed in 2015 and houses HPC systems that can be used throughout Germany at KIT's North Campus. A few months ago, the necessary structural and technical preparations were completed to make the building fit for future, even much more powerful computing systems with up to 2 megawatts of power consumption.
Following the exciting technical presentations, which highlighted the entire 40 years of HPC in Karlsruhe from various angles, guests were able to take part in guided tours of the HPC infrastructure as well as admire data center exhibits from the last 40 years in a specially designed exhibition room.
The SCC would like to thank the Ministry of Science, Research and the Arts of Baden-Württemberg, the KIT Presidium, all those involved in shaping and consistently developing 40 years of HPC in Karlsruhe, as well as the organization team of this festive colloquium around Simon Raffeiner (see photo), and of course all its guests.
 
Achim Grindler
Photos: Markus Breig (KIT)

2023-09-20
Newsbites-Kategorie:Dienste,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
GridKa online storage massively expanded

The Worldwide LHC Computing GridKa Tier-1 center is massively expanding its storage. An additional newly installed 71 petabytes of online storage are available. Data migration is now complete for almost all experiments.

In spring 2023, the expansion of the online storage system for the GridKa Tier-1 center in the Worldwide LHC Computing Grid at KIT was put into operation. The newly installed 71 petabytes are available to the LHC experiments ALICE, ATLAS, CMS, LHCb and the experiments Belle-II, Pierre Auger Observatory, Icecube and DARWIN and also replace 30 petabytes of storage hardware that will be decommissioned after six years. In total, GridKa now has 99 petabytes of online storage.
Unfortunately, the commissioning was delayed by a year due to the chip and logistics crisis following the Corona pandemic and the Ukraine war. The new installation consists of high-density Seagate CORVAULT systems with a total of 4664 18-terabyte hard drives, 70 servers and Infiniband switches that were integrated into the existing Infiniband network fabrics. IBM Storage Scale is used as the software-defined storage tier. The existing file systems were not extended, but new file systems were created. This allows new NVMe-based metadata storage systems to be deployed and new features of IBM Storage Scale to be used. The data for almost all experiments has already been migrated and the systems are in productive operation.
 
Contact at SCC: Dr. Serge Sushkov
 
Achim Grindler

2023-09-26
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
Genetic algorithms solve optimization problems

Propulate is a software that solves very general optimization problems using genetic algorithms. It is specifically designed for high-performance systems, easy to use and publicly available.

With Propulate, we provide a software for solving optimization problems that is especially adapted to the HPC setting. It is openly accessible and easy to use. Compared to a widely used competitor Propulate is faster - for a set of typical benchmarks at least an order of magnitude - and in some cases even more accurate. 
Propulate is inspired by biology, in particular by evolution, that is selection, recombination, and mutation. Propulate uses mechanisms inspired by biological evolution, such as selection, recombination, and mutation. Evolution begins with a population of solution candidates each with randomly initialized genes.  It’s an iterative process where the population at each iteration is called a generation. For each generation, the fitness of each candidate in the population is evaluated. The genes of the fittest candidates are incorporated in the next generation (see explanatory video). 
 
Like in nature, propulate does not wait for all compute units to finish evaluation of the current generation. Instead, the compute units communicate the currently available information and use that to breed the next candidate immediately. This avoids waiting idly for other units and in consequence a load imbalance. Each unit is responsible for the evaluation of a single candidate. The result is a fitness level belonging to the genes of that candidate, allowing to compare and rank the candidates. This information is sent to other compute units as it becomes available. When a unit is finished evaluating a candidate and communicating the resulting fitness, it breeds the candidate for the next generation using the fitness values of all the candidates it evaluated and received from other units so far.  
 
Propulate can be used for hyperparameter optimization and neural architecture search. It was already successfully applied for several accepted scientific publications. Applications include grid load forecasting, remote sensing, and structural molecular biology. 
 
Further Information:
 
Propulate Code Repository Explanatory video Paper Massively Parallel Genetic Optimization through Asynchronous Propagation of Populations
Contact at SCC: Dr. Marie Weiel, Oskar Taubert
Achim Grindler

2023-06-07
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC
MLPerf™ HPC Benchmark: Helmholtz AI computing infrastructure put to the test

As in the previous year, Helmholtz AI researchers joined a benchmarking study to analyze the center’s position in the computing infrastructure market — and the results are finally out!

The rushed development of AI methods and tools can make it difficult to keep up with all available options for computing, and even more difficult to identify the best alternative for a given task. This is why benchmarking values are key to compare and choose the best AI hardware option available. Benchmarking platforms give an overall view of relevant aspects like performance, environmental impact, efficiency, training speed, etc.
That’s why, as in the previous year, Helmholtz AI members from the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT) and the Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich have jointly submitted their results to the MLPerf™ HPC benchmarking suite. And we are proud to announce that our infrastructures run on the best performing AI chips!
The initiative to submit was jointly coordinated by Helmholtz AI members, Daniel Coquelin, Katharina Flügel, and Markus Götz, from SCC and Jan Ebert, Chelsea John, and Stefan Kesselheim from JSC. The results cover our two units in those centers: the HoreKa supercomputer at SCC and the JUWELS Booster at JSC. Both run using NVIDIA A100 GPUs, one of the best performing according to the benchmark. The JUWELS Booster in particular used up to 3,072 NVIDIA A100 GPUs during these measurements.
The MLPerf™ HPC benchmarking suite is a great opportunity to fine-tune both code-based and system-based optimization methods and tools. For example, based on the CosmoFlow benchmark (Physical Quantity Estimation From Cosmological Image Data), we were able to improve our submission by over 300% compared to last year! While fine-tuning our IO operations, for example, we discovered ways for our filesystems to more rapidly and reliably deliver read and write performance. Thanks to this, the recent CosmoFlow benchmark results, featured by IEEE Spectrum and HPCWire, HoreKa achieved the runner-up position behind NVIDIA's Selene system and the top spot for academic and research institutions in terms of fastest training time, outcompeting even larger systems like RIKEN's Fugaku.
As the impacts of climate change become more apparent, it is also imperative to be more conscious about our environmental footprint, especially with respect to energy consumption. To that end, the system administrators at HoreKa have enabled the use of the Lenovo XClarity Controller to measure the energy consumption of the compute nodes*. For the submission runs on HoreKa, 1,127.8 kWh were used. This is slightly more than what it takes to drive an average electric car from Portugal to Finland. 
The MLPerf™ HPC benchmarking suite is vital to determining the utility of our HPC machines for modern work flows. We look forward to submitting again next year!
 
Contact at SCC: Dr. Markus Götz
 
*This measurement does not include all parts of the system and is not an official MLCommons methodology, however it provides a minimum measurement for the energy consumed on our system. As each system is different, these results cannot be directly transferred to any other submission.
 
 

2022-11-23
News:AI,News:HAICORE,News:HPC,Newsbites-Kategorie:Dienste,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:KIT-Mitarbeiter,Newsbites-Zielgruppe:Wissenschaftler
Simplified access to more AI resources in the Helmholtz Association.

For more than two years, the SCC has been operating dedicated AI resources for AI research within the Helmholtz Association, HAICORE@KIT for short. A new operating model now simplifies access even further and increases capacities.

Artificial intelligence (AI) and machine learning (ML) encompass technologies that will impact industry, science and society in unprecedented ways. Speech and image recognition are just two of the tangible examples of applications that have proven to be reliably applicable in recent years.
HAICORE: Resources for AI/ML
Research and education in AI and ML primarily require large amounts of computing power, most of which is provided by GPUs. To meet the short-term demand for AI hardware, dedicated hardware platforms for all researchers working on AI in the Helmholtz Association have been created with the "Helmholtz AI Computing Resources" (HAICORE) at both the SCC (HAICORE@KIT) and the Forschungszentrum Jülich (HAICORE@FZJ).
HAICORE@KIT, with its 72 NVIDIA A100-40 GPUs, is primarily geared towards a prototypical usage mode of the resources, e.g. for interactive use with Jupyter, and the easiest possible access. For projects with higher demand, the large GPU systems such as HoreKa at SCC are available in addition to HAICORE@FZJ.
Self-registration and more capacity
Access to HAICORE@KIT was already very low-threshold, but required some manual steps such as filling out a short application form or maintaining guest and partner accounts for all non-KIT users.
Therefore, as of Sept. 22, 2022, the previous operating model was changed. Employees of all 18 Helmholtz institutions can now log in to the Federated Login Service (FeLS) of the SCC via the Helmholtz-AAI with their usual accounts and register themselves for the new HAICORE@KIT service. Access to up to four GPUs simultaneously per job is thus immediately enabled. An increase of this limit is possible on request.
The new feature "Multi Instance GPU" (MIG) is used to further increase the capacity of HAICORE@KIT. This allows multiple users to access the same GPU at the same time without processes interfering with each other.
More information about HAICORE@KIT can be found on the Helmholtz AI website or in the user documentation.

Simon Raffeiner

2022-11-08
News:HoreKa,News:NHR,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC
NHR@KIT launches first NVIDIA Arm Development Cluster worldwide

The new NVIDIA-Arm cluster joins an ever-growing number of development systems for future technologies.

Whether in engineering, the life sciences, astrophysics or materials research - cutting-edge research in many fields is only possible using powerful supercomputers. The National High-Performance Computing Center at the Karlsruhe Institute of Technology (NHR@KIT) operates several High-Performance Computing systems for researchers from all over Germany. 
Over the past decade, the majority of the supercomputers in the world have relied on just two different hardware architectures from three different manufacturers. Intel and AMD dominate the market for high-performance processors (CPUs) with their so-called "x86" architecture, while accelerator chips (GPUs) are almost always supplied by NVIDIA. This "monoculture" makes it easier for users to switch between systems, but the potential of alternative architectures, which may be able to achieve much higher performance and energy efficiency, remains untapped. 
To exploit these potentials, it is important to give users as well as operators the opportunity to evaluate these alternative architectures in a simple way and under real conditions. A central component of NHR@KIT is therefore the so-called "Future Technologies Partition", a hardware and software test bed for novel, disruptive technologies that have not achieved market penetration yet and are therefore not yet available in the large high-performance computers. This category also includes processors with the Arm architecture. Arm chips are not only being used in mobile devices anymore, but for example also in the currently fastest supercomputer in the world, the Japanese "Fugaku", or in current Apple systems. 
NVIDIA is planning to enter the market for high-performance processors based on the Arm architecture in 2023. These chips are to be used together with its next GPU generation, codenamed "Hopper", in future supercomputers. To enable the porting of applications right now, NVIDIA offers its partners special development kits. Each of these kits consists of an Ampere Arm Processor with 80 CPU cores, two NVIDIA A100 accelerators and two BlueField-2 Data Processing Units (DPU) with InfiniBand connections. 
One of the racks of the Future Technologies Partition,
where the new systems were installed. "The transferability of the results obtained in the Future Technologies Partition is very important to us," said Simon Raffeiner, HPC Operations Manager at NHR@KIT. "Most of the computations on the large main systems like HoreKa run on more than a single server system at the same time. That's why KIT is the only site in the world that has procured not just a single NVIDIA Arm HPC Developer Kit, but an entire cluster. Only in this way is it possible for our users to test their applications under realistic conditions."
The new systems join a steadily growing number of development systems in the Future Technologies Partition. These include special accelerators for artificial intelligence and machine learning (AI/ML) from Graphcore, existing Arm systems by other manufacturers or new types of all-flash data storage.
"We try to staff the Future Technologies Partition symmetrically, as far as possible," Raffeiner explains further. "For example, if there is a system with an x86 CPU and NVIDIA GPUs, there is also a system with an Arm CPU and NVIDIA GPUs." To round out the current matrix of systems, Arm systems with AMD GPUs are also in influx - a combination that is not in use anywhere else. "We're also porting our own cluster software stack to the Arm architecture, so that the differences on the software side are as small as possible."
Ideally, users should not immediately notice they are using a different hardware architecture and be able to focus on porting and optimising their applications. But it's often a long way to get there, Raffeiner says. "For example, we can currently only access one of the large parallel file systems using a workaround because the manufacturer does not yet offer its software for Arm systems. Here we are jointly working on a solution." The result will then also benefit other operators, who only switch to such novel architectures much later.
The new systems are currently being equipped with suitable software in cooperation with NVIDIA and are expected to be available to users in a few weeks. For more information on the Future Technologies Partition, HoreKa and the National High Performance Computing Center NHR@KIT, please visit www.nhr.kit.edu/
 
Contact at SCC: Simon Raffeiner
 
 
Achim Grindler

2022-04-08
Newsbites-Kategorie:HPC
HoreKa mirror image wins 1st place in dpa photo competition

The dpa photographer Uli Deck from Karlsruhe succeeded particularly well in staging the LED illumination of the new supercomputer HoreKa at KIT. The photo receives a 1st place in the competition "dpa Pictures of the Year 2021".

Translated with DeepL.com
A visit to the computer room of the supercomputer HoreKa (Karlsruhe High Performance Computer), which will be officially inaugurated in July 2021, makes a big impression on photography enthusiasts. The ceiling lighting in the room remains off for visitors for the time being when they enter. The darkness, the noise of the computing machine, the warmth, and the smell in the room all put the senses on standby. Slowly, the eyes get used to the twilight. You can't see the computing power, but you are all the more impressed by the flickering play of lights from thousands and thousands of green and blue LEDs on the back of the computing cluster, which you can already see from the entrance door. The inner workings of the computer are accessible via a so-called cold aisle. Entering it is a highlight on every visit to the SCC at the KIT North Campus.
Not only the technology is fascinating, but also the effective play of light with which it is illuminated. A remote control elegantly controls the light sources in the various areas of HoreKa's interior. Color and light intensity for the illumination of the supercomputer can be adjusted so that different-looking photos are always possible. Technology can be so photogenic! If it weren't so cold and noisy there, this unusual place could invite people to linger and promote inspiring ideas and thoughts.

The resulting "mirror image" of the high-performance computer captured by Karlsruhe-based dpa photographer Uli Deck at HoreKa's inauguration ceremony is particularly impressive. With this picture, he won 1st prize in the category of symbolic images in the highly acclaimed "dpa Picture of the Year" competition. The SCC is happy with him. Congratulations!

To the press release: dpa: dpa Picture of the Year
 
Achim Grindler
 

2022-03-25
News:HPC,News:NHR,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
NHR@KIT Call for Collaboration

Seven successful projects in the second NHR@KIT Call for Collaboration

The National High-Performance Computing Center at KIT (NHR@KIT) has recently concluded the review of the second NHR@KIT Call for Collaboration. In this call scientists from the application fields Earth System Science, Materials Science, Engineering in Energy and Mobility, Particle and Astroparticle Physics, and other disciplines were invited to submit proposals for collaborative research projects bridging the expertise of domain scientists and HPC experts. Following a competitive, external review of all submitted proposals, seven projects were selected and will receive funding for up to a 3 year period.
The successful proposals cover the full range of scientific domains at NHR@KIT and will be performed in close collaboration within either the Simulation and Data Life Cycle Labs (SDLs) or the Software Sustainability and Performance Engineering Team (SSPE). We congratulate the successful projects and we look forward to the start of the collaborative projects
 
Contact at SCC: René Caspart

2022-02-17
News:Authentication,News:Authorisation,News:OIDC,News:oidcagent,News:OpenIDConnect,Newsbites-Kategorie:Dienste,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:IT-Beauftragte,Newsbites-Zielgruppe:KIT-Mitarbeiter,Newsbites-Zielgruppe:Studierende,Newsbites-Zielgruppe:Wissenschaftler
oidc-agent now part of the Linux distribution Debian

With the oidc-agent software developed by SCC, OpenID connect tools are available on the command line under Debian Linux.

Since 2.1.2022 the SCC software "oidc-agent" is part of the Linux distribution DEBIAN. Currently still part of the newly included software, the so-called unstable branch, the SCC development can now be easily installed on DEBIAN Linux.
oidc-agent includes a set of tools for managing OpenID Connect tokens and makes them easily usable from the command line. It is based on the design of ssh-agent, so that users can handle OIDC tokens in a similar way to ssh keys. oidc-agent is started at the beginning of an X session or a login session. By using environment variables, the agent can be found and used to manage OIDC tokens.
The software is also available for a number of other Linux distributions via a repository server at SCC.
Further information can be found on the oidc-agent homepage and in the GitBook.

Contact person for oidc-agent: Gabriel Zachmann.
 
Dr. Marcus Hardt

2022-01-14
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
Graphcore: NHR@KIT offers state-of-the-art accelerators for AI/ML

National High Performance Computing Center NHR@KIT first operator in Germany to give researchers access to an IPU-POD16 system.

Algorithms and methods from the fields of Artificial Intelligence and Machine Learning (AI/ML) are taking on an increasingly important role in modern science. The successful application of these methods depend on a very high amount of computing power. Since the performance of conventional standard processors is usually not sufficient, researchers and data center operators often rely on special accelerator processors that can perform the required computing operations (such as matrix multiplications) very quickly.
Up to now, these processors have almost exclusively been close relatives of the graphics processors that are also being used in conventional PCs or gaming consoles. The internal structure of these chips is also suitable for many scientific calculations, including in the field of AI/ML. For use in data centers, however, more powerful models with additional functions required for professional environments are being used. The Scientific Computing Center (SCC) now operates more than 1,000 of these accelerator processors in total, including the almost 700 in the new „Hochleistungsrechner Karlsruhe“ (Karlsruhe High-Performance Computer, HoreKa) and more than 130 in the bwUniCluster 2.0 system.
The currently used accelerators from NVIDIA (A100 and V100) and AMD (MI100) deliver both a level of computing performance and energy efficiency that is about a factor of 10 higher than conventional standard processors can do. Thanks to its A100 accelerators, HoreKa made it to 13th place on the list of the world's most energy-efficient computers (as of June 2021). However, there is still great potential for further optimization in the field of AI/ML.
UK-based startup Graphcore, founded in 2016, is one of several companies working on corresponding products. Graphcore calls its processors "Intelligent Processing Units" (IPUs). The current GC200 model has 59 billion transistors, making it one of the largest chips in production worldwide, and bears the nickname "Colossus".
In contrast to the chips from NVIDIA and AMD, which support the entire range of scientific applications, "Colossus" processors are primarily focussed on computational operations with reduced precision data types important for AI/ML. Up to 250 trillion of these special computational operations per second (AI Floating Point Operations, AI FLOPS) can be performed by a single chip. The IPU-POD16 system now put into operation by the National High Performance Computing Center NHR@KIT as part of the so-called "Future Technologies Partition" has 16 Colossus processors. It is the first system of its kind in Germany.
Researchers with access to HoreKa or from the National High Performance Computing Alliance (NHR-Verbund) can get access to the Future Technologies Partition on request. This hardware and software test bed for innovative and distruptive technologies also sports systems with AMD processors, AMD accelerators, ARM processors and other hardware.
More information about the new Graphcore systems is available in the NHR@KIT user documentation. Information on National High Performance Computing at KIT can be found on the NHR@KIT website.
 
Contact: Simon Raffeiner

2021-12-08
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
Successful Machine Learning Performance Tests with HoreKa

Researchers and Helmholtz AI members from the SCC and the Jülich Supercomputing Center have jointly submitted their results from a competitive initiative using the HPC benchmarking software MLPerf™ to the Supercomputing Conference 21.

In the Helmholtz AI platform, Germany's largest research centers have teamed up to bring cutting-edge AI methods to scientists from other fields. With this in mind, researchers and Helmholtz AI members of the Scientific Computing Center (SCC) at Karlsruhe Institute of Technology (KIT) and the Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich have jointly submitted their results for the MLPerf™ HPC benchmarking suite. This initiative was started in 2020 by companies like Baidu, Google and GraphCore as well as researchers from Stanford, Harvard and Berkeley to precisely study large-scale AI applications. The Helmholtz AI team of SCC successfully executed training runs of the DeepCAM application with up to 512 Nvidia A100 GPUs on the HoreKa supercomputer located at SCC and additionally the CosmoFlow application on the JUWELS Booster system at JSC.
DeepCAM, the Gordon-Bell prize award winning application in 2019, is an artificial intelligence deep-learning software that allows for the recognition of cyclones in climate data. Early detection of these rotating tropical storm systems in the Indian and South Pacific Ocean is key in preventing material damage and loss of life, as well as in identifying potentially arable dry regions. Using HoreKa, Daniel Coquelin in Markus Götz's team was able to complete more than 100 quadrillion calculations required by DeepCAM in just 4 minutes and 21 seconds.
While striving for performance, it is vital to also balance the own climate impact of such large-scale measurements. With HoreKa and JUWELS ranking among the top 15 on the worldwide Green500 list of energy-efficient supercomputers, the computing resources in Helmholtz AI are both computationally and energy-efficient, making them Europe’s fastest and greenest systems for AI workloads. These benchmarks have not only helped us in better understanding our current systems, but also show us pathways for improvements of future systems, in providing us with testing tools to show administrators and users alike the carbon footprint of each individual computing job.
As Helmholtz AI, we hope to be part of this challenging, yet exciting, competition again next year. For this, we plan to not only partake in the so-called closed competition, i.e. measuring existing, official applications as-is, but also to showcase some advanced large-scale training approaches in the open competition.
Further Information:
www.fz-juelich.de/SharedDocs/Meldungen/PORTAL/DE/2021/2021-11-18-mlperf-hpc.html blogs.nvidia.com/blog/2021/11/17/mlperf-hpc-ai/ https://developer.nvidia.com/blog/mlperf-hpc-v1-0-deep-dive-into-optimizations-leading-to-record-setting-nvidia-performance/  
Contact at SCC: Markus Götz


Achim Grindler

2021-11-18
News:HoreKa,News:HPC,News:NHRKIT,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
NHR@KIT Call for Collaborations

Possibility for collaborative projects within the framework of NHR@KIT

The National High Performance Computing (NHR) Center at KIT is opening a second round of a call for collaborative research projects. In the scope of these projects, collaborative research activities of PhD students and postdocs can be funded. The call for proposals is open for collaborative projects of scientists from NHR@KIT and from the fields of Earth System Science, Materials Science, Engineering in Energy and Mobility, and Particle and Astroparticle Physics collaborate.
For more information and the call for proposals, please visit nhr.kit.edu/collaboration-call
Contact: René Caspart

2021-11-16
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Kategorie:Studium, Lehre, Bildung,Newsbites-Zielgruppe:Wissenschaftler
NHR alliance establishes graduate school

The National High Performance Computing Network (NHR Network) has founded its own graduate school. The NHR Center at KIT, NHR@KIT, is also participating.

The NHR Alliance has been organizing high-performance computing on a national level since January 1, 2021. The nine NHR centers, including the NHR center at KIT, NHR@KIT, jointly coordinate the resources and services offered to researchers from all over Germany. In order to adequately train young scientists in high-performance computing and to promote their networking, a dedicated graduate school has now been established.
The offer is addressed to graduates of a master's program in computer science, mathematics, natural sciences, or engineering who aim at a doctorate in one of the research areas covered by the NHR centers. At KIT, these include, for example, efficient numerical methods for exascale systems, sustainable software development of scientific applications, or data-intensive computing.
Members of the graduate school will be accepted into a regular PhD program at the location of one of the NHR centers, but are also expected to work at another NHR center for at least six months. In addition, the NHR Graduate School offers its own curriculum in areas of particular relevance to HPC, individual supervision (mentoring) and courses to teach "soft skills".
Up to nine applicants per year will be accepted, who will also receive a stipend for 36 months. Applications for the 2022 graduate programme are open until 15 December 2021 and are also open to interested parties from outside the European Union.
Further information: www.nhr-gs.de/ueber-uns/nhr-graduiertenschule.
 
Contact at the SCC: Martin Frank
 
Achim Grindler

2021-11-08
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:KIT-Mitarbeiter,Newsbites-Zielgruppe:Studierende
KIT supercomputer on rank 14 in Europe.

The Karlsruhe HoreKa high-performance computer is one of the fastest computers in Europe. On the TOP500 list, the HPC system is ranked 52nd. In terms of energy efficiency, it is in an excellent 13th place in the international ranking.

On June 1, KIT, as the National High Performance Computing Center(NHR@KIT), started scientific operation of the new high-performance computer "HoreKa". In the current Top 500 Spring List, the system is among the fifteen fastest computers in Europe; in a worldwide comparison, it ranks 52nd. In terms of energy efficiency, HoreKa reaches 13th place in the international ranking.
The hybrid system consists of a computing accelerator based on graphics processing units (GPUs) and a partition equipped with standard processors (CPUs). The GPUs from NVIDIA guarantee extremely high performance, which is required for certain computing operations such as equation system solvers or algorithms for applications in artificial intelligence. The latest generation of Intel CPUs, which were only officially introduced with the start of the HoreKa test run, are also optimized for certain operations. HoreKa cleverly combines the strengths of both architectures so that maximum performance is achieved. Overall, the system achieves a peak performance of 17 PetaFlop/s. The ThinkSystem from Lenovo was supplied by pro-com Datensysteme GmbH.
 
Contact: Dr. Jennifer Buchmüller
Further information: kit.edu/kit/pi_2021_059_supercomputer-of-the-kit-one-of-the-15-fastest-in-europe.php
 
Achim Grindler

2021-07-07
Newsbites-Kategorie:Dienste,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:KIT-Mitarbeiter,Newsbites-Zielgruppe:Nutzer SCC-Dienste,Newsbites-Zielgruppe:Wissenschaftler
NHR@KIT puts HoreKa into operation for research purposes

New supercomputer goes into operation for research purposes after succesful pilot phase

The National High Performance Computing Center at KIT (NHR@KIT) has put the new "HoreKa" supercomputer into operation for research purposes today after successful completion of the trial operations. The system is now available to scientists from all over Germany for research projects.
Thanks to the new supercomputer, researchers will be able to gain a more detailed understanding of highly complex natural and technical processes, particularly in materials science, earth system science, energy and mobility research in engineering, and particle and astroparticle physics.
Innovative high-performance system with a big hunger for data
HoreKa is an innovative hybrid system with nearly 60,000 Intel processor cores, more than 220 terabytes of main memory and 668 NVDIA A100 GPUs. A 200 GBit/s non-blocking InfiniBand HDR network is used as the communication network, and two parallel Spectrum Scale file systems with a total capacity of more than 15 petabytes are used for data storage.
A key consideration during the design of the system were also the enormous amounts of data generated by scientific research projects. To keep up with the growing needs, HoreKa's compute nodes, InfiniBand network and parallel file systems each deliver up to four times the storage throughput of its predecessor ForHLR. A multi-level data storage architecture will additionally guarantee further high-throughput processing on external storage systems.
HoreKa is housed in a dedicated computer building on KIT's North Campus, which was newly constructed in 2015 for its predecessor ForHLR. The award-winning, energy-efficient hot water cooling concept is continued with the new system.
New platform for project applications
The application for computing time projects on HoreKa is now possible via the digital application platform. In addition, the new NHR Support Portal provides an integrated platform for all questions related to application submission as well as technical and professional support. Organizational questions about HoreKa can also be sent to horeka-info@nhr.kit.edu.
The official inauguration ceremony of HoreKa will take place in mid-July. An invitation will follow.
_ _ _
More information about HoreKa:
https://www.nhr.kit.edu/userdocs/horeka/
With bwUniCluster 2.0, KIT operates a second supercomputer in state service:
https://www.scc.kit.edu/dienste/bwUniCluster_2.0.php


Dr. Jennifer Buchmüller

2021-06-01
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
NHR@KIT Call for Collaborations

Possibility for collaborative projects within the framework of NHR@KIT

Within the scope of National High Performance Computing (NHR) at KIT, research activities of PhD students and postdocs can be funded in collaborative research projects. In these projects, scientists from NHR@KIT and from the fields of Earth System Science, Materials Science, Engineering in Energy and Mobility, and Particle and Astroparticle Physics collaborate. We are opening a call for project proposals from researchers in these fields.
For more information and the call for proposals, please visit nhr.kit.edu/collaboration-call

Contact: René Caspart

2021-05-25
News:HoreKa,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
Change at the top: Pilot operation of Supercomputer HoreKa begins

Startup of HoreKa also marks the beginning of the shutdown of predecessor ForHLR II

Karlsruhe - SCC has started pilot operation of the new supercomputer "Hochleistungsrechner Karlsruhe" (High Performance Computer Karlsruhe), or “HoreKa” for short. While HoreKa will be ramped up to its full capacity in the coming weeks, this also marks the beginning of the shutdown of its predecessor, ForHLR II, after five years of successful operation.
When KIT's “Forschungshochleistungsrechner II” (High Performance Research Computer II, short ForHLR II) was commissioned in March 2016, it was one of the few computers in the world that could reach a computing power of more than one PetaFLOPS - i.e. one quadrillion computing operations per second. More than 1150 compute nodes with almost 24,000 CPU cores and 74 terabytes of main memory in total were required to achieve this level of performance.
KIT was not only at the forefront in terms of computing power with ForHLR II, though: The system was not cooled with cold water, but used “hot” water at temperatures of up to 45 degrees Celsius. This was a novelty in the field of High-Performance Computing and also the reason for the construction of a new data center building for the supercomputer at KIT's Campus North. The SCC was awarded with the German Data Center Award in 2017 for the energy-efficient overall concept.
Three years of preparations for HoreKa
The planning for a successor system to ForHLR II already started in 2018. In 2019, the project was named "High Performance Computer Karlsruhe" - HoreKa for short - and the procurement process was started. But HoreKa will not just be a worthy successor to ForHLR II. With 769 compute nodes, nearly 60,000 CPU cores, more than 220 terabytes of main memory and 668 GPUs, the system will achieve a theoretical peak performance of more than 17 petaFLOPS, making it 17 times faster than its predecessor. The system is thus expected to be among the ten fastest computers in Europe in mid-2021.
Thanks to the new supercomputer, researchers will be able to gain a more detailed understanding of highly complex natural and technical processes, particularly in materials sciences, earth system sciences, energy and mobility research in engineering, and particle- and astroparticle physics. Of course HoreKa can also be used by scientists conducting research to understand the SARS-CoV-2 virus, helping to combat the COVID-19 disease.
Computing and storage go hand in hand
A key consideration in the design of the system were the enormous amounts of data created by scientific research projects. Depending on the application, several hundreds of terabytes can be generated from a single simulation. To keep up with the growing needs, HoreKa's compute nodes, InfiniBand network and parallel file systems each deliver up to four times the throughput of the previous ForHLR II system.
A Multi-level data storage concept also guarantees fast data processing on external storage systems. HoreKa is connected with speeds of up to 45 gigabytes per second to the "Large Scale Data Facility" (LSDF) at SCC, which has been providing a modern infrastructure for the storage, management, archiving and analysis of research data since 2010.
Full operation of HoreKa begins on June 1, 2021
HoreKa was built up alongside ForHLR II over the past months. The first user groups have already been granted access to be able to port and optimize their applications. Over the next few weeks, the system will be ramped up to its full capabilities and pilot operation is expected to seamlessly transition into full operation. Starting on June 1, 2021, HoreKa will then be available to scientists from all over Germany. Applications for computing time can already be submitted right now.
Since the data center building at Campus North cannot supply both systems at the same time, ForHLR II has to be gradually shut down in parallel. By mid-April, the system will no longer be available. The record after five years of operation is extremely positive: almost one billion CPU hours of computing time have been provided to more than 140 different research projects.
More Information about HoreKa and how to file applications for computing time on the new system: www.nhr.kit.edu/userdocs/horeka/
 
More information on COVID-19 research at KIT:
www.kit.edu/kit/corona-pandemie-forschung-und-hilfsaktivitaeten-am-kit.php (in German) www.scc.kit.edu/en/aboutus/13531 With bwUniCluster 2.0, KIT operates a second supercomputer as a state service: www.scc.kit.edu/en/dienste/bwUniCluster_2.0
 
Jennifer Buchmüller
Simon Raffeiner

2021-03-29
News:Exascale,News:GPU,News:HPC,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Kategorie:Publikationen,Newsbites-Zielgruppe:Wissenschaftler
Fit for the upcoming high-performance GPUs from Intel

Hartwig Anzt, head of the Helmholtz junior research group FiNE, presents first experiences in software porting for Intel Xe GPUs.

In the article "Preparing for the Arrival of Intel's Discrete High-Performance GPUs" published on HPCwire, Hartwig Anzt, head of the Helmholtz junior research group FiNE, presents first experiences in software porting for Intel's new Xe GPUs.


Hartwig Anzt 's team is one of the first worldwide to develop software for the expected discrete Intel High-performance GPUs. In close collaboration with Intel and Argonne National Lab, which is planning the first exascale supercomputer based on these GPUs, the research team has developed a backend for the open-source software library Ginkgo in the programming language DPC++, which is already capable of executing numerical methods entirely on Intel GPUs.


The obove mentioned HPCwire article discusses a workflow to convert CUDA code to DPC++ code and the challenges involved. Even though Intel's GPUs and the oneAPI ecosystem are still struggling with various teething problems, the open source strategy Intel has chosen could succeed in bringing the scientific community along.


2021-03-24
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC
KIT Becomes Center for National High Performance Computing

Millions of Euros in Funding for Future Supercomputers at the SCC of KIT - Researchers throughout Germany can use enormous computing power from Karlsruhe.

The Karlsruhe Institute of Technology (KIT) becomes the Center for National High Performance Computing (NHR). This was decided today (13.11.2020) by the Joint Science Conference - the body coordinates the science funding of the federal and state governments. This will enable scientists to use even more powerful high-performance computers at KIT in the future. With HoreKa, one of the most powerful supercomputers in Europe will be available at the Scientific Computing Center of KIT in spring 2021. The NHR Alliance has an annual budget of 62.5 million euros, a high single-digit million amount goes to KIT every year.
 
Press Release of the KIT: pi_2020_101_kit-will-be-center-fur-national-high-performance computing.php

Press release of the GWK: pm-2020-11.pdf
 
Translated with DeepL.com
 

2020-11-13
Newsbites-Kategorie:Dienste,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Nutzer SCC-Dienste,Newsbites-Zielgruppe:Wissenschaftler
Interactive supercomputing with JupyterLab

With JupyterLab, KIT researchers can now very easily interact with the SCC's high-performance computers directly from any workstation using a web browser. This also opens up new possibilities for courses.

The project Jupyter [1] started some years ago. It evolved into a project that today represents a complete open source ecosystem for programming, data exploration and code execution. Most importantly, Jupyter offers a new way of supercomputing that allows interactive work with kernels, text editors and data visualization on HPC systems.
Since the end of October, SCC offers Jupyter as a service [2]. In addition to the classic access via SSH, interactive access via web browser to all HPC systems of the SCC is now also possible. For the Tier-3 system, bwUniCluster2.0 + GFB-HPC, the new AI/ML-GPUs of the HAICORE partition and the Tier-2 inventory system ForHLR II, dedicated queues are reserved for Jupyter.
This minimizes waiting times and allows especially new users of our HPC systems a low entry threshold.
To use Jupyter on the HPC resources of the SCC the respective access requirements apply. Registration of access via bwidm.scc.kit.edu/ is required. The Jupyter service is only available within the KIT network. If the service is to be used from outside, a VPN connection to the KIT network must first be established.
References:
[1] A detailed documentation of the Jupyter project can be found at jupyter.readthedocs.io
[2] Further information on the Jupyter service can be found in the service description
 
Contact at SCC: Jennifer Buchmüller, Samuel Braun
 
Achim Grindler
 

2020-11-09
Newsbites-Kategorie:Dienste,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Nutzer SCC-Dienste,Newsbites-Zielgruppe:Wissenschaftler
KIT introduces 2-factor authentication for its HPC systems

A large-scale IT security incident forced many operators, including the KIT, to take their HPC system offline in mid-May. With the introduction of a 2-factor authentication scheme the systems can now be made available without restrictions again.

In mid-May, an IT security incident became known that affected a large number of HPC systems worldwide. It took several weeks until the systems could be made available to users again, often only with significant restrictions. The two high-performance computers ForHLR II (Tier-2) and bwUniCluster 2.0 (Tier-3) at KIT were put back into operation in mid-June. During the first out of three phases of the recommissioning process coordinated with the other operators in the federal state of Baden-Wuerttemberg, the use of SSH keys was no longer possible. This caused severe restrictions for the scientific communities, especially on the Tier-2 system, since the HPC systems could no longer be integrated into automated scientific workflows.
Within just a few weeks, the SCC has now successfully introduced 2-factor authentication (2FA) for all HPC systems using time-based one-time passwords (TOTP). So-called hardware or software tokens can be used to generate the one-time passwords. KIT employees already receive a hardware token for access to critical services such as the SAP portal or campus management. These can now also be used to access the HPC systems. A wide range of software token solutions, including apps for mobile devices, is available to users from other institutions. Registration and management of the tokens is handled by the web portal of the federal identity management system bwIDM.
In combination with the 2FA, the use of SSH keys is now possible again. These keys must also be managed via bwIDM. There are two types of SSH keys: those for interactive use and those for workflow automation (so-called “command keys”). SSH keys registered for interactive use allow the execution of any commands and require additional authentication with a time-based one-time password as a second factor. The second factor has to be entered once per hour at maximum. Command keys can be used without 2FA and thus in an automated fashion. However, they must be restricted to a single command and have to be cleared by the HPC operations team.
Other operators from the bwHPC project are planning to introduce 2-factor authentication based on the new components developed for bwIDM. The source code is available to interested parties under an open source license.
Further information on 2-factor authentication for the HPC systems can be found in the user documentations for ForHLR II and bwUniCluster 2.0.

2020-08-13
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC
Super-fast KI system installed and commissioned at SCC

To ensure that researchers of the Helmholtz Association continue to be at the forefront of AI and machine learning development, KIT is now the first location in Europe to commission state-of-the-art NVIDIA DGX A100 AI systems.

Als ein Werkzeug der Spitzenforschung ist Künstliche Intelligenz (KI) heute unentbehrlich. Für einen erfolgreichen Einsatz – ob in der Energieforschung oder bei der Entwicklung neuer Materialien – wird dabei neben den Algorithmen zunehmend auch spezialisierte Hardware zu einem immer wichtigeren Faktor. Das Karlsruher Institut für Technologie (KIT) hat nun als erster Standort in Europa das neuartige KI-System NVIDIA DGX A100 in Betrieb genommen. Bei den neuen Computersystemen vom Typ DGX A100 handelt es sich um Hochleistungsserver mit jeweils acht NVIDIA A100 Tensor Core GPUs. Gemeinsam erbringen die acht Beschleuniger eine Rechenleistung von 5 AI-PetaFLOP/s. Angeschafft wurde es aus Mitteln der Helmholtz AI Computing Resources (HAICORE) Initiative, die eng mit Helmholtz AI verbunden ist.

Weitere Informationen in der Presseinformation des KIT vom 6.7.2020
 
Achim Grindler


2020-07-06
Newsbites-Kategorie:Auszeichnungen,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC
KIT becomes part of the US Exascale Computing Project

The Helmholtz Junior Research Group "Fixed-Point Methods for Numerics at Exascale" has been included in the US Exascale Computing Project and receives research funding of more than one million Euro until 2022.

On July 29th 2015, the former US president Barack Obama founded the National Strategic Computing Initiative (NSCI) with an executive order. This national initiative aims to develop a strategic vision across institutional boundaries and a federal investment strategy in collaboration with scientific research institutes and industry partners to maximize the benefits of High Performance Computing (HPC) for the US.
The Exascale Computing Project (ECP) is part of the NSCI and aims to provide powerful Exascale supercomputers and develop a sustainable software ecosystem for scientific simulations. Due to the existing expert knowledge on the development of numerical software libraries and sustainable software development, the Helmholtz Junior Research Group Fixed-Point Methods for Numerics at Exascale (FiNE) headed by Dr. Hartwig Anzt located at the Scientific Computing Center has been included in the ECP consortium and receives research funding of more than a million Euro until 2022.
The Team headed by Hartwig Anzt (see photo) will especially focus on the development of numerical methods and production-ready codes that can operate in lower arithmetic precision, while still producing results of high precision. These so-called "mixed precision algorithms" can profit from the high performance of modern, AI-focused hardware and thus be used efficiently for numerical methods and scientific simulations.
 
Contact persons at SCC: Hartwig Anzt, Martin Frank
Further Information:
Multiprecision Effort (Exascale-Podcast mit Hartwig Anzt): insidehpc.com/2019/12/podcast-developing-multiprecision-algorithms-with-the-ginkgo-library-project/ 

2020-07-01
Newsbites-Kategorie:Dienste,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
bwUniCluster 2.0 goes into operation

As part of the Initiative for High Performance Computing, Data Intensive Computing and Large Scale Scientific Data Management in Baden-Württemberg (bwHPC) the SCC will put the new parallel computer system bwUniCluster 2.0 into operation 17.03.2020.

As part of the Initiative for High Performance Computing, Data Intensive Computing and Large Scale Scientific Data Management in Baden-Württemberg (bwHPC) the SCC will put the new parallel computer system "bwUniCluster 2.0+GFB-HPC" (bwUniCluster 2.0) into operation as a federated service on 17.03.2020.
 
The bwUniCluster 2.0 replaces the predecessor system "bwUniCluster" and also includes the extension of the predecessor system procured in November 2016. The modern, advanced HPC system consists of more than 840 SMP nodes with 64-bit Xeon processors from Intel. It provides the universities of the state of Baden-Württemberg with basic computing power and can be used free of charge by the staff of all universities in Baden-Württemberg. 
 
Access is granted via the bwIDM federal identity management, regulated by each university individually. Users who currently have access to bwUniCluster 1 will automatically also have access to bwUniCluster 2.0. There is no need to apply for new entitlements or re-register.
Further information:
Technical description of bwUniCluster 2.0 Details to registration and access at wiki.bwhpc.de/e/bwUniCluster_2.0  

2020-02-28
Newsbites-Kategorie:Dienste,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:KIT-Mitarbeiter,Newsbites-Zielgruppe:Nutzer SCC-Dienste,Newsbites-Zielgruppe:Wissenschaftler
KIT Procures New Supercomputer

In fall 2020, KIT will make available the 1st stage of a new supercomputer for many scientific fields. The full system will be handed over to the scientific communities by summer 2021. The procurement that has now been signed is in the order of 15 mio. €

“Research using supercomputers contributes to a modern and sus-tainable society,” explained Professor Holger Hanselka, President of KIT. “With the help of supercomputers, research in key areas, such as energy, environment, mobility, and medicine, will find new solu-tions faster. HoreKa thus fits perfectly into KIT's strategy to make significant contributions to managing the challenges facing society.”
“High-performance computing stands for rapid developments. With their ever increasing peak performances, supercomputers are crucial to both leading-edge research and the development of innovative products and processes in key economic areas. Thanks to institutions like KIT, Baden-Württemberg is a European leader in supercomputing and internationally competitive in this area. It is not only the impressive computing power of the machines, but also the concentrated methodological expertise that enables our computer-assisted top-level research to achieve breathtaking results,” said Baden-Wuerttemberg Science Minister Theresia Bauer.
The new “Hochleistungrechner Karlsruhe” (German for Karlsruhe high-performance computer), HoreKa for short, is expected to be one of the ten most powerful computers in Europe in 2021 and will have a computing power of more than 17 PetaFLOPS - 17 quadrillion computing operations per second, which corresponds to the performance of more than 150,000 laptops. 
The system will be available to scientists from all over Germany. Thanks to the new supercomputer, researchers in the areas of materials sciences, earth system science, energy and mobility research in engineering, life sciences and particle and astroparticle physics will be able to gain a more detailed understanding of highly complex natural and technical processes. Of course, HoreKa can also be used by scientists studying the SARS-CoV-2 virus and, thus, will contribute to fighting the COVID-19 disease.
Computing and Storage Go Hand in Hand With HoreKa, researchers can analyze far more details in larger systems, thereby extending normal simulations to so-called multiscale simulations. “Climate simulations and Earth system models, for example, will achieve much finer resolutions and, thus, a higher level of detail,” explains Professor Martin Frank, Director of the Scientific Computing Center (SCC) of KIT. “However, in addition to pure computing power, the demands on file systems are also increasing in terms of both capacity and latency. With HoreKa, we are consistently continuing the strategic orientation of SCC towards data-intensive computing”.
“Currently, highly diverse technical developments are taking place on the hardware market,” says Dr. Jennifer Schröter, Head of the High-performance Computing Group of SCC. “Our technical requirements were demanding, but the tendering process was deliberately kept open with respect to the exact technologies used to give our bidders the opportunity to design the most powerful systems possible.”
Two Innovative Chip Technologies – One High-Performance System The result is an innovative hybrid system with almost 60.000 next-generation Intel Xeon Scalable Processor cores and 220 terabytes of main memory as well as 740 NVIDIA A100 Tensor Core GPUs. A non-blocking NVIDIA Mellanox InfiniBand HDR network with 200 GBit/s per port is used for communication between the nodes. Two Spectrum Scale parallel file systems offer a total storage capacity of more than 15 petabytes. The computer systems are made by Lenovo, while general contractor pro-com Datensysteme GmbH from Eislingen near Stuttgart is responsible for project coordination, system integration, delivery, and customer support.
“We are looking forward to putting this system into operation to-gether with our partners Lenovo and KIT and to handing it over to the users,” says Oliver Kill, Managing Director of pro-com. With HoreKa, pro-com is not only celebrating its 30th anniversary in 2020, but also the largest order in the company's history.
Machine Learning Supports Human Researchers  "Artificial intelligence and machine learning can dramatically accelerate scientific computations in the most significant areas of research, where the world’s problems are being solved,” says Marc Hamilton, Vice-President of Solutions Architecture and Engineering at NVIDIA. “NVIDIA A100 Tensor Core GPUs further support this accelerated research, and together with NVIDIA Mellanox InfiniBand technology, KIT’s new supercomputer will speed up scientific discovery for a broad range of important research."
Another central aspect in the system design has been the enormous amount of data generated by scientific research projects. Depending on the application, several hundred terabytes of data can be generated by a single simulation. To keep up with the growing amounts of data, the computing nodes, the InfiniBand network, and the parallel file systems of HoreKa each will provide up to four times the throughput of its predecessor system, ForHLR.
A multi-level data storage concept will guarantee high-throughput processing of data on external storage systems. With a data rate of up to 45 GByte/s, HoreKa will also be connected to the “Large Scale Data Facility” (LSDF) of the SCC which has been providing a modern infrastructure for the storage, administration, archiving, and analysis of research data since 2010.
Award-winning Energy Efficiency HoreKa will be installed in a state-of-the-art data center constructed for its predecessor ForHLR on KIT‘s Campus North in 2015. The award-winning, energy-efficient hot water cooling concept based on the Lenovo Neptune Direct Water Cooling (DWC) technology will also be used for the new system.
The SCC employees chose the name HoreKa in reference to “GridKa”, the “Grid Computing Centre Karlsruhe”. It is also located at SCC and has successfully provided data storage and analysis capacities for large-scale experiments all over the world, including the Large Hadron Collider (LHC) at CERN in Switzerland, for more than 15 years. One of GridKa's greatest successes is its participation in the discovery of the Higgs particle in July 2012. GridKa is the largest and most powerful data center of its kind. 

More information about HoreKa:
www.scc.kit.edu/en/services/horeka.php More information on COVID-19 research at KIT:
www.kit.edu/kit/corona-pandemie-forschung-und-hilfsaktivitaeten-am-kit.php (in German)
www.scc.kit.edu/en/aboutus/13531.php With bwUniCluster 2.0, KIT operates a second supercomputer as a state service: 
www.scc.kit.edu/en/services/bwUniCluster_2.0.php
  KIT Press Release: www.kit.edu/kit/english/pi_2020_035_kit-procures-new-supercomputer.php

Achim Grindler

2020-05-14
Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:Wissenschaftler
Computer resources of SCC support distributed protein computing projects

Resources of the Grid Computing Centre Karlsruhe and the HPC systems of KIT support the distributed computing projects Folding@Home and Rosetta@home, which help to improve the understanding of proteins, including those of the SARS-CoV-2 virus.

Since the end of March, computers of the Grid Computing Centre Karlsruhe (GridKa) and the HPC systems of KIT are supporting the distributed computing projects Folding@Home and Rosetta@home, for example, to improve the understanding of the proteins of the SARS-CoV-2 virus.
In close cooperation with the Institute for Experimental Particle Physics (ETP) at KIT, the CPU resources of GridKa were integrated into the Cobald/Tardis system that is used otherwise to dynamically manage the opportunistic provision of resources for research in high-energy physics. Up to 10,000 logical CPU cores of GridKa are now made available for Covid-19 research.
The department Scientific Computing and Simulation (SCS) has recently put the new HPC system bwUniCluster 2.0+GFB-HPC into operation. Folding@Home could serve as a useful example for a so-called "burn-in" process and the opportunistic provision of resources that would otherwise be temporarily not used due to the regular scheduling mechanisms in an HPC system. In addition to the CPU resources, up to 132 energy-efficient NVIDIA Tesla V100 GPU accelerators are included in the calculations. 
The teams "KIT-ETP" with GridKa and "KIT-SCS" already occupy ranks among the first 1000 of the more than 250.000 teams of Folding@Home.
In the short term, these resources will also be made available to the WeNMR project. Via the HADDOCK portal operated there, individual research groups can now also use GridKa resources for calculations to combat Covid-19.
 
Contact SCS: Jennifer Schröter
Contact GridKa: Andreas Petzold


Achim Grindler

2020-04-08
News:HPC_ForHLR_Hochleistungsrechner,Newsbites-Kategorie:Dienste,Newsbites-Kategorie:Forschung,Newsbites-Kategorie:HPC,Newsbites-Zielgruppe:IT-Beauftragte,Newsbites-Zielgruppe:Nutzer SCC-Dienste,Newsbites-Zielgruppe:Wissenschaftler
Shutdown of the research high performance computer Phase I (ForHLR I)

In mid-April 2020, the HPC system "Research High-Performance Computer Phase I" (ForHLR I for short) will be shut down after more than five years of operation.

The HPC system "Research High-Performance Computer Phase I" (ForHLR I for short) will be finally shut down after more than five years of operation on 14 April 2020 and will no longer be available after that date.
The file systems of ForHLR I will remain in operation until 12.05.2020, after which they will also be shut down and their contents deleted. There will be no automatic migration of data to other storage systems, this must be carried out by the users themselves.
To enable this migration, the login and data mover nodes of the former cluster will also remain in operation until 12.05.2020.
As replacement systems, the Forschungshochleistungsrechner II (ForHLR II), the bwUniCluster 2.0+GFB or the bwForCluster operated within the bwHPC initiative are possible.
The SCC asks for timely application of possibly required access rights and - depending on the system - for the submission of requests for computing time.

2020-03-11