Researchers at the International Laboratory for Supercomputer Atomistic Modeling and Multiscale Analysis at the Higher School of Economics, the Joint Institute for High Temperatures of the Russian Academy of Sciences and the Moscow Institute of Physics and Technology compared the work of popular molecular modeling programs on AMD and Nvidia GPU accelerators. Scientists have ported LAMMPS to the new open-source AMD HIP GPU technology for the first time. This evolving technology saw great promise as it enables efficient use of the same code on both Nvidia accelerators and AMD's new GPUs.
In a paper published in the International Journal of High Performance Computing Applications, scientists ported LAMMPS to AMD's new open-source GPU technology HIP for the first time. The researchers conducted detailed performance analyzes of three molecular modeling programs LAMMPS, Gromacs and OpenMM on Nvidia and AMD GPUs with comparable peak performance.
For the tests, the scientists used a model of the protein ApoA1 - Apolipoprotein A1 - an apolipoprotein in blood plasma, the main carrier protein of "good cholesterol". It turned out that the performance of scientific calculations is influenced not only by the characteristics of the hardware, but also by the software environment. It turned out that insufficiently efficient operation of AMD drivers in complex scenarios of parallel launching of computational cores can introduce significant delays. Open-source solutions have their drawbacks so far.
In the published work, scientists have ported LAMMPS to the new open-source AMD HIP GPU technology for the first time. This evolving technology saw great promise as it enables efficient use of the same code on both Nvidia accelerators and AMD's new GPUs. The developed modification of LAMMPS is published under an open license and is available in the official repository - users around the world can use this development to speed up their calculations.
“We conducted a detailed analysis and comparison of the memory subsystems of GPU-accelerators of the Nvidia Volta and AMD Vega20 architectures. I discovered the difference in the logic of parallel running of GPU cores and demonstrated it using the visualization of the program profiles. Both the bandwidth and latency of the internal memory hierarchy of the GPU accelerator, and the efficient parallel execution of GPU cores - all this has a very large impact on the real performance of GPU programs,”says Vsevolod Nikolsky, one of the authors of the article, HSE graduate student. According to the authors of the article, participation in this technological race of the titans of modern microelectronics demonstrates an obvious trend towards an increase in the variety of GPU accelerator technologies.
“On the one hand, this is a positive fact for end users, stimulating competition, increasing efficiency and reducing the cost of supercomputers. On the other hand, the complexity of developing efficient programs for hybrid computing systems will increase even more as a result of the need to take into account the presence of several different types of GPU architectures and programming technologies, - says HSE professor Vladimir Stegailov.
- Even support for portability of programs for conventional processors on various architectures (x86, Arm, POWER) is often non-trivial. Portability of programs between different GPU platforms is a much more complex issue.The open-source paradigm removes many barriers and helps developers of large and complex supercomputing programs."
In 2020, the shortage in the market for graphics accelerators increased. Popular areas of their use are well known: cryptocurrency mining and machine learning tasks. However, GPU accelerators are also needed in science for mathematical modeling of new materials and biological molecules. “Building powerful supercomputers and developing fast and efficient programs is preparing the tools to tackle the world's toughest challenges, such as the COVID-19 pandemic. Computational tools for molecular modeling are used today all over the world to find ways to combat this virus,”says Nikolai Kondratyuk, one of the authors of the article, a research fellow at the Higher School of Economics.
The most important programs for mathematical modeling are developed by international teams and scientists from dozens of organizations. Development is carried out with open source and free licenses. The competition between the two titans of modern microelectronics Nvidia and AMD has led to the emergence of a new open-source AMD ROCm infrastructure for programming GPU accelerators.
The openness of the platform allows us to hope for the maximum portability of the codes developed with its use on supercomputers of various types. AMD's strategy is different from Nvidia's, whose CUDA technology is a proprietary standard.
The response from the scientific community was not long in coming. Projects to create new largest supercomputers using AMD GPU accelerators are nearing completion. The construction of the Lumi supercomputer in Finland with a performance of 0.5 exaflops (which is equivalent, for example, to the total performance of one and a half million laptops), is in full swing. In the same year, one and a half times more powerful supercomputer Frontier (1.5 exaflops) will appear in the United States, and in 2023 - an even more powerful El Capitan (2 exaflops).