摘要: |
Modern HPC software packages are rarely self-contained. They depend on a large number of external libraries, and many spend large fractions of their runtime in external subroutines. Performance portability depends not only on the effort of application teams, but also on the availability of well-tuned libraries. At most sites, the burden of maintaining libraries is shared by code teams and facilities. Facilities typically provide well-tuned default versions, but code teams frequently build with bleeding-edge compilers to achieve high performance. For this reason, HPC has no “standard” software stack, unlike other domains where performance is not critical. Incompatibilities among compilers and software versions force application teams and facility staff to re-build custom versions of libraries for each new toolchain. Because the number of potential configurations is combinatorial, and because HPC software is notoriously difficult to port to new machines [3, 7, 8], the tuning effort required to support and maintain performance-portable libraries outstrips the available manpower at most sites. Software complexity is a growing obstacle to performance portability for HPC. |