************ Introduction ************ This is benchmark documentation for a Department of Energy (DOE) National Nuclear Security Administration (NNSA) Advanced Simulation and Computing (ASC) **Future Computing Resource (FCR)**. Benchmark Overview ================== .. list-table:: * - **Benchmark** - **Description** - **Language** - **Parallelism** - **Libraries** * - AMG2023 - AMG solver of sparse matrices - C - | MPI+CUDA/HIP/SYCL | OpenMP on CPU - Hypre * - Kripke - | Scalable 3D Sn deterministic | particle transport code - C++ - MPI+RAJA - RAJA, CHAI, Camp * - Laghos - | LAGrangian High-Order Solver, | unstructured high-order finite | element compressible gas dynamics - C++ - MPI+RAJA/CUDA/HIP - MFEM, Hypre * - RAJA Performance Suite - | Collection of loop-based computational | kernels found in HPC applications - C++ - | MPI+RAJA | /CUDA/HIP/OpenMP - RAJA * - ScaFFold - | Scale-Free Fractal Benchmark, | Proxy for emerging models such as | programmatic inverse-design projects - Python - | MPI/NCCL/RCCL | CUDA/HIP - PyTorch * - Branson - Implicit Monte Carlo transport - C++ - MPI+CUDA/HIP - N/A * - Sparta - Direct Simulation Monte Carlo - C++ - MPI+Kokkos - Kokkos * - LAMMPS ACE - | Molecular dynamics using | Atomic Cluster Expansion (ACE) - C++ - MPI+Kokkos - Kokkos * - Remhos - | REMap High-Order Solver, unstructured | high-order finite element advection - C++ - MPI+RAJA/CUDA/HIP - MFEM, Hypre * - MiniEM - Electro-Magnetics solver - C++ - MPI+Kokkos - Kokkos * - MLPerf - Llama 3.1 405B training - Python - NCCL+CUDA - NVIDIA NeMo .. _GlobalRunRules: Run Rules Synopsis ================== Source code modification categories: 1. Baseline: “out-of-the-box” performance * Code modifications not permitted * Compiler options can be modified, library substitutions permitted unless prohibited for a specific benchmark (see details on benchmark pages), problem decomposition may be changed * If provided code cannot run on the proposed architecture as-is, limited source code modifications are permitted to port and tune for the target architecture using directives or commonly used interfaces. 2. Optimized: "speed of light" * Aggressive code changes that enhance performance are permitted. Optimizations that will be applicable to mission applications are of more value. * Algorithms fundamental to the program may not be replaced. Wholesale algorithm changes or manual rewriting of loops that become strongly architecture specific are of less value. * The modified code must still pass validation tests. * Optimizations will be reviewed by subject matter experts for applicability to the larger application portfolio and other goals such as performance portability and programmer productivity. Approvals ========= - Benchmarks is released under the Creative Commons Attribution 4.0 International Public License. For more details, see the https://github.com/LLNL/benchmarks/blob/develop/LICENSE and https://github.com/LLNL/benchmarks/blob/develop/NOTICE files. SPDX-License-Identifier: CC-BY-4.0. LLNL-DATA-2007856. - Content from Sandia National Laboratories considered unclassified with unlimited distribution under SAND2023-12176O, SAND2023-01069O, and SAND2023-01070O.