.. Copyright 2023 Lawrence Livermore National Security, LLC and other Benchpark Project Developers. See the top-level COPYRIGHT file for details. SPDX-License-Identifier: Apache-2.0 ======================= Tutorial: Benchpark 101 ======================= This tutorial will guide you through using Benchpark to run a strong scaling experiment with the `Kripke benchmark `_ on an AWS instance. It was presented at the `International Symposium on High-Performance Parallel and Distributed Computing (HPDC) `_ on July 20, 2025. The event was a half-day tutorial along with Caliper and Thicket. .. image:: tutorial/ReproduciblePerfAnalysis-HPDC25-Tutorial-Slide-Preview.jpg :target: _static/slides/ReproduciblePerfAnalysis-HPDC25-Tutorial-Slides.pdf :height: 72px :align: left :alt: Slide Preview :download:`Download Slides <_static/slides/ReproduciblePerfAnalysis-HPDC25-Tutorial-Slides.pdf>`. **Full citation:** Pearce, O., Scott, A., Becker, G., Haque, R., Hanford, N., Brink, S., Jacobsen, D., Poxon, H., Domke, J., & Gamblin, T. (2023, November 12–17). Towards Collaborative Continuous Benchmarking for HPC. By the end of this tutorial, you will be able to use Benchpark to: * Initialize a system configuration and experiment configuration * Build and run a scaling experiment * Perform pre-defined performance analysis on the results of the scaling experiment **Prerequisites** * Access to a terminal with Benchpark installed (provided automatically by the infrastructure in our `Benchpark Tutorial repository `_) * Basic familiarity with command-line interfaces -------------------------------------- Step 1: Verify Benchpark Installation -------------------------------------- First, ensure Benchpark is installed and working correctly by running: .. code-block:: bash benchpark --version You should see a version number like ``0.1.0``. ---------------------------------------------------- Step 2: Explore Available Benchmarks and Experiments ---------------------------------------------------- Next, list all available benchmarks and experiments in Benchpark by running: .. code-block:: bash benchpark list experiments You should see an output like: .. code-block:: text Experiments: ... hpcg+single_node hpcg+openmp hpcg+strong hpcg+weak hpl+single_node hpl+openmp hpl+strong hpl+weak ior+single_node ior+strong ior+weak kripke+single_node kripke+openmp kripke+cuda kripke+rocm laghos+single_node laghos+cuda laghos+rocm lammps+single_node lammps+openmp lammps+cuda lammps+rocm lammps+strong ... From this output, you can see that Benchpark experiments are specified using Spack-like conventions (e.g., ~, +). For example, the spec ``kripke+single_node`` describes an experiment using the Kripke benchmark running on a single node. Additionally, you can get only the experiments associated with a particular benchmark by adding :code:`--experiment ` to the above command. For example, to get only the experiments associated with Kripke, run: .. code-block:: bash benchpark list experiments --experiment kripke You should see the following: .. code-block:: text Experiments: kripke+single_node kripke+openmp kripke+cuda kripke+rocm .. note:: For Kripke, the default experiment is ``kripke+single_node``. For Kripke, we specify the strong scaling experiment on the command line using ``kripke scaling=strong`` as shown in Step 4. .. _step3_label: ------------------------------------------ Step 3: Initialize Your System Description ------------------------------------------ Next, initialize the description of the AWS system by running the commands below: .. code-block:: bash cd benchpark benchpark system init --dest=hpdc-tutorial aws-tutorial instance_type=c7i.12xlarge The :code:`benchpark system init` command generates configuration files that describe the system on which you are running. The system is specified in a system specification (``system.py``). In the command above, the spec (i.e., :code:`aws-tutorial instance_type=c7i.12xlarge`) defines a system running with `our tutorial infrastructure on AWS `_ that uses the `c7i.12xlarge instance type `_. After running the command above, you should see the following files in the ``hpdc-tutorial`` directory: * ``system_id.yaml``: a Benchpark configuration file that contains high-level metadata about the system * ``software.yaml``: a Ramble configuration file specifying the default packages to use for software like compilers and MPI * ``variables.yaml``: a Ramble configuration file defining variables that are needed for job script generation and scheduling (e.g., type of scheduler, number of cores per node) * ``auxiliary_software_files/compilers.yaml``: a Spack configuration file defining available compilers on the system * ``auxiliary_software_files/packages.yaml``: a Spack configuration file defining available software on the system ---------------------------------- Step 4: Initialize Your Experiment ---------------------------------- Next, initialize the Kripke strong scaling experiment used in this tutorial by running: .. code-block:: bash benchpark experiment init --dest=kripke-benchmark kripke scaling=strong caliper=time,mpi Similar to :code:`benchpark system init`, the :code:`benchpark experiment init` command generates the Ramble configuration file to describe the experiment to be run. The experiment is specified in an experiment specification (``experiment.py``). In the command above, the spec (i.e., :code:`kripke scaling=strong caliper=time,mpi`) defines a strong-scaling experiment running Kripke with the performance measurement tool known as `Caliper `_ enabled to collect performance metrics. The ``caliper=time,mpi`` specification enables execution time measurement and MPI library instrumentation. After running the command above, you should see a Ramble configuration file (``ramble.yaml``) in the ``kripke-benchmark`` directory. -------------------------------------- Step 5: Setup Your Benchpark Workspace -------------------------------------- After initializing the system description and experiment, setup a Benchpark workspace by running: .. code-block:: bash benchpark setup kripke-benchmark/ hpdc-tutorial/ wkp/ This command takes the configuration files stored in the output directories of :code:`benchpark experiment init` (i.e., ``kripke-benchmark/``) and :code:`benchpark system init` (i.e., ``hpdc-tutorial/``) and combines them to generate a Benchpark workspace. A Benchpark workspace contains everything that Benchpark, Ramble, and Spack need to build and run your experiment, including: * Clones of Spack and Ramble * A ``setup.sh`` script that calls Spack and Ramble's setup scripts * A Ramble workspace To start using your Benchpark workspace, run: .. code-block:: bash . /home/jovyan/benchpark/wkp/setup.sh .. _step6_label: ----------------------------------------------------------------- Step 6: Build Software Dependencies and Generate Experiment Files ----------------------------------------------------------------- Next, build any necessary software and generate all necessary files for the Kripke scaling experiment by running: .. code-block:: bash ramble --disable-progress-bar \ --workspace-dir /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace \ workspace setup This command does two things. First, it builds all necessary software using Spack. Building the software may take a while to complete, depending on how many external packages are contained in the system definition from :ref:`Step 3 `. For this tutorial, it should take roughly 2 minutes. Second, this command generates batch scripts (e.g., submission scripts) for executing the experiment. For each run in the experiment, a directory containing the files necessary for the run will be created under ``/home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/experiments/kripke/kripke``. If the command is successful, you should see something like: .. code-block:: text ==> Streaming details to log: ==> /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23.out ==> Setting up 4 out of 4 experiments: ==> Experiment #1 (1/4): ==> name: kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_2_2_1_64_64_32_64_1_128_128_4_4 ==> root experiment_index: 1 ==> log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23/kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_2_2_1_64_64_32_64_1_128_128_4_4.out ==> Returning to log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23.out ==> Experiment #2 (2/4): ==> name: kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_2_2_2_64_64_32_64_1_128_128_4_8 ==> root experiment_index: 2 ==> log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23/kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_2_2_2_64_64_32_64_1_128_128_4_8.out ==> Returning to log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23.out ==> Experiment #3 (3/4): ==> name: kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_4_2_2_64_64_32_64_1_128_128_4_16 ==> root experiment_index: 3 ==> log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23/kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_4_2_2_64_64_32_64_1_128_128_4_16.out ==> Returning to log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23.out ==> Experiment #4 (4/4): ==> name: kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_4_4_2_64_64_32_64_1_128_128_4_32 ==> root experiment_index: 4 ==> log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23/kripke.kripke.kripke_kripke_single_node_strong_scaling_caliper_time_mpi_4_4_2_64_64_32_64_1_128_128_4_32.out ==> Returning to log file: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/setup.2025-07-09_18.08.23.out ------------------------------------------ Step 7: Run Kripke Experiment using Ramble ------------------------------------------ Next, run the Kripke strong scaling experiment by running the following command: .. code-block:: bash ramble --disable-progress-bar \ --workspace-dir /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace \ on This command submits the batch scripts (e.g., submission scripts) generated in :ref:`Step 6 ` to the system's resource manager (which is specified in the files generated by :code:`benchpark system init`). For the AWS infrastructure used in this tutorial, the resource manager is LLNL's `Flux resource manager `_. If the above command is successful, you should see something like: .. code-block:: bash ==> Streaming details to log: ==> /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/execute.2025-07-09_18.14.08.out ==> Executing 4 out of 4 experiments: ==> Log files for experiments are stored in: /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/logs/execute.2025-07-09_18.14.08 ==> Running executors... ƒV54uD5o5 ƒV57fKkEK ƒV5ANUS6s ƒV5D498h5 The final lines printed by the :code:`ramble on` command are the job IDs produced by the system resource manager. You can use these IDs to track the progress of the jobs in your experiment. For example, with Flux, you can see job status by running: .. code-block:: bash flux jobs -a This command will produce an output like: .. image:: ./flux_jobs_a_output.png :alt: Example output of flux jobs -a :width: 750px :align: center .. note:: If you are running on our `AWS infrastructure `_, it should take roughly 8 minutes for all jobs to finish running. Additionally, only one job will run at a time under our infrastructure because each user only has 1 node. If you are running on an HPC system, expect the jobs to complete faster. After all the jobs are finished, each job directory (i.e., subdirectories of ``/home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/experiments/kripke/kripke``) will contain a Caliper output file (i.e., a ``.cali`` file) containing performance data for the job. ------------------------ Step 8: Analyze Results ------------------------ Finally, we perform pre-defined analysis on the Caliper files generated by the scaling study we defined in the experiment. The ``benchpark analyze`` command uses the `Thicket `_ performance analysis tool to compose the Caliper performance profiles and visualize the scaling performance. We focus on the application-level function calls, specifying the ``--no-mpi`` flag to hide MPI function calls in the resulting graph. We also specify ``--chart-fontsize`` to increase the overall font size in the resulting graph, helpful for better readability of the graph in presentations. .. code-block:: bash benchpark analyze \ --workspace-dir /home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace \ --no-mpi \ --chart-fontsize 15 The command above reads in the Caliper files generated by the experiment and outputs several files, such as the stacked area chart and Caliper calling context tree shown below. These files can be found in ``/home/jovyan/benchpark/wkp/kripke-benchmark/hpdc-tutorial/workspace/analyze``. .. image:: ./graph-and-tree.png :width: 900px :align: center ---------- Next Steps ---------- Now that you know how to initialize, run, and analyze the performance of an experiment, check out our :doc:`Benchpark Workflow <./benchpark-workflow>` page for more information on how to interact with Benchpark. We have guides for users wanting to add or modify a system, add a new benchmark, or define a new experiment parameters for a benchmark.