Under a new memorandum of understanding, researchers at LLNL, IBM, and Red Hat will aim to enable next-generation workloads by integrating LLNL’s open source Flux scheduling framework with Red Hat OpenShift to allow more traditional HPC jobs to take advantage of cloud and container technologies. “Cloud systems are increasingly setting the directions of the broader computing ecosystem, and economics are a primary driver,” said Bronis de Supinski, CTO of Livermore Computing at LLNL. “With the growing prevalence of cloud-based systems, we must align our HPC strategy with cloud technologies, particularly in terms of their software environments, to ensure the long-term sustainability and affordability of our mission-critical HPC systems.” Read more at LLNL News.
Called to a Valuable Function, Stephanie Brink Streamlines the Lab’s Code
April 27, 2021 (profile)
LLNL Computing relies on engineers like Stephanie Brink to keep the legacy codes running smoothly. “You’re only as fast as your slowest processor or your slowest function,” says Stephanie, who works in the Center for Applied Scientific Computing. By analyzing a legacy code’s performance, Stephanie and her team can reduce the amount of time it takes to run and allow for more critical science to be accomplished. Stephanie is a frquent contributor to open source software, including Hatchet and Variorum. Read the full profile at LLNL Computing.
SAMRAI 4.1.0 Released
April 24, 2021 (release)
SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) is an object-oriented C++ software library that enables exploration of numerical, algorithmic, parallel computing, and software issues associated with applying structured adaptive mesh refinement (SAMR) technology in large-scale parallel application development. The current release features a new alias tbox::ResourceAllocator to clean up the API for usage of Umpire allocators in pdat classes. This provides a valid type name that can be used and passed through the pdat classes’ APIs regardless of the status of the configuration.
The Exascale Computing Project, a joint effort between the DOE Office of Science and NNSA, brings together several national laboratories to address many hardware, software, and application challenges inherent in the organizations’ scientific and national security missions. The ECP’s annual meeting was held virtually this year on April 12-16. Several sessions are available in a YouTube playlist. LLNL’s highlights feature open source projects that are crucial to the ECP’s collaborative goals:
Spack BoF (runtime 1:00:40): This “birds of a feather” gathering details major developments in Spack releases, collaborative work with the E4S team, roadmap for future development, and results from a community survey.
Using Spack to Accelerate Developer Workflows (runtime 6:14:42): This tutorial focuses on developer workflows, covering covered installation, package authorship, Spack’s dependency model, and Spack environments and configuration. Participants can learn new skills in this tutorial, even if they have participated in Spack tutorials in the past.
Characterizing Performance Improvements in the Center for Efficient Exascale Discretizations (runtime 1:00:04, CEED section begins at 25:05): Speakers from ECP Application Development areas talked about how they set figures of merit, determined key performance parameters, and calculated efficiency of codes. CEED is a co-design center led by LLNL and focusing on discretization algorithms that better exploit the hardware and deliver a significant performance gain over conventional low-order methods. The video concludes with a panel discussion with the speakers.
S&TR Features Software for the Exascale Era
April 21, 2021 (story)
The latest issue of LLNL’s Science & Technology Review magazine showcases Computing in the cover story (see abstract below) and Commentary. Open source software plays a prominent role in the initiatives described in the story. The cover art shows an advection simulation powered by open source repos MFEM and GLVis.
Cover story: The Exascale Software Portfolio by Holly Auten and featuring Lori Diachin, Rob Neely, Jeff Hittinger, Ulrike Meier Yang, David Beckingsale, and Tzanio Kolev
Abstract: As a leader in high-performance computing, Lawrence Livermore wields a large portion of the Department of Energy’s HPC resources to advance national security and foundational science. The Sierra supercomputer supports the National Nuclear Security Administration’s Stockpile Stewardship Program by enabling more accurate, more predictive simulations. This generation of computers is known as heterogeneous, or hybrid, because their architectures combine graphics processing units and central processing units to achieve peak performance well above 100 petaflops. (A petaflop is 10^15 floating-point operations per second.) The next generation’s processing capability—at least an exaflop (10^18 flops)—will be many times greater. HPC software must adjust to these new hardware standards. As the exascale era begins, two major initiatives leverage and expand Livermore’s HPC capabilities, with a spotlight here on software. The Exascale Computing Project, a joint effort between the DOE Office of Science and NNSA, brings together several national laboratories to address many hardware, software, and application challenges inherent in the organizations’ scientific and national security missions. Within the Laboratory, the RADIUSS project aims to benefit scientific applications through a robust software infrastructure.
LLNL's Spring Hackathon Coming Up
April 20, 2021 (event)
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. Like the hackathons of the past year, the spring event (April 29-30) will be held virtually using WebEx and Mattermost for collaboration. LLNL sponsors are two Computing divisions: Enterprise Applications Services and National Ignition Facility Computing.
New Web and Jupyter Versions of GLVis
April 20, 2021 (release)
GLVis is a lightweight OpenGL tool for accurate and flexible finite element visualization. It’s been recently upgraded for wider usability:
The full feature set is available on Linux, Mac, and Windows.
In Jupyter, installation is as easy as pip install glvis to use the full power of GLVis in an interactive Python notebook.
SCR 3.0 Release Candidate
April 16, 2021 (release)
The Scalable Checkpoint/Restart (SCR) library enables MPI applications to utilize distributed storage on Linux clusters to attain high file I/O bandwidth for checkpointing, restarting, and writing large datasets. The 3.0 release candidate includes:
improved support for large datasets and shared access to files
new API calls
new logging options
assists for application developers when integrating the SCR API
MTtime (Time Domain Moment Tensor Inversion in Python) is a python package developed for time domain inversion of complete seismic waveform data to obtain the seismic moment tensor. It supports deviatoric and full moment tensor inversions, and 1-D and 3-D basis Green’s functions. Documentation and a working example are provided.
Charliecloud 0.23 Released
April 16, 2021 (release)
LANL led with LLNL contributors, Charliecloud provides user-defined software stacks for HPC centers. It uses Linux user namespaces to run containers with no privileged operations or daemons and minimal configuration changes on center resources. Version 0.22 includes new functionality for ch-image, ch-image build, and ch-image push.
New Project Aims to Solve the Software Complexity Puzzle
April 14, 2021 (story)
The HPC world is full of complexity—from applications to the software components they rely on and the hardware they need to run. With the first three exascale machines, including Livermore’s El Capitan, slated to come online in the next few years, addressing complexity challenges will be a heavier, more urgent lift. Like our current Sierra system, these exascale systems will derive most of their computational power from secondary accelerator processors called GPUs. Traditionally, HPC systems have used only CPUs. With these machines, developers will need to accommodate not just NVIDIA accelerators but also new offerings from AMD and Intel. Harnessing the power of these devices entails using rapidly evolving programming environments, which require new compilers, runtime libraries, and software packages whose relationships are not always well understood. Without automated approaches to integration, developers will fight these software stacks by hand—but manual integration and maintenance are unsustainable.
A new effort kicking off in fiscal year 2021 aims to develop a machine-verifiable model of package compatibility that will enable automated integration, reducing human labor and errors. The Binary Understanding and Integration Logic for Dependencies (BUILD) project will run for three years with computer scientist Todd Gamblin at the helm. He states, “This project will develop techniques that enable rapid integration of HPC software systems, especially for upcoming exascale machines.” The project will build on Spack—the widely adopted package manager with a repository of more than 5,000 packages. Created by Gamblin in 2013 and today supported by a core development team, Spack already incorporates package configuration capabilities with dependency solving techniques.
LLNL’s Rob Falgout Named to 2021 Class of SIAM Fellows
April 09, 2021 (profile) (story)
The Society for Industrial and Applied Mathematics (SIAM) has announced its 2021 Class of Fellows, including LLNL computational mathematician Rob Falgout. Falgout is best known for his development of multigrid methods and for Hypre, one of the world’s most popular parallel multigrid codes. LLNL News has the fully story about this honor.
BLT 0.4.0 Released
April 09, 2021 (release)
BLT (Building, Linking, and Testing) is a streamlined CMake build system foundation for developing HPC software. BLT makes it easy to get up and running on a wide range of HPC compilers, operating systems, and technologies. The repo includes unit testing and benchmarking. The v0.4.0 release includes added variables, support for clang-tidy static analysis check, user option for enforcing specific versions of autoformatters, new macros, and much more.
PairScore is a preliminary code to predict binding affinity from the pairwise distances between protein and ligand atoms. The repo contains an example.
CCT 1.0.11 Released
April 02, 2021 (release)
The Coda Calibration Tool (CCT) calculates reliable moment magnitudes for small- to moderate-sized seismic events. This release includes updates to the measurement Mw fitting algorithm, autopicking, constraints (e.g., maximum stress changed from 10 to 100 MPa), and the spectra truncation feature.
The Center for Efficient Exascale Discretizations (CEED) within the US Department of Energy’s ECP is helping applications leverage future architectures by developing state-of-the-art discretization algorithms that better exploit the hardware and deliver a significant performance gain over conventional methods. libCEED is a high-order API library that provides a common algebraic low-level operator description, allowing a wide variety of applications to take advantage of the efficient operator evaluation algorithms in the different CEED packages. This major release includes support for matrix assembly (mainly intended for low order and coarse grids), new HIP and MAGMA backends with kernel fusion, Julia and Rust interfaces, and more.
HiOp is an optimization solver for solving certain mathematical optimization problems expressed as nonlinear programming problems. This lightweight HPC solver leverages application’s existing data parallelism to parallelize the optimization iterations by using specialized linear algebra kernels. This version includes centers on sparse optimization solver and enhanced support for device computations, including:
Development of a sparse NLP solver and associated sparse NLP interface
Update of the mixed dense-sparse NLP solver to support full GPU compute mode
UnifyFS is a user-level burst buffer file system under active development. The repo supports scalable and efficient aggregation of I/O bandwidth from burst buffers while having the same life cycle as a batch-submitted job. Version 0.9.2 includes updates for newer versions of dependencies, support for setting cores-per-server via environ, config option changes, new unit and CI tests, and more.
From HPC Tech Shorts, this video (25:09) shows Amazon Web Services team members discussing the NoTearsHPC cluster solution for 1-click launches. Evan Bollig and Sean Smith talk about how the cluster works, what it provides, and how to do complicated tasks quickly. They used Spack for installation.
Kosh 1.2 Released
March 24, 2021 (release)
Kosh allows codes to store, query, share data via an easy-to-use Python API. Kosh lies on top of Sina and, as a result, can use any database backend supported by Sina. This software aims to make data access and sharing as simple as possible. In this backwards-compatible release:
Operators are introduced allowing the composition of features from one or many sources.
Feature selection without extraction is now possible via the new execution graphs introduced in this release.
Execution graphs are the recommended way to use Kosh going forward, as reflected in the updated notebooks.
STAT, the Stack Trace Analysis Tool, is a highly scalable, lightweight tool that gathers and merges stack traces from all of the processes of a parallel application to form call graph prefix trees. STAT generates two prefix trees termed 2D-trace-space and 3D-trace-space-time. STAT’s source code also includes STATBench, a tool to emulate STAT. STATBench enables the benchmarking of STAT on arbitrary machine architectures and applications by fully utilizing parallel resources and generating artificial stack traces. Version 4.1.0 DynInst 10.2 support and initial rocgdb support.
LLNL’s computing website recently underwent a major overhaul to its design and information architecture. The site now features a taxonomy of Focus Areas that connect related content. These topics are tagged on News, People Highlights, and Projects. One of the topics is open source software. The site’s Livermore Computing page also directs users to this website for more information about open source projects.
Novel Deep Learning Framework Includes New Repo
March 18, 2021 (new-repo) (story)
LLNL computer scientists have developed a new framework and an accompanying visualization tool that leverages deep reinforcement learning for symbolic regression problems, outperforming baseline methods on benchmark problems. The paper was recently accepted as an oral presentation at the International Conference on Learning Representations (ICLR 2021), one of the top machine learning conferences in the world. The conference takes place virtually May 3-7, and the team’s deep symbolic regression code is available in a GitHub repo.
Aluminum 1.0.0 Released
March 06, 2021 (release)
Aluminum provides a generic interface to high-performance communication libraries with a focus on allreduce algorithms. Blocking and non-blocking algorithms and GPU-aware algorithms are supported. Aluminum also contains custom implementations of select algorithms to optimize for certain situations. This, the v1.0.0 release, is stable and includes refactored communicators, a barrier operation, and support for vector collectives.
Variorum is a platform-agnostic library exposing monitor and control interfaces for several features in hardware architectures. Its general interfaces provide privileged functionality for monitoring and controlling various hardware-level features of multiple hardware architectures. The latest release includes:
Juqbox.jl is a package for solving quantum optimal control problems in closed quantum systems, where the evolution of the state vector is governed by Schrodinger’s equation. See the project’s README for installation instructions and workflow details. A few examples are also provided.
PolyClipper 1.1 Released
February 23, 2021 (release)
PolyClipper is a C++ reimplementation of the geometric clipping operations in the R3D library originally written by Devon Powell, as documented in the paper Powell & Abell (2015). This release transforms PolyClipper to a header-only C++ library.
Saloon is a Vim plugin that simplifies Python code linter/fixer configuration and usage. Saloon’s menu lets developers toggle which static analysis tools to use and delegates those changes to ALE’s API. Since prospector already handles multiple tools, and is integrated with ALE, most of the actual linting will initially be handled via prospector calls. See the project’s README for information about how to get started using Saloon.
MFEM GPU Tips & Tricks
February 17, 2021 (story)
The MFEM team has created a helpful page of tips and tricks that explain how to make the most of GPUs when running finite element algorithms. This support documentation includes information about optimizing porting and performance. Learn more about these features and processes:
MFEM’s internal memory manager to simplify the use of host/device memory
MFEM_FORALL macro to enable performance portability
Maximizing the main memory bandwidth
Profiling on NVIDIA GPUs to improve the performance of a memory bound kernel
Roofline model for predicting the peak performance achievable by a specific algorithm
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. Like the spring and summer hackathons of 2020, this year’s winter event (February 11-12) was held virtually using WebEx and Mattermost for collaboration. With LLNL’s Data Science Institute (DSI) sponsoring the hackathon, the agenda included guest speakers (below) discussing data science topics relevant to the Lab’s missions as well as a deep learning tutorial. (Participants were not required to attend the talks.) Read the recap on the DSI website.
Brian Van Essen: COVID-19 Rapid Drug Discovery
Jose Cadena Pico: Modeling the Temporal Network Dynamics of Neuronal Cultures
Benjamin Priest: Querying Massive Graphs with Sketching Algorithms
Kelli Humbird: Data-Driven Design for Inertial Confinement Fusion
Cindy Gonzales and Luke Jaffe: Introduction to Deep Learning for Image Classification
Conduit 0.7.0 Released
February 08, 2021 (release)
Conduit provides an intuitive model for describing hierarchical scientific data in C++, C, Fortran, and Python. It is used for data coupling between packages in-core, serialization, and I/O tasks.
LANL led with LLNL contributors, Charliecloud provides user-defined software stacks for HPC centers. It uses Linux user namespaces to run containers with no privileged operations or daemons and minimal configuration changes on center resources. Version 0.22 includes enhancements to ch-image pull and ch-image build and more.
mpibind is a memory-driven algorithm to map parallel hybrid applications to the underlying heterogeneous hardware resources transparently, efficiently, and portably. Unlike other mappings, its primary design point is the memory system, including the cache hierarchy. Compute elements are selected based on a memory mapping and not vice versa. Several new incremental versions have been released simultaneously:
Aluminum provides a generic interface to high-performance communication libraries with a focus on allreduce algorithms. Blocking and non-blocking algorithms and GPU-aware algorithms are supported. Aluminum also contains custom implementations of select algorithms to optimize for certain situations. In this release, the testing and benchmarking infrastructure has been rewritten to be significantly more comprehensive and cleaner. The repo also now includes scripts for nicely plotting benchmark results.
SUNDIALS is a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. This release includes a new NVECTOR implementation based on the SYCL abstraction layer and a new SUNMatrix and SUNLinearSolver implementation.
Award-Winning Computer Vision Research Includes New Repo
January 08, 2021 (new-repo) (story)
The 2021 IEEE Winter Conference on Applications of Computer Vision (WACV 2021) announced that a paper co-authored by Rushil Anirudh received the conference’s Best Paper Honorable Mention award based on its potential impact to the field. The paper, titled “Generative Patch Priors for Practical Compressive Image Recovery,” introduces a new kind of prior—a characterization of the space of natural images—for compressive image recovery that is trained on patches of images instead of full-sized images. Unlike existing generative methods that are applicable only to images similar to the training dataset—i.e., similar kinds of objects, image sizes or aspect ratios—the generative patch prior (GPP) can recover a wide variety of natural images and compares favorably to other existing methods, researchers said. Anirudh presented the paper on behalf of the group during an awards session hosted by the virtual conference, the premier event of its kind in the world. The conference received about 1,100 paper submissions—only 5 were honored with awards. The code used in the paper is available on the open source repository GPP on GitHub.
New Repo: SPOT Suite
January 07, 2021 (new-repo)
SPOT is a web-based visualization tool for performance data. One use case is to link the Caliper performance library into an application, and every run of the application will produce a .cali performance data file. SPOT will then visualize the collective performance of an application across many runs. This could involve tracking performance changes over time, comparing the performance achieved by different users, running scaling studies across time, and so on.
We now have three repos that provide different functionality around SPOT:
To get started, fork spot2_container and let it check out the other two repos as submodules.
New Repo: ns3-if77-module
December 17, 2020 (new-repo)
The ns3-if77-module provides an implementation of the ns3::PropagationLossModel, which uses the If77 propagation model developed by G. D. Gierhart and M. E. Johnson. The If77 propagation model was developed in the 1970s to estimate the service coverage for radio systems. It can be used to calculate propagation loss for ground/air, air/air, ground/satellite, and air/satellite systems for frequencies in the 0.1Mhz to 20Ghz range.
NeurIPS Features LLNL Papers and Software
December 07, 2020 (event) (story)
The 34th Conference on Neural Information Processing Systems (NeurIPS)features two LLNL papers advancing the reliability of deep learning for the Lab’s mission-critical applications. The most prestigious machine learning conference in the world, NeurIPS began virtually on December 6. The first paper describes a framework for understanding the effect of properties of training data on the generalization gap of machine learning (ML) algorithms—the difference between a model’s observed performance during training versus its “ground-truth” performance in the real world. The second NeurIPS paper introduces an automatic framework to obtain robustness guarantees of any deep neural network structure using the open source Linear Relaxation-based Perturbation Analysis (LiRPA) repo. Developed with colleagues at Northeastern University, China’s Tsinghua University, and UCLA, LiRPA algorithms can provide guaranteed upper and lower bounds for a neural network function with perturbed inputs.
New Repo: [Boost].MPI3
December 04, 2020 (new-repo)
MPI is a large library for run-time parallelism where several paradigms coexist. It was is originally designed as standardized and portable message-passing system to work on a wide variety of parallel computing architectures. The last standard, MPI-3, uses a combination of techniques to achieve parallelism, Message Passing (MP), Remote Memory Access (RMA),and Shared Memory (SM). [Boost].MPI3 is a C++ library wrapper for standard MPI3. While not an official MPI3 library, [Boost].MPI3 is designed following the principles of Boost and the STL. This repo provides a uniform interface and abstractions for MPI3 features by means of wrapper function calls and concepts brought familiar to C++ and the STL.
Spack User Survey Results
December 02, 2020 (story)
The Spack development team ran a user survey from September 28 to October 9 and received 169 responses. The survey covered user demographics, use cases, feature priorities, community involvement, and more. For example, responses indicated strong interest in a future virtual workshop. The full analysis is available on the Spack website, and the survey data is housed in its own repo.
New Repo: Lestofire
November 30, 2020 (new-repo)
Lestofire implements the level set method in topology optimization. It combines the automated calculation of shape derivatives in Firedrake and pyadjoint with the Null space optimizer to find the optimal design.
New Templates for Community Health Files
November 24, 2020 (this-website)
Our .github repo houses file templates and other content that can be used by LLNL open source projects. The goal is to help standardize the presentation and organization of certain types of content across the LLNL organization. New this month are community health files that developers can copy and/or modify as needed to ensure their repos adhere to certain guidelines regarding licenses and other aspects of releasing and maintaining open source software. New files include Contributing Guidelines, a Notice, a Code of Conduct, and templates for opening issues and submitting pull requests. More information is available on the .github README.
LLNL's First Computing Virtual Expo
November 11, 2020 (event-report)
The LLNL Computing Virtual Expo was an end-to-end digital experience with interactive booths, networking opportunities, and on-demand presentations, held on September 30. Lab employees and the public were invited to learn about new initiatives while networking and engaging with the Computing community, including computer scientists, IT experts, HPC contacts, and software developers and engineers.
PF3DCOMM is an MPI benchmark that tests the performance of communication patterns used in pF3D, a laser-plasma simulation code developed at LLNL. The benchmark is intended for use in evaluating the effectiveness of HPC interconnects.
New Repo: Admiral
November 04, 2020 (new-repo)
Admiral is a package for developing agent based simulations and training them with multiagent reinforcement learning. A reinforcement learning experiment contains two main components: (1) a simulation environment and (2) learning agents, which contain policies that map observations to actions. These policies may be hard-coded by the researcher or trained by the RL algorithm. In Admiral, these two components are specified in a single Python configuration script. The components can be defined in-script or imported as modules.
FAROS Team Wins Best Paper Award at OpenMP Workshop
October 30, 2020 (story)
A team of LLNL computer scientists and a collaborator from Argonne National Laboratory (ANL) won the Best Paper Award at the International Workshop on OpenMP (IWOMP) 2020 in September. Giorgis Georgakoudis, Ignacio Laguna, Tom Scogland (LLNL), and Johannes Doerfert (ANL) accepted the award for their paper, “FAROS: A Framework to Analyze OpenMP Compilation Through Benchmarking and Compiler Optimization Analysis.” The paper showcases the new Livermore-developed framework, FAROS, which pinpoints missing compiler optimizations due to OpenMP compilation and measures the impact on performance. FAROS is the result of a previously funded Laboratory Directed Research and Development Feasibility Study, which was led by Georgakoudis. Read more about the award.
LLNL Heads to SC20
October 28, 2020 (event)
The 32nd annual Supercomputing Conference (SC20) will be held virtually throughout November 9–19. This year the conference groups event types (e.g., tutorials, posters) into specific days. Much of the content is pre-recorded and will remain available online for six months. Tutorials will be live-streamed. As always, LLNL teams are looking forward to the event.
FAROS is a framework for benchmarking and analysis of compiler optimization. The repo contains a benchmark harness to fetch, build, and run programs with different compilation options for analyzing the impact of compilation on performance. The use case is contrasting OpenMP compilation with its serial elision to understand performance differences due to different compilation. The README file contains information about how the software works. The team has published a paper with more details and recently won the Best Paper Award at the 2020 International Workshop on OpenMP (IWOMP).
Open-Source Software Community Welcomes Virtual Internships
October 01, 2020 (story) (this-website)
LLNL hosts hundreds of student interns annually—even during a year distinguished by the COVID-19 pandemic. This summer, the Computing Scholar Program welcomed 160 undergraduate and graduate students into virtual internships. The Lab’s open source community was already primed for student participation. A Computing news article describes three open source projects that benefitted from interns’ help: Ascent, MFEM, and this website. Mentors discuss the challenges of mentoring remotely, while students describe their experiences including skill development.
Kosh allows codes to store, query, share data via an easy-to-use Python API. Kosh lies on top of Sina and, as a result, can use any database backend supported by Sina. This software aims to make data access and sharing as simple as possible. The repo includes installation instructions.
New Repo: Spheral++
September 22, 2020 (new-repo)
Spheral++ (documentation) provides a steerable parallel environment for performing coupled hydrodynamical and gravitational numerical simulations. Hydrodynamics and gravity are modeled using particle based methods (SPH and N-Body). Features include:
PYB11Generator is a python based code generator that creates pybind11 code for binding C++ libraries as extensions in Python. The software parses input that is very close to writing the desired interface in native python, turning this into the corresponding pybind11 C++ code. The repo includes documentation.
Celebrate Exascale Day
September 18, 2020 (event) (multimedia)
Exascale computing will transform the ability to tackle some of the world’s most important challenge. The Exascale Computing Project (ECP) celebrates this new era of scientific discovery with Exascale Day on October 18, or “10^18” to represent the exascale threshold of floating-point operations per second. This virtual event will provide videos, audio discussions, and articles that will educate participants about impact areas of exascale computing from the Department of Energy national laboratories, HPC manufacturers, and leading universities and industrial organizations. LLNL will be participating, and much of the ECP’s software stack is open source.
This site’s News and Archive pages have been updated with filters for selecting news posts by category. These categories appear next to the date on each post. We have nearly five years’ worth of news, so this feature improves the navigation of different types of news. These filters were implemented by our 2020 summer intern.
New Visualizations of Popular Repositories
August 31, 2020 (this-website)
The Explore section of this website has again expanded to include a new page that breaks down the popularity (i.e., stars) of LLNL repositories in a few ways: repos with the highest number of stars, creation history of those repos, increase of stars over time, commit activity of popular repos, and licenses of those repos. This new page, created by our 2020 summer intern, helps us better understand repos that have made a big impact in the open source community.
New Repo: CAMP
August 25, 2020 (new-repo)
CAMP is a compiler-agnostic metaprogramming library providing concepts, type operations, and tuples for C++ and Cuda. The project collects a variety of macros and metaprogramming facilities for C++ projects.
CEED held its 4th annual meeting on August 11-12 using ECP Zoom for videoconferencing and Slack for side discussions. The goals of the meeting were to report on the progress in the Center; deepen existing and establish new connections with ECP hardware vendors, ECP software technologies projects, and other collaborators; plan project activities; and brainstorm/work as a group to make technical progress. In addition to gathering together many CEED researchers, the meeting included representatives of ECP management, hardware vendors, software technology, and other interested projects. The full meeting agenda is available on the CEED website.
New Dependencies Page on Software Portal
July 28, 2020 (this-website)
The Explore section of this website has grown to include a new page that visualizes our software catalog’s dependencies. LLNL software repos are shown in the context of repositories with dependencies, External Packages, and internal packages. You can move the slider to change the connections between repos, organizations, and dependencies as well as click on a circle to isolate its specific connections in an expansion panel on the right side of the page. This work, which enables us to learn more about our repos and how they are related, was done by our 2020 summer intern.
Tool Time: Caliper - A Performance Analysis Toolbox in a Library
July 27, 2020 (story)
The Performance Optimisation and Productivity blog published a post by LLNL’s David Boehme, who described the open source Caliper program instrumentation and performance measurement framework. Caliper can be used for lightweight always-on profiling or advanced performance engineering use cases, such as tracing, monitoring, and auto-tuning. It is primarily aimed at HPC applications, but works for any C/C++/Fortran program on Unix/Linux. The blog post outlines Caliper’s instrumentation and API with examples.
New Repo: mpibind
July 22, 2020 (new-repo)
mpibind is a memory-driven algorithm to map parallel hybrid applications to the underlying heterogeneous hardware resources transparently, efficiently, and portably. Unlike other mappings, its primary design point is the memory system, including the cache hierarchy. Compute elements are selected based on a memory mapping and not vice versa.
Spack Tutorial on AWS
July 20, 2020 (event-report)
Amazon Web Services hosted a free two-day Spack tutorial broadly targeted at HPC users, developers, and user support teams. Each day consisted of two 1.5-hour sessions with a 30-minute break in the middle. The first day covered Spack basics, while the second day drilled down on advanced features. Videos from day 1 (3:19:18) and day 2 (3:30:18) are available.
LLNL's Summer Hackathon Will Be Virtual
July 18, 2020 (event)
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. Like the spring hackathon earlier this year, the summer event (August 6-7) will be held virtually using WebEx and Mattermost for collaboration. LLNL sponsors are Livermore Computing and the Center for Applied Scientific Computing. Registration closes on July 31.
Webinar: What’s New in Spack?
July 15, 2020 (event-report) (multimedia)
The IDEAS Productivity project, in partnership with the DOE Computing Facilities of the ALCF, OLCF, and NERSC and the DOE Exascale Computing Project, hosts a webinar series on Best Practices for HPC Software Developers. A webinar titled “What’s New in Spack?” was presented by LLNL’s Todd Gamblin on July 15. Slides and a video (1:26:33) from the session are available.
New Repo: SPIFY
July 13, 2020 (new-repo)
SPIFY is a C++ library for parsing input files to be used in scientific computing applications. The library allows an application developer to define a full set of required and optional input variable of different types and handles all of the parsing and validation. Examples are included in the repo.
New Repo: pLiner
July 08, 2020 (new-repo)
Compiler optimizations can alter significantly the numerical results of scientific computing applications. When numerical results differ significantly between compilers, optimization levels, and floating-point hardware, these numerical inconsistencies can impact programming productivity. pLiner is a framework that helps programmers identify locations in the source of numerical code that are highly affected by compiler optimizations. It uses a novel approach to identify such code locations by enhancing the floating-point precision of variables and expressions.
New Consolidated FAQ on Software Portal
July 08, 2020 (this-website)
Much of the content under the About section of this website has been consolidated into an easy-to-navigate FAQ page. The FAQ explain how to get started on GitHub, become part of the LLNL organization, manage repositories, and much more. We encourage readers to provide feedback or new questions by contacting the LLNL GitHub admins or submitting a pull request.
New Data Visualizations on Software Portal
July 07, 2020 (this-website)
The Explore section of this website is benefitting from new development by our summer intern. Data we collect from GitHub is visualized in various ways, with additional visualizations planned. These efforts help us understand our repos’ activity, how they are being used, development trends, and more. Check out the new “Repo Licenses” viz and stay tuned for more!
LLNL to Host Online Developer Day
June 09, 2020 (event)
Initiated in 2017, Developer Day is a day-long, annual event that brings software developers together from all over LLNL. The fourth installment of the popular event will be held virtually on July 30. Read more about Dev Day in last year’s recap.
ISC Is Going Virtual
June 08, 2020 (event)
Although in-person conferences are not feasible this summer, LLNL will participate in the online ISC High Performance Conference (ISC20) on June 22–25. The event brings together the HPC community—from research centers, commercial companies, academia, national laboratories, government agencies, exhibitors, and more—to share the latest technology of interest to HPC developers and users. View details about LLNL’s papers, poster, and workshops.
Webcast: Open Source Doesn't Have to Be Scary
May 23, 2020 (event-report) (multimedia)
LLNL’s Ian Lee recently appeared on the Thought Leadership Consortium webcast entitled “Open Source Doesn’t Have to Be Scary.” Registration is free to watch the Zoom replay (01:25:00) on demand.
Working Remotely: The Spack Team
May 16, 2020 (story)
Better Scientific Software’s blog features a post about the Spack team’s experience working remotely and interacting with the Spack community. LLNL’s Todd Gamblin offers insight into making the most of online communication opportunities and stresses the importance of providing robust documentation so users can help themselves.
Maestro Workflow Conductor
May 04, 2020 (story)
Maestro Workflow Conductor is a lightweight Python tool that can launch multi-step software simulation workflows in a clear, concise, consistent, and repeatable manner. It does this locally as well as on supercomputers. LLNL Computing recently published a project description, highlighting the challenges in scientific workflows that Maestro solves. “Before Maestro, it took a long time to stand up new workflows. Maestro has changed that by providing a consistent framework that can break down workflows into smaller pieces, and facilitate automated execution,” said project leader Frank Di Natale. Check out the Maestrowf repo.
LLNL to Host First Virtual Hackathon
April 12, 2020 (event)
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. This year’s spring hackathon (April 30 through May 1) will be held virtually. In true hackathon spirit, several tech solutions will enable participants to collaborate remotely. Charalynn Macedo, division leader for LLNL’s Enterprise Applications Services, will kick off the event with a brief keynote presentation.
Video: Spack at FOSDEM '20
February 02, 2020 (event-report) (multimedia)
FOSDEM is an annual two-day event promoting the widespread use of free and open source software. The 2020 conference took place in Brussels, Belgium, on February 1–2. Videos of speakers, lightning talks, and other sessions are available on the FOSDEM website. LLNL’s Todd Gamblin led two sessions about the package manager Spack:
MFEM and VisIt Benefits Engineer in LLNL’s Design Optimization Laboratory
December 10, 2019 (story)
MFEM and VisIt are key design codes in LLNL’s Center for Design and Optimization, which is developing algorithms that can optimize immensely complex systems in HPC environments. The MFEM library enables application scientists to prototype parallel physics application codes quickly, based on partial differential equations discretized with high-order finite elements. VisIt—a visualization, animation, and analysis tool—helps scientists and engineers interactively visualize and analyze data, from small (<101 core) desktop-sized projects to large (>105 core) leadership-class computing facility simulation campaigns. Learn more about the Center in the Science & Technology Review article “Leading a Revolution in Design.”
On the Spack Track at SC19
December 06, 2019 (story)
At the annual supercomputing conference (SC19) in Denver, Colorado, Spack events were held each day. As a reflection of its grassroots heritage, nine sessions were planned by more than a dozen thought leaders from seven organizations, including three DOE laboratories and Sylabs, the company behind Singularity. Thirteen thousand six hundred conference attendees had the chance to learn about Spack from two meet-and-greets, three birds-of-a-feather meetings, three papers, and more. This HPCwire article describes Spack’s history, functionality, impact, and user community through the many Spack-related events at SC19.
LLNL’s Presence in HPC Shines Bright at SC19
December 05, 2019 (event-report)
The 2019 International Conference for High Performance Computing, Networking, Storage, and Analysis—better known simply as SC19—returned to Denver, and once again LLNL made its presence known as a force in supercomputing. The conference, held November 17 through 22, was attended by nearly 14,000 people representing 118 countries.
On November 22, a panel of judges at the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19) awarded a multi-institutional team led by LLNL computer scientists with the conference’s Best Paper award. The paper, entitled “Massively Parallel Infrastructure for Adaptive Multiscale Simulations: Modeling RAS Initiation Pathway for Cancer,” describes the workflow driving a first-of-its-kind multiscale simulation on predictively modeling the dynamics of RAS proteins—a family of proteins whose mutations are linked to more than 30 percent of all human cancers—and their interactions with lipids, the organic compounds that help make up cell membranes.
The team’s software, called MuMMI (Multiscale Machine-Learned Modeling Infrastructure), will soon be released as open source. Read more about the award on LLNL news.
Software Engineering 101: I have some code! Now what?
November 12, 2019 (event-report) (story) (this-website)
As part of LLNL’s Computing 101 speaker series, Ian Lee gave a talk to employees on November 12 titled “Software Engineering 101: I have some code! Now what?” The presentation reviewed the Lab’s resources for supporting software engineering and open source development.
Lee, who manages this website and leads many initiatives in the Lab’s open source community, aimed his remarks at relative newcomers to the software development landscape. He also updated the audience on the state of open source development at the Lab.
The Lab provides a wide range of support and solutions for just about any task a developer does: programming languages, package managers, computing platforms, code editors, version control systems, project communication, project tracking, documentation, and much more. Lee provided an overview of these options, offered advice about how to navigate the Lab’s software resources, and encouraged developers to take advantage of colleagues’ knowledge and experience.
Lee summarized the Lab’s recent open source activity, which echoes a trend toward developing “out in the open,”—i.e., not waiting for code to mature before releasing it for community feedback and contributions. (As this website shows, the Lab and affiliated GitHub organizations have almost 600 repos.) Accordingly, the Lab has updated its open source release policies to support modern code development practices.
Lee also demoed this website’s category-driven design changes, LLNL’s open source logo (and stickers), the @LLNL_OpenSource Twitter account, and Slack channels. He noted that LLNL may have a booth at PyCon 2020, which will be held April 15-23 in Pittsburgh. (Conferences such as PyCon provide LLNL’s open source software community with opportunities for networking, collaboration, and technical skills development. Lab employees interested in attending similar events may contact Ian Lee for funding.)
ESGF Architecture Workshop
November 08, 2019 (event-report)
Members of the Earth System Grid Federation (ESGF) gathered in Abingdon, England, on November 5-7 to kick off the redesign process for the Federation’s computing architecture. Since the original system was designed a decade ago, the number of ESGF’s supported projects and disciplines has grown and diversified. Furthermore, operational requirements are clearer for the ESGF to support an international federated archive of this size. Many of the ESGF nodes now have other functions beyond CMIP (the Coupled Model Intercomparison Project), and the landscape of data repository and science needs has changed.
Led by ESGF’s Executive Committee, the workshop team discussed improvements to the user experience, data repository and management, data compute requirements, and platform and system administration. This workshop concluded with a high-level roadmap for future architecture directions, which will be presented at the larger ESGF conference in March. LLNL’s delegates to the workshop were Ghaleb Abdulla (principal investigator and co-chair of the Executive Committee), Sasha Ames (member of multiple ESGF Working Teams) and Jason Boutte (Compute Working Team member).
Two Repos among 2019 R&D 100 Award Finalists
October 24, 2019 (story)
The annual R&D 100 Awards finalists have been announced. Among them are six LLNL-developed or co-developed technologies. In the Software/Services category, two open source projects have been recognized:SCR (Scalable Checkpoint/Restart) and Spack. Winners will be announced on October 29.
JuliaCon Recap and Videos
August 22, 2019 (event-report) (multimedia)
LLNL’s Seth Bromberger attended JuliaCon 2019 on July 22–25 in Baltimore, Maryland. He gave a talk on July 24 to a full house: “Using Julia in Secure Environments” (abstract, YouTube video). The focus of the presentation was engaging the community in thinking about transitive package dependencies and the security of the source code supply chain.
Other notable events at the conference included a keynote address by Steven Lee, applied mathematics program manager for Advanced Scientific Computing Research (ASCR) within the U.S. Department of Energy’s Office of Science. His presented his office’s computing priorities and mentioned related LLNL work (YouTube video). In addition, LLNL’s Jane Herriman received a Julia Community Prize for her “teaching, outreach, and community stewardship.”
Conferences such as JuliaCon provide LLNL’s open source software community with opportunities for networking, collaboration, and technical skills development. Lab employees interested in attending similar events may contact Ian Lee for funding.
Software Portal Redesign and GitHub Integration
July 30, 2019 (this-website)
Recently this website received several changes that improve the user’s experience, keep the content fresh, and help the admin team monitor and track all repositories under the LLNL organization on GitHub. We are excited to improve user access to LLNL’s 500+ open source repositories and appreciate the help of our summer intern, Angela Flores, who is pursuing a B.S. in computer science with a minor in cybersecurity from Cal State Long Beach.
*LLNL’s RADIUSS project—Rapid Application Development via an Institutional Universal Software Stack—aims to broaden usage across LLNL and the open source community of a set of libraries and tools used for HPC scientific application development.
LLNL's Third Annual Developer Day Focuses on Career Lifecycle and Best Practices
July 26, 2019 (story)
Initiated in 2017, Developer Day is a day-long, annual event that brings software developers together from all over LLNL. This year’s Dev Day included a panel discussion about onboarding new hires; short talks on topics ranging from staying engaged at work to learning unicode characters; and deep dives on software quality assurance and cloud services. The event featured a keynote address by Dr. Jeffrey Carver from the University of Alabama, who spoke about “Contemporary Peer Code Review Practices in Research Software.”
Why Do We Need Supercomputers and Who Is Using Them?
July 10, 2019 (story)
PC Magazine recently featured LLNL’s supercomputing facility to find out how the supercharged machines handle everything from virtual nuclear weapons tests to weather modeling. The article highlights examples of simulations performed on the Lab’s computers, such as a fusion energy research experiment generated by the MFEM-based BLAST shock hydrodynamics code and visualized with VisIt.
Redesign of Cardioid's Heartbeat Simulation Brings Code One Step Closer to Clinical Use
June 12, 2019 (multimedia) (story)
LLNL researchers have successfully optimized a code that models the human heartbeat for next-generation, GPU-based supercomputers, with an eye on developing it for virtual drug screening and modeling heart activity in clinical settings. Cardioid, a suite merging mathematical solvers for electrophysiology, fiber-generation, cardiac mechanics, torso-electrocardiograms (ECGs) and cardiac meshing tools, simulates the electrical current running through the heart tissue, triggering cells to contract like cascading dominoes and causing the heart to beat. It was originally developed by LLNL and IBM for Sequoia, at one time the world’s fastest supercomputer, and was a finalist for the 2012 Gordon Bell Prize, supercomputing’s top honor.
At the recent Red Hat Summit in Boston, LLNL’s Robin Goldstone discussed open-source technologies and the Sierra supercomputer. Goldstone, an HPC solutions architect, said “open source makes perfect sense” for scalability and performance in an HPC center like LLNL’s. She stated, “We have all that visibility and that software. If it doesn’t work for our needs, we can make it work for our needs. And then we can give it back to the community because even though people aren’t doing things at the scale that we are today, a lot of the things that we’re doing really do trickle down and be used by a lot of other people.” A transcript of her interview is included with the video, which runs 15:28.
OSS Project Lead Kathryn Mohror Completes Tenure as S&TR Scientific Editor
May 07, 2019 (profile)
Like many LLNL computer scientists, Kathryn Mohror juggles multiple responsibilities both at her workplace and in the scientific community. She recently completed a 12-month term as scientific editor of LLNL’s Science & Technology Review magazine. Read about her experience with the publication while still keeping up with her own research in scalable fault-tolerant computing and input/output for next-generation computing systems – not to mention her two open source projects, SCR and UnifyCR.
How Machine Learning Could Change Science
May 03, 2019 (story)
Artificial intelligence tools are revolutionizing scientific research and changing the needs of high performance computing. In an article from Data Center Dynamics, LLNL’s Fred Streitz and Brian Van Essen discuss the future of scientific computing, highlighting the Exascale Computing Project (ECP) and the Livermore Big Artificial Neural Network (LBANN).
The ECP is a multi-institutional Department of Energy collaboration aimed at achieving exascale computing capability. Many open source software projects, from LLNL and elsewhere, are crucial components of the ECP ecosystem.
LBANN is an open source deep learning toolkit developed at the Lab. It provides model-parallel acceleration through domain decomposition to optimize for strong scaling of network training.
Spack Team Visits RIKEN
April 23, 2019 (event-report)
Spack’s first tutorial in Japan took place on April 23, 2019. With more than 40 participants, the onsite tutorial at RIKEN’s Kobe research center was the latest international event for the Spack team and collaborators. Read more about Spack’s European tour of HPC facilities. Everything you need to get started with Spack is available on the website.
Caliper Library Highlighted at 31st VI-HPS Tuning Workshop
April 15, 2019 (event-report)
The Virtual Institute – High Productivity Supercomputing (VI-HPS) conducts a long-running series of tuning workshops, where participants can learn about programming tools developed by the institute partners. Morning sessions consist of tool presentations and hands-on exercises. In the afternoon, users can apply the tools to their own codes with the help of the instructors. Whilst most of the workshops take place in Europe, the 31st tuning workshop was held at the University of Tennessee, Knoxville (UTK), on April 9–12, 2019.
As part of the workshop, LLNL computer scientist David Boehme conducted a 75-minute tutorial on Caliper, an open-source performance profiling library for HPC software. The session included hands-on exercises using the Lulesh proxy application as an example. There were around 15–20 participants, primarily HPC software developers from UTK and Oak Ridge National Laboratory, as well as the other HPC tool presenters. This tutorial marked the first time Caliper was presented within the VI-HPS tuning workshop series. Boehme’s tutorial was well received, and several participants were able to successfully apply Caliper to their programs.
The workshop also provided an opportunity to discuss common software infrastructure as well as integration and interoperability possibilities with other performance analysis tools. For example, the PAPI team plans to explore using Caliper’s data collection and processing functionality. Finally, as a VI-HPS member organization, LLNL’s participation in the tuning workshop series helped showcase the Lab’s strong portfolio of open-source programming tools among the VI-HPS partners and in the HPC community at large.
The Linux Foundation's Open Source Leadership Summit
March 15, 2019 (event-report)
The Linux Foundation’s Open Source Leadership Summit occurred in Half Moon Bay, California, on Thursday, March 14. LLNL’s Todd Gamblin presented “Open Source in the Exascale Computing Project: Building a Software Ecosystem for Science.” Check out the conference schedule.
This presentation covered the challenges of building software for machines that don’t exist yet, and how government laboratories, academia, and industry are collaborating to build a highly optimized software distribution. From deploying services like GitLab CI and JupyterHub in high-security HPC centers, challenges for architecture-specific containers, the use of Spack to package and distribute optimized binaries, and the social hurdles of scientists and developers working together, this talk summarized the open source challenges in DOE’s largest-ever HPC software project.
Inaugural NAHOMCon19 Coming to San Diego
February 14, 2019 (event)
To all computational scientists, mathematicians, scientists, and engineers interested in high-order methods and PDEs: Several institutions have joined together to organize the inaugural North American High Order Methods Conference (NAHOMCon19). The conference will be held in San Diego in the summer of 2019 and will focus on the many developments in high-order discretizations and applications taking place in North America.
The DOE co-design Center for Efficient Exascale Discretizations (CEED) is pleased to participate in the conference. CEED is a partnership between two U.S. DOE laboratories (Livermore & Argonne) and five universities in support of the Exascale Computing Project.
Held in Washington, DC, the Earth System Grid Federation’s (ESGF) 8th annual face-to-face conference was a lively, fruitful affair. The event packed 40 presentations, several plenary sessions, a poster session, guest speakers, an awards ceremony, and an executive committee meeting into the week.
The federation houses an enormous database of global observational and simulation data—more than 5 petabytes—and manages the HPC hardware and software infrastructure necessary for scientific climate research. In the nearly two decades since its launch, ESGF has grown to serve 25,000 users on 6 continents.
Among ESGF’s 2018 milestones were support for CMIP6 data (thanks to input4MIPs and obs4MIPs initiatives), beta v3.0 of the software stack installer, OAuth single sign-on integration, and progress in containerized architecture. Read more about the conference and check out ESGF’s GitHub repo.
DOE Machines Dominate Record-Breaking SC18
November 20, 2018 (event-report)
Supercomputing ‘18 (SC18), held Nov. 11–16 in Dallas, broke records for attendees and exhibitors and saw LLNL once again make its presence felt on the world’s biggest HPC stage. For the first time in five years, the U.S. captured the top two spots on the TOP500 List of the world’s fastest supercomputers: Summit at ORNL and Sierra at LLNL.
P3HPC (performance, portability, and productivity) workshop
Talks by LLNL experts at industry booths (e.g., Penguin Computing, NVIDIA)
Earth System Grid Federation's Annual Conference Coming Up
November 03, 2018 (event)
The LLNL-led international Earth System Grid Federation (ESGF) will meet December 3-7 in Washington, DC, to plan the future of Earth system data analysis and more. Registration info is available on the ESGF website along with the conference agenda. Fork this 2017 R&D 100 winner on GitHub.
Good Times at GitHub Universe
November 01, 2018 (event-report)
LLNL open-source champions Laura Weber, Ian Lee, and David Beckingsale attended the 2018 GitHub Universe conference in San Francisco. Billed as “a conference for the builders, planners, and leaders defining the future of software”, the team enjoyed hearing about upcoming GitHub enhancements and being able to network with GitHub Federal employees and other GitHub users.
One recurring theme was inner source, the use of open source software development best practices and the establishment of an open-source-like culture within organizations. With this practice the organization may still develop proprietary software, but internally opens up its development.
Flux and Spack Events Coming to Supercomputing '18
October 27, 2018 (event)
LLNL staff are heading to Dallas, Texas, for the 30th annual Supercomputing Conference (SC18) on November 11–16. LLNL is leading 6 tutorials and 16 workshops with topics ranging from data analytics and data compression to performance analysis and productivity. LLNL-developed open-source tools Flux and Spack are subjects of a workshop and a tutorial, respectively. We hope to see you there!
Read more about our past experiences and tips for first-timers, and a complete list of LLNL-led sessions can be found on the LLNL Computing website (links unpublished in 2020). All times are listed in Central Standard Time.
Open-Source Developer Greg Becker Scales Projects and Mountains
October 26, 2018 (profile)
Is there a connection between rock climbing and software development? In this profile, LLNL’s Greg Becker describes his career path, motivation for improving HPC tools, and recent work with open-source projects like SCR, Caliper, and Spack.