Video: Fast, Accurate Data Compression for Modern Supercomputing Applications
May 31, 2023 (multimedia)
Developed at LLNL, the zfp software library provides a comprehensive solution to both lossy and lossless data compression. zfp reduces the storage space of high-precision floating-point data without sacrificing its accuracy. Unique among data compressors, zfp is designed to be a compact number format for storing data arrays in-memory in compressed form while still supporting high-speed random access. zfp divides multidimensional arrays into small blocks that are independently compressed and decompressed on demand when an array element is accessed, without the application’s knowledge. This flexibility allows applications to work with zfp arrays as though they were regular uncompressed arrays while saving storage, time, and compute power. Watch a new video about zfp on YouTube (6:49).
Video: Vendor-Agnostic Power Management
May 30, 2023 (multimedia)
Pushing our supercomputers to their limits requires getting closer to the metal than standard software and operating systems allow. Variorum provides robust, portable interfaces that allow us to measure and optimize computation at the physical level: temperature, cycles, energy, and power. With that foundation, we can get the best possible use of our world-class computing resources—from purchasing to runtime systems to job scheduling. Watch a new video about Variorum on YouTube (6:57).
New Repo: DFTF
May 26, 2023 (new-repo)
DFTF, or Drink From The Firehose, is a Python program that subscribes to Redfish events on Cray/HPE hardware and republishes them to topics in Kafka. In an attempt to tame the “firehose” of information from CrayTelmetry, DFTF drops any repeated metrics so only the most recent value for each unique metric is maintained. In effect, this usually means values are reported roughly every five seconds rather than every second. See the example.conf file for example configuration. DFTF is being used during Livermore Computing’s efforts to site the Lab’s upcoming exascale machine El Capitan.
Hatchet 2023.1.0 Released
May 24, 2023 (release)
Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. This release adds features to caliperreader, enables support for multi-indexed DataFrames, adds Cython to the build-system requires list, and more.
Spack is a flexible, configurable, Python-based, and open-source HPC package manager. Spack automates the installation and fine-tuning of simulations and libraries, operating on a wide variety of HPC platforms and enabling users to build many code configurations. Version 0.20.0 includes:
Conduit provides an intuitive model for describing hierarchical scientific data in C++, C, Fortran, and Python. It is used for data coupling between packages in-core, serialization, and I/O tasks. The latest release includes Python 3 Stable ABI compatibility, new blueprint classes, and more.
UnifyFS is a user-level burst buffer file system under active development. The repo supports scalable and efficient aggregation of I/O bandwidth from burst buffers while having the same lifecycle as a batch-submitted job. This release includes updates to Mochi-Margo usage, a config for optional sleep, various config option changes, new unit tests, and more.
ExaCA is a cellular automata (CA) code for grain growth under additive manufacturing conditions by ExaAM within the Exascale Computing Project. This release adds optional JSON input and output files, updates to analysis routines, and a few deprecated features.
Surveys to Understand Developer Health and Happiness
May 10, 2023 (story)
Computer scientist Vanessa Sochat talked to BSSw about an effort to survey software developer needs at LLNL. She discusses how developer needs can be grouped into two categories—those that are practical (e.g., tools and resources) and those that are intangible (e.g., happiness, communication, or motivation)—as well as the survey methodology and some of the results, which were published on the RADIUSS website in February.
Flux 0.50.0 Released
May 03, 2023 (release)
Flux is a flexible framework for resource management consisting of a suite of projects, tools, and libraries which may be used to build site-custom resource managers for HPC centers. This version enables Flux to communicate with a systemd user instance, supports flipping through queues via left/right arrow keys, includes test suite updates, and much more.
Register for CEED's Seventh Annual Meeting in August
May 03, 2023 (event)
As part of the Exascale Computing Project (ECP), the Center for Efficient Exascale Discretizations (CEED) is a research partnership between two U.S. Department of Energy laboratories and five universities. LLNL leads the Center. All of CEED’s software is open source.
MaPPeRTrac (Massively Parallel, Portable, and Reproducible Tractography) is a probabilistic tractography workflow using structural DW-MRI and designed for high performance computing. The latest release includes updated toolkit versions, compatibility with DWI data, enhanced parallelization, and more.
The Coda Calibration Tool (CCT) calculates reliable moment magnitudes for small- to moderate-sized seismic events. This release provides an alpha version of the new Coda Envelope Ratio Tool (CERT), which allows users to measure amplitudes for loaded envelopes, generate ratios between events, and then attempt to invert the selected events for source parameters.
Tribol is a modular interface physics library featuring state-of-the-art contact physics methods. High-fidelity simulations modeling complex interactions of moving bodies require specialized contact algorithms to enforce constraints between surfaces that come into contact in order to prevent penetration and to compute the associated contact response forces. Tribol aims to provide a unified interface for various contact algorithms, specifically, contact detection and enforcement, and serve as a common infrastructure enabling the research and development of advanced contact algorithms.
KubeCon: Enabling HPC and ML Workloads with the Latest Kubernetes Job Features
April 21, 2023 (event-report)
On April 21, LLNL computer scientist Vanessa Sochat and Google software engineer Michał Woźniak presented “Enabling HPC and ML Workloads with the Latest Kubernetes Job Features” at KubeCon. View the slides and watch the video. The abstract follows:
In this talk, we present the new features in Kubernetes Job API and how they can be used to stand up to challenges of running distributed Batch/AI/HPC workloads at scale, based on real-world experiences from DeepMind and the Flux Operator from Lawrence Livermore National Laboratory. We showcase the Indexed Jobs feature by presenting its production use. First, we demonstrate how it simplifies running parallel workloads which require pod-to-pod communication, including distributed machine learning examples based on its use by DeepMind. Next, we demonstrate the orchestration of HPC workloads using the Flux Operator. Here, we create a “Mini Cluster” within Kubernetes built on top of an indexed job, providing a rich ecosystem for orchestration of batch workloads, related user interfaces, and APIs. We also discuss the challenge of handling pod failures for long-running workloads. We show how Pod Failure Policy can be used to continue job execution despite numerous pod disruptions (caused by events such as node maintenance or preemption), yet reduce costs by avoiding unnecessary pod retries when there are software bugs.
Kosh 3.0 Released
April 20, 2023 (release)
Kosh allows codes to store, query, and share data via an easy-to-use Python API. This software aims to make data access and sharing as simple as possible. The latest release includes support for clustering, updates to curves and ensembles, and more.
UEDGE is an interactive suite of physics packages using the Python or BASIS scripting systems. The plasma is described by time-dependent 2D plasma fluid equations that include equations for density, velocity, ion temperature, electron temperature, electrostatic potential, and gas density in the edge region of a magnetic fusion energy confinement device. slab, cylindrical, and toroidal geometries are allowed, and closed and open magnetic field-line regions are included. Classical transport is assumed along magnetic field lines, and anomalous transport is assumed across field lines. Multi-charge state impurities can be included with the corresponding line-radiation energy loss.
Merlin 1.10.0 Released
April 13, 2023 (release)
Merlin is a tool for running machine learning based workflows. The goal of Merlin is to make it easy to build, run, and process the kinds of large scale HPC workflows needed for cognitive simulation. Version 1.10.0 includes added Flux support, new commands, reformatted integration tests, refactored batch.py, and more.
Open Source Is Fueling the Future of Nuclear Physics
April 11, 2023 (story)
A recent GitHub blog post considers the role of open source software in nuclear fusion research, including LLNL’s ignition breakthrough. The article notes that many nuclear science organizations have released open source software in recent years, which is a big change from business as usual in the field.
New Repo: ezAlign
April 04, 2023 (new-repo)
ezAlign provides coarse-grain to atomistic molecular coordinate and topology conversion for molecular dynamics simulations. ezAlign is designed to convert complex, solvated biological systems including lipid membranes with drug-like molecules using GROMACS. Single molecule coordinate, topology, and mapping may be specified during execution using the appropriate command-line options. A number of commonly simulated biological molecules are currently provided.
VisIt 3.3.3 Released
April 03, 2023 (release)
VisIt is an open source, interactive, scalable, visualization, animation, and analysis tool. Version 3.3.3 includes:
From Compact Plasma Particle Sources to Advanced Accelerators with Modeling at Exascale
March 30, 2023 (story)
In a new paper (“From Compact Plasma Particle Sources to Advanced Accelerators with Modeling at Exascale”), a collaboration between LLNL, Lawrence Berkeley National Lab, Deutsches Elektronen-Synchrotron, and CEA-Universite Paris-Saclay explores computational modeling in particle accelerator research. The team’s research was supported by the Exascale Computing Project and includes the Beam, Plasma & Accelerator Simulation Toolkit (BLAST) of scientific open source codes and applications. The abstract follows:
Developing complex, reliable advanced accelerators requires a coordinated, extensible, and comprehensive approach in modeling, from source to the end of beam lifetime. We present highlights in Exascale Computing to scale accelerator modeling software to the requirements set for contemporary science drivers. In particular, we present the first laser-plasma modeling on an exaflop supercomputer using the US DOE Exascale Computing Project WarpX. Leveraging developments for Exascale, the new DOE SCIDAC-5 Consortium for Advanced Modeling of Particle Accelerators (CAMPA) will advance numerical algorithms and accelerate community modeling codes in a cohesive manner: from beam source, over energy boost, transport, injection, storage, to application or interaction. Such start-to-end modeling will enable the exploration of hybrid accelerators, with conventional and advanced elements, as the next step for advanced accelerator modeling. Following open community standards, we seed an open ecosystem of codes that can be readily combined with each other and machine learning frameworks. These will cover ultrafast to ultraprecise modeling for future hybrid accelerator design, even enabling virtual test stands and twins of accelerators that can be used in operations.
mappgene 1.3.0 Released
March 27, 2023 (release)
mappgene is a SARS-CoV-2 genomic sequence analysis pipeline designed for parallel HPC. Since the initial release in 2021, updates have included improved memory handling, default support for iVar plus additional parameters, and more.
The Livermore Big Artificial Neural Network toolkit (LBANN) is an open-source, HPC-centric, deep learning training framework that is optimized to compose multiple levels of parallelism. This release includes new training algorithms, new network structures, new layers, updates to the C++ API, and more.
LANL led with LLNL contributors, Charliecloud provides user-defined software stacks for HPC centers. It uses Linux user namespaces to run containers with no privileged operations or daemons and minimal configuration changes on center resources. Version 0.32 includes updates to ch-image, ch-test, ch-run, and other executables.
Please note the team’s request for feedback: We are considering removing ch-ssh, a utility program to facilitate SSH connections from one Charliecloud container into an equivalent container on another host. Please respond to and/or comment on our poll, especially if you use this tool, in discussion #1600.
MFEM is a lightweight, general, scalable C++ library for finite element methods. It enables high-performance scalable finite element discretization research and application development on a wide variety of platforms, from laptops to supercomputers. The v4.5.2 release includes:
Conduit provides an intuitive model for describing hierarchical scientific data in C++, C, Fortran, and Python. It is used for data coupling between packages in-core, serialization, and I/O tasks. The latest release includes DataType support for the Fortran API, updates to DataAccessor and blueprints, and more.
hypre is a library of high-performance preconditioners and solvers featuring multigrid methods for the solution of large, sparse linear systems of equations on massively parallel computers. Version 2.28.0 includes multiple new functions (e.g., matrix scaling, hypre_IntArray, vector resizing), updated docs, and more.
Equation of state (EOS) data provide necessary information for accurate multiphysics modeling, which is necessary for fields such as inertial confinement fusion. Here, we suggest a neural network surrogate model of energy and entropy and use thermodynamic relationships to derive other necessary thermodynamic EOS quantities. We incorporate phase information into the model by training a phase classifier and using phase-specific regression models, which improves the modal prediction accuracy. Our model predicts energy values to 1% relative error and entropy to 3.5% relative error in a log-transformed space. Although sound speed predictions require further improvement, the derived pressure values are accurate within 10% relative error. Our results suggest that neural network models can effectively model EOS for inertial confinement fusion simulation applications.
AWS Blog: Install Optimized Software with Spack Configs for AWS ParallelCluster
March 14, 2023 (story)
Amazon Web Services (AWS) recently announced the availability of Spack configs for AWS ParallelCluster. Users can use these configurations to install optimized HPC applications quickly and easily on their AWS-powered HPC clusters. Spack configs for AWS ParallelCluster represent validated best practices developed by the AWS HPC Performance Engineering Team. They contain fixes and general optimizations that can increase the performance of any compiled application on a wide range of architectures. Read more in an AWS blog post.
Aluminum 1.3.0 Released
March 11, 2023 (release)
Aluminum provides a generic interface to high-performance communication libraries with a focus on allreduce algorithms. Blocking and non-blocking algorithms and GPU-aware algorithms are supported. Aluminum also contains custom implementations of select algorithms to optimize for certain situations. The latest release adds in-place SendRecv support.
LaSDI (Latent Space Dynamics Identification) enables a fast and accurate solution process on various partial differential equations (i.e., Burgers’ equations, radial advection problem, nonlinear heat conduction problem), achieving more speed-ups and less relative error with respect to the corresponding full order models. The repo includes four examples: 1D Burgers, 2D Burgers, a radial advection example as from MFEM, and a time-dependent diffusion example.
The accompanying repo gLaSDI (“greedy” LaSDI, or Parametric Physics-informed Greedy Latent Space Dynamics Identification) provides a framework for accurate, efficient, and robust data-driven reduced-order modeling of high-dimensional nonlinear dynamical systems. gLaSDI’s autoencoder discovers intrinsic nonlinear latent representations of high-dimensional data, while dynamics identification models capture local latent-space dynamics.
The ElasticsearchExcellence Awards recognize the efforts of visionary teams. LLNL was recognized as a public sector organization leading the way in innovative, sustainable, and critical use cases. The winning entry for the Public Sector Award supports the Lab’s HPC center and enables extreme-scale work across multiple research domains and scientific applications. With a single tool for system security and monitoring, the organization can search large datasets with improved visualizations providing the insight needed to protect its environments.
Flux 0.48.0 Released
March 07, 2023 (release)
Flux is a flexible framework for resource management consisting of a suite of projects, tools, and libraries which may be used to build site-custom resource managers for HPC centers. This version includes support for RFC 36 submission directives, flux-core configuration in ascii-only mode, flux-mini subcommands available as top-level flux commands, and more.
SAMRAI (Structured Adaptive Mesh Refinement Application Infrastructure) is an object-oriented C++ software library that enables exploration of numerical, algorithmic, parallel computing, and software issues associated with applying structured adaptive mesh refinement (SAMR) technology in large-scale parallel application development. The latest release includes RAJA-based kernel fusion features and an option to change how small patches are treated during load balancing.
Elastic Webinar: You Must Unlearn What You Have Learned
March 01, 2023 (event-report) (multimedia)
LLNL security operations team lead Ian Lee recently gave a webinar describing how the Lab uses Elasticsearch for HPC. The webinar is available to watch on demand (19:27); the abstract follows:
High Performance Computing (HPC) systems generate massive amounts of data and logs. In addition, the retention requirements are only increasing to ensure data remains available for incident response, audits, and other business needs. Ingesting and making sense of all the data takes a correspondingly large amount of computing power and storage. With El Capitan, a 2 Exaflop computer arriving and being deployed at LLNL in 2023, we’ll have even larger processing needs in the future. Therefore over the past year, Livermore Computing at LLNL has been migrating our current logging infrastructure to Elasticsearch and Kibana in an effort to handle the increasing amount of data even faster than before. This talk will focus on the changes we’ve made, why we decided to go with Elastic, and address some of the bumps we’ve hit along the way.
AWS Leverages LLNL Finite Element Discretization for Electromagnetics Simulations of Quantum Computing Hardware
February 22, 2023 (story)
Amazon Web Services (AWS) recently introduced Palace (PArallel, LArge-scale Computational Electromagnetics), a parallel finite element code for full-wave electromagnetics simulations. Palace is used at the AWS Center for Quantum Computing to perform large-scale 3D simulations of complex electromagnetics models and enable the design of quantum computing hardware. This open source, parallel finite element code for full-wave 3D electromagnetic simulations in the frequency or time domain uses the MFEM finite element discretization library and can be installed via Spack. Learn more about Palace in AWS’s blog post.
We are interested in estimating the uncertainties of deep neural networks, which play an important role in many scientific and engineering problems. In this paper, we present a striking new finding that an ensemble of neural networks with the same weight initialization, trained on datasets that are shifted by a constant bias gives rise to slightly inconsistent trained models, where the differences in predictions are a strong indicator of epistemic uncertainties. Using the neural tangent kernel (NTK), we demonstrate that this phenomena occurs in part because the NTK is not shift-invariant. Since this is achieved via a trivial input transformation, we show that this behavior can therefore be approximated by training a single neural network – using a technique that we call ∆−UQ – that estimates uncertainty around prediction by marginalizing out the effect of the biases during inference. We show that ∆−UQ ’s uncertainty estimates are superior to many of the current methods on a variety of benchmarks– outlier rejection, calibration under distribution shift, and sequential design optimization of black box functions.
New Repo: LUAR
February 15, 2023 (new-repo)
With LUAR (Learning Universal Authorship Representations), researchers conduct the first large-scale study of cross-domain transfer for authorship verification considering zero-shot transfers involving three disparate domains: Amazon Reviews, fanfiction short stories, and Reddit comments. This code accompanies the paper “Learning Universal Authorship Representations” from the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). The abstract follows:
Determining whether two documents were composed by the same author, also known as authorship verification, has traditionally been tackled using statistical methods. Recently, authorship representations learned using neural networks have been found to outperform alternatives, particularly in large-scale settings involving hundreds of thousands of authors. But do such representations learned in a particular domain transfer to other domains? Or are these representations inherently entangled with domain-specific features? To study these questions, we conduct the first large-scale study of cross-domain transfer for authorship verification considering zero-shot transfers involving three disparate domains: Amazon reviews, fanfiction short stories, and Reddit comments. We find that although a surprising degree of transfer is possible between certain domains, it is not so successful between others. We examine properties of these domains that influence generalization and propose simple but effective methods to improve transfer.
New Repo: dbComm
February 15, 2023 (new-repo)
dbComm is a module for integrating a Python codebase with MongoDB using the PyMongo library. The methods contained are used to connect to a Mongo database and push/pull data.
Results from the First LLNL Developer Survey
February 08, 2023 (this-website)
With software development underpinning much of LLNL’s mission-driven science, the Lab’s developer community has a long history of figuring out solutions to a huge number of complex challenges. But how do we know what’s working and what isn’t? What tools or other support do developers need to do their best work? The RADIUSS project team conducted a survey of LLNL developers in late 2022 to assess their happiness and needs, both for awareness and targeted action that might be pursued in the next year. Some of this insight has been published publicly for the purposes of knowledge sharing and community building. Additionally, since RADIUSS products are open source, external users can benefit from improvements LLNL makes internally and pushed out to publicly available projects. Read the survey report.
Spack Videos from FOSDEM
February 06, 2023 (event-report) (multimedia)
FOSDEM is a free event for software developers to meet, share ideas, and collaborate. The Spack team had three presentations at this year’s event in February. If you missed them, you can watch the videos here:
Skywing: Open Source Software Aids Collaborative Autonomy Applications
January 25, 2023 (new-repo) (story)
The U.S. modern critical infrastructure—from the electrical grid that sends power to homes to the pipelines that deliver water and natural gas and the railways and roadways we travel—is full of digitized components. In a power grid, this includes distributed energy resources such as smart meters, solar inverters, power-quality sensors, and protection devices that are geographically spread out, programmable and network connected. These networks, as currently designed, typically rely on a single control center for analysis and decision making.
To defend against cyber-attacks and harden the system, LLNL mathematicians, systems analysts, power engineers, cybersecurity experts and computational scientists have turned to collaborative autonomy—a new class of computational techniques that teach networked devices how to self-organize into a collective whole. And recently, an LLNL research team has developed Skywing—an open-source, high-reliability, real-time, decentralized software platform for domain scientists, mathematicians, and computer scientists exploring collaborative-autonomy applications for critical infrastructure. Skywing provides approaches and solutions for real-world applications that solve problems and allow for confidence in the results. It also helps lower the barrier to entry for those who may lack fluency in decentralized software development.
Discrete resource event modeling and multi-cluster scheduling simulator (DR_EVT) aims to provide a computational environment for simulating job scheduling and resource management using a set of heterogeneous clusters. Currently, the repo provides means to load and process job trace files, and the team is working on a workload model based on the trace data. DR_EVT is designed for Linux-based systems and contains information specific to Livermore Computing (LC) platforms.
Johannes Doerfert Wins Fellowship from Better Scientific Software Organization
December 22, 2022 (profile) (story)
Johannes Doerfert was selected as one of six 2023 Better Scientific Software fellows recognizing his leadership and advocacy of high-quality scientific software. The Better Scientific Software (BSSw) community is an international group of researchers, practitioners, and stakeholders from national laboratories, academic institutions, and industry who are dedicated to curating, creating, and disseminating information that leads to improved software for the advancement of computational science and engineering (CSE) and related technical computing areas, with a particular interest in CSE on high-performance (parallel) computers. The BSSw Fellowship Program gives recognition and funding to leaders and advocates of high-quality scientific software. Each 2023 Fellow receives up to $25,000 for an activity that promotes better scientific software. Doerfert will focus his fellowship funding on improving developer productivity by demystifying the compiler black box. His plans include making short introductory videos on compiler technology with a focus on improved interaction and available tooling. The new fellows will be recognized at the 2023 Exascale Computing Project Annual Meeting in January.
OpenZFS: Improving File System Efficiency
December 15, 2022 (story)
Large-scale parallel computing systems generate massive amounts of data every second, and file systems are crucial for handling data transfer at LLNL’s HPC center. Livermore Computing’s (LC’s) Scalable Storage group helps manage the hardware and software necessary to keep file systems—and therefore supercomputers—running smoothly and efficiently. One key tool in the group’s portfolio is the ZFS (Zettabyte File System) project, which controls I/O operations and optimizes storage volume capacity. In 2013, the original ZFS project and ZFS on Linux evolved into what is now known as OpenZFS, which is maintained by a global developer community that includes LC staff. The Scalable Storage group has adapted OpenZFS for the Lab’s needs, especially as new generations of Lustre-based HPC systems—including the upcoming El Capitan exascale supercomputer—are designed and installed. Read more about OpenZFS on the LLNL Computing website.
Automated Cache for Container Executables
December 15, 2022 (story)
LLNL’s Vanessa Sochat and collaborators from the Wellcome Sanger Institute, the Pawsey Supercomputing Research Institute, and the University of Texas at Dallas have written a paper about the Singularity Registry HPC (“shpc”). Sochat breaks down the team’s methods in a Twitter thread. You can also download the preprint PDF; abstract follows:
Linux container technologies such as Docker and Singularity offer encapsulated environments for easy execution of software. In high performance computing, this is especially important for evolving and complex software stacks with conflicting dependencies that must co-exist. Singularity Registry HPC (“shpc”) was created as an effort to install containers in this environment as modules, seamlessly allowing for typically hidden executables inside containers to be presented to the user as commands, and as such significantly simplifying the user experience. A remaining challenge, however, is deriving the list of important executables in the container. In this work, we present new automation and methods that allow for not only discovering new containers in large community sets, but also deriving container entries with important executables. With this work we have added over 8,000 containers from the BioContainers community that can be maintained and updated by the software automation over time. All software is publicly available on the GitHub platform, and can be beneficial to container registries and infrastructure providers for automatically generating container modules to lower the usage entry barrier and improve user experience.
Best Paper Winner Improves Scientific Workflow Performance
December 07, 2022 (story)
The IEEE international eScience conference, which emphasizes compute- and data-intensive research methods, bestowed the 2022 Best Paper Award on a multidisciplinary team that includes LLNL staff and external collaborators. The paper, “Scalable Composition and Analysis Techniques for Massive Scientific Workflows,” details the optimization of a drug screening workflow for the American Heart Association (AHA). The AHA Molecule Screening (MoleS) workflow combines specialized software tools to manage HPC hardware heterogeneity. The MoleS end-to-end workflow relies on both general-purpose and domain-specific software tools, some of which are open source and/or developed at LLNL: Maestro for workflow execution, Flux for workload management and scheduling, RabbitMQ for message brokering (orchestrated by Kubernetes), ConveyorLC for docking automation tasks, Fusion machine learning algorithms for binding affinity predictions, and GMD (Generative Molecular Design) for the small-molecule discovery loop. Read more on the LLNL Computing website.
Carol Woodward Helps Scientists Solve Diverse Challenges
December 01, 2022 (profile)
Carol Woodward joined the Lab’s Center for Applied Scientific Computing (CASC) in 1996, first as a postdoc and then as a staff researcher. CASC has developed a reputation over the years, she notes, as an organization that can solve tough problems, so she and her colleagues are asked to consult on a diverse array of projects. “It’s nice because it means I can work at the same place and not just do one thing for a long time—I get to keep changing what I work on,” she says. She is the principal investigator for SUNDIALS, a package of time integrators and nonlinear solvers that garners more than 100,000 downloads annually and is used in myriad simulation-dependent applications. Her group’s technical contributions to the software have modernized it and upgraded its functionality for more than two decades, enabling it to scale to DOE’s highest-end computing systems. With its innovative solvers and flexibility for a variety of computing systems, SUNDIALS was awarded the prestigious 2023 SIAM/ACM Prize in Computational Science and Engineering. In 2022 Woodward was promoted to Distinguished Member of Technical Staff, LLNL’s highest technical job classification level. Read more about her work.
CASC Newsletter Highlights Open Source Projects
November 21, 2022 (story)
LLNL’s Center for Applied Scientific Computing (CASC) has published a new issue of its external newsletter. Included in the issue articles featuring open source software projects:
Compiler Co-Design with the RAJA Team: The RAJA Performance Suite, developed during the Sierra platform procurement, is a key tool for the Lab’s interactions with compiler vendors and GPU vendors, such as NVIDIA and AMD, for current LLNL supercomputers.
HPCwire Award for Applying Cognitive Simulation to Inertial Confinement Fusion
November 17, 2022 (story)
The high-performance computing publication HPCwireannounced LLNL as the winner of its Editor’s Choice award for Best Use of HPC in Energy for applying cognitive simulation (CogSim) methods to inertial confinement fusion (ICF) research. The award recognizes the team for progress in their machine learning-based approach to modeling ICF experiments performed at the National Ignition Facility (NIF) and elsewhere, which has led to the creation of faster and more accurate models of ICF implosions. The CogSim work addresses the need for better models that can fully utilize available datasets, can accurately estimate uncertainty, and can improve with additional data. Much of the CogSim work has been done on HPC machines including Sierra, Lassen, and Corona, using the open source projects Merlin, a custom deep-learning workflow tool, and the Livermore Big Artificial Neural Network toolkit (LBANN), a deep-learning training framework optimized for HPC.
Recap of MFEM's Second Community Workshop
November 16, 2022 (event-report) (multimedia)
The MFEM team hosted the second annual MFEM Community Workshop on October 25, 2025. The goal of the workshop was to foster collaboration among all MFEM users and developers, share the latest MFEM features with the broader community, deepen application engagements, and solicit feedback to guide future development directions for the project. If you missed the workshop, check out these resources: 2022 agenda (with speakers’ slides linked as PDFs) and news article. Videos of the talks are in production and will be posted soon.
SC22 Twitter Space: Open Source for HPC
November 15, 2022 (event-report) (multimedia)
The @Livermore_Comp account hosted a Twitter Space during the 34th annual Supercomputing Conference (SC22) on November 15. LLNL’s Meg Epperly moderated a panel featuring Elsa Gonsiorowski (SCR and other projects), Greg Becker (Spack), and David Gardner (SUNDIALS). The panel discussed answered questions about why open source software is important for HPC centers, who benefits from using and contributing to it, and why LLNL actively develops and nurtures open source software. Listen to the recording.
Optimizing Workflow with Flux
November 10, 2022 (story)
The latest issue of LLNL’s Science & Technology Review magazine showcases the R&D 100 award–winning Flux software framework. Honored with a 2021 R&D 100 Award, Flux is a scalable, flexible next-generation workload management framework that meets this need—maximizing resource utilization while allowing scientific applications and workflows to run faster and more efficiently. Developed in collaboration with university partners, Flux also enables new resource types, schedulers, and services to be deployed at data centers as they continue to evolve.
Spack: Sustaining the HPC Software Ecosystem
November 09, 2022 (event-report) (multimedia)
Todd Gamblin, an LLNL Distinguished Member of Technical Staff, gave a presentation on November 9 for the Dell Technologies HPC Community. His talk, “Sustaining the HPC Software Ecosystem,” described how HPC software can be managed more easily for all customers and users with Spack and included an overview of recent developments in the Spack community such as a partnership with AWS to provide infrastructure for a worldwide binary cache, a recent machine learning special interest group within Spack, and work to handle the complexities of installing software for GPUs. Slides can be downloaded, and a recording is available with free registration.
SUNDIALS Wins 2023 SIAM/ACM Prize in Computational Science and Engineering
November 07, 2022 (story)
The Society for Industrial and Applied Mathematics (SIAM) and Association for Computing Machinery (ACM) announced that they have awarded the 2023 SIAM/ACM Prize in Computational Science and Engineering to the team behind the LLNL-developed SUNDIALS software suite. The prestigious award is handed out every two years and recognizes outstanding contributions to the development and use of mathematical and computational tools and methods for the solution of science and engineering problems. It is one of SIAM’s most significant awards and will be presented to the team at the 2023 SIAM Conference on Computational Science and Engineering in Amsterdam next February. Because of its ease, flexibility and extensive documentation, SUNDIALS has become internationally recognized as one of the most effective and efficient time integration libraries. It is widely used by government laboratories and in academic institutions and industry, leading to advances in a variety of applications including fusion device modeling, watershed modeling, and reacting flow simulations.
As the creator of Spack, Todd Gamblin states, “Open source tools developed at LLNL enable people around the world to use HPC resources more effectively and to do better science.” In turn, these tools provide a framework for an entire community to maintain software needed by LLNL and its programs—for instance, LLNL could never maintain Spack’s thousands of software packages alone, and it benefits from the work of Spack’s enthusiastic contributors. Ultimately, he notes, “Sustaining open source communities is about finding leverage. It’s worthwhile for LLNL to put in the effort to build and maintain something like Spack if it incites a community of thousands to work together for the benefit of all.” In 2022 Gamblin was promoted to Distinguished Member of Technical Staff, LLNL’s highest technical job classification level. Read more about his recent work.
Celebrate Exascale Day 2022
October 18, 2022 (event) (multimedia)
The Exascale Computing Project (ECP) celebrates a new era of scientific discovery with Exascale Day on October 18, or “10^18” to represent the exascale threshold of floating-point operations per second. The event runs all week and provides multimedia and articles that educate explain the impact areas of exascale computing from the Department of Energy national laboratories (including LLNL), HPC manufacturers, and universities and industrial organizations. Much of the ECP’s software stack is open source.
The MSU Disentanglement Analysis Software (MDAS) is used to disentangle the forced and unforced components of tropospheric temperature change over the satellite era (after 1979) using maps of surface temperature change as a predictor. Input data is available via Zenodo (doi: 10.5281/zenodo.7199961). An accompanying publication is forthcoming from the Proceedings of the National Academy of Sciences.
New Project to Improve Differentiation of Extreme-Scale Science Applications
October 10, 2022 (event-report)
Under a recently funded project, researchers at LLNL and the Massachusetts Institute of Technology (MIT) will address the challenge of efficiently differentiating large-scale applications for the Department of Energy by building on advances in LLNL’s MFEM finite element library and MIT’s Enzyme AD tool. The team’s project will address the challenge of efficiently differentiating large-scale DOE applications—predicting how adjustments in design parameters will impact the output of a code. Knowledge of optimal outputs is increasingly needed for complex simulation codes to be used for design optimization, machine learning, uncertainty quantification and sensitivity analysis, among other applications. While automatic differentiation (AD) has made the differentiation process easier, traditional AD tools require significant changes that are not feasible for many existing large-scale DOE applications. Read more about the project at LLNL News.
New Repo: Rollerball
October 06, 2022 (new-repo)
The Rollerball Notepad++ Plugin repo contains code for a benignware (pseudo malware) plugin for the Notepad++ editor based on this plugin template. The plugin purports to be an AutoSave feature, but in fact it exfiltrates file content based on keywords to a webserver that you specify. This code was designed as a way to test detection capabilities.
The Assured Timing Detector (ATD) software provides an implementation in C++ of an adaptable, model-based system for monitoring a timing signal for anomalies versus a reference time source ensemble. This system is designed to be model-based, adaptive, and customizable. A companion status display software package, ATD-SD, is also available for use with this software.
New Repo: Skywing
September 09, 2022 (new-repo)
Skywing is a high-reliability, real-time, decentralized platform for collaborative autonomy–focused applications. The repo includes installation instructions, examples, and a tutorial; some dependencies are managed as git submodules.
Celebrating 10 Years of Hackathons
September 07, 2022 (event-report)
After 10 years and 33 hackathons, nothing can stop this beloved tradition. “Hackathons are wildly popular not just because they allow employees to try new things and develop new skills, but also because they are so much fun!” says Computing’s associate director Bruce Hendrickson. Read an article about this milestone, and check out the LLNL Flickr album of hackathon photos throughout the years.
Dev Day Returns for the Sixth Year
September 02, 2022 (event-report)
Held in a hybrid format for the first time, LLNL’s Developer Day 2022 convened more than 70 people for an agenda of lightning talks, a town hall discussion, and guest speakers. Dev Day provides different ways for the audience to learn about and engage in topics of interest to the developer community—such as lightning talks about small projects, summaries of literature, deep dives into project planning and implementation, discussions of career opportunities and challenges, and networking and brainstorming sessions—and the format varies every year. This year’s event spotlighted guest speakers who explained the impact of unique software projects on their organizations. For instance, Miranda Mundt, a research software engineer at Sandia National Laboratories, described her team’s tiered approach to software quality practices, while Dr. Arun Viswanathan from NASA’s Jet Propulsion Laboratory presented work that keeps space missions resilient from cyber threats.
Sandia Leverages LLNL's Open Source Software for New Website
August 31, 2022 (meta) (story)
A Sandia software development team has launched a new website based on one of LLNL’s open-source projects. The new software.sandia.gov website (repository: github.com/sandialabs/sandialabs.github.io) serves as a portal into Sandia’s GitHub repositories, providing more than a dozen browsable categories alongside data visualizations of GitHub data—repository relationships, commit and pull request activity, common licenses, and more. Forked from this very website (repository: github.com/LLNL/llnl.github.io), Sandia’s site is also built on a Jekyll template and served via GitHub pages. Read more about the implementation process and the ways the two websites vary.
LLNL ATDM Addresses Software Infrastructure Needs for Multiple Communities
August 30, 2022 (story)
The Advanced Technology Development and Mitigation (ATDM) program within the Exascale Computing Project (ECP) shows that the best way to support the mission is through open collaboration and a sustainable software infrastructure. Although ATDM primarily supports NNSA’s traditionally closed mission of national security, LLNL’s ATDM Software Technology (ST) project contributes key open source components of a full-featured, integrated, and maintainable software stack for exascale systems that will impact both the ECP and the broader HPC community. A new article on the ECP website describes the project’s goals and technical focus areas in an interview with Livermore Computing division leader Becky Springmeyer and ATDM ST deputy lead Todd Gamblin.
Materials Available from RADIUSS AWS Tutorials
August 29, 2022 (event-report) (multimedia)
This summer, LLNL’s RADIUSS team conducted a series of tutorials in collaboration with Amazon Web Services (AWS), demonstrating how to use several GPU-ready projects in the cloud and on premises. The tutorials were open to everyone and held every week throughout August. Participants followed along on their own AWS EC2 instance (provided). No previous was experience necessary. Materials are accessible on the RADIUSS website.
ECP Annual Meeting Videos Now Available
August 24, 2022 (event-report) (multimedia)
The Exascale Computing Project, a joint effort between the DOE Office of Science and NNSA, brings together several national laboratories to address many hardware, software, and application challenges inherent in the organizations’ scientific and national security missions. The ECP’s annual meeting was held this year on May 2–6. Each day’s sessions are available in a dedicated YouTube playlist. Individual sessions highlighted below feature LLNL staff and open source projects.
As part of the Exascale Computing Project (ECP), the Center for Efficient Exascale Discretizations (CEED) is a research partnership between two U.S. Department of Energy laboratories and five universities. LLNL leads the Center. All of CEED’s software is open source.
CEED held its sixth annual meeting (CEED6AM) on August 9-11 in a hybrid format: in-person at the Siebel Center for Computer Science on the University of Illinois Urbana-Champaign campus in Urbana and virtually using ECP Zoom for videoconferencing and Slack for side discussions. Learn more about the agenda on the CEED6AM event website.
As the number of smart meters and the demand for energy is expected to increase by 50% by 2050, so will the amount of data those smart meters produce. While energy standards have enabled large-scale data collection and storage, maximizing this data to mitigate costs and consumer demand has been an ongoing focus of energy research. GridDS is an open source data science toolkit for power and data engineers that will provide an integrated energy data storage and augmentation infrastructure, as well as a flexible and comprehensive set of state-of-the-art machine learning models. By providing an integrative software platform to train and validate machine learning models, GridDS will help improve the efficiency of distributed energy resources, such as smart meters, batteries and solar photovoltaic units. GridDS also is designed to leverage advanced metering infrastructure, outage management systems data, supervisory control data acquisition, and geographic information systems to forecast energy demands and detect incipient grid failures. Read more about GridDS at LLNL News.
GEOSX Simulates Carbon Dioxide Storage
August 02, 2022 (story)
GEOSX and its predecessor, GEOS, were developed from the ground up with the help of experts from across LLNL, combining a range of disciplines including engineering, seismology, hydrology, computational geoscience, and oil- and gas-industry expertise to build a tool that can take advantage of advanced computing platforms. A Research Highlight article in the latest issue of Science & Technology Review describes how GEOSX will improve the management and security of geological repositories and support planning for the widespread implementation of CO2 storage at an industrial scale by simulating how fluids flow and rocks break deep underground. Read “GEOSX Simulates Carbon Dioxide Storage” on the S&TR website.
New Repo: PDBspheres
July 29, 2022 (new-repo)
PDBspheres is a structure-based method for finding and evaluating structural similarities in protein regions relevant to ligand binding. PDBspheres comprises an exhaustive library of protein structure regions (“spheres”) adjacent to complexed ligands derived from the Protein Data Bank (PDB), along with methods to find and evaluate structural matches between a protein of interest and spheres in the library.
New Repo: Wintap
July 22, 2022 (new-repo)
Wintap is an extensible host-based agent for Windows. Wintap provides a singular and extensible service-based runtime environment, a unified data model, API abstraction, and data discovery for an integrated, locally hosted web-based analytic “workbench” from where real-time event streams can be queried and explored.
New Repo: DFTT
July 21, 2022 (new-repo)
DFTT (Detection Framework Testbed and Toolkit) is a database and associated Java programs intended to facilitate the development and testing of algorithms for operating suites of correlation and subspace detectors. This framework is a generalization of the system described in Harris and Dodge (2011). It allows retrospective processing of sequences of data using various system configurations. Results are saved in a database, so it is easy to compare the results obtained using different configurations of the system.
NAHOMCon22 Explores High Order Methods for PDEs
July 19, 2022 (event-report)
After a few years off, the North American High Order Methods Conference (NAHOMCon) returned on July 18-19 in San Diego. NAHOMCon provides a North American forum for computational scientists, mathematicians, scientists, and engineers to share ideas and techniques on, and further the state of the art of, high order methods for the solution of partial differential equations with applications to a broad range of scientific and engineering applications. The DOE co-design Center for Efficient Exascale Discretizations (CEED) participates in this conference. CEED is a partnership between two U.S. DOE laboratories (Livermore & Argonne) and five universities in support of the Exascale Computing Project.
The NAHOMCon22 program featured several LLNL speakers, whose GitHub profiles and abstracts are linked here:
LLNL’s Data Science Institute launched the Open Data Initiative (ODI) a few years ago. The ODI shares LLNL’s rich, challenging, and unique datasets with the larger data science community. The goal is for these datasets to help support curriculum development, raise awareness around LLNL’s data science efforts, foster new collaborations, and be leveraged across other learning opportunities. The ODI currently has 13 publicly available datasets in its collection, the newest of which builds off the Cardioid open source code, which simulates the electrophysiology of the human heart. A research team has conducted a computational study to generate a dataset of cardiac simulations at high spatiotemporal resolutions. This dataset was built using real cardiac bi-ventricular geometries and clinically inspired endocardial activation patterns under different physiological and pathophysiological conditions. It consists of pairs of computationally simulated intracardiac transmembrane voltage recordings and electrocardiogram signals. Read more and download the data from the ODI web page.
New Repo: EchemFEM
July 13, 2022 (new-repo)
EchemFEM provides finite element solvers for electrochemical transport. Both continuous Galerkin (CG) and discontinuous Galerkin (DG) schemes are provided. The following transport mechanisms are available: diffusion, advection, electromigration. EchemFEM supports both non-porous and porous cases. The ionic potential can either be described using an electroneutrality constraint or a Poisson equation. In the porous case, the electronic potential can be described by a Poisson equation.
New Repo: SAC2000
July 05, 2022 (new-repo)
SAC2000 (Seismic Analysis Code for the third millennium) is a general purpose interactive program designed for the study of sequential signals, especially time-series data. Emphasis has been placed on analysis tools used by research seismologists in the detailed study of seismic events. Analysis capabilities include general arithmetic operations, Fourier transforms, three spectral estimation techniques, IIR and FIR filtering, signal stacking, decimation, interpolation, correlation, and seismic phase picking.
New RADIUSS Activity Portal
June 23, 2022 (this-website)
The RADIUSS project has a new activity portal (source code for this GitHub action) that displays contributors’ activities, such as pull requests, issue comments, and releases. Visitors to this page can browse the entire list or filter by activity type to learn more about who contributes to RADIUSS and how they do so. RADIUSS aims to develop and deploy a common base of foundational scientific software with opt-in adoption from LLNL applications in order to reduce long-term software costs and increase agility.
@LLNL_OpenSource Account Is Twitter Verified
June 10, 2022 (this-website)
Our @LLNL_OpenSource Twitter account is now officially verified with a blue checkmark. This badge affirms the account’s authenticity and notability to users and confirms that our account is active. Read more about verified accounts on Twitter’s help website.
LLNL and AWS to Cooperate on Standardized HPC Software Stack
May 26, 2022 (story)
LLNL and Amazon Web Services (AWS) have signed a memorandum of understanding (MOU) to define the role of leadership-class HPC in a future where cloud HPC is ubiquitous. Under the MOU, LLNL and AWS will explore software and hardware solutions spanning cloud and on-premises HPC environments, with the goal of establishing a common stack of open source software components that can run equally well at both large HPC centers and on cloud resources. LLNL and AWS have an existing open source collaboration involving Spack; building off that collaboration, the organizations will look to better understand how HPC centers can best utilize cloud resources to support HPC and explore models for cloud-bursting, data staging, and data migration for deploying both on-site and in the cloud. Read more about the MOU at LLNL News.
LLNL at ISC22
May 06, 2022 (event)
ISC High Performance Conference (ISC22) returns on May 29 through June 2, with in-person events held in Hamburg, Germany. The event brings together the HPC community—from research centers, commercial companies, academia, national laboratories, government agencies, exhibitors, and more—to share the latest technology of interest to HPC developers and users. View LLNL’s lineup of tutorials, BOFs, and workshops.
LLNL's Spring Hackathon (and 32nd Overall) Coming Up
May 06, 2022 (event)
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. This year’s spring event (May 26-27) will be held in person at the Livermore Valley Open Campus. Sponsors are two Computing divisions: Enterprise Applications Services and National Ignition Facility Computing.
New RADIUSS Catalog and Repo
May 03, 2022 (new-repo) (this-website)
The RADIUSS project has a new look including an About page and an interactive catalog of open-source products. These new web pages are managed in a dedicated repo under the LLNL GitHub organization. RADIUSS aims to develop and deploy a common base of foundational scientific software with opt-in adoption from LLNL applications in order to reduce long-term software costs and increase agility.
Exascale Computing Project Community BOF Days
April 25, 2022 (event)
The Exascale Computing Project (ECP) 2022 Community Birds-of-a-Feather (BOF) Days will take place May 10–12 with multiple sessions per day. The BOF Days provide an opportunity for the HPC community to engage with ECP teams to discuss latest development efforts. Each BOF will be a 60- to 90-minute session on a given topic, with a brief overview followed by Q&A. All sessions will be conducted via Zoom. View the schedule; each session has its own registration link.
Julian Andrej Applies Mathematics and Engineering to Support LLNL Missions
March 09, 2022 (profile)
Computational mathematician Julian Andrej began using LLNL-developed, open source software while in Germany. Now at Livermore, he lends his expertise to the Center for Applied Scientific Computing, developing code for next-generation computing hardware. “I know that every part of my work is contributing to a mission, and I can see a clear traceable path of its impact,” he says. Andrej contributes to MFEM and SUNDIALS projects. Read more about his work.
Peter Lindstrom: Then and Now
January 25, 2022 (profile)
The Department of Energy’s (DOE’s) Office of Science interviewed LLNL computer scientist Peter Lindstrom about his work since receiving the 2011 Early Career Award. Lindstrom leads the zfp project, which provides a compressed format for representing multidimensional floating-point and integer arrays. zfp is currently used in the DOE’s Exascale Computing Project.
FEM@LLNL Seminar Series
January 05, 2022 (multimedia) (story)
The MFEM team has announced a new FEM@LLNL seminar series focusing on finite element research and applications talks of interest to the MFEM community. Visit the MFEM website to see the full lineup of speakers. Seminars will be hosted and recorded via WebEx; videos of the recordings will be available from the MFEM website.
The MFEM team held the first annual MFEM Community Workshop on October 20, 2021. MFEM, which stands for Modular Finite Element Methods, is an open source C++ software library that provides high-order mathematical algorithms for large-scale scientific simulations. The project’s discretization methods enable HPC systems to run these simulations more efficiently. More than 150 researchers from dozens of organizations and countries attended the one-day virtual workshop organized by Aaron Fisher, Tzanio Kolev, Will Pazner, and Mark Stowell. According to the registration survey, more than half of the participants were new users. An article about the workshop is available on LLNL’s Computing website, and links to videos of the presenters can be found at the MFEM website.
Celebrate Exascale Day 2021
October 18, 2021 (event) (multimedia)
Exascale computing will transform the ability to tackle some of the world’s most important challenge. The Exascale Computing Project (ECP) celebrates this new era of scientific discovery with the now-annual Exascale Day on October 18, or “10^18” to represent the exascale threshold of floating-point operations per second. This virtual event runs all week and provides multimedia and articles that educate explain the impact areas of exascale computing from the Department of Energy national laboratories (including LLNL), HPC manufacturers, and universities and industrial organizations. Much of the ECP’s software stack is open source.
Variorum is a platform-agnostic software library exposing monitor and control interfaces for several features in hardware architectures from IBM, ARM, and NVIDIA. In a two-part lecture series, the Variorum team demonstrates everything necessary to start using Variorum to write portable power management code. These videos were recorded as part of the Exascale Computing Project’s lecture series.
Reduced order models (ROMs) combine data and underlying first principles to accelerate physical simulations, reducing computational complexity without losing accuracy. The C++ software library called libROM provides data-driven physical simulation methods from intrusive projection-based ROMs to non-intrusive black-box approaches. The project has a new website that contains documentation and examples. Additionally, computational scientist Youngsoo Choi has recorded three user tutorials, with plans to record more in the future:
The Python-based library Hatchet will be part of a new Supercomputing 2021 (SC21) tutorial on performance tools. Hatchet allows Pandas dataframes to be indexed by structured tree and graph data and is intended for analyzing hierarchical performance data. The development team has released a video titled “User-Centric Automated Performance Analysis of Hybrid Parallel Programs” to preview the tutorial and give attendees an idea of what to expect. SC21 will take place in a hybrid format on November 14–19.
Vanessa Sochat Presents Keynote at SeptembRSE
September 06, 2021 (event-report) (multimedia)
LLNL computer scientist and open source advocate Vanessa Sochat delivered a keynote presentation titled “The Stories We Tell Ourselves” at the 5th Conference of Research Software Engineers on September 6. Sochat’s work includes developing container technologies, supporting tools, and fostering open source communities. She founded and hosts the Research Software Engineer Stories podcast and is an active member of the U.S. Research Software Engineer Association.
How to Spack a Software Package
August 27, 2021 (multimedia)
At the AWS/Arm Cloud Hackathon, LLNL’s Todd Gamblin and Greg Becker discussed (video, 31:42) the essential skills and concepts needed to understand how to create and deploy Spack recipes to build scientific codes. The Hackathon was held July 12-16 and aimed to assemble the HPC community around a common goal of beginning the porting, testing, and tuning processes for dozens of codes to use Arm-based processors.
Summer Hackathon Tradition Continues Virtually
August 23, 2021 (event-report)
Each new season brings another hackathon, and LLNL’s summer event took place on August 12–13. The event was sponsored by the Center for Applied Scientific Computing (CASC) and Livermore Computing (LC) divisions and organized by Stephanie Brink (CASC), Tammy Dahlgren (LC), and Stephen Herbein (LC). Additionally, Computing’s summer interns were encouraged to participate in the event. Ian Lee, open source advocate and Computing’s Alternate Organizational Information System Security Officer, kicked off the hackathon with a presentation titled “When a Hackathon Project Ends…Does It Make a Sound?” He gave participants a larger picture beyond the event’s concentrated 24 hours, detailing how he has shepherded hackathon projects into real-world applications with benefits to a wide group of users and developers across the Lab.
The goal of the workshop is to foster collaboration among all MFEM users and developers, share the latest MFEM features with the broader community, deepen application engagements, and solicit feedback to guide future development directions for the project.
Additionally, we are looking for users to present the work they are doing utilizing MFEM. If you are interested in presenting please indicate that in the registration form.
CEED's Fifth Annual Meeting Recap
August 05, 2021 (event-report)
As part of the Exascale Computing Project (ECP), the Center for Efficient Exascale Discretizations (CEED) is a research partnership between two U.S. Department of Energy laboratories and five universities. LLNL leads the Center. All of CEED’s software is open source.
CEED held its fifth annual meeting (CEED5AM) virtually on August 3-4. The goals of the meeting were to report on recent progress; deepen existing and establish new connections with ECP hardware vendors, ECP software technologies projects, and other collaborators; plan project activities; and work as a group to make technical progress. Presentations covered activities related to GPU support and GPU-enabled solvers, high-order methods and finite elements, software products including the AmgX linear solver library and libCEED algebraic library, benchmarking and optimization, various types of simulations enabled by CEED development, and much more.
Attendance included 97 researchers from 36 organizations:
8 national labs
ECP Annual Meeting Videos Now Available: Spack, CEED, Flux
July 29, 2021 (event-report) (multimedia)
The Exascale Computing Project, a joint effort between the DOE Office of Science and NNSA, brings together several national laboratories to address many hardware, software, and application challenges inherent in the organizations’ scientific and national security missions. The ECP’s annual meeting was held virtually this year on April 12-16. Several sessions are available in a dedicated YouTube playlist. LLNL’s highlights feature open source projects that are crucial to the ECP’s collaborative goals:
Spack BoF (runtime 1:00:40): This “birds of a feather” gathering details major developments in Spack releases, collaborative work with the E4S team, roadmap for future development, and results from a community survey.
Using Spack to Accelerate Developer Workflows (runtime 6:14:42): This tutorial focuses on developer workflows, covering covered installation, package authorship, Spack’s dependency model, and Spack environments and configuration. Participants can learn new skills in this tutorial, even if they have participated in Spack tutorials in the past.
Characterizing Performance Improvements in the Center for Efficient Exascale Discretizations (runtime 1:00:04, CEED section begins at 25:05): Speakers from ECP Application Development areas talked about how they set figures of merit, determined key performance parameters, and calculated efficiency of codes. CEED is a co-design center led by LLNL and focusing on discretization algorithms that better exploit the hardware and deliver a significant performance gain over conventional low-order methods. The video concludes with a panel discussion with the speakers.
Using Flux to Overcome Scheduling Challenges of Exascale Workflows (runtime 2:16:48): The Flux team provides an in-depth tutorial that demonstrates how Flux is used in challenging HPC workflows, how to unify Flux with other scheduling and resource management software tools, and how Flux’s job and resource model works, along with hands-on uses cases and testing.
Dev Day Makes the Most of Virtual Format
July 29, 2021 (event-report)
Held virtually on July 15, our fifth annual Developer Day was a success. The morning session included lightning talks, a security-focused technical deep dive, and “quick takes” on remote-development resources. The afternoon session provided presentations about career paths and the Lab’s diversity and inclusion goals, capped by a career development panel discussion co-sponsored by the Data Science Institute.
Spack on CppCast
May 28, 2021 (multimedia)
The CppCast podcast recently hosted Spack creator Todd Gamblin and core developer Greg Becker on an episode (59:13) to discuss a documentation tool, a blog post about floating point numbers, and ABI changes. The podcast is created by and for C++ developers.
Flux: Enabling Modern Supercomputing Workflows
May 26, 2021 (multimedia)
Flux is an open-source software framework that manages and schedules computing workflows to maximize available resources to run applications faster and more efficiently. Flux’s fully hierarchical resource management and graph-based scheduling features improve the performance, portability, flexibility, and manageability of both traditional and complex scientific workflows on many types of computing systems—in the cloud, at remote locations, on a laptop, or on next-generation architectures. Watch this video to learn more about Flux (runtime 7:14).
Vanessa Sochat Is Building Research Software and Open Source Engagement
May 18, 2021 (profile)
Vanessa Sochat has built her software engineering and computer science career in an unconventional way. After earning an undergraduate degree in Psychology, her first research assistant job involved using command line software and writing scripts. “I had no idea what I was doing, nor did anyone teach me, but I thrived in this environment,” she says. Vanessa recently joined LLNL to work on the BUILD project, Spack package manager, and other open-source initiatives. She was one of the original developers of the Singularity container technology, and she created and continues to produce the RSE Stories podcast. Read Vanessa’s profile at LLNL Computing.
Called to a Valuable Function, Stephanie Brink Streamlines the Lab’s Code
April 27, 2021 (profile)
LLNL Computing relies on engineers like Stephanie Brink to keep the legacy codes running smoothly. “You’re only as fast as your slowest processor or your slowest function,” says Stephanie, who works in the Center for Applied Scientific Computing. By analyzing a legacy code’s performance, Stephanie and her team can reduce the amount of time it takes to run and allow for more critical science to be accomplished. Stephanie is a frequent contributor to open source software, including Hatchet and Variorum. Read the full profile at LLNL Computing.
LLNL's Spring Hackathon Coming Up
April 20, 2021 (event)
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. Like the hackathons of the past year, the spring event (April 29-30) will be held virtually using WebEx and Mattermost for collaboration. LLNL sponsors are two Computing divisions: Enterprise Applications Services and National Ignition Facility Computing.
LLNL’s Rob Falgout Named to 2021 Class of SIAM Fellows
April 09, 2021 (profile) (story)
The Society for Industrial and Applied Mathematics (SIAM) has announced its 2021 Class of Fellows, including LLNL computational mathematician Rob Falgout. Falgout is best known for his development of multigrid methods and for hypre, one of the world’s most popular parallel multigrid codes. LLNL News has the fully story about this honor.
Spack and the NoTearsHPC Cluster at AWS
March 25, 2021 (multimedia) (story)
From HPC Tech Shorts, this video (25:09) shows Amazon Web Services team members discussing the NoTearsHPC cluster solution for 1-click launches. Evan Bollig and Sean Smith talk about how the cluster works, what it provides, and how to do complicated tasks quickly. They used Spack for installation.
New Computing Website Tags Content as Open Source
March 19, 2021 (story) (this-website)
LLNL’s computing website recently underwent a major overhaul to its design and information architecture. The site now features a taxonomy of Focus Areas that connect related content. These topics are tagged on News, People Highlights, and Projects. One of the topics is open source software. The site’s Livermore Computing page also directs users to this website for more information about open source projects.
Videos from Wild West Hackin' Fest
February 24, 2021 (multimedia)
LLNL computer engineer Ian Lee presented at Wild West Hackin’ Fest (WWHF) 2020, and both of his talks are now available on YouTube:
Intro to Git for Security Professionals (2:01:58). This workshop provides an overview and introduction to the version control system Git for security professionals who may have no background in software development and who would like to start using their favorite open source tool.
Releasing Your First (Python) Open Source Project to the Masses! (2:08:02). This video picks up from having just learned how to start using Git, and works through how to take that knowledge to start your own first open source project. This Hackin’ Cast is appropriate for attendees of all levels, and no prior knowledge (other than very basic command line and git usage) is expected.
WWHF offers high-quality information security education to beginners and seasoned professionals alike. A stated goal is to lower the barrier to entry for those seeking to enter into the world of information security.
NeurIPS Features LLNL Papers and Software
December 07, 2020 (event)
The 34th Conference on Neural Information Processing Systems (NeurIPS)features two LLNL papers advancing the reliability of deep learning for the Lab’s mission-critical applications. The most prestigious machine learning conference in the world, NeurIPS began virtually on December 6. The first paper describes a framework for understanding the effect of properties of training data on the generalization gap of machine learning (ML) algorithms—the difference between a model’s observed performance during training versus its “ground-truth” performance in the real world. The second NeurIPS paper introduces an automatic framework to obtain robustness guarantees of any deep neural network structure using the open source Linear Relaxation-based Perturbation Analysis (LiRPA) repo. Developed with colleagues at Northeastern University, China’s Tsinghua University, and UCLA, LiRPA algorithms can provide guaranteed upper and lower bounds for a neural network function with perturbed inputs.
New Templates for Community Health Files
November 24, 2020 (this-website)
Our .github repo houses file templates and other content that can be used by LLNL open source projects. The goal is to help standardize the presentation and organization of certain types of content across the LLNL organization. New this month are community health files that developers can copy and/or modify as needed to ensure their repos adhere to certain guidelines regarding licenses and other aspects of releasing and maintaining open source software. New files include Contributing Guidelines, a Notice, a Code of Conduct, and templates for opening issues and submitting pull requests. More information is available on the .github README.
LLNL's First Computing Virtual Expo
November 11, 2020 (event-report) (multimedia)
The LLNL Computing Virtual Expo was an end-to-end digital experience with interactive booths, networking opportunities, and on-demand presentations, held on September 30. Lab employees and the public were invited to learn about new initiatives while networking and engaging with the Computing community, including computer scientists, IT experts, HPC contacts, and software developers and engineers.
In addition to an exhibit booth about open source projects in general, the Expo featured booths for RADIUSS project and the Maestro Workflow Conductor. Ian Lee presented a lightning talk about the Lab’s open source community and policies, and Rob Neely gave a talk on RADIUSS. Materials:
Releasing Open Source Software at the Lab (poster)
Podcast: Power Up Your Java Using Python With JPype
October 26, 2020 (multimedia)
Python and Java are two of the most popular programming languages in the world, and have both been around for over 20 years. In that time there have been numerous attempts to provide interoperability between them, with varying methods and levels of success. One such project is JPype, which allows you to use Java classes in your Python code. In this Python podcast episode, lead developer Karl Nelson from LLNL explains why he chose it as his preferred tool for combining these ecosystems, how he and his team are using it, and when and how you might want to use it for your own projects. He also discusses the work he has done to enable use of JPype on Android, and what is in store for the future of the project. The episode runs 48:39.
Video: Build All the Things with Spack
October 20, 2020 (multimedia)
LLNL computer scientist Todd Gamblin presented a brief overview of Spack at CppCon. CppCon is an annual, week-long gathering for the C++ community. This accompanying video runs 6:53.
Exascale computing will transform the ability to tackle some of the world’s most important challenge. The Exascale Computing Project (ECP) celebrates this new era of scientific discovery with Exascale Day on October 18, or “10^18” to represent the exascale threshold of floating-point operations per second. This virtual event will provide videos, audio discussions, and articles that will educate participants about impact areas of exascale computing from the Department of Energy national laboratories, HPC manufacturers, and leading universities and industrial organizations. LLNL will be participating, and much of the ECP’s software stack is open source.
The FLOSS for Science podcast showcases open source software uses in science. Episode 30 covered the philosophy of Spack, its package management capabilities in HPC clusters, supported operating systems, and much more. The episode runs 52:26.
News Filters Added to Software Portal
September 02, 2020 (this-website)
This site’s News and Archive pages have been updated with filters for selecting news posts by category. These categories appear next to the date on each post. We have nearly five years’ worth of news, so this feature improves the navigation of different types of news. These filters were implemented by our 2020 summer intern.
New Visualizations of Popular Repositories
August 31, 2020 (this-website)
The Visualize section of this website has again expanded to include a new page that breaks down the popularity (i.e., stars) of LLNL repositories in a few ways: repos with the highest number of stars, creation history of those repos, increase of stars over time, commit activity of popular repos, and licenses of those repos. This new page, created by our 2020 summer intern, helps us better understand repos that have made a big impact in the open source community.
Video: Flux Framework Featured on Next Platform TV
August 18, 2020 (multimedia)
LLNL computer scientist Stephen Herbein discusses the open-source Flux Framework HPC software on this video episode of Next Platform TV. His segment begins at 27:34.
New Dependencies Page on Software Portal
July 28, 2020 (this-website)
The Visualize section of this website has grown to include a new page that visualizes our software catalog’s dependencies. LLNL software repos are shown in the context of repositories with dependencies, External Packages, and internal packages. You can move the slider to change the connections between repos, organizations, and dependencies as well as click on a circle to isolate its specific connections in an expansion panel on the right side of the page. This work, which enables us to learn more about our repos and how they are related, was done by our 2020 summer intern.
Spack Tutorial on AWS
July 20, 2020 (event-report) (multimedia)
Amazon Web Services hosted a free two-day Spack tutorial broadly targeted at HPC users, developers, and user support teams. Each day consisted of two 1.5-hour sessions with a 30-minute break in the middle. The first day covered Spack basics, while the second day drilled down on advanced features. Videos from day 1 (3:19:18) and day 2 (3:30:18) are available.
LLNL's Summer Hackathon Will Be Virtual
July 18, 2020 (event)
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. Like the spring hackathon earlier this year, the summer event (August 6-7) will be held virtually using WebEx and Mattermost for collaboration. LLNL sponsors are Livermore Computing and the Center for Applied Scientific Computing.
Webinar: What’s New in Spack?
July 15, 2020 (event-report) (multimedia)
The IDEAS Productivity project, in partnership with the DOE Computing Facilities of the ALCF, OLCF, and NERSC and the DOE Exascale Computing Project, hosts a webinar series on Best Practices for HPC Software Developers. A webinar titled “What’s New in Spack?” was presented by LLNL’s Todd Gamblin on July 15. Slides and a video (1:26:33) from the session are available.
New Consolidated FAQ on Software Portal
July 08, 2020 (this-website)
Much of the content under the About section of this website has been consolidated into an easy-to-navigate FAQ page. The FAQ explain how to get started on GitHub, become part of the LLNL organization, manage repositories, and much more. We encourage readers to provide feedback or new questions by contacting the LLNL GitHub admins or submitting a pull request.
New Data Visualizations on Software Portal
July 07, 2020 (this-website)
The Visualize section of this website is benefitting from new development by our summer intern. Data we collect from GitHub is visualized in various ways, with additional visualizations planned. These efforts help us understand our repos’ activity, how they are being used, development trends, and more. Check out the new “Repo Licenses” viz and stay tuned for more!
Video: MFEM: Advanced Simulation Algorithms for HPC Applications
June 24, 2020 (multimedia)
MFEM is an open-source software library that provides advanced mathematical algorithms for use by scientific applications. By relying on MFEM, application scientists can quickly develop highly accurate physics simulation codes on a variety of platforms—from laptops to the world’s largest supercomputers. MFEM version 4.0 incorporates the most advanced techniques from the scientific computing research community, and its methods are widely applicable, highly impactful, and easy to use. A new video (7:07) features members of the LLNL development team, who describe how the software library works.
Podcast: The MFEM Finite Element Library Broadens GPU Support
June 08, 2020 (multimedia) (release)
MFEM is a lightweight, general, scalable C++ library for finite element methods. v4.1 was released in March. LLNL computational mathematician and MFEM PI Tzanio Kolev joined the Let’s Talk Exascale podcast to talk about the release and MFEM’s expanded GPU support. The podcast episode runs 6:28.
Held since 2012, LLNL’s hackathons are 24-hour opportunities to brainstorm, foster creativity, prototype, and explore. Participants work in groups or individually and often strive to learn new skills, programming languages, and tools in service to LLNL’s missions. This year’s spring hackathon (April 30 through May 1) will be held virtually. In true hackathon spirit, several tech solutions will enable participants to collaborate remotely. Charalynn Macedo, division leader for LLNL’s Enterprise Applications Services, will kick off the event with a brief keynote presentation.
Software Engineering 101: I have some code! Now what?
November 12, 2019 (event-report) (story) (this-website)
As part of LLNL’s Computing 101 speaker series, Ian Lee gave a talk to employees on November 12 titled “Software Engineering 101: I have some code! Now what?” The presentation reviewed the Lab’s resources for supporting software engineering and open source development.
Lee, who manages this website and leads many initiatives in the Lab’s open source community, aimed his remarks at relative newcomers to the software development landscape. He also updated the audience on the state of open source development at the Lab.
The Lab provides a wide range of support and solutions for just about any task a developer does: programming languages, package managers, computing platforms, code editors, version control systems, project communication, project tracking, documentation, and much more. Lee provided an overview of these options, offered advice about how to navigate the Lab’s software resources, and encouraged developers to take advantage of colleagues’ knowledge and experience.
Lee summarized the Lab’s recent open source activity, which echoes a trend toward developing “out in the open,”—i.e., not waiting for code to mature before releasing it for community feedback and contributions. (As this website shows, the Lab and affiliated GitHub organizations have almost 600 repos.) Accordingly, the Lab has updated its open source release policies to support modern code development practices.
Lee also demoed this website’s category-driven design changes, LLNL’s open source logo (and stickers), the @LLNL_OpenSource Twitter account, and Slack channels. He noted that LLNL may have a booth at PyCon 2020, which will be held April 15-23 in Pittsburgh. (Conferences such as PyCon provide LLNL’s open source software community with opportunities for networking, collaboration, and technical skills development. Lab employees interested in attending similar events may contact Ian Lee for funding.)
Software Portal Redesign and GitHub Integration
July 30, 2019 (this-website)
Recently this website received several changes that improve the user’s experience, keep the content fresh, and help the admin team monitor and track all repositories under the LLNL organization on GitHub. We are excited to improve user access to LLNL’s 500+ open source repositories and appreciate the help of our summer intern, Angela Flores, who is pursuing a B.S. in computer science with a minor in cybersecurity from Cal State Long Beach.
*LLNL’s RADIUSS project—Rapid Application Development via an Institutional Universal Software Stack—aims to broaden usage across LLNL and the open source community of a set of libraries and tools used for HPC scientific application development.
Inaugural NAHOMCon19 Coming to San Diego
February 14, 2019 (event)
To all computational scientists, mathematicians, scientists, and engineers interested in high-order methods and PDEs: Several institutions have joined together to organize the inaugural North American High Order Methods Conference (NAHOMCon19). The conference will be held in San Diego in the summer of 2019 and will focus on the many developments in high-order discretizations and applications taking place in North America.
The DOE co-design Center for Efficient Exascale Discretizations (CEED) is pleased to participate in the conference. CEED is a partnership between two U.S. DOE laboratories (Livermore & Argonne) and five universities in support of the Exascale Computing Project.
Earth System Grid Federation's Annual Conference Coming Up
November 03, 2018 (event)
The LLNL-led international Earth System Grid Federation (ESGF) will meet December 3-7 in Washington, DC, to plan the future of Earth system data analysis and more. Registration info is available on the ESGF website along with the conference agenda. Fork this 2017 R&D 100 winner on GitHub.
Flux and Spack Events Coming to Supercomputing '18
October 27, 2018 (event)
LLNL staff are heading to Dallas, Texas, for the 30th annual Supercomputing Conference (SC18) on November 11–16. LLNL is leading 6 tutorials and 16 workshops with topics ranging from data analytics and data compression to performance analysis and productivity. LLNL-developed open-source tools Flux and Spack are subjects of a workshop and a tutorial, respectively. We hope to see you there!
Read more about our past experiences and tips for first-timers, and a complete list of LLNL-led sessions can be found on the LLNL Computing website (links unpublished in 2020). All times are listed in Central Standard Time.
Open-Source Developer Greg Becker Scales Projects and Mountains
October 26, 2018 (profile)
Is there a connection between rock climbing and software development? In this profile, LLNL’s Greg Becker describes his career path, motivation for improving HPC tools, and recent work with open-source projects like SCR, Caliper, and Spack.