LLNL’s Vanessa Sochat and collaborators from the Wellcome Sanger Institute, the Pawsey Supercomputing Research Institute, and the University of Texas at Dallas have written a paper about the Singularity Registry HPC (“shpc”). Sochat breaks down the team’s methods in a Twitter thread. You can also download the preprint PDF; abstract follows:
Linux container technologies such as Docker and Singularity offer encapsulated environments for easy execution of software. In high performance computing, this is especially important for evolving and complex software stacks with conflicting dependencies that must co-exist. Singularity Registry HPC (“shpc”) was created as an effort to install containers in this environment as modules, seamlessly allowing for typically hidden executables inside containers to be presented to the user as commands, and as such significantly simplifying the user experience. A remaining challenge, however, is deriving the list of important executables in the container. In this work, we present new automation and methods that allow for not only discovering new containers in large community sets, but also deriving container entries with important executables. With this work we have added over 8,000 containers from the BioContainers community that can be maintained and updated by the software automation over time. All software is publicly available on the GitHub platform, and can be beneficial to container registries and infrastructure providers for automatically generating container modules to lower the usage entry barrier and improve user experience.