JustinPrivitera open issue visit-dav/visit#20411.
Update AMR test baselines
Update baselines once #20385 is merged….View Comment
spackbot pushed to spack/spack
[@spackbot] updating style on behalf of tgamblin
tgamblin opened a pull request to spack/spack
psakievich open issue spack/spack#50699.
Issues with sparse checkout
### Steps to reproduce…View Comment
upsj commented on issue spack/spack#50698.
Interestingly enough, this also happens when pushing to a source cache…
spackbot-app[bot] commented on issue spack/spack#50695.
…
Address some implicit conversion warnings
Co-authored-by: John Camier camierjs@gmail.com</small>
adamqc open issue mfem/mfem#4874.
LOR Solver: order 1 uses more memory than order 2 with the same number of DoFs
I encountered a strange issue when using the LOR solver with HypreAMS. It seems that it uses more memory for order 1 than for order 2 with the same number of DoFs, while the memory usage for order 3 is similar to that of order 2. This is unexpected, as one would typically assume that a lower order would require less memory. How to explain this behavior?…View Comment
github-actions[bot] closed issue mfem/mfem#4017.
Overlap halo exchange with process interior computations
Hi everyone, …View Comment
victorapm commented on issue hypre-space/hypre#1289.
@oseikuffuor1 @rfalgout I updated the Elm
function names to Pointwise
. Please, let me know if this looks good…
jameshcorbett open issue flux-framework/flux-core#6842.
Tolerate some node-specific failures in job prolog
In a meeting today, several members of the team discussed how Flux could tolerate individual nodes failing some sort of prolog check while still allowing the job to proceed, ideally without those nodes. This would be useful on systems with prologs have a significant chance of failing, especially on systems with long queue times. Then if a user required 50 nodes, they could submit a request for 60 nodes and somehow indicate that they would be satisfied with 10 of those nodes failing a prolog….View Comment
codecov[bot] commented on issue flux-framework/flux-core#6669.
## Codecov Report…
balos1 commented on issue LLNL/sundials#706.
> > @gardner48 I believe this is ready to be merged. …
najlkin commented on issue GLVis/glvis#339.
Js seems to be working: https://github.com/GLVis/glvis-js/actions/runs/15313952476 😉 Thanks for the cleanup 👍 I need to take a deeper look……
markcmiller86 pushed to visit-dav/visit
snapshot
brugger1 opened a pull request to visit-dav/visit
Sam-Briney commented on issue visit-dav/visit#20385.
@JustinPrivitera Take your time, thanks for taking a look….
homeomorfismo commented on issue mfem/mfem#4769.
From https://github.com/mfem/mfem/pull/4769#issuecomment-2822389336 @v-dobrev’s comments …
rfalgout pushed to hypre-space/hypre
Merge branch ‘recmat’ into recmat-pbnd
grondo opened a pull request to hpc/Spindle
mergify[bot] pushed to flux-framework/flux-core
Merge pull request #6841 from grondo/t2602-fix
testsuite: fix potential failure in t2602-job-shell.t
</small>
chu11 reviewed a flux-framework/flux-core pull request
seems like a good fix to me. do you want to put a comment? So cut & paste in the future will know why it’s done this way. …
grondo commented on issue flux-framework/flux-core#6835.
Ok, a patch for that test issue will be merged as soon as tests pass (#6841)….
vicentebolea opened a pull request to LLNL/zfp
maggul pushed to LLNL/sundials
arnoldi to domeig name change
sundials-users-github-account commented on issue LLNL/sundials#711.
Hi Dan,…
ibaned commented on issue LLNL/sundials#711.
Yes, there are other ODEs for those two equations as well (algebraic for voltage veq and differential for current ieq). I’ve put diagnostics into my residual callback and notice that the residual for the energy equation eeq is quite high (10000 compared to 1.0e-13 for others) when the solver fails…
lucpeterson open issue LLNL/merlin#508.
Remove annoying .slurm.out
https://github.com/LLNL/merlin/blob/0eccc86aa7bb08e414f7e72a66af96aeaf31a89d/merlin/study/script_adapter.py#L681…View Comment
JustinPrivitera pushed to LLNL/conduit
oops
rhornung67 reviewed a LLNL/RAJA pull request
Thank you @rchen20 …
tzanio commented on issue GLVis/glvis#339.
To clarify: “DOF” here means the numbering of the Mesh nodes (as a high-order grid function), right?…
siramok opened a pull request to Alpine-DAV/ascent
sambo57u commented on issue visit-dav/visit#20408.
Yes, I had a long closed thread where Biagas helped me build visit on Fedora 41 with GL support. This coninues that to Fedora 42….
oseikuffuor1 pushed to hypre-space/hypre
Tux regression fixes (#1165)
Tux regression fixes and astyle</small>
jwake commented on issue LLNL/zfp#268.
For what it’s worth, I managed to get the parallel decoder working fairly well on CUDA (~70GB/sec on an A100 with buffers in device memory, though I’ll note I’m only testing 3D at the moment) between this change and manually bodging the stream_rseek
call at the end of zfp_internal_cuda_decompress
to reset the pointer offset without trying to read from the stream (in my case, the stream was in device memory, so rseek internally trying to read the buffer would fail; short of a separate device-aware bitstream implementation I’m not sure what the best approach here would be from an API standpoint), and then manually saving/restoring the appropriate index data externally after compression / before decompression. I did note that further compressing the index data with eg. LZ4 or Deflate netted a further ~15% size improvement on the index data but with appropriate granularity the index was already pretty small compared to the bulk of the compressed data.
…
cyrush open issue LLNL/conduit#1437.
consider changing CONDUIT_EPISLON to std::numeric_limits
artv3 commented on issue LLNL/RAJA#1751.
Done, we added a helper function to simplify construction. …
JorgeG94 open issue LLNL/Caliper#670.
question about potential conda package
hi, I am interested in using caliper in some of my programs, specifically a python based one. All of the dependencies of this python code are administered through conda. I realize Caliper uses Spack, but the user community of this code does not. Is there any interest or opposition on creating a package that can be conda installed? …View Comment
codecov[bot] commented on issue flux-framework/flux-sched#1376.
## Codecov Report…
mplegendre open issue hpc/Spindle#83.
Handle interception of openat and related calls
Need to investigate whether we can safely intercept and handle openat (and the related *at family of I/O calls). We would need to do so without breaking the atomicity of the calls, which will be a challange. But we’re also missing I/O redirection when applications use openat….View Comment
cyrush pushed to visit-dav/visit
Merge remote-tracking branch ‘origin/biagas/update_vtk_to_9.5’ into biagas/update_vtk_to_9.5
trws commented on issue flux-framework/flux-sched#1372.
Ok, that sounds like a failure misidentified as ENOMEM then… will have to look and see if we have some that might go rogue….
grondo commented on issue flux-framework/flux-sched#1372.
The sar
data is sill there, here’s a report on memory around that time:…
sam-maloney commented on issue flux-framework/flux-sched#1373.
On my Rocky9 VM with 8 cores and default configuration (just running flux start -s3
), I get successful allocations for the following (with the jobspec resource dict in the first lines followed by the allocated R_lite and nodelist arrays in the second lines):…
thartland reviewed a LLNL/hiop pull request
Looks good to me….
mkre open issue LLNL/hatchet#162.
Cannot process simple Caliper sampling file
I created a Caliper sampling profile with…View Comment
davidbeckingsale pushed to LLNL/Umpire
Small refactor
github-actions[bot] pushed to LLNL/Umpire
Apply style updates
ksaurabh-amd commented on issue LLNL/Umpire#966.
Hi,…
cmcrook5 pushed to GEOS-DEV/LvArray
Merge branch ‘develop’ into feature/crook5/math
zekemorton opened a pull request to flux-framework/flux-sched
evaleev open issue LLNL/Umpire#965.
"Namespace" CMake target names to be able to build Umpire via CMake's FetchContent
When trying to build Umpire using CMake’s FetchContent module from within a large project we run into many target name clashes, e.g….View Comment
devreal commented on issue LLNL/Umpire#822.
We’re running into issues with missing namespaces as well. We want to include Umpire through CMake FetchContent but the clashes makes this more complicated than it needs to be. @evaleev can adds details if needed….
BagchiS6 open issue flux-framework/flux-sched#1375.
Using flexible gpu counts with JospecV1
Interested to know whether it’s possible to schedule jobs where all allocated mpi tasks will not use GPUs but few of them might. I am specifically interested in flux-python
commands to achieve this. Currently I see that flux.job.Jobspec.JobspecV1.from_command
has only options such as gpu-per-task
which might not allow for heterogeneous jobs which uses a mix of CPU+GPU tasks. …View Comment
wihobbs commented on issue flux-framework/flux-sched#1372.
Ah, thanks for the tip! The “around the same time” comment made me think you were grep’ping for timestamps or something. Now I see you were just commenting that fluxion reported the memory error around the same time as core reported an exception….
michaelmckinsey1 open issue LLNL/hatchet#161.
from_caliper Examples Not Working for Caliper>=2.10
Node ordering breaks the examples shown in https://llnl-hatchet.readthedocs.io/en/latest/analysis_examples.html#read-in-a-caliper-cali-file, with the error No hierarchy column in input file
. This is also documented in https://github.com/LLNL/hatchet/issues/123. We can recommend users to use the from_caliperreader
(https://github.com/LLNL/hatchet/pull/160) reader instead, which does not have this issue….View Comment
slabasan commented on issue LLNL/hatchet#156.
Can you please rebase?…
jbschroder pushed to XBraid/xbraid
adding the user-defined MPI buffer alloc/free functions to the C++ interface
gzagaris commented on issue LLNL/Umpire#964.
> We don’t currently have a way to “destroy” an allocator. The release function will deallocate all of the memory iff it is not currently used (e.g. it will deallocate all the chunks that the pool has allocated provided they aren’t being used to satisfy a user allocation request)…
djagodzinski commented on issue LLNL/maestrowf#430.
Hi @jwhite242,…
jwhite242 commented on issue LLNL/maestrowf#430.
No need for email: I am that LLNL person….
dtaller commented on issue LLNL/maestrowf#430.
Hi @djagodzinski . No, unfortunately I have not pushed this. I ended up not doing a scaling study on the machine with the PBS scheduler. Perhaps in the future, but I don’t have plans to pursue this now. You might be better off emailing one of the project owners who works at LLNL to ask….
daboehme commented on issue LLNL/Caliper#636.
Hi @adayton1, I think it fixed a segfault that could happen when tool_init() was called before the map was initialized….
adayton1 commented on issue LLNL/Caliper#636.
What symptoms did this fix?…
cyrush open issue LLNL/blt#720.
in the future, will CMAKE_CUDA_HOST_COMPILER continue to be required?
As we work to update CUDA support, see if we really need to require CMAKE_CUDA_HOST_COMPILER
…View Comment
benson31 pushed to LBANN/lbann
A new take on the LBANN toolkit.
This line of development will focus on providing optimizations for machine learning applications in PyTorch with a particular focus on HPC systems. The initial capability introduces the beginnings of an LBANN “backend” to PyTorch, primarily implementing memory optimizations for the MI300A architecture of El Capitan.
tl;dr: initial commit.</small>
bvanessen created a new branch, v1.x-master at LBANN/lbann
github-actions[bot] released v0.45.0.
View Release Notes for flux-sched v0.45.0 View Comment
victorapm pushed to GEOS-DEV/LvArray
Add tuo-cce-19-rocm-6.4.0 host config
pauleonix open issue LLNL/Caliper#668.
Build failure with CUDA 12 due to legacy NVTX dependency
> NVTX v2 has been removed from CUDA Toolkit after being previously deprecated. Migrate to NVTX v3 by changing your code from #include <nvtoolsext.h>
to #include "nvtx3/nvtoolsext.h"
, which is included in the toolkit….View Comment
BradWhitlock opened a pull request to LLNL/conduit
tdrwenski commented on issue LLNL/blt#719.
BTW I don’t have permission to re-run the tioga jobs (or I can but then I get Error encountered during job: you are currently unauthorized to use this runner
)…