Commit Graph

4 Commits

Author SHA1 Message Date
Victor A. Paludetto Magri
9415d6aa08
FSAI implementation on CPUs (#610)
Thir PR adds a factorized sparse approximate inverse (FSAI) implementation on hypre, which can be used as a standalone solver, preconditioner to Krylov methods, or complex smoother to BoomerAMG. Particularly, we consider the adaptive algorithm version, where the sparsity pattern of the lower triangular factor G is built dynamically, i.e., during an iterative procedure that tries to find the best nonzero positions for a given row of G. This implementation was performed on top of the IJ interface. It uses the diagonal portion of A for constructing G, i.e., it's a block-Jacobi method in the MPI sense. List of additional changes:

* Add caliper instrumentation to FSAI.
* Add ZeroGuess option to FSAI.
* Performance optimizations.
* Add OpenMP support to FSAI.
* Make internal BLAS/LAPACK functions thread-safe. 
* Update CMake build.
* Add new test cases: beam_tet_dof459_np1, beam_hex_dof459_np2, and beam_tet_dof2475_np4.
* Add documentation for FSAI.

Co-authored-by: Heather Switzer <switzer4@lassen36.coral.llnl.gov>
Co-authored-by: heatherms27 <hmswitzer@email.wm.edu>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
2022-04-05 11:18:39 -07:00
Victor A. Paludetto Magri
6fd043c9c2
(S)Struct IO on GPUs (#599)
This PR extends the (semi)-struct matrix/vector IO functions added on #583 with GPU support. Additionally:

* Fix regression tests on Lassen.
* Read data values into host memory
* Update Umatrix read algorithm when the ParCSRMatrix is expected to live on the device
* Reset deallocated pointers at hypre_IJMatrixDestroyParCSR to NULL
* Clone rownnz info if present on a CSRMatrix
* Reduce memory transfer and remove unused variables
* Fix bug with -print option
* Build rownnz info also when the ParCSRMatrix is in assembled state
* Remove a few instances of "return ierr"
* Refactor (s)struct IO - code works with cuda and without UM
* Add executables to gitignore
2022-03-13 20:14:23 -07:00
Ruipeng Li
3bc7d267ef
Gpu default (#336)
This PR changes AMG defaults regarding GPUs at various places, adds regression tests on GPUs, simplifies CUDA boxloop implementations. 

Co-authored-by: Sarah Virginia Osborn <osborn9@llnl.gov>
Co-authored-by: PaulMullowney <pmullown@nrel.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
Co-authored-by: Ruipeng Li <li50@euler.llnl.gov>
Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>
2021-05-24 17:16:35 -07:00
Ramesh Pankajakshan
414fa671be
Umpire (#243)
This PR contains the support of UMPIRE pooling allocators for host and GPU memory. Configure hypre with --with-umpire, device and uvm allocations and deallocations are done with umpire, whereas host pool is not enabled by default. This PR also includes some other minor changes:

Adding .gitignore to the repo
Removing all malloc/calloc/realloc/free and regression testing on finding them
No longer compile ij.c with C++ compiler. It goes back to a C code now.
Introducing HYPRE_USING_GPU, which is equivalent to HYPRE_USING_CUDA || HYPRE_USING_DEVICE_OPENMP
Adding a few user-level interfaces: HYPRE_SetMemoryLocation, HYPRE_SetExecutionPolicy, HYPRE_SetGPUMemoryPoolSize and HYPRE_CSRMatrixSetSpGemmUseCusparse

Co-authored-by: li50@llnl.gov <liruipengblue@gmail.com>
Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
Co-authored-by: Ruipeng Li <li50@llnl.gov>
2021-02-03 12:31:25 -08:00