hypre

CFD/hypre

Author	SHA1	Message	Date
Victor A. Paludetto Magri	9415d6aa08	FSAI implementation on CPUs (#610 ) Thir PR adds a factorized sparse approximate inverse (FSAI) implementation on hypre, which can be used as a standalone solver, preconditioner to Krylov methods, or complex smoother to BoomerAMG. Particularly, we consider the adaptive algorithm version, where the sparsity pattern of the lower triangular factor G is built dynamically, i.e., during an iterative procedure that tries to find the best nonzero positions for a given row of G. This implementation was performed on top of the IJ interface. It uses the diagonal portion of A for constructing G, i.e., it's a block-Jacobi method in the MPI sense. List of additional changes: * Add caliper instrumentation to FSAI. * Add ZeroGuess option to FSAI. * Performance optimizations. * Add OpenMP support to FSAI. * Make internal BLAS/LAPACK functions thread-safe. * Update CMake build. * Add new test cases: beam_tet_dof459_np1, beam_hex_dof459_np2, and beam_tet_dof2475_np4. * Add documentation for FSAI. Co-authored-by: Heather Switzer <switzer4@lassen36.coral.llnl.gov> Co-authored-by: heatherms27 <hmswitzer@email.wm.edu> Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>	2022-04-05 11:18:39 -07:00
Victor A. Paludetto Magri	6fd043c9c2	(S)Struct IO on GPUs (#599 ) This PR extends the (semi)-struct matrix/vector IO functions added on #583 with GPU support. Additionally: * Fix regression tests on Lassen. * Read data values into host memory * Update Umatrix read algorithm when the ParCSRMatrix is expected to live on the device * Reset deallocated pointers at hypre_IJMatrixDestroyParCSR to NULL * Clone rownnz info if present on a CSRMatrix * Reduce memory transfer and remove unused variables * Fix bug with -print option * Build rownnz info also when the ParCSRMatrix is in assembled state * Remove a few instances of "return ierr" * Refactor (s)struct IO - code works with cuda and without UM * Add executables to gitignore	2022-03-13 20:14:23 -07:00
Ruipeng Li	3bc7d267ef	Gpu default (#336 ) This PR changes AMG defaults regarding GPUs at various places, adds regression tests on GPUs, simplifies CUDA boxloop implementations. Co-authored-by: Sarah Virginia Osborn <osborn9@llnl.gov> Co-authored-by: PaulMullowney <pmullown@nrel.gov> Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov> Co-authored-by: Ruipeng Li <li50@euler.llnl.gov> Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>	2021-05-24 17:16:35 -07:00
Ramesh Pankajakshan	414fa671be	Umpire (#243 ) This PR contains the support of UMPIRE pooling allocators for host and GPU memory. Configure hypre with --with-umpire, device and uvm allocations and deallocations are done with umpire, whereas host pool is not enabled by default. This PR also includes some other minor changes: Adding .gitignore to the repo Removing all malloc/calloc/realloc/free and regression testing on finding them No longer compile ij.c with C++ compiler. It goes back to a C code now. Introducing HYPRE_USING_GPU, which is equivalent to HYPRE_USING_CUDA \|\| HYPRE_USING_DEVICE_OPENMP Adding a few user-level interfaces: HYPRE_SetMemoryLocation, HYPRE_SetExecutionPolicy, HYPRE_SetGPUMemoryPoolSize and HYPRE_CSRMatrixSetSpGemmUseCusparse Co-authored-by: li50@llnl.gov <liruipengblue@gmail.com> Co-authored-by: Rob Falgout <rfalgout@llnl.gov> Co-authored-by: Ruipeng Li <li50@llnl.gov>	2021-02-03 12:31:25 -08:00

4 Commits