hypre_ParCSRMatrixPrintIJ works for matrices living on the device w/o the need of UVM support. A explicit copy is to host memory is performed in this function prior to printing the files.
This PR fixes compilation warnings obtained with gcc-11, clang-12, and clang-14. A list of the warnings is given below:
* -Wundef
* -Wunused
* -Wdouble-promotion
* -Wsometimes-uninitialized
* -Wunused-variable
* -Wunused-but-set-variable
Add -auxfromfile option for reading an auxiliary matrix from file, which is then used to build the preconditioner. This is useful, for example, for the case when a filtered version of A is used to build the preconditioner.
Further unification of GPU implementation across cuda/hip/sycl.
Implements the parallel matrix matrix product in sycl.
HYPRE_CUDA_LAUNCH and HYPRE_SYCL_LAUNCH macros have
been unified under HYPRE_GPU_LAUNCH for kernel launches.
Replace HYPRE_SetSpGemmUseCusparse with HYPRE_SetSpGemmUseVendor.
* Added new capabilities to allow multilevel assignment of solver options
* New (local) block Jacobi option for smoothers and intergrid operators
* Added capabilities to do CPR in MGR
* Updated non-Galerkin strategy for constructing the coarse grid.
Co-authored-by: Quan Bui <mquan.bui@gmail.com>
This PR fixes a segmentation fault on HYPRE_SStructGraphDestroy. The error occurred when the number of graph entries added to the SStructGraph via HYPRE_SStructGraphAddEntries was larger than 1000.
This PR fixes compilation of the "complex" build variant of hypre. It also adds hypre_csqrt for computing the square root of an HYPRE_Complex number. This function/macro works when enable-complex is turned on/off.
Thir PR adds a factorized sparse approximate inverse (FSAI) implementation on hypre, which can be used as a standalone solver, preconditioner to Krylov methods, or complex smoother to BoomerAMG. Particularly, we consider the adaptive algorithm version, where the sparsity pattern of the lower triangular factor G is built dynamically, i.e., during an iterative procedure that tries to find the best nonzero positions for a given row of G. This implementation was performed on top of the IJ interface. It uses the diagonal portion of A for constructing G, i.e., it's a block-Jacobi method in the MPI sense. List of additional changes:
* Add caliper instrumentation to FSAI.
* Add ZeroGuess option to FSAI.
* Performance optimizations.
* Add OpenMP support to FSAI.
* Make internal BLAS/LAPACK functions thread-safe.
* Update CMake build.
* Add new test cases: beam_tet_dof459_np1, beam_hex_dof459_np2, and beam_tet_dof2475_np4.
* Add documentation for FSAI.
Co-authored-by: Heather Switzer <switzer4@lassen36.coral.llnl.gov>
Co-authored-by: heatherms27 <hmswitzer@email.wm.edu>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
* fixed MM-multipass interpolation for case of no C-points
* fixed the issue of isolated groups of fine points and added a regression test.
* corresponding changes to the device code
Co-authored-by: Ruipeng Li <li50@llnl.gov>
This PR extends the (semi)-struct matrix/vector IO functions added on #583 with GPU support. Additionally:
* Fix regression tests on Lassen.
* Read data values into host memory
* Update Umatrix read algorithm when the ParCSRMatrix is expected to live on the device
* Reset deallocated pointers at hypre_IJMatrixDestroyParCSR to NULL
* Clone rownnz info if present on a CSRMatrix
* Reduce memory transfer and remove unused variables
* Fix bug with -print option
* Build rownnz info also when the ParCSRMatrix is in assembled state
* Remove a few instances of "return ierr"
* Refactor (s)struct IO - code works with cuda and without UM
* Add executables to gitignore
This PR adds support for native print/read functions of SStructMatrix and SStructVector. Other important changes are:
* Add public functions for reading StructMatrix and StructVector.
* Add a new set of regression tests called "io" to the TEST_sstruct folder.
This PR fixes issue #556.
AC_CHECK_FILE was being used to test the existence of the .git folder. However, according to Autoconf manual, it does not work when cross-compiling. This PR implements another strategy for looking for the .git folder which works also when doing cross-compilation.
Enable GPU setup for MGR solver.
* Added device specific functionality for interpolation
* Made device and host calls to interpolation consistent
* Edited IJ driver to use GPU capable options for MGR
* Updated saved files for new GPU options
* Updated CMakeLists to support new MGR capabilities
Co-authored-by: Ruipeng Li <li50@llnl.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
This adds matvec, matrix transpose, and vector operations (axpy, inner product, etc.)
with sycl backend (via oneMKL and oneDPL) for running on Intel GPUs. Thus, the AMG
solve phase can now execute entirely on Intel GPUs.
This PR adds Fortran interfaces for hypre_MGR and hypre_ILU. Additionally:
* Add ArrayArray types in `fortran.h`
* Add MGR and ILU options to the fortran interfaces for Krylov solvers
Usually consumers of HYPRE call find_package(HYPRE) and depend on
HYPRE::HYPRE target. But they could also want to use HYPRE via
add_directory (for instance via a git submodule)) or FetchContent,
in which case they have to depend on HYPRE target.
This alias makes this usage more consistent, all users could then
depend on HYPRE::HYPRE. See for instance
https://cmake.org/pipermail/cmake/2018-November/068629.html