Commit Graph

11619 Commits

Author SHA1 Message Date
Ruipeng Li
e270c561b0
Spgemm (#639)
This PR includes optimizations for hypre's SpGEMM and ParSpGEMM kernels

Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
2022-06-24 10:42:16 -07:00
Victor A. Paludetto Magri
8268b9f1e1
hypre_ParCSRMatrixPrintIJ on device (#655)
hypre_ParCSRMatrixPrintIJ works for matrices living on the device w/o the need of UVM support. A explicit copy is to host memory is performed in this function prior to printing the files.
2022-06-22 20:49:57 -04:00
Victor A. Paludetto Magri
850fd47d07
Fix chebyshev smoother for singular problems (#657)
See PR's description for additional info
2022-06-22 20:47:09 -04:00
Ruipeng Li
b58585e0f0
add a func (#646)
This PR adds a function to perform local transposition of ParCSR.
2022-06-21 08:52:21 -07:00
Ulrike Yang
ac9d7d0d7b updated CHANGELOG 2022-06-14 12:02:25 -07:00
Rob Falgout
03e0150ee4 Change release number to 2.25.0 2022-06-13 17:17:54 -07:00
Rob Falgout
14cfc2db1e Update CHANGELOG for 2.25.0 release 2022-06-13 17:14:43 -07:00
Victor A. Paludetto Magri
07a8def6f8
Fix compilation warnings (#643)
This PR fixes compilation warnings obtained with gcc-11, clang-12, and clang-14. A list of the warnings is given below:

* -Wundef
* -Wunused
* -Wdouble-promotion
* -Wsometimes-uninitialized
* -Wunused-variable
* -Wunused-but-set-variable
2022-05-31 20:32:49 -04:00
Victor A. Paludetto Magri
edb91b4a50
Add -auxfromfile option to IJ driver (#633)
Add -auxfromfile option for reading an auxiliary matrix from file, which is then used to build the preconditioner. This is useful, for example, for the case when a filtered version of A is used to build the preconditioner.
2022-05-26 21:23:31 -04:00
Ruipeng Li
e766e36e76
Add header to remove header-transitivity issue (#636)
Add header for `thrust::remove_if.`
Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
2022-05-24 14:18:57 -07:00
Ruipeng Li
1c1bf95b10
fix with umpire 2022 (#625)
This PR fixes hypre with umpire 2022
2022-05-24 13:55:08 -07:00
Ruipeng Li
a8d423013b
missing -f in softlink commands (#594)
This PR adds in a few of the Mafefiles the missing -f in the softlink commands.

Co-authored-by: Paul Mullowney <Paul.Mullowney@nrel.gov>
2022-05-24 13:39:33 -07:00
Paul T. Bauman
565bbe0511
Need to read start/end indices as HYPRE_BigInt (#605)
This fixes an issue with the ParCSRMatrixRead when compiled with mixed-int enabled.
2022-05-24 13:32:56 -07:00
Ruipeng Li
aa0446d720
ij help (#634)
Minor fix to `ij -help`
2022-05-24 13:30:23 -07:00
Ruipeng Li
ef3f890d4b
Nvcollab (#591)
This PR contains various GPU optimizations in the collaboration with the NVIDIA team. 

Co-authored-by: Peng Wang <penwang@nvidia.com>
2022-05-24 13:27:32 -07:00
Ruipeng Li
bec8645cf9
script option for runtest.sh (#632)
This PR changes runtest.sh to run an executable with a "script"  and allows valgrind and mpibind on all platforms.
2022-05-19 10:10:03 -07:00
Daniel Osei-Kuffuor
63208e3e34
Hotfix for issues with dsuperlu in regression test. (#631)
Commented out unnecessary memory deallocation check.
2022-05-15 21:42:28 -07:00
Wayne Mitchell
dfdd1cd12f
Sycl par matmat (#611)
Further unification of GPU implementation across cuda/hip/sycl.
Implements the parallel matrix matrix product in sycl.
HYPRE_CUDA_LAUNCH and HYPRE_SYCL_LAUNCH macros have 
been unified under HYPRE_GPU_LAUNCH for kernel launches.
Replace HYPRE_SetSpGemmUseCusparse with HYPRE_SetSpGemmUseVendor.
2022-05-09 15:24:44 -07:00
Daniel Osei-Kuffuor
00d1dfd3f7
Mgr block jacobi (#607)
* Added new capabilities to allow multilevel assignment of solver options
* New (local) block Jacobi option for smoothers and intergrid operators
* Added capabilities to do CPR in MGR
* Updated non-Galerkin strategy for constructing the coarse grid.

Co-authored-by: Quan Bui <mquan.bui@gmail.com>
2022-05-09 08:30:05 -07:00
Victor A. Paludetto Magri
8017ce459b
Fix segfault on HYPRE_SStructGraphDestroy (#617)
This PR fixes a segmentation fault on HYPRE_SStructGraphDestroy. The error occurred when the number of graph entries added to the SStructGraph via HYPRE_SStructGraphAddEntries was larger than 1000.
2022-04-06 21:12:41 -07:00
Victor A. Paludetto Magri
70d055a994
Fix complex build (#616)
This PR fixes compilation of the "complex" build variant of hypre. It also adds hypre_csqrt for computing the square root of an HYPRE_Complex number. This function/macro works when enable-complex is turned on/off.
2022-04-06 15:02:04 -07:00
Rob Falgout
4c5529810a Updating one missed copyright date in user manual 2022-04-05 16:40:02 -07:00
Victor A. Paludetto Magri
e16167fe46
Fix copyright (#615)
This PR updates Copyright headers from "Copyright 1998-2019 ..." to "Copyright (c) 1998 ..."
2022-04-05 16:19:51 -07:00
Victor A. Paludetto Magri
9415d6aa08
FSAI implementation on CPUs (#610)
Thir PR adds a factorized sparse approximate inverse (FSAI) implementation on hypre, which can be used as a standalone solver, preconditioner to Krylov methods, or complex smoother to BoomerAMG. Particularly, we consider the adaptive algorithm version, where the sparsity pattern of the lower triangular factor G is built dynamically, i.e., during an iterative procedure that tries to find the best nonzero positions for a given row of G. This implementation was performed on top of the IJ interface. It uses the diagonal portion of A for constructing G, i.e., it's a block-Jacobi method in the MPI sense. List of additional changes:

* Add caliper instrumentation to FSAI.
* Add ZeroGuess option to FSAI.
* Performance optimizations.
* Add OpenMP support to FSAI.
* Make internal BLAS/LAPACK functions thread-safe. 
* Update CMake build.
* Add new test cases: beam_tet_dof459_np1, beam_hex_dof459_np2, and beam_tet_dof2475_np4.
* Add documentation for FSAI.

Co-authored-by: Heather Switzer <switzer4@lassen36.coral.llnl.gov>
Co-authored-by: heatherms27 <hmswitzer@email.wm.edu>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
2022-04-05 11:18:39 -07:00
ulrikeyang
303457abae
fixed MM-multipass interpolation for case of no C-points (#606)
* fixed MM-multipass interpolation for case of no C-points

* fixed the issue of isolated groups of fine points and added a regression test.

* corresponding changes to the device code

Co-authored-by: Ruipeng Li <li50@llnl.gov>
2022-03-29 15:14:29 -07:00
Ruipeng Li
5fe37b2286
hypre_ParPrintf (#604)
This PR adds hypre_ParPrintf. Prints to standard out, only from the first processor in the communicator. Calls from other processes are ignored.
2022-03-21 09:10:01 -07:00
Ruipeng Li
adfd07c509
Fix build on FreeBSD/powerpc*. (#603)
lr collides with lr from machine/frame.h header (link register):
Co-authored-by: Piotr Kubaj <pkubaj@FreeBSD.org>
2022-03-18 10:16:20 -07:00
Rob Falgout
fa43ea82e3
Bug fix in prefix sum for OpenMP IJ interface (#602)
This fixes a bug found in issue #522 for the prefix sum openmp code in IJ.
2022-03-16 10:56:09 -07:00
Ruipeng Li
92faac9748
fix memory location (#600)
This PR fixes a number of memory location issues in memory copy and memset. It also adds more strict checking in memory.c in the debug mode.
2022-03-14 11:19:28 -07:00
Victor A. Paludetto Magri
6fd043c9c2
(S)Struct IO on GPUs (#599)
This PR extends the (semi)-struct matrix/vector IO functions added on #583 with GPU support. Additionally:

* Fix regression tests on Lassen.
* Read data values into host memory
* Update Umatrix read algorithm when the ParCSRMatrix is expected to live on the device
* Reset deallocated pointers at hypre_IJMatrixDestroyParCSR to NULL
* Clone rownnz info if present on a CSRMatrix
* Reduce memory transfer and remove unused variables
* Fix bug with -print option
* Build rownnz info also when the ParCSRMatrix is in assembled state
* Remove a few instances of "return ierr"
* Refactor (s)struct IO - code works with cuda and without UM
* Add executables to gitignore
2022-03-13 20:14:23 -07:00
Ruipeng Li
f7787ab0ae
fixes coarsening.jobs.14 (#598)
a temporary "fix"
2022-03-11 18:25:03 -08:00
Ruipeng Li
8c344aee9a
Invalid assumption on exclusive_scan (#575)
This PR fixes a number of initialization problems with exclusive_scan on GPUs due to invalid assumptions of this function.
2022-03-11 08:32:26 -08:00
Paul T. Bauman
251cd3d269
Need -O1 instead of -O0 for HIP in debug mode (#588)
This PR changes -O0 in debug mode to -O1 with HIP (at this time).
2022-03-04 12:40:35 -08:00
Ruipeng Li
95e6433fc7
GPU support with single precision (#572)
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ebd6eb88c3
bug fix; nonsquare rap (#581)
This PR fixes a corner case of the RAP routine for RAP matrix that is globally square but not locally.
2022-03-03 21:26:17 -08:00
Paul T. Bauman
04af9a4cd9
HYPRE_Int -> HYPRE_BigInt (#585) 2022-02-18 12:16:35 -08:00
Golam Rabbani
94070dd3a9
Updated CMakeLists.txt for SYCL (#577)
With CMake, enable CUDA stream by default when using SYCL.
2022-02-17 18:21:51 -08:00
Victor A. Paludetto Magri
33a5051398
Add SStruct IO functions (#583)
This PR adds support for native print/read functions of SStructMatrix and SStructVector. Other important changes are:
* Add public functions for reading StructMatrix and StructVector.
* Add a new set of regression tests called "io" to the TEST_sstruct folder.
2022-02-17 18:06:23 -08:00
Victor A. Paludetto Magri
49dbf7b60a
Fix cross-compilation problem (#580)
This PR fixes issue #556.

AC_CHECK_FILE was being used to test the existence of the .git folder. However, according to Autoconf manual, it does not work when cross-compiling. This PR implements another strategy for looking for the .git folder which works also when doing cross-compilation.
2022-02-16 07:55:02 -08:00
Ulrike Yang
e5a82e81e6 specified SYCL support 2022-02-10 17:12:32 -08:00
Rob Falgout
ccd135d8da Updating CHANGELOG 2022-02-10 10:18:32 -08:00
Rob Falgout
666f457d2b Bumping release number to 2.24.0 2022-02-10 07:05:43 -08:00
Rob Falgout
4ee737b53c Initial CHANGELOG update for new release 2022-02-10 07:02:26 -08:00
Ruipeng Li
ab72d05bd8
Deviceomp (#519)
This PR fixes the build with Kokkos + OMP offload, supports OMP offload without linking CUDA libraries, and supports OMP offload on Intel GPUs.
2022-02-09 06:40:57 -08:00
Ruipeng Li
8ba048c0b5
Forced regeneration of softlinks in shared library builds. (#574)
This PR (copied from #573) added -f to softlink generation to all the makefiles.
2022-02-07 16:54:44 -08:00
Ruipeng Li
e40f8219a3 fix for last merged PR 2022-02-07 18:26:53 -06:00
Quan Bui
734a10fcb7
Mgr setup gpu (#400)
Enable GPU setup for MGR solver.
* Added device specific functionality for interpolation
* Made device and host calls to interpolation consistent
* Edited IJ driver to use GPU capable options for MGR
* Updated saved files for new GPU options
* Updated CMakeLists to support new MGR capabilities

Co-authored-by: Ruipeng Li <li50@llnl.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
2022-02-07 15:54:52 -08:00
Ruipeng Li
790e8e7826
fix cuda 11 build (#569)
This PR fixes CUDA 11 build after merging #549, also adds regression tests (build only) with CUDA 11.
2022-02-02 08:40:15 -08:00
Wayne Mitchell
a7bb784a45
SYCL support for AMG solve phase (#549)
This adds matvec, matrix transpose, and vector operations (axpy, inner product, etc.)
with sycl backend (via oneMKL and oneDPL) for running on Intel GPUs. Thus, the AMG
solve phase can now execute entirely on Intel GPUs.
2022-01-31 16:15:30 -08:00
Victor A. Paludetto Magri
b159c7dd58
Fortran interfaces (#566)
This PR adds Fortran interfaces for hypre_MGR and hypre_ILU. Additionally:

* Add ArrayArray types in `fortran.h`
* Add MGR and ILU options to the fortran interfaces for Krylov solvers
2022-01-31 15:32:14 -08:00