Commit Graph

25 Commits

Author SHA1 Message Date
Rob Falgout
f478498295
New error handling feature to print messages to memory (#920)
This allows users to direct hypre's error messages to a memory buffer instead of stderr.  With this, there are now three basic ways to use hypre when configured --with-print-errors:
- Default (mode 0): Errors are printed immediately to stderr (there is no processor information available in this print).
- Store errors in memory (mode 1) and call PrintErrorMessages to print them.
- Store errors in memory (mode 1) and call GetErrorMessages to manage the error messages however you like.
2023-06-13 20:17:25 -07:00
Rui Peng Li
ec86992c4b
Cuda12 (#871)
This PR adds the support for CUDA 12.
2023-05-17 20:01:41 -07:00
Rui Peng Li
5546cc22d4
Runtime switch and memory tracker (#741)
This PR adds "runtime switch" feature to the solvers in hypre.

Co-authored-by: Victor A. P. Magri <paludettomag1@llnl.gov>
Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
2022-10-07 08:39:00 -07:00
Ruipeng Li
5eb84ec1db
Fix GPU memory leak (#677)
This PR fixes a memory leak on GPUs.
2022-07-15 11:20:41 -07:00
Ruipeng Li
14ee602fbf
Regression (#668)
This PR updates regression test scripts and benchmark performance results.
2022-07-05 17:10:43 -07:00
Ruipeng Li
ef3f890d4b
Nvcollab (#591)
This PR contains various GPU optimizations in the collaboration with the NVIDIA team. 

Co-authored-by: Peng Wang <penwang@nvidia.com>
2022-05-24 13:27:32 -07:00
Victor A. Paludetto Magri
e16167fe46
Fix copyright (#615)
This PR updates Copyright headers from "Copyright 1998-2019 ..." to "Copyright (c) 1998 ..."
2022-04-05 16:19:51 -07:00
Ruipeng Li
95e6433fc7
GPU support with single precision (#572)
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ab72d05bd8
Deviceomp (#519)
This PR fixes the build with Kokkos + OMP offload, supports OMP offload without linking CUDA libraries, and supports OMP offload on Intel GPUs.
2022-02-09 06:40:57 -08:00
Ruipeng Li
790e8e7826
fix cuda 11 build (#569)
This PR fixes CUDA 11 build after merging #549, also adds regression tests (build only) with CUDA 11.
2022-02-02 08:40:15 -08:00
Ruipeng Li
8c9f41a4d0
GPU ams ame ads (#398)
This PR adds GPU support for ams, ame and ads, and the following parcsr operations on GPUs, ParCSRAdd, ParCSRTranspose, l1 hybrid G-S/SSOR.

Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
2021-06-21 14:36:46 -07:00
Ruipeng Li
ad5d7e009f
Gpu mixedInt (#380)
This PR adds GPU support for mixedInt. 

Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
2021-06-10 11:10:13 -07:00
Ruipeng Li
3bc7d267ef
Gpu default (#336)
This PR changes AMG defaults regarding GPUs at various places, adds regression tests on GPUs, simplifies CUDA boxloop implementations. 

Co-authored-by: Sarah Virginia Osborn <osborn9@llnl.gov>
Co-authored-by: PaulMullowney <pmullown@nrel.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
Co-authored-by: Ruipeng Li <li50@euler.llnl.gov>
Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>
2021-05-24 17:16:35 -07:00
Sarah Osborn
8a41a42c8d
Cmake cuda updates (#349)
Add CUDA options to CMake. 
* Add CUDA options to CMake, including adding CUDA_SRCS to each sub-directory's CMake list
* Changes so that HYPRE_config.h from CMake is consistent with configure
* Fix to license commenting style
* Add CUDA + CMake tests for building on lassen
2021-05-19 15:39:57 -05:00
Rob Falgout
6f9260b67c
Add saved-file extension to runtest (#271)
This pull request adds a -save <ext> feature to the runtest.sh script to allow testing against different saved files on different platforms such as GPU machines. See Issue #255. A few additional things were done:

- All of the checks against the saved files were moved out of the individual tests and into runtest.sh.
- The output-file sanity checks that are in many of the tests were modified so they no longer depend on the saved files. Several issues were also uncovered and fixed.
2021-02-08 15:11:45 -08:00
Ruipeng Li
2186a8fb34
triangular solve on GPUs; runcheck (#256)
This PR fixes triangular solve on GPUs, and runcheck.sh

Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
2021-01-15 20:46:59 -08:00
Ruipeng Li
b49727f16b
Cuda triangular smoothers (#240)
* This commit has CUDA based smoothers for AMG based on the triangular parts of sparse matrices. This includes an Gauss-Seidel (relax_type==3), which uses CUSPARSE triangular solvers to invert L. Symmetric Gauss Seidel is implemented in relax_type==6 also via CUSPARSE. Finally, 2 new smoothers are added. THe first is a 2 stage approximation to Gauss Seidel using a parallel MatVec and L (relax_type==11). The second (relax_type==12) is a less effective version of 11. It uses A_diag instead of L for the smoothing. CPU implementations of these new smoothers are also provided. For the two stage algorithms, L and U are NOT explicitly created. This seems faster and saves memory. In the two stage preconditioner, multiply by invdiag rather than divide by diagonal reduces register pressure and yields full occupancy.
Co-authored-by: Paul Mullowney <pmullown@nrel.gov>
Co-authored-by: PaulMullowney <60452402+PaulMullowney@users.noreply.github.com>
2020-12-17 19:37:59 -08:00
Luke
22f4d3f8c6
Cuda 11 API (#163)
This PR adds CUDA-11 support.
2020-11-05 20:57:57 -08:00
Ruipeng Li
aaf5aa564a
Aggressive coarsening and 2- stage MM-ext Interpolations on GPUs (#195)
This PR contains the following changes:
* Aggressive coarsening, i.e, 2nd SoC on GPUs
* 2-stage MM-ext Interpolations (MM-ext, MM-ext+e) on GPUs
* Enhanced abilities of extracting strong FF/FC/CF/CC submatrix with given SoC matrix
* Bug fix in device PMIS
Co-authored-by: Bjorn Sjogreen <sjogreen2@llnl.gov>
Co-authored-by: ulrikeyang <yang11@llnl.gov>
2020-09-23 17:13:23 -07:00
liruipeng
8833bed155 add compile flags in GPU regression test scripts 2020-08-27 18:10:27 -07:00
Ruipeng Li
5d5b75bc02 GPU regression tests 2020-06-05 17:25:45 -07:00
Ruipeng Li
b945553980 updated regression tests on lassen 2020-05-19 22:11:19 -07:00
Ruipeng Li
cf4d9b78b5 bug fix in ILU 2020-02-28 00:32:23 -08:00
Ruipeng Li
43ad3d6703 cub allocator 2020-02-27 22:28:44 -08:00
Ruipeng Li
fb35486114 regression test on lassen 2019-07-23 11:41:34 -07:00