Commit Graph

12219 Commits

Author SHA1 Message Date
Ruipeng Li
6bd19c272f
new benchmark results on tioga (#727)
Improved benchmark performance on tioga.
2022-09-01 08:25:03 -07:00
Ruipeng Li
f48a5ce0e3
Fixing bugs in hypre_BoomerAMGBuildModMultipassDevice (#724)
Fixing bugs in hypre_BoomerAMGBuildModMultipassDevice
2022-08-29 09:53:06 -07:00
Victor A. Paludetto Magri
9f9f2972f8
[Multivec 5/5]: Parallel SpMV updates (#700)
* Add device functions to perform strided copy
* Extend BoomerAMG with L1-Jac to vectors with multiple components
* Add ParVectorSetLocalSize calls to MGR
* Extend hypre_ParCSRCommPkg with a new member (num_components) that allows communicating several vector components at once.
* Add hypre_ParCSRCommPkgCreateAndFill for allocating and filling a communication package with its data.
* Improve parallel SpMV performance for vectors with multiple components

Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
2022-08-24 20:17:18 -04:00
Victor A. Paludetto Magri
9a9d68c5d5
[Multivec 3/5]: Sequential SpMV updates (#694)
This is part of a series of PRs for enabling BoomerAMG to be applied to vectors with multiple components.

* Extend hypre's SpMV on CUDA to multivectors and add calls to cusparse's SpMM.
* Add loop unrolling to hypre's SpMV on CPUs when using multivectors with less than 5 components.
* Add HYPRE_SPMV_FILL variables to hypre's SpMV on GPUs
* Add SYCL changes to SpMV

Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
2022-08-23 09:55:09 -04:00
PaulMullowney
5d1def73d9
slight fix to compile with umpire and pinned memory pools (#682)
Co-authored-by: Paul Mullowney <Paul.Mullowney@nrel.gov>
2022-08-18 09:28:52 -07:00
Victor A. Paludetto Magri
2383e6881d
Fix sign-compare warnings (#714)
This PR fixes warnings from [-Wsign-compare] when building hypre with GPU support.
2022-08-17 17:16:34 -04:00
Wayne Mitchell
c551365fe7
Add config options for target backends for sycl instead of hard coding (#701)
Adds config options for specifying a target backend with sycl (both cmake and autoconf)
2022-08-16 17:43:08 -07:00
Wayne Mitchell
6cf11aaa08
Increase available length of filenames for parcsr mat and vec (#562)
Increase char array lengths to accommodate longer filenames, e.g. long absolute paths.
2022-08-16 15:03:13 -07:00
Victor A. Paludetto Magri
ce3ecb0daf
[Multivec 4/5]: BoomerAMG partially supports vector with multiple components (#695)
* Extend BoomerAMG with L1-Jacobi to vector with multiple components.
* Add ParVectorSetLocalSize calls to MGR.
2022-08-11 16:21:31 -04:00
Yadong_Zeng
71cdcc89f7
Include missing header file for the CUDA install (#710)
Co-authored-by: yzeng <yzeng@altair.com>
2022-08-07 15:06:53 -07:00
Wayne Mitchell
98ab3445b7
Sycl build fix (#707)
Move fill functions in device utils back into the general functions block for use with sycl backend.
2022-08-02 18:53:35 -07:00
Ruipeng Li
6425099995
Iterative Triangular Solve APIs (#696)
* This commit adds the public and internal APIs for doing jacobi approximation to iterative triangular solve. At the moment, this code simply sets parameters for the ILU classes.

* Fixing defaults, docs, and comments.

Co-authored-by: Paul Mullowney <Paul.Mullowney@nrel.gov>
2022-08-02 00:19:44 -07:00
Victor A. Paludetto Magri
6845385dd1
Fix sanity checks of hypre_SeqVectorElmdivpy (#704)
This fixes the regression test failure reported in #703.
2022-08-01 10:44:06 -04:00
Fredrik Ekre
fac5ba173f
docs/misc.rst: fix rst syntax for italics (#690)
* docs/README: add tip for viewing local docs using Python webserver.
* docs/misc.rst: fix rst syntax for italics.
2022-07-29 17:10:00 -07:00
Victor A. Paludetto Magri
662e886881
[Multivec 2/5]: Extend multivector support (#693)
* Add new device functions needed by multivectors (`hypreDevice_IntStridedCopy` and `hypreDevice_IVAMXPMY`)
* Extend `hypre_SeqVectorElmdivpy` to work with multivectors.
2022-07-29 15:37:24 -07:00
Victor A. Paludetto Magri
26f334002f
[Multivec 1/5]: Fix code compilation (#692)
This PR fixes a few compilation errors when building hypre with CUDA and without cusparse support
2022-07-28 18:13:34 -04:00
Wayne Mitchell
922a3ce4df
SYCL IJ without unified memory (#676)
Ports some missing routines to allow the IJ driver to run without unified memory with the sycl backend:
IJ vector set/add values and assembly
Routines in par_coarse_parms_device.c
hypre_ParCSRComputeL1Norms()
2022-07-26 12:01:55 -07:00
Victor A. Paludetto Magri
ea3c43a6b0
Set/Get for multivectors (#665)
This PR enables Set/AddTo/GetValues for multivectors through the IJ interface. It also adds regression tests for the new functionalities (TEST_ij/vector) and the HYPRE_IJVectorInnerProd function
2022-07-20 16:46:15 -04:00
Ruipeng Li
48de53e675
fix bigint (#684)
Fix BigInt issue #681
2022-07-20 09:28:22 -07:00
Fredrik Ekre
e8990caf7a
Fix a typo in documentation for hybrid solver. (#685) 2022-07-20 06:52:01 -07:00
Ruipeng Li
275d04d987
Cmake cuda update (#675)
This PR contains CMake changes for CUDA.

Co-authored-by: pengwang <penwang@nvidia.com>
Co-authored-by: Sarah Virginia Osborn <osborn9@llnl.gov>
2022-07-19 22:04:42 -07:00
Ruipeng Li
22d35a4d09
Backward compatible ROCm header update. (#680)
Backward compatible update to rocm header include path.

Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
2022-07-15 23:28:00 -07:00
Ruipeng Li
5eb84ec1db
Fix GPU memory leak (#677)
This PR fixes a memory leak on GPUs.
2022-07-15 11:20:41 -07:00
Ruipeng Li
ad50e4e123
config/update.sh (#671)
Minor changes to `configure`.
2022-07-08 16:33:48 -07:00
Wayne Mitchell
6c4803b90d
Sycl pmis (#664)
Port PMIS coarsening to SYCL
2022-07-08 14:00:23 -07:00
Ruipeng Li
bd514cf998
Cusolver (#653)
This PR adds the option of using cuSolver.

Co-authored-by: Paul Mullowney <Paul.Mullowney@nrel.gov>
2022-07-06 09:37:32 -07:00
Ruipeng Li
14ee602fbf
Regression (#668)
This PR updates regression test scripts and benchmark performance results.
2022-07-05 17:10:43 -07:00
Wayne Mitchell
6f3bccb92c
Sycl interp (#638)
This adds sycl support for interpolation optionsExtInterp, ExtPIInterp,
and ExtPEInterp (which correspond to InterpType 6, 14, 16, 17, 18).
Generation of the strength matrix is also ported to sycl.
Further unification of cuda/hip/sycl kernel functions.
Adds regression tests for the sycl backend on arcticus including both ij and struct tests.
2022-07-05 16:10:36 -07:00
Ruipeng Li
63ed624709
Merge pull request #666 from hypre-space/interp_trunc
This PR optimizes interpolation truncation routines on GPUs.
2022-06-30 14:34:59 -07:00
Ruipeng Li
aa153c9c89 astyle 2022-06-30 14:23:24 -07:00
Ruipeng Li
750d4877a4 Merge branch 'interp_trunc' of github.com:hypre-space/hypre into interp_trunc 2022-06-29 11:54:06 -07:00
Ruipeng Li
5dfda6b009 update saved.lassen 2022-06-29 11:42:23 -07:00
Ruipeng Li
6611451694 update tioga saved 2022-06-29 10:41:47 -07:00
Ruipeng Li
ac09576ef9 bug fix 2022-06-29 10:41:22 -07:00
Ruipeng Li
2fa29169c6 bug fix 2022-06-29 10:06:09 -07:00
Ruipeng Li
172787d7d9 delete files 2022-06-28 22:18:39 -07:00
Ruipeng Li
b03f350bf1 fix after merge 2022-06-28 22:15:45 -07:00
Ruipeng Li
4ed68414e5 Merge branch 'master' of github.com:hypre-space/hypre into interp_trunc 2022-06-28 22:10:57 -07:00
Ruipeng Li
b3573fb7a5 .saved 2022-06-28 21:09:23 -07:00
Wayne Mitchell
4411530e76
hypre_Item (#645)
Introduce hypre_DeviceItem to further unify cuda/hip/sycl implementation.
Unify some wrappers for thread/warp-level kernel routines.
2022-06-28 08:27:36 -07:00
Ruipeng Li
e270c561b0
Spgemm (#639)
This PR includes optimizations for hypre's SpGEMM and ParSpGEMM kernels

Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
2022-06-24 10:42:16 -07:00
Victor A. Paludetto Magri
8268b9f1e1
hypre_ParCSRMatrixPrintIJ on device (#655)
hypre_ParCSRMatrixPrintIJ works for matrices living on the device w/o the need of UVM support. A explicit copy is to host memory is performed in this function prior to printing the files.
2022-06-22 20:49:57 -04:00
Victor A. Paludetto Magri
850fd47d07
Fix chebyshev smoother for singular problems (#657)
See PR's description for additional info
2022-06-22 20:47:09 -04:00
Ruipeng Li
b58585e0f0
add a func (#646)
This PR adds a function to perform local transposition of ParCSR.
2022-06-21 08:52:21 -07:00
Ruipeng Li
322e6a5e6e astyle 2022-06-17 10:14:00 -07:00
Ruipeng Li
18f85886ff remove debug code 2022-06-17 09:15:08 -07:00
Ruipeng Li
b9b93c45ef save debug code of Pass0 2022-06-17 09:13:58 -07:00
Ruipeng Li
8d54b78730 fix some nvtx region names 2022-06-15 23:18:00 -07:00
Ruipeng Li
3509640354 optimized interp_trunc 2022-06-15 22:55:35 -07:00
Ulrike Yang
ac9d7d0d7b updated CHANGELOG 2022-06-14 12:02:25 -07:00