Commit Graph

12101 Commits

Author SHA1 Message Date
Wayne Mitchell
5e90f43cbc astyle 2022-03-08 23:51:41 +00:00
Wayne Mitchell
34c16787b7 Addition of MatMat and TMatMat routines and clean up 2022-03-08 23:50:48 +00:00
Wayne Mitchell
297ff5d5a7 Par matmat is verified correcct for a small example 2022-03-08 21:08:47 +00:00
Ruipeng Li
7443a2ac6c missed some sync in the last commit 2022-03-07 23:54:03 -08:00
Ruipeng Li
e1b9a56405 add gpu sync for mpi 2022-03-07 23:34:33 -08:00
Ruipeng Li
8ee20f4812 cudamallocasync 2022-03-07 16:54:56 -08:00
Ruipeng Li
df0f6dbba7 configure options: cublas; cudamallocasync 2022-03-07 16:40:32 -08:00
Ruipeng Li
d7728d0bce updated ij driver for 2nd solve 2022-03-07 15:16:00 -08:00
Ruipeng Li
b97fbc13ed sync device at ending timing 2022-03-07 15:13:08 -08:00
Ruipeng Li
c2e4836c1e bug fix 2022-03-05 10:36:53 -08:00
Ruipeng Li
a51bb880a8 bug fix 2022-03-05 09:46:16 -08:00
Ruipeng Li
03546b428f Merge branch 'master' of github.com:hypre-space/hypre into nvcollab 2022-03-04 22:18:02 -08:00
Paul T. Bauman
251cd3d269
Need -O1 instead of -O0 for HIP in debug mode (#588)
This PR changes -O0 in debug mode to -O1 with HIP (at this time).
2022-03-04 12:40:35 -08:00
Ruipeng Li
95e6433fc7
GPU support with single precision (#572)
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ebd6eb88c3
bug fix; nonsquare rap (#581)
This PR fixes a corner case of the RAP routine for RAP matrix that is globally square but not locally.
2022-03-03 21:26:17 -08:00
Wayne Mitchell
d388a2766e Lots of reorganization. This now has all functionality for par matmat and compiles, but needs debugging. 2022-03-01 01:59:27 +00:00
Wayne Mitchell
8112dd736f Further cleanup and reorg of device_utils.c/h and addition of more functionality needed for par matmat 2022-02-19 00:40:27 +00:00
Paul T. Bauman
04af9a4cd9
HYPRE_Int -> HYPRE_BigInt (#585) 2022-02-18 12:16:35 -08:00
Wayne Mitchell
cda5b10a69 Single processor device rap works 2022-02-18 18:58:28 +00:00
Wayne Mitchell
47ae1c8a22 Start major reorganization of device_utils.h 2022-02-18 17:44:22 +00:00
Golam Rabbani
94070dd3a9
Updated CMakeLists.txt for SYCL (#577)
With CMake, enable CUDA stream by default when using SYCL.
2022-02-17 18:21:51 -08:00
Victor A. Paludetto Magri
33a5051398
Add SStruct IO functions (#583)
This PR adds support for native print/read functions of SStructMatrix and SStructVector. Other important changes are:
* Add public functions for reading StructMatrix and StructVector.
* Add a new set of regression tests called "io" to the TEST_sstruct folder.
2022-02-17 18:06:23 -08:00
Wayne Mitchell
1e289479f1 astyle 2022-02-17 00:56:49 +00:00
Ruipeng Li
9888903445 memory tracker 2022-02-16 15:04:20 -08:00
Ruipeng Li
c336122cc1 move to debug region 2022-02-16 14:19:15 -08:00
Victor A. Paludetto Magri
49dbf7b60a
Fix cross-compilation problem (#580)
This PR fixes issue #556.

AC_CHECK_FILE was being used to test the existence of the .git folder. However, according to Autoconf manual, it does not work when cross-compiling. This PR implements another strategy for looking for the .git folder which works also when doing cross-compilation.
2022-02-16 07:55:02 -08:00
Wayne Mitchell
4136c63269 Switch to HYPRE_GPU_LAUNCH 2022-02-15 20:01:08 +00:00
Wayne Mitchell
7883708a4f astyle 2022-02-15 18:05:12 +00:00
Wayne Mitchell
e3e99f3b9d Add back include 2022-02-15 18:04:22 +00:00
Wayne Mitchell
1e2f711d01 Merge branch 'sycl' into sycl_par_matmat 2022-02-15 18:00:07 +00:00
Wayne Mitchell
e312ecbe31 Merge branch 'master' into sycl_par_matmat 2022-02-15 17:41:05 +00:00
Rui-peng Li
df0b1eac20 turn off print 2022-02-12 01:14:53 -06:00
Rui-peng Li
c430559482 remove fast hash 2022-02-12 01:09:15 -06:00
Rui-peng Li
c347b3c9e1 remove some changes 2022-02-12 00:53:55 -06:00
Ruipeng Li
7672c101cd try to optimize hash 2022-02-11 22:17:47 -08:00
Ruipeng Li
c85a060d85 naive row nnz bound with spmv 2022-02-11 17:18:32 -08:00
Ulrike Yang
e5a82e81e6 specified SYCL support 2022-02-10 17:12:32 -08:00
Rob Falgout
ccd135d8da Updating CHANGELOG 2022-02-10 10:18:32 -08:00
Rob Falgout
666f457d2b Bumping release number to 2.24.0 2022-02-10 07:05:43 -08:00
Rob Falgout
4ee737b53c Initial CHANGELOG update for new release 2022-02-10 07:02:26 -08:00
Ruipeng Li
9d99e1fb77 put unroll as a macro 2022-02-10 00:14:18 -05:00
Ruipeng Li
efed76505c rename printf0; change unroll to 1 2022-02-09 23:44:49 -05:00
Wayne Mitchell
f8d482efad Add packing on device with oneDPL for par matvec 2022-02-10 01:51:56 +00:00
Ruipeng Li
6180f1261c minor change 2022-02-09 20:36:53 -05:00
Ruipeng Li
bef2862710 silence compiler warnings 2022-02-09 13:27:18 -05:00
Ruipeng Li
ab72d05bd8
Deviceomp (#519)
This PR fixes the build with Kokkos + OMP offload, supports OMP offload without linking CUDA libraries, and supports OMP offload on Intel GPUs.
2022-02-09 06:40:57 -08:00
Rui-peng Li
e0c24fd1f6 turn off printings 2022-02-08 16:34:05 -06:00
Ruipeng Li
04037377ea debug for HIP 2022-02-08 14:48:58 -06:00
Ruipeng Li
8ba048c0b5
Forced regeneration of softlinks in shared library builds. (#574)
This PR (copied from #573) added -f to softlink generation to all the makefiles.
2022-02-07 16:54:44 -08:00
Ruipeng Li
e40f8219a3 fix for last merged PR 2022-02-07 18:26:53 -06:00