Commit Graph

12104 Commits

Author SHA1 Message Date
Paul T. Bauman
d55a409bdb Silence uninitialized var HIP warnings
This should be a benign change. What happens is that the first one or two
workitems/threads in each workgroup/block read a value and then broadcast
it (with __shfl_sync or similar) and then code branching happens
based on this value. But the compiler can't see it all the way
through it so we get some uninitialized var warnings.
2022-03-09 13:31:32 -06:00
Ruipeng Li
9c33e9a263 regression tests 2022-03-09 08:54:42 -08:00
Ruipeng Li
63c9fa65a2 add using hypre's spmv option 2022-03-08 22:11:31 -08:00
Wayne Mitchell
5e90f43cbc astyle 2022-03-08 23:51:41 +00:00
Wayne Mitchell
34c16787b7 Addition of MatMat and TMatMat routines and clean up 2022-03-08 23:50:48 +00:00
Wayne Mitchell
297ff5d5a7 Par matmat is verified correcct for a small example 2022-03-08 21:08:47 +00:00
Ruipeng Li
7443a2ac6c missed some sync in the last commit 2022-03-07 23:54:03 -08:00
Ruipeng Li
e1b9a56405 add gpu sync for mpi 2022-03-07 23:34:33 -08:00
Ruipeng Li
8ee20f4812 cudamallocasync 2022-03-07 16:54:56 -08:00
Ruipeng Li
df0f6dbba7 configure options: cublas; cudamallocasync 2022-03-07 16:40:32 -08:00
Ruipeng Li
d7728d0bce updated ij driver for 2nd solve 2022-03-07 15:16:00 -08:00
Ruipeng Li
b97fbc13ed sync device at ending timing 2022-03-07 15:13:08 -08:00
Ruipeng Li
c2e4836c1e bug fix 2022-03-05 10:36:53 -08:00
Ruipeng Li
a51bb880a8 bug fix 2022-03-05 09:46:16 -08:00
Ruipeng Li
03546b428f Merge branch 'master' of github.com:hypre-space/hypre into nvcollab 2022-03-04 22:18:02 -08:00
Paul T. Bauman
251cd3d269
Need -O1 instead of -O0 for HIP in debug mode (#588)
This PR changes -O0 in debug mode to -O1 with HIP (at this time).
2022-03-04 12:40:35 -08:00
Ruipeng Li
95e6433fc7
GPU support with single precision (#572)
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ebd6eb88c3
bug fix; nonsquare rap (#581)
This PR fixes a corner case of the RAP routine for RAP matrix that is globally square but not locally.
2022-03-03 21:26:17 -08:00
Wayne Mitchell
d388a2766e Lots of reorganization. This now has all functionality for par matmat and compiles, but needs debugging. 2022-03-01 01:59:27 +00:00
Wayne Mitchell
8112dd736f Further cleanup and reorg of device_utils.c/h and addition of more functionality needed for par matmat 2022-02-19 00:40:27 +00:00
Paul T. Bauman
04af9a4cd9
HYPRE_Int -> HYPRE_BigInt (#585) 2022-02-18 12:16:35 -08:00
Wayne Mitchell
cda5b10a69 Single processor device rap works 2022-02-18 18:58:28 +00:00
Wayne Mitchell
47ae1c8a22 Start major reorganization of device_utils.h 2022-02-18 17:44:22 +00:00
Golam Rabbani
94070dd3a9
Updated CMakeLists.txt for SYCL (#577)
With CMake, enable CUDA stream by default when using SYCL.
2022-02-17 18:21:51 -08:00
Victor A. Paludetto Magri
33a5051398
Add SStruct IO functions (#583)
This PR adds support for native print/read functions of SStructMatrix and SStructVector. Other important changes are:
* Add public functions for reading StructMatrix and StructVector.
* Add a new set of regression tests called "io" to the TEST_sstruct folder.
2022-02-17 18:06:23 -08:00
Wayne Mitchell
1e289479f1 astyle 2022-02-17 00:56:49 +00:00
Ruipeng Li
9888903445 memory tracker 2022-02-16 15:04:20 -08:00
Ruipeng Li
c336122cc1 move to debug region 2022-02-16 14:19:15 -08:00
Victor A. Paludetto Magri
49dbf7b60a
Fix cross-compilation problem (#580)
This PR fixes issue #556.

AC_CHECK_FILE was being used to test the existence of the .git folder. However, according to Autoconf manual, it does not work when cross-compiling. This PR implements another strategy for looking for the .git folder which works also when doing cross-compilation.
2022-02-16 07:55:02 -08:00
Wayne Mitchell
4136c63269 Switch to HYPRE_GPU_LAUNCH 2022-02-15 20:01:08 +00:00
Wayne Mitchell
7883708a4f astyle 2022-02-15 18:05:12 +00:00
Wayne Mitchell
e3e99f3b9d Add back include 2022-02-15 18:04:22 +00:00
Wayne Mitchell
1e2f711d01 Merge branch 'sycl' into sycl_par_matmat 2022-02-15 18:00:07 +00:00
Wayne Mitchell
e312ecbe31 Merge branch 'master' into sycl_par_matmat 2022-02-15 17:41:05 +00:00
Rui-peng Li
df0b1eac20 turn off print 2022-02-12 01:14:53 -06:00
Rui-peng Li
c430559482 remove fast hash 2022-02-12 01:09:15 -06:00
Rui-peng Li
c347b3c9e1 remove some changes 2022-02-12 00:53:55 -06:00
Ruipeng Li
7672c101cd try to optimize hash 2022-02-11 22:17:47 -08:00
Ruipeng Li
c85a060d85 naive row nnz bound with spmv 2022-02-11 17:18:32 -08:00
Ulrike Yang
e5a82e81e6 specified SYCL support 2022-02-10 17:12:32 -08:00
Rob Falgout
ccd135d8da Updating CHANGELOG 2022-02-10 10:18:32 -08:00
Rob Falgout
666f457d2b Bumping release number to 2.24.0 2022-02-10 07:05:43 -08:00
Rob Falgout
4ee737b53c Initial CHANGELOG update for new release 2022-02-10 07:02:26 -08:00
Ruipeng Li
9d99e1fb77 put unroll as a macro 2022-02-10 00:14:18 -05:00
Ruipeng Li
efed76505c rename printf0; change unroll to 1 2022-02-09 23:44:49 -05:00
Wayne Mitchell
f8d482efad Add packing on device with oneDPL for par matvec 2022-02-10 01:51:56 +00:00
Ruipeng Li
6180f1261c minor change 2022-02-09 20:36:53 -05:00
Ruipeng Li
bef2862710 silence compiler warnings 2022-02-09 13:27:18 -05:00
Ruipeng Li
ab72d05bd8
Deviceomp (#519)
This PR fixes the build with Kokkos + OMP offload, supports OMP offload without linking CUDA libraries, and supports OMP offload on Intel GPUs.
2022-02-09 06:40:57 -08:00
Rui-peng Li
e0c24fd1f6 turn off printings 2022-02-08 16:34:05 -06:00