Wayne Mitchell
5e90f43cbc
astyle
2022-03-08 23:51:41 +00:00
Wayne Mitchell
34c16787b7
Addition of MatMat and TMatMat routines and clean up
2022-03-08 23:50:48 +00:00
Wayne Mitchell
297ff5d5a7
Par matmat is verified correcct for a small example
2022-03-08 21:08:47 +00:00
Ruipeng Li
7443a2ac6c
missed some sync in the last commit
2022-03-07 23:54:03 -08:00
Ruipeng Li
e1b9a56405
add gpu sync for mpi
2022-03-07 23:34:33 -08:00
Ruipeng Li
8ee20f4812
cudamallocasync
2022-03-07 16:54:56 -08:00
Ruipeng Li
df0f6dbba7
configure options: cublas; cudamallocasync
2022-03-07 16:40:32 -08:00
Ruipeng Li
d7728d0bce
updated ij driver for 2nd solve
2022-03-07 15:16:00 -08:00
Ruipeng Li
b97fbc13ed
sync device at ending timing
2022-03-07 15:13:08 -08:00
Ruipeng Li
c2e4836c1e
bug fix
2022-03-05 10:36:53 -08:00
Ruipeng Li
a51bb880a8
bug fix
2022-03-05 09:46:16 -08:00
Ruipeng Li
03546b428f
Merge branch 'master' of github.com:hypre-space/hypre into nvcollab
2022-03-04 22:18:02 -08:00
Paul T. Bauman
251cd3d269
Need -O1 instead of -O0 for HIP in debug mode ( #588 )
...
This PR changes -O0 in debug mode to -O1 with HIP (at this time).
2022-03-04 12:40:35 -08:00
Ruipeng Li
95e6433fc7
GPU support with single precision ( #572 )
...
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ebd6eb88c3
bug fix; nonsquare rap ( #581 )
...
This PR fixes a corner case of the RAP routine for RAP matrix that is globally square but not locally.
2022-03-03 21:26:17 -08:00
Wayne Mitchell
d388a2766e
Lots of reorganization. This now has all functionality for par matmat and compiles, but needs debugging.
2022-03-01 01:59:27 +00:00
Wayne Mitchell
8112dd736f
Further cleanup and reorg of device_utils.c/h and addition of more functionality needed for par matmat
2022-02-19 00:40:27 +00:00
Paul T. Bauman
04af9a4cd9
HYPRE_Int -> HYPRE_BigInt ( #585 )
2022-02-18 12:16:35 -08:00
Wayne Mitchell
cda5b10a69
Single processor device rap works
2022-02-18 18:58:28 +00:00
Wayne Mitchell
47ae1c8a22
Start major reorganization of device_utils.h
2022-02-18 17:44:22 +00:00
Golam Rabbani
94070dd3a9
Updated CMakeLists.txt for SYCL ( #577 )
...
With CMake, enable CUDA stream by default when using SYCL.
2022-02-17 18:21:51 -08:00
Victor A. Paludetto Magri
33a5051398
Add SStruct IO functions ( #583 )
...
This PR adds support for native print/read functions of SStructMatrix and SStructVector. Other important changes are:
* Add public functions for reading StructMatrix and StructVector.
* Add a new set of regression tests called "io" to the TEST_sstruct folder.
2022-02-17 18:06:23 -08:00
Wayne Mitchell
1e289479f1
astyle
2022-02-17 00:56:49 +00:00
Ruipeng Li
9888903445
memory tracker
2022-02-16 15:04:20 -08:00
Ruipeng Li
c336122cc1
move to debug region
2022-02-16 14:19:15 -08:00
Victor A. Paludetto Magri
49dbf7b60a
Fix cross-compilation problem ( #580 )
...
This PR fixes issue #556 .
AC_CHECK_FILE was being used to test the existence of the .git folder. However, according to Autoconf manual, it does not work when cross-compiling. This PR implements another strategy for looking for the .git folder which works also when doing cross-compilation.
2022-02-16 07:55:02 -08:00
Wayne Mitchell
4136c63269
Switch to HYPRE_GPU_LAUNCH
2022-02-15 20:01:08 +00:00
Wayne Mitchell
7883708a4f
astyle
2022-02-15 18:05:12 +00:00
Wayne Mitchell
e3e99f3b9d
Add back include
2022-02-15 18:04:22 +00:00
Wayne Mitchell
1e2f711d01
Merge branch 'sycl' into sycl_par_matmat
2022-02-15 18:00:07 +00:00
Wayne Mitchell
e312ecbe31
Merge branch 'master' into sycl_par_matmat
2022-02-15 17:41:05 +00:00
Rui-peng Li
df0b1eac20
turn off print
2022-02-12 01:14:53 -06:00
Rui-peng Li
c430559482
remove fast hash
2022-02-12 01:09:15 -06:00
Rui-peng Li
c347b3c9e1
remove some changes
2022-02-12 00:53:55 -06:00
Ruipeng Li
7672c101cd
try to optimize hash
2022-02-11 22:17:47 -08:00
Ruipeng Li
c85a060d85
naive row nnz bound with spmv
2022-02-11 17:18:32 -08:00
Ulrike Yang
e5a82e81e6
specified SYCL support
2022-02-10 17:12:32 -08:00
Rob Falgout
ccd135d8da
Updating CHANGELOG
2022-02-10 10:18:32 -08:00
Rob Falgout
666f457d2b
Bumping release number to 2.24.0
2022-02-10 07:05:43 -08:00
Rob Falgout
4ee737b53c
Initial CHANGELOG update for new release
2022-02-10 07:02:26 -08:00
Ruipeng Li
9d99e1fb77
put unroll as a macro
2022-02-10 00:14:18 -05:00
Ruipeng Li
efed76505c
rename printf0; change unroll to 1
2022-02-09 23:44:49 -05:00
Wayne Mitchell
f8d482efad
Add packing on device with oneDPL for par matvec
2022-02-10 01:51:56 +00:00
Ruipeng Li
6180f1261c
minor change
2022-02-09 20:36:53 -05:00
Ruipeng Li
bef2862710
silence compiler warnings
2022-02-09 13:27:18 -05:00
Ruipeng Li
ab72d05bd8
Deviceomp ( #519 )
...
This PR fixes the build with Kokkos + OMP offload, supports OMP offload without linking CUDA libraries, and supports OMP offload on Intel GPUs.
2022-02-09 06:40:57 -08:00
Rui-peng Li
e0c24fd1f6
turn off printings
2022-02-08 16:34:05 -06:00
Ruipeng Li
04037377ea
debug for HIP
2022-02-08 14:48:58 -06:00
Ruipeng Li
8ba048c0b5
Forced regeneration of softlinks in shared library builds. ( #574 )
...
This PR (copied from #573 ) added -f to softlink generation to all the makefiles.
2022-02-07 16:54:44 -08:00
Ruipeng Li
e40f8219a3
fix for last merged PR
2022-02-07 18:26:53 -06:00