Commit Graph

12176 Commits

Author SHA1 Message Date
Ruipeng Li
793b22aaf0 Merge branch 'nvcollab' of github.com:hypre-space/hypre into nvcollab 2022-03-11 08:34:20 -08:00
Ruipeng Li
97f3254d71 Merge branch 'master' of github.com:hypre-space/hypre into nvcollab 2022-03-11 08:33:31 -08:00
Ruipeng Li
8c344aee9a
Invalid assumption on exclusive_scan (#575)
This PR fixes a number of initialization problems with exclusive_scan on GPUs due to invalid assumptions of this function.
2022-03-11 08:32:26 -08:00
Rui-peng Li
700b0328bf bug fix 2022-03-10 22:19:22 -06:00
Ruipeng Li
9498625db4 Merge branch 'nvcollab' of github.com:hypre-space/hypre into nvcollab 2022-03-10 18:31:13 -08:00
Ruipeng Li
50c88ab95d minor changes 2022-03-10 18:30:52 -08:00
Ruipeng Li
c143265c61 regression on ray 2022-03-10 17:41:49 -08:00
Ruipeng Li
14a336c28b accidentally deleted io.sh 2022-03-10 13:30:31 -08:00
Ruipeng Li
2d06b53c4e benchmark ij on lassen 2022-03-10 13:26:15 -08:00
Ruipeng Li
815f2d57e0 add cublas/cusparse precision macros 2022-03-10 12:12:32 -08:00
Ruipeng Li
90cbe64fee saved.lassen 2022-03-10 08:45:27 -08:00
Ruipeng Li
7681f7f180 add cublas to makefile 2022-03-10 08:43:44 -08:00
Ruipeng Li
8ecee0b47d update hypre's spmv 2022-03-09 22:36:16 -08:00
Ruipeng Li
31ca2338d2 saved.lassen 2022-03-09 16:22:22 -08:00
Ruipeng Li
009501d51c bug fix 2022-03-09 16:14:56 -08:00
Ruipeng Li
86dae0be5b a minor change 2022-03-09 14:29:22 -08:00
Ruipeng Li
1dc1261fe8 fix cpu regression 2022-03-09 14:26:40 -08:00
Ruipeng Li
f8fd57ab2a updated saved perf on ray 2022-03-09 13:54:00 -08:00
Ruipeng Li
9b8627ce84 update lassen banchmark saved results 2022-03-09 13:09:36 -08:00
Ruipeng Li
7a8cf68b9a add -repeats 2 for struct benchmark jobs 2022-03-09 13:09:07 -08:00
Ruipeng Li
9dda5af3c4 struct.c driver for reps == 2 2022-03-09 12:31:46 -08:00
Paul T. Bauman
08b901f24d Silence clang warning
Should not be a change in behavior, just making explicit the
order of operations with parantheses and silencing a clang warning.
2022-03-09 13:31:32 -06:00
Paul T. Bauman
d55a409bdb Silence uninitialized var HIP warnings
This should be a benign change. What happens is that the first one or two
workitems/threads in each workgroup/block read a value and then broadcast
it (with __shfl_sync or similar) and then code branching happens
based on this value. But the compiler can't see it all the way
through it so we get some uninitialized var warnings.
2022-03-09 13:31:32 -06:00
Ruipeng Li
9c33e9a263 regression tests 2022-03-09 08:54:42 -08:00
Ruipeng Li
63c9fa65a2 add using hypre's spmv option 2022-03-08 22:11:31 -08:00
Wayne Mitchell
5e90f43cbc astyle 2022-03-08 23:51:41 +00:00
Wayne Mitchell
34c16787b7 Addition of MatMat and TMatMat routines and clean up 2022-03-08 23:50:48 +00:00
Wayne Mitchell
297ff5d5a7 Par matmat is verified correcct for a small example 2022-03-08 21:08:47 +00:00
Ruipeng Li
7443a2ac6c missed some sync in the last commit 2022-03-07 23:54:03 -08:00
Ruipeng Li
e1b9a56405 add gpu sync for mpi 2022-03-07 23:34:33 -08:00
Ruipeng Li
8ee20f4812 cudamallocasync 2022-03-07 16:54:56 -08:00
Ruipeng Li
df0f6dbba7 configure options: cublas; cudamallocasync 2022-03-07 16:40:32 -08:00
Ruipeng Li
d7728d0bce updated ij driver for 2nd solve 2022-03-07 15:16:00 -08:00
Ruipeng Li
b97fbc13ed sync device at ending timing 2022-03-07 15:13:08 -08:00
Ruipeng Li
c2e4836c1e bug fix 2022-03-05 10:36:53 -08:00
Ruipeng Li
a51bb880a8 bug fix 2022-03-05 09:46:16 -08:00
Ruipeng Li
03546b428f Merge branch 'master' of github.com:hypre-space/hypre into nvcollab 2022-03-04 22:18:02 -08:00
Paul T. Bauman
251cd3d269
Need -O1 instead of -O0 for HIP in debug mode (#588)
This PR changes -O0 in debug mode to -O1 with HIP (at this time).
2022-03-04 12:40:35 -08:00
Ruipeng Li
95e6433fc7
GPU support with single precision (#572)
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ebd6eb88c3
bug fix; nonsquare rap (#581)
This PR fixes a corner case of the RAP routine for RAP matrix that is globally square but not locally.
2022-03-03 21:26:17 -08:00
Wayne Mitchell
d388a2766e Lots of reorganization. This now has all functionality for par matmat and compiles, but needs debugging. 2022-03-01 01:59:27 +00:00
Wayne Mitchell
8112dd736f Further cleanup and reorg of device_utils.c/h and addition of more functionality needed for par matmat 2022-02-19 00:40:27 +00:00
Paul T. Bauman
04af9a4cd9
HYPRE_Int -> HYPRE_BigInt (#585) 2022-02-18 12:16:35 -08:00
Wayne Mitchell
cda5b10a69 Single processor device rap works 2022-02-18 18:58:28 +00:00
Wayne Mitchell
47ae1c8a22 Start major reorganization of device_utils.h 2022-02-18 17:44:22 +00:00
Golam Rabbani
94070dd3a9
Updated CMakeLists.txt for SYCL (#577)
With CMake, enable CUDA stream by default when using SYCL.
2022-02-17 18:21:51 -08:00
Victor A. Paludetto Magri
33a5051398
Add SStruct IO functions (#583)
This PR adds support for native print/read functions of SStructMatrix and SStructVector. Other important changes are:
* Add public functions for reading StructMatrix and StructVector.
* Add a new set of regression tests called "io" to the TEST_sstruct folder.
2022-02-17 18:06:23 -08:00
Wayne Mitchell
1e289479f1 astyle 2022-02-17 00:56:49 +00:00
Ruipeng Li
9888903445 memory tracker 2022-02-16 15:04:20 -08:00
Ruipeng Li
c336122cc1 move to debug region 2022-02-16 14:19:15 -08:00