Commit Graph

79 Commits

Author SHA1 Message Date
Ruipeng Li
3f39f5d4fa update printMM 2022-06-07 15:26:40 -07:00
Ruipeng Li
8b02ab88d3 GPU support for rownnz using IntArray 2022-06-07 12:08:43 -07:00
Ruipeng Li
c09ab567e7 remove a function 2022-06-07 10:13:11 -07:00
Ruipeng Li
4e78801d2a Merge branch 'master' of github.com:hypre-space/hypre into spgemm 2022-05-25 16:36:44 -07:00
Wayne Mitchell
dfdd1cd12f
Sycl par matmat (#611)
Further unification of GPU implementation across cuda/hip/sycl.
Implements the parallel matrix matrix product in sycl.
HYPRE_CUDA_LAUNCH and HYPRE_SYCL_LAUNCH macros have 
been unified under HYPRE_GPU_LAUNCH for kernel launches.
Replace HYPRE_SetSpGemmUseCusparse with HYPRE_SetSpGemmUseVendor.
2022-05-09 15:24:44 -07:00
Ruipeng Li
224bb78d4f Merge branch 'spgemm' of github.com:hypre-space/hypre into parspgemm 2022-04-05 23:35:19 -07:00
Ruipeng Li
046b278c66 bug fix 2022-04-05 22:26:29 -07:00
Victor A. Paludetto Magri
e16167fe46
Fix copyright (#615)
This PR updates Copyright headers from "Copyright 1998-2019 ..." to "Copyright (c) 1998 ..."
2022-04-05 16:19:51 -07:00
Ruipeng Li
87b0b6669a update hypre's spmv 2022-04-02 14:16:08 -07:00
Ruipeng Li
8ea39950b1 Merge branch 'nvcollab' of github.com:hypre-space/hypre into spgemm 2022-03-31 18:58:36 -07:00
Wayne Mitchell
bb2cb43232 Merge branch 'master' into sycl_par_matmat 2022-03-25 20:27:17 +00:00
Wayne Mitchell
f2fa2e9577 Lots of ugly debugging code in here, but I have also fixed a couple esoteric things. Saving with debugging code in just in case I need to go back and use it. 2022-03-18 23:49:23 +00:00
Ruipeng Li
e5f6655ba0 initial support for pattern only matrices (spgemm only) 2022-03-16 09:32:12 -07:00
Victor A. Paludetto Magri
6fd043c9c2
(S)Struct IO on GPUs (#599)
This PR extends the (semi)-struct matrix/vector IO functions added on #583 with GPU support. Additionally:

* Fix regression tests on Lassen.
* Read data values into host memory
* Update Umatrix read algorithm when the ParCSRMatrix is expected to live on the device
* Reset deallocated pointers at hypre_IJMatrixDestroyParCSR to NULL
* Clone rownnz info if present on a CSRMatrix
* Reduce memory transfer and remove unused variables
* Fix bug with -print option
* Build rownnz info also when the ParCSRMatrix is in assembled state
* Remove a few instances of "return ierr"
* Refactor (s)struct IO - code works with cuda and without UM
* Add executables to gitignore
2022-03-13 20:14:23 -07:00
Ruipeng Li
8c344aee9a
Invalid assumption on exclusive_scan (#575)
This PR fixes a number of initialization problems with exclusive_scan on GPUs due to invalid assumptions of this function.
2022-03-11 08:32:26 -08:00
Wayne Mitchell
a7bb784a45
SYCL support for AMG solve phase (#549)
This adds matvec, matrix transpose, and vector operations (axpy, inner product, etc.)
with sycl backend (via oneMKL and oneDPL) for running on Intel GPUs. Thus, the AMG
solve phase can now execute entirely on Intel GPUs.
2022-01-31 16:15:30 -08:00
Wayne Mitchell
bb136a489b one MKL sparse matvec now in and functional 2021-11-23 01:09:32 +00:00
Wayne Mitchell
134d2d73b1 Initial onemklsparse matvec implementation 2021-11-19 17:15:06 +00:00
Rob Falgout
805ee77be8
Adding source file indentation with astyle (#498)
This PR adds automatic indentation using Artistic Style (astyle).  The script config/astyle-apply.sh runs the indentation using the configuration file config/astylerc.  The script also runs headers in all of the directories that automatically generate internal _hypre_*.h header files.  Much of this was borrowed from the MFEM project.  A pre-commit git hook was also added.
2021-11-08 19:26:59 -08:00
Wayne Mitchell
e53b5a0270
Add rocsparse triangular solve (#462)
Adds a rocsparse implementation for the upper/lower triangular solve required 
for Gauss-Seidel relaxation when using hip and rocsparse on AMD GPUs.
2021-08-30 13:33:49 -07:00
Ruipeng Li
3bc7d267ef
Gpu default (#336)
This PR changes AMG defaults regarding GPUs at various places, adds regression tests on GPUs, simplifies CUDA boxloop implementations. 

Co-authored-by: Sarah Virginia Osborn <osborn9@llnl.gov>
Co-authored-by: PaulMullowney <pmullown@nrel.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
Co-authored-by: Ruipeng Li <li50@euler.llnl.gov>
Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>
2021-05-24 17:16:35 -07:00
Victor A. Paludetto Magri
3f12d47651
Add support for matrices with many zero rows (#300)
The main objective of this PR is to improve the support of matrices with a large number of zero rows in hypre. More specifically:
* Improve IJMatrixAssemble for matrices with a large number of zero rows.
* Add hypre_AuxParCSRMatrixSetRownnz to build array of nonzero rows in the auxiliary matrix. This saves allocation time for building ParCSRMatrices.
* Improve (Par)CSRMatrix transpose, addition and multiplication operations for matrices with a large number of zero rows.

Secondary changes made in this PR are:
* Update SpMV paths in `csr_matvec` in order to make the calculation of A*x more concise.
* Extend OpenMP support to hypre_CSRMatrixSumElts, hypre_CSRMatrixFnorm and hypre_CSRMatrixReorder.
* Clean gcc-9 warnings
* Update saved files and delete unused variable

Co-authored-by: Ruipeng Li <li50@llnl.gov>
2021-04-27 17:14:17 -07:00
Ruipeng Li
25646da905
Mat descr (#331)
This PR (@pbauman #329) addresses #309, which allows each `hypre_csrmatrix` has a GPU matrix descriptor.

Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
2021-04-15 19:00:38 -07:00
Ruipeng Li
b49727f16b
Cuda triangular smoothers (#240)
* This commit has CUDA based smoothers for AMG based on the triangular parts of sparse matrices. This includes an Gauss-Seidel (relax_type==3), which uses CUSPARSE triangular solvers to invert L. Symmetric Gauss Seidel is implemented in relax_type==6 also via CUSPARSE. Finally, 2 new smoothers are added. THe first is a 2 stage approximation to Gauss Seidel using a parallel MatVec and L (relax_type==11). The second (relax_type==12) is a less effective version of 11. It uses A_diag instead of L for the smoothing. CPU implementations of these new smoothers are also provided. For the two stage algorithms, L and U are NOT explicitly created. This seems faster and saves memory. In the two stage preconditioner, multiply by invdiag rather than divide by diagonal reduces register pressure and yields full occupancy.
Co-authored-by: Paul Mullowney <pmullown@nrel.gov>
Co-authored-by: PaulMullowney <60452402+PaulMullowney@users.noreply.github.com>
2020-12-17 19:37:59 -08:00
Ruipeng Li
7b2379c0d3
optimization in hypre_CSRMatrixBigJtoJ and JtoBigJ (#204)
* Code optimization in hypre_CSRMatrixBigJtoJ and JtoBigJ
2020-09-28 19:17:35 -07:00
Ruipeng Li
aaf5aa564a
Aggressive coarsening and 2- stage MM-ext Interpolations on GPUs (#195)
This PR contains the following changes:
* Aggressive coarsening, i.e, 2nd SoC on GPUs
* 2-stage MM-ext Interpolations (MM-ext, MM-ext+e) on GPUs
* Enhanced abilities of extracting strong FF/FC/CF/CC submatrix with given SoC matrix
* Bug fix in device PMIS
Co-authored-by: Bjorn Sjogreen <sjogreen2@llnl.gov>
Co-authored-by: ulrikeyang <yang11@llnl.gov>
2020-09-23 17:13:23 -07:00
Wayne Mitchell
0b80656ce9
AMG-DD implementation (#145)
This includes the implementation of the AMG-DD algorithm, a variant of BoomerAMG designed to limit communication. 

AMG-DD may be used as a standalone solver or a preconditioner for Krylov methods (note that AMG-DD is a non-symmetric preconditioner). For an example of how to set up and use AMG-DD, see the IJ driver (src/test/ij.c).

A list with the parameters of AMG-DD is given below:

Padding (recommended default 1): HYPRE_BoomerAMGDDSetPadding(...)
Number of ghost layers (recommended default 1): HYPRE_BoomerAMGDDSetNumGhostLayers(...)
Number of inner FAC cycles per AMG-DD iteration (default 2): HYPRE_BoomerAMGDDSetFACNumCycles(...)
FAC cycle type: HYPRE_BoomerAMGDDSetFACCycleType(...)
1 = V-cycle (default)
2 = W-cycle
3 = F-cycle
Number of relaxations on each level during FAC cycle: HYPRE_BoomerAMGDDSetFACNumRelax(...)
Type of local relaxation during FAC cycle: HYPRE_BoomerAMGDDSetFACRelaxType(...)
0 = Jacobi
1 = Gauss-Seidel
2 = ordered Gauss-Seidel
3 = C/F L1-scaled Jacobi (default)

For more details of the algorithm, see Mitchell W.B., R. Strzodka, and R.D. Falgout (2020), Parallel Performance of Algebraic Multigrid Domain Decomposition (AMG-DD).
2020-09-02 17:52:20 -07:00
Ruipeng Li
0d4089bdf8 FFFC on device 2020-05-11 17:23:39 -07:00
liruipeng
4457324247 access global var _hypre_handle via hypre_handle() 2020-03-27 10:35:19 -07:00
Ruipeng Li
1afd4a09a9 memory model, exec policy, etc 2020-02-24 22:16:18 -08:00
Ruipeng Li
213d2390c3 Merge branch 'gpu-assembly' of https://github.com/hypre-space/hypre into mempool
Conflicts:
	src/parcsr_ls/ams.c
	src/parcsr_mv/par_csr_matrix.c
2019-12-05 22:57:11 -08:00
Ruipeng Li
c9755f5d0d less UM 2019-11-19 10:22:47 -08:00
Ruipeng Li
964b65006c remove ismanaged 2019-10-11 21:40:51 -07:00
Ruipeng Li
0af952509c Merge branch 'master' of https://github.com/hypre-space/hypre into amg-setup
Conflicts:
	src/CMakeLists.txt
	src/config/configure.in
	src/parcsr_ls/HYPRE_parcsr_ls.h
	src/parcsr_ls/_hypre_parcsr_ls.h
	src/parcsr_ls/aux_interp.c
	src/parcsr_ls/par_amg.c
	src/parcsr_ls/par_amg_setup.c
	src/parcsr_ls/par_coarsen.c
	src/parcsr_ls/par_cycle.c
	src/parcsr_ls/par_interp.c
	src/parcsr_ls/par_strength.c
	src/parcsr_mv/_hypre_parcsr_mv.h
	src/parcsr_mv/par_csr_assumed_part.c
	src/parcsr_mv/par_csr_communication.c
	src/parcsr_mv/par_csr_communication.h
	src/parcsr_mv/par_csr_matop.c
	src/parcsr_mv/par_csr_matrix.c
	src/parcsr_mv/par_csr_matrix.h
	src/seq_mv/csr_matrix.c
	src/seq_mv/seq_mv.h
	src/utilities/_hypre_utilities.h
	src/utilities/hypre_memory.h
	src/utilities/protos.h
2019-09-25 15:36:06 -07:00
Ruipeng Li
0cd052494d Merge branch 'master' of https://github.com/hypre-space/hypre into AIR
Conflicts:
	src/krylov/krylov.h
	src/parcsr_ls/par_lr_restr.c
	src/parcsr_ls/par_stats.c
	src/utilities/protos.h
2019-07-26 11:32:53 -07:00
Ruipeng Li
8e8eb4f5cb Merge branch 'master' of https://github.com/hypre-space/hypre into amg-setup
Conflicts:
	src/IJ_mv/IJMatrix_parcsr.c
	src/parcsr_ls/par_nongalerkin.c
	src/seq_mv/csr_matrix.c
	src/utilities/_hypre_utilities.h
	src/utilities/binsearch.c
	src/utilities/gpuErrorCheck.c
	src/utilities/gpuErrorCheck.h
	src/utilities/gpuMem.c
	src/utilities/gpuMem.h
	src/utilities/hypre_cuda_reducer.h
	src/utilities/hypre_nvtx.h
	src/utilities/hypre_reducesum.c
	src/utilities/protos.h
2019-07-22 15:36:07 -07:00
Ruipeng Li
59d2809b2d gpu changes 2019-07-12 17:09:44 -07:00
Rob Falgout
48c9f0b972 Changed all of the headers 2019-07-07 19:26:24 -07:00
Ulrike Yang
926577d3fe freed unnecessary array. 2019-06-26 07:58:02 -07:00
Ruipeng Li
4ba901049e a lot of GPU related changes 2019-06-17 10:59:02 -07:00
Ruipeng Li
026b14cf7e code refactor 2019-06-04 16:31:31 -07:00
Ruipeng Li
20af62015e rewrite some kernels with thrust 2019-06-02 20:39:18 -07:00
Ruipeng Li
a2e5c0a7ce device matvec. removed some shared memory usage 2019-06-01 11:07:31 -07:00
Ruipeng Li
e184413f60 Merge branch 'master' of https://github.com/hypre-space/hypre into AIR
Conflicts:
	src/config/Makefile.config.in
	src/config/configure.in
	src/configure
	src/krylov/gmres.c
	src/krylov/krylov.h
	src/parcsr_ls/HYPRE_parcsr_gmres.c
	src/parcsr_ls/_hypre_parcsr_ls.h
	src/parcsr_ls/aux_interp.c
	src/parcsr_ls/par_amg.c
	src/parcsr_ls/par_amg_setup.c
	src/parcsr_ls/par_coarse_parms.c
	src/parcsr_ls/par_coarsen.c
	src/parcsr_ls/par_cycle.c
	src/parcsr_ls/par_lr_restr.c
	src/parcsr_ls/par_relax.c
	src/parcsr_ls/par_relax_interface.c
	src/parcsr_ls/par_restr.c
	src/parcsr_ls/par_stats.c
	src/parcsr_ls/par_strength.c
	src/parcsr_mv/_hypre_parcsr_mv.h
	src/parcsr_mv/par_csr_communication.c
	src/parcsr_mv/par_csr_matop.c
	src/parcsr_mv/par_csr_matrix.c
	src/parcsr_mv/par_csr_matrix.h
	src/test/ij.c
	src/utilities/_hypre_utilities.h
	src/utilities/hypre_qsort.c
2019-05-09 15:08:38 -07:00
Ruipeng Li
5a6fa6b2ac more changes related to bigInt and adding some comments for PMIS 2019-04-12 16:57:48 -07:00
Ruipeng Li
ce63d1c3f1 Merge branch 'master' of https://github.com/hypre-space/hypre into amg-setup
Conflicts:
	src/parcsr_ls/_hypre_parcsr_ls.h
	src/parcsr_ls/aux_interp.c
	src/parcsr_ls/par_coarsen.c
	src/parcsr_ls/par_gsmg.c
	src/parcsr_ls/par_laplace_9pt.c
	src/parcsr_ls/par_lr_restr.c
	src/parcsr_ls/par_rap.c
	src/parcsr_ls/par_relax.c
	src/parcsr_mv/_hypre_parcsr_mv.h
	src/parcsr_mv/new_commpkg.c
	src/parcsr_mv/par_csr_assumed_part.c
	src/parcsr_mv/par_csr_assumed_part.h
	src/parcsr_mv/par_csr_communication.c
	src/parcsr_mv/par_csr_matop.c
	src/parcsr_mv/par_csr_matrix.c
	src/parcsr_mv/par_csr_matrix.h
	src/parcsr_mv/par_vector.c
	src/seq_mv/csr_matrix.c
	src/seq_mv/gpukernels.c
	src/seq_mv/seq_mv.h
	src/utilities/_hypre_utilities.h
	src/utilities/gpuMem.h
	src/utilities/hypre_general.c
	src/utilities/protos.h
2019-04-03 12:47:46 -07:00
Ruipeng Li
61094c1d97 a lot of GPU-related changes 2019-04-02 13:51:01 -07:00
Ulrike Yang
b35bdc16d7 added private variables 2019-03-13 09:34:52 -07:00
Ulrike Yang
f7d6980215 fixed various memory leaks etc 2019-03-08 07:22:00 -08:00
Ulrike Yang
84d50104d9 Added HYPRE_BigInts in various places 2019-02-15 15:04:27 -08:00