hypre

CFD/hypre

Author	SHA1	Message	Date
Victor A. Paludetto Magri	662e886881	[Multivec 2/5]: Extend multivector support (#693 ) * Add new device functions needed by multivectors (`hypreDevice_IntStridedCopy` and `hypreDevice_IVAMXPMY`) * Extend `hypre_SeqVectorElmdivpy` to work with multivectors.	2022-07-29 15:37:24 -07:00
Ruipeng Li	e270c561b0	Spgemm (#639 ) This PR includes optimizations for hypre's SpGEMM and ParSpGEMM kernels Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov> Co-authored-by: Paul T. Bauman <ptbauman@gmail.com> Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>	2022-06-24 10:42:16 -07:00
Victor A. Paludetto Magri	edb91b4a50	Add -auxfromfile option to IJ driver (#633 ) Add -auxfromfile option for reading an auxiliary matrix from file, which is then used to build the preconditioner. This is useful, for example, for the case when a filtered version of A is used to build the preconditioner.	2022-05-26 21:23:31 -04:00
Ruipeng Li	ef3f890d4b	Nvcollab (#591 ) This PR contains various GPU optimizations in the collaboration with the NVIDIA team. Co-authored-by: Peng Wang <penwang@nvidia.com>	2022-05-24 13:27:32 -07:00
Victor A. Paludetto Magri	e16167fe46	Fix copyright (#615 ) This PR updates Copyright headers from "Copyright 1998-2019 ..." to "Copyright (c) 1998 ..."	2022-04-05 16:19:51 -07:00
Quan Bui	734a10fcb7	Mgr setup gpu (#400 ) Enable GPU setup for MGR solver. * Added device specific functionality for interpolation * Made device and host calls to interpolation consistent * Edited IJ driver to use GPU capable options for MGR * Updated saved files for new GPU options * Updated CMakeLists to support new MGR capabilities Co-authored-by: Ruipeng Li <li50@llnl.gov> Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>	2022-02-07 15:54:52 -08:00
Wayne Mitchell	a7bb784a45	SYCL support for AMG solve phase (#549 ) This adds matvec, matrix transpose, and vector operations (axpy, inner product, etc.) with sycl backend (via oneMKL and oneDPL) for running on Intel GPUs. Thus, the AMG solve phase can now execute entirely on Intel GPUs.	2022-01-31 16:15:30 -08:00
Wayne Mitchell	4232108a4d	Add SYCL support (#431 ) This sets up basic infrastructure (e.g. memory management, device setup, etc.) and implements the boxloops and structure solvers in sycl.	2021-11-22 16:54:22 -08:00
Rob Falgout	805ee77be8	Adding source file indentation with astyle (#498 ) This PR adds automatic indentation using Artistic Style (astyle). The script config/astyle-apply.sh runs the indentation using the configuration file config/astylerc. The script also runs headers in all of the directories that automatically generate internal _hypre_*.h header files. Much of this was borrowed from the MFEM project. A pre-commit git hook was also added.	2021-11-08 19:26:59 -08:00
Ruipeng Li	7f2762cffb	Cusparse spmv (#512 ) This PR removes frequent GPU malloc/free in CSRMatvec with cuSPARSE 11. See #507.	2021-11-01 10:33:52 -07:00
Ruipeng Li	eaff5505ed	hypre's GPU SpGemm (#433 ) This PR improves the performance of hypre's sparse matrix-matrix on NVIDIA GPUs, and fixes it on AMD GPUs with hip. Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster> Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>	2021-09-09 08:34:39 -07:00
Ruipeng Li	dd9f1ea31c	Dcsrmv analysis (#458 ) This PR (by @pbauman #430) is a hook to be able to call rocsparse_dcsrmv_analysis when using rocSPARSE on AMD GPUs. Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>	2021-09-08 13:59:17 -07:00
Wayne Mitchell	e53b5a0270	Add rocsparse triangular solve (#462 ) Adds a rocsparse implementation for the upper/lower triangular solve required for Gauss-Seidel relaxation when using hip and rocsparse on AMD GPUs.	2021-08-30 13:33:49 -07:00
Wayne Mitchell	59fcd47e6d	Air gpu (#425 ) This PR ports the Neumann version of AIR to the GPU. New features include: 1. Construction of Neumann AIR restriction operator R on the GPU 2. Construction of one-point interpolation on the GPU 3. Construction of an absolute value version of the strength of connection matrix on the GPU 4. CF relaxation for Jacobi (relax7) and L1 Jacobi (relax18) on the GPU - note that this does redundant computation since a full matvec is called when only relaxing either C- or F-points 5. Regression tests for AIR 6. Filtering for ParCSR matrices based on tol*row_norm for 1-, 2-, and infinity-norm on the GPU	2021-08-06 16:39:42 -07:00
Luke	cb0c70b163	Fix potentially inconsistent eig estimates (#390 ) (#410 ) This PR reimplements hypre_ParCSRMaxEigEstimate using Gershgorin discs, which ensures that max_eig and min_eig are both allreduced across all ranks so that the return value of the function is the same for all ranks. Co-authored-by: Ruipeng Li <li50@llnl.gov> Co-authored-by: li50@llnl.gov <liruipengblue@gmail.com>	2021-08-05 14:03:04 -07:00
Victor A. Paludetto Magri	ffe4f7384b	Update IJ interface with changes in recmat-merge (#392 ) This PR improves the IJ interface with new functions, performs code reorganization, and simplifies coding by removing ownership info related to the partitioning data members from ParCSRMatrix and ParVector objects. A more comprehensive list of changes is given below: * Add HYPRE_IJMatrixAdd, HYPRE_IJMatrixNorm and HYPRE_IJMatrixTranspose functions * Add ParCSRMatrixInfNorm and ParCSRMatrixReorder functions * Add transpose, add and norm functions to IJMatrix * Add more caliper annotation to BoomerAMG and ParCSR functions * Fix typo in assumed partition function and add caliper annotation * The output matrix from ParTMatmul owns row/col starts. * Build communication package for A at ParTMatmul if it does not exist. * Move hypre_Log2 to utilities * Add HYPRE_ANNOTATE_REGION_[BEGIN,END] to caliper annotation * Phase out [row,col]_starts ownership info in ParCSR matrices * Remove partitioning ownership info from vector * Move partitioning variables to stack memory	2021-07-28 15:42:23 -07:00
Ruipeng Li	8c9f41a4d0	GPU ams ame ads (#398 ) This PR adds GPU support for ams, ame and ads, and the following parcsr operations on GPUs, ParCSRAdd, ParCSRTranspose, l1 hybrid G-S/SSOR. Co-authored-by: Rob Falgout <rfalgout@llnl.gov>	2021-06-21 14:36:46 -07:00
Wayne Mitchell	5f8472b05c	Amgdd fixes (#386 ) This removes the masked matvec routine previously used for CF L1 Jacobi relaxation in the AMG-DD solver. There was a bug present in the GPU code and the bsrxmv cusparse routine no longer supports our use-case as of cuda 11. In addition, appropriate regression test results were saved for the GPU implementation of AMG-DD.	2021-06-15 10:44:46 -07:00
Ruipeng Li	3bc7d267ef	Gpu default (#336 ) This PR changes AMG defaults regarding GPUs at various places, adds regression tests on GPUs, simplifies CUDA boxloop implementations. Co-authored-by: Sarah Virginia Osborn <osborn9@llnl.gov> Co-authored-by: PaulMullowney <pmullown@nrel.gov> Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov> Co-authored-by: Ruipeng Li <li50@euler.llnl.gov> Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>	2021-05-24 17:16:35 -07:00
Victor A. Paludetto Magri	c38527c455	Add OMP support to Mat/Mat add functions (#341 ) Add OpenMP support to CSRMatrixAddHost and ParCSRMatrixAdd functions. Minor changes are: - Changed name ParcsrAdd to ParCSRMatrixAdd - Add hypre_CSRMatrixAddFirstPass and hypre_CSRMatrixAddSecondPass to reduce code duplication - Update rownnz support in CSRMatrixAddHost, ParCSRMatrixAdd and hypre_ILUParCSRInverseNSH. - Refactor SpMV branches	2021-05-13 17:29:42 -07:00
Ruipeng Li	25646da905	Mat descr (#331 ) This PR (@pbauman #329) addresses #309, which allows each `hypre_csrmatrix` has a GPU matrix descriptor. Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>	2021-04-15 19:00:38 -07:00
Ruipeng Li	b3a4a76a5f	Roc sparse (#316 ) This PR (by @pbauman #304) adds the first pass of rocSPARSE support. Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>	2021-03-25 20:11:53 -07:00
Ramesh Pankajakshan	414fa671be	Umpire (#243 ) This PR contains the support of UMPIRE pooling allocators for host and GPU memory. Configure hypre with --with-umpire, device and uvm allocations and deallocations are done with umpire, whereas host pool is not enabled by default. This PR also includes some other minor changes: Adding .gitignore to the repo Removing all malloc/calloc/realloc/free and regression testing on finding them No longer compile ij.c with C++ compiler. It goes back to a C code now. Introducing HYPRE_USING_GPU, which is equivalent to HYPRE_USING_CUDA \|\| HYPRE_USING_DEVICE_OPENMP Adding a few user-level interfaces: HYPRE_SetMemoryLocation, HYPRE_SetExecutionPolicy, HYPRE_SetGPUMemoryPoolSize and HYPRE_CSRMatrixSetSpGemmUseCusparse Co-authored-by: li50@llnl.gov <liruipengblue@gmail.com> Co-authored-by: Rob Falgout <rfalgout@llnl.gov> Co-authored-by: Ruipeng Li <li50@llnl.gov>	2021-02-03 12:31:25 -08:00
Ruipeng Li	2186a8fb34	triangular solve on GPUs; runcheck (#256 ) This PR fixes triangular solve on GPUs, and runcheck.sh Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>	2021-01-15 20:46:59 -08:00
Ruipeng Li	b49727f16b	Cuda triangular smoothers (#240 ) * This commit has CUDA based smoothers for AMG based on the triangular parts of sparse matrices. This includes an Gauss-Seidel (relax_type==3), which uses CUSPARSE triangular solvers to invert L. Symmetric Gauss Seidel is implemented in relax_type==6 also via CUSPARSE. Finally, 2 new smoothers are added. THe first is a 2 stage approximation to Gauss Seidel using a parallel MatVec and L (relax_type==11). The second (relax_type==12) is a less effective version of 11. It uses A_diag instead of L for the smoothing. CPU implementations of these new smoothers are also provided. For the two stage algorithms, L and U are NOT explicitly created. This seems faster and saves memory. In the two stage preconditioner, multiply by invdiag rather than divide by diagonal reduces register pressure and yields full occupancy. Co-authored-by: Paul Mullowney <pmullown@nrel.gov> Co-authored-by: PaulMullowney <60452402+PaulMullowney@users.noreply.github.com>	2020-12-17 19:37:59 -08:00
Ruipeng Li	804609b6c4	Reorg relax (#237 ) This PR refactors the relaxation routines on CPUs and modularize the various Jacobi and Gauss-Seidel (G-S) methods in two "core" kernels.	2020-12-07 09:05:36 -08:00
Daniel Osei-Kuffuor	56012897e1	Ilu dev 2019 (#160 ) This merge introduces new features to the parallel ILU solvers in hypre. In particular we have GPU support for BJ-ILU(0) and GMRES-ILU(0). In addition, this merge includes a new option for GMRES-ILU(0) using MILU(0) to build restriction/interpolation operators used to construct the Schur complement matrix by a Galerkin product. This option is also available on the GPU. Key commits include: * ILU updates with bug fixes for compiling the cuda version * Update local RCM ordering option to support nonsymmetric matrices * Update regression tests to test new features * Reference manual updates, Code cleanup and bug fixes Co-authored-by: Tianshi Xu <xu16@ray59.coralea.llnl.gov> Co-authored-by: Tianshi Xu <xu16@lassen708.coral.llnl.gov> Co-authored-by: Xu <xu16@bellsofireland.llnl.gov> Co-authored-by: Kote Hitenze <hitenze@jotenshis-MacBook-Pro.local> Co-authored-by: Tianshi Xu <xuxx1180@umn.edu> Co-authored-by: Ruipeng Li <li50@llnl.gov>	2020-11-22 22:16:56 -06:00
Luke	22f4d3f8c6	Cuda 11 API (#163 ) This PR adds CUDA-11 support.	2020-11-05 20:57:57 -08:00
Ruipeng Li	aaf5aa564a	Aggressive coarsening and 2- stage MM-ext Interpolations on GPUs (#195 ) This PR contains the following changes: * Aggressive coarsening, i.e, 2nd SoC on GPUs * 2-stage MM-ext Interpolations (MM-ext, MM-ext+e) on GPUs * Enhanced abilities of extracting strong FF/FC/CF/CC submatrix with given SoC matrix * Bug fix in device PMIS Co-authored-by: Bjorn Sjogreen <sjogreen2@llnl.gov> Co-authored-by: ulrikeyang <yang11@llnl.gov>	2020-09-23 17:13:23 -07:00
Wayne Mitchell	0b80656ce9	AMG-DD implementation (#145 ) This includes the implementation of the AMG-DD algorithm, a variant of BoomerAMG designed to limit communication. AMG-DD may be used as a standalone solver or a preconditioner for Krylov methods (note that AMG-DD is a non-symmetric preconditioner). For an example of how to set up and use AMG-DD, see the IJ driver (src/test/ij.c). A list with the parameters of AMG-DD is given below: Padding (recommended default 1): HYPRE_BoomerAMGDDSetPadding(...) Number of ghost layers (recommended default 1): HYPRE_BoomerAMGDDSetNumGhostLayers(...) Number of inner FAC cycles per AMG-DD iteration (default 2): HYPRE_BoomerAMGDDSetFACNumCycles(...) FAC cycle type: HYPRE_BoomerAMGDDSetFACCycleType(...) 1 = V-cycle (default) 2 = W-cycle 3 = F-cycle Number of relaxations on each level during FAC cycle: HYPRE_BoomerAMGDDSetFACNumRelax(...) Type of local relaxation during FAC cycle: HYPRE_BoomerAMGDDSetFACRelaxType(...) 0 = Jacobi 1 = Gauss-Seidel 2 = ordered Gauss-Seidel 3 = C/F L1-scaled Jacobi (default) For more details of the algorithm, see Mitchell W.B., R. Strzodka, and R.D. Falgout (2020), Parallel Performance of Algebraic Multigrid Domain Decomposition (AMG-DD).	2020-09-02 17:52:20 -07:00
Ruipeng Li	ce7ef08496	new sparse mat-mat-dist, triple-mat-dist	2020-06-01 18:10:41 -07:00
Ruipeng Li	6f8f513164	added protos.h; bug fix	2020-05-12 23:24:35 -07:00

32 Commits