Commit Graph

11524 Commits

Author SHA1 Message Date
Ruipeng Li
51e5a4c6de
SStruct interface without UVM (#170)
In this PR, we removed the dependency of UVM when building with CUDA for the SStruct solvers, and added a "memory tracker" to help debugging memory leak or misuse.
2021-02-09 11:21:39 -08:00
Ruipeng Li
3438132e1a
GPU examples (#268)
This PR adds GPU-support to hypre's examples. A new `Makefile_gpu` is provided in `src/examples` as an sample makefile, so one can compile the examples with make -f Makefile_gpu if hypre has been built with GPUs.
2021-02-09 11:19:05 -08:00
Rob Falgout
6eb66f8695 Fixed a small mistake in the sludist.sh test 2021-02-09 06:44:39 -08:00
Rob Falgout
be18e595ae
Remove the global partition code from hypre (#273)
This PR removes the global partition code from hypre.
2021-02-08 15:16:29 -08:00
Rob Falgout
6f9260b67c
Add saved-file extension to runtest (#271)
This pull request adds a -save <ext> feature to the runtest.sh script to allow testing against different saved files on different platforms such as GPU machines. See Issue #255. A few additional things were done:

- All of the checks against the saved files were moved out of the individual tests and into runtest.sh.
- The output-file sanity checks that are in many of the tests were modified so they no longer depend on the saved files. Several issues were also uncovered and fixed.
2021-02-08 15:11:45 -08:00
Ramesh Pankajakshan
414fa671be
Umpire (#243)
This PR contains the support of UMPIRE pooling allocators for host and GPU memory. Configure hypre with --with-umpire, device and uvm allocations and deallocations are done with umpire, whereas host pool is not enabled by default. This PR also includes some other minor changes:

Adding .gitignore to the repo
Removing all malloc/calloc/realloc/free and regression testing on finding them
No longer compile ij.c with C++ compiler. It goes back to a C code now.
Introducing HYPRE_USING_GPU, which is equivalent to HYPRE_USING_CUDA || HYPRE_USING_DEVICE_OPENMP
Adding a few user-level interfaces: HYPRE_SetMemoryLocation, HYPRE_SetExecutionPolicy, HYPRE_SetGPUMemoryPoolSize and HYPRE_CSRMatrixSetSpGemmUseCusparse

Co-authored-by: li50@llnl.gov <liruipengblue@gmail.com>
Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
Co-authored-by: Ruipeng Li <li50@llnl.gov>
2021-02-03 12:31:25 -08:00
Ruipeng Li
8462f60dc7
Hypre warp bitshift (#267)
This PR adds HYPRE_WARP_BITSHIFT macro, which will allow us to hide instances of '>> 5' for forthcoming HIP changes that will need a bit shift of 6 rather than 5. This PR was copied from #265 by @pbauman

Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
2021-01-27 11:22:09 -08:00
Ruipeng Li
2186a8fb34
triangular solve on GPUs; runcheck (#256)
This PR fixes triangular solve on GPUs, and runcheck.sh

Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
2021-01-15 20:46:59 -08:00
Daniel Osei-Kuffuor
bd76daf124
Updated saved files to reflect change in NSH solve on ILU Schur system -- See PR#251. (#254) 2021-01-11 20:22:36 -08:00
Daniel Osei-Kuffuor
6a1caf8998
Modification to fix error code warning for coarse level solver (#251)
Set tolerance for ILU Schur solver to zero for NSH. Schur solver convergence will be controlled by the max. number of iterations.
2021-01-08 16:57:22 -08:00
Ruipeng Li
a6c852be52
fixed syntax error with --enable-gpu-aware-mpi (#250)
This PR fixed compile errors with `--enable-gpu-aware-mpi', and added more comments regarding syncing CUDA stream when doing GPU-GPU MPI.
2021-01-04 16:05:27 -08:00
Ruipeng Li
950f9f2505
fix GPU SpMV for zero matrices (#246)
This PR fixes the issue with zero-sized matrix SpMV on GPUs.
2020-12-22 21:47:36 -08:00
Ruipeng Li
b49727f16b
Cuda triangular smoothers (#240)
* This commit has CUDA based smoothers for AMG based on the triangular parts of sparse matrices. This includes an Gauss-Seidel (relax_type==3), which uses CUSPARSE triangular solvers to invert L. Symmetric Gauss Seidel is implemented in relax_type==6 also via CUSPARSE. Finally, 2 new smoothers are added. THe first is a 2 stage approximation to Gauss Seidel using a parallel MatVec and L (relax_type==11). The second (relax_type==12) is a less effective version of 11. It uses A_diag instead of L for the smoothing. CPU implementations of these new smoothers are also provided. For the two stage algorithms, L and U are NOT explicitly created. This seems faster and saves memory. In the two stage preconditioner, multiply by invdiag rather than divide by diagonal reduces register pressure and yields full occupancy.
Co-authored-by: Paul Mullowney <pmullown@nrel.gov>
Co-authored-by: PaulMullowney <60452402+PaulMullowney@users.noreply.github.com>
2020-12-17 19:37:59 -08:00
Sarah Osborn
763ea8a5a8
cmake: Optionally accept path to BLAS/LAPACK libraries (#230)
Allow a user to specify the path to BLAS/LAPACK libraries within cmake, and bypass the cmake find_package logic.
2020-12-17 12:21:30 -06:00
Ruipeng Li
804609b6c4
Reorg relax (#237)
This PR refactors the relaxation routines on CPUs and modularize the various Jacobi and Gauss-Seidel (G-S) methods in two "core" kernels.
2020-12-07 09:05:36 -08:00
Daniel Osei-Kuffuor
9c24f006a6
ILU updates (#239)
* Fix bug in communication for Schur offD element map and update comments on communication pattern options.

* Fix bug to avoid double deletion of solution array.
2020-12-04 11:13:21 -08:00
Ruipeng Li
eae9be29be
bug fix cycle_param (#236)
This PR fixes the improper use of relax_type in relax-18 with CF relax, see #235
2020-11-23 13:09:59 -08:00
Daniel Osei-Kuffuor
56012897e1
Ilu dev 2019 (#160)
This merge introduces new features to the parallel ILU solvers in hypre. In particular we have GPU support for BJ-ILU(0) and GMRES-ILU(0). In addition, this merge includes a new option for GMRES-ILU(0) using MILU(0) to build restriction/interpolation operators used to construct the Schur complement matrix by a Galerkin product. This option is also available on the GPU. Key commits include:

* ILU updates with bug fixes for compiling the cuda version

* Update local RCM ordering option to support nonsymmetric matrices

* Update regression tests to test new features

* Reference manual updates, Code cleanup and bug fixes 

Co-authored-by: Tianshi Xu <xu16@ray59.coralea.llnl.gov>
Co-authored-by: Tianshi Xu <xu16@lassen708.coral.llnl.gov>
Co-authored-by: Xu <xu16@bellsofireland.llnl.gov>
Co-authored-by: Kote Hitenze <hitenze@jotenshis-MacBook-Pro.local>
Co-authored-by: Tianshi Xu <xuxx1180@umn.edu>
Co-authored-by: Ruipeng Li <li50@llnl.gov>
2020-11-22 22:16:56 -06:00
Rob Falgout
2bc4228eca Changed sludist.saved file to correct new superlu-dist autotest errors 2020-11-13 06:20:27 -08:00
David M. Rogers
796ab0af48
Use basename when checking compiler in configure (#225)
Use `basename` to strip off the path when (for example) `CC=/path/to/compiler/gcc` before testing for specific compilers in the `configure` script.
2020-11-09 06:23:03 -08:00
Luke
22f4d3f8c6
Cuda 11 API (#163)
This PR adds CUDA-11 support.
2020-11-05 20:57:57 -08:00
Meisam
641f7a4e31
Minor spelling fix (#222)
This PR fixes a typo in doc.
2020-11-02 08:51:42 -08:00
Ruipeng Li
2e1ccee243
Euclid fix (#218)
This PR fixes the integer overflow problem in Euclid.
2020-11-02 08:50:50 -08:00
Daniel Osei-Kuffuor
5ac2b3a54a
Improve portability for update-release script (for LINUX, UNIX and macOS). (#227)
Modified update-release.sh script to improve portability (for LINUX, UNIX and macOS).
* The use of the 'date' command has been modified to use GNUs 'date' command (if installed).
* In addition, single quotes for sed commands have been replaced by double quotes to allow the use of single quotes
   around internal variables. This appears to be more portable than the use of '\x27'. Note that this means shell meta-
   characters need to be escaped if they need to be treated as string literals. Other said lines are also modified 
   accordingly for consistency.
2020-10-29 08:39:34 -07:00
Ruipeng Li
9fb1b351c3
MS-Windows OMP pragma (#223)
This PR fixes OpenMP pragma in Windows when not using MSVC.
2020-10-27 08:45:58 -07:00
Ruipeng Li
636706acd7
Fixing compile issues --with-caliper (#216)
This PR fixed compile issues --with-caliper and a region mismatch issue with caliper.

Co-authored-by: Victor A. P. Magri <paludettomag1@llnl.gov>
2020-10-09 20:29:27 -07:00
Rob Falgout
dd4ddba0f3 Added a filter to runtest for 'lrun warning' 2020-10-08 13:43:18 -07:00
Rob Falgout
ff45ecef32
Set default convergence tolerance to 1.0e-6 (#206)
This sets the default convergence tolerance to 1.0e-6 uniformly in hypre.  Only three solvers had to be updated (AMG, ILU, and MGR), along with the corresponding documentation.
2020-09-28 19:28:43 -07:00
Ruipeng Li
7b2379c0d3
optimization in hypre_CSRMatrixBigJtoJ and JtoBigJ (#204)
* Code optimization in hypre_CSRMatrixBigJtoJ and JtoBigJ
2020-09-28 19:17:35 -07:00
Ruipeng Li
1ddd69f27c
Fixed problems when calling HYPRE_Finalize() multiple times (#207)
This PR fixed the problem when calling HYPRE_Finalize() multiple times, which set `_hypre_handle=NULL` after `HYPRE_Finalize`.
2020-09-28 16:05:51 -07:00
Ruipeng Li
5988a506be
Update CHANGELOG 2020-09-24 10:56:42 -07:00
ulrikeyang
54190a8461
Update CHANGELOG 2020-09-24 09:54:21 -07:00
ulrikeyang
4a5c5aca4e
Update CHANGELOG 2020-09-24 09:49:39 -07:00
Rob Falgout
d257887cd8 Another CHANGELOG update for 2.20.0 2020-09-24 05:43:30 -07:00
Rob Falgout
00b826e845 Update version number and date for release 2.20.0 2020-09-23 21:50:25 -07:00
Rob Falgout
2fe718e11f Update CHANGELOG for release 2.20.0 2020-09-23 21:44:28 -07:00
Ruipeng Li
aaf5aa564a
Aggressive coarsening and 2- stage MM-ext Interpolations on GPUs (#195)
This PR contains the following changes:
* Aggressive coarsening, i.e, 2nd SoC on GPUs
* 2-stage MM-ext Interpolations (MM-ext, MM-ext+e) on GPUs
* Enhanced abilities of extracting strong FF/FC/CF/CC submatrix with given SoC matrix
* Bug fix in device PMIS
Co-authored-by: Bjorn Sjogreen <sjogreen2@llnl.gov>
Co-authored-by: ulrikeyang <yang11@llnl.gov>
2020-09-23 17:13:23 -07:00
Victor A. Paludetto Magri
0fcb670540
Fix AMGDD (#190)
This PR fixes some memory leaks of BoomerAMGDD. Additional topics covered are:

* Perform code refactoring to meet the style used in hypre.
* Add GetIterations and FinalResidual interfaces for BoomerAMGDD.
* Add new regression tests.
* Make FAC Relaxation dependent on the execution policy.
2020-09-07 22:29:26 -07:00
ulrikeyang
37f7a0a3f0
Epe gpu (#187)
* added extended+e interpolation, CUDA implementation
* added two regression tests for extended+e interpolation
* renamed some routines to better reflect type of interpolation
Co-authored-by: Ulrike M. Yang <ulrikey@ray53.coralea.llnl.gov>
Co-authored-by: Ruipeng Li <li50@llnl.gov>
2020-09-03 08:15:12 -07:00
Rob Falgout
5f3141a647
Change issue reporting to use github's issue tracker (#189)
These changes direct users to report issues through GitHub instead of the hypre-support email and Roundup.
2020-09-02 22:35:36 -07:00
Rob Falgout
d5e4eb4bd4 Fixed a few minor autotest errors 2020-09-02 22:21:37 -07:00
Rob Falgout
36d0bfba4e Fixed a compile error. 2020-09-02 21:54:46 -07:00
Wayne Mitchell
0b80656ce9
AMG-DD implementation (#145)
This includes the implementation of the AMG-DD algorithm, a variant of BoomerAMG designed to limit communication. 

AMG-DD may be used as a standalone solver or a preconditioner for Krylov methods (note that AMG-DD is a non-symmetric preconditioner). For an example of how to set up and use AMG-DD, see the IJ driver (src/test/ij.c).

A list with the parameters of AMG-DD is given below:

Padding (recommended default 1): HYPRE_BoomerAMGDDSetPadding(...)
Number of ghost layers (recommended default 1): HYPRE_BoomerAMGDDSetNumGhostLayers(...)
Number of inner FAC cycles per AMG-DD iteration (default 2): HYPRE_BoomerAMGDDSetFACNumCycles(...)
FAC cycle type: HYPRE_BoomerAMGDDSetFACCycleType(...)
1 = V-cycle (default)
2 = W-cycle
3 = F-cycle
Number of relaxations on each level during FAC cycle: HYPRE_BoomerAMGDDSetFACNumRelax(...)
Type of local relaxation during FAC cycle: HYPRE_BoomerAMGDDSetFACRelaxType(...)
0 = Jacobi
1 = Gauss-Seidel
2 = ordered Gauss-Seidel
3 = C/F L1-scaled Jacobi (default)

For more details of the algorithm, see Mitchell W.B., R. Strzodka, and R.D. Falgout (2020), Parallel Performance of Algebraic Multigrid Domain Decomposition (AMG-DD).
2020-09-02 17:52:20 -07:00
liruipeng
7f9d222ed6 run headers 2020-09-02 09:52:12 -07:00
liruipeng
2b2ea39202 should run `headers' to make sure _hypre_parcsr_mv.h is not directly changed 2020-09-02 09:35:50 -07:00
Ruipeng Li
3ae6c7fec3
Merge pull request #172 from hypre-space/PETScFix
PETSc fix
2020-08-31 10:14:52 -07:00
Ruipeng Li
1c0598626c Merge branch 'master' of https://github.com/hypre-space/hypre into PETScFix 2020-08-27 20:12:13 -07:00
liruipeng
8833bed155 add compile flags in GPU regression test scripts 2020-08-27 18:10:27 -07:00
Ruipeng Li
ffe35407ae format change 2020-08-27 14:31:13 -07:00
Ruipeng Li
f6f98cb363 bug fix (hopefully...) 2020-08-27 14:29:06 -07:00