This PR adds device support to various MGR options:
Non-galerkin coarse grid correction options (except for option 4)
Block diagonal interpolation (interp_type = 12)
Block Jacobi relaxation (level_smooth_type = 0 for global relaxation and interp_type = 12 for F-relaxation)
The main code changes are listed below:
* Add hypre_ParCSRMatrixExtractBlockDiagDevice
* Add hypre_ParCSRMatrixExtractBlockDiagDevice and respective GPU kernels
* Add hypre_ParCSRMatrixGenerateFFFCHost and respective backend wrapper
* Add device support to hypre_MGRBuildPBlockJacobi
* Add hypre_ParCSRMatrixBlockDiagMatrixDevice
* Add hypre_ParCSRMatrixExtractBlockDiagDevice
* Add MGRBuildPFromWpDevice
* Add implementation for batched matrix transpose on the device
* hypre_ParCSRMatrixDropSmallEntriesDevice: exit if tolerance is zero
* Add hypre_ParCSRMatrixGenerateCCCFDevice
* Port MGR's Non-Galerkin option to device
* Add L1-Jacobi global smoother to MGR
* Add missing comments about MGR's public APIs
* Add hypre_MGRComputeNonGalerkinCGDevice
* Update style of hypre_MGRCycle
* Add sanity checks to hypre_SeqVectorElmdivpyMarked
* Add hypre_MGRBlockRelaxSolveDevice
* Add GPUProfiling to several places
* MGR setup: simplify computation of l1_norms
* MGR solve: make use of ParVectorSetZeros to make residual computations faster
* Exit hypre_SeqVectorElmdivpyMarked earlier for vectors with zero size
* Update caliper region names for MGR
* Add wrappers to cublas batched getrf and getri functions
* General performance improvements for MGR
* Updated gcc compiler flags for strict-checking build option to throw floating point conversion warnings
* Several minor edits to clean up floating point conversion warnings and minor bugs.
* Updated saved files to reflect changes.
This PR adds HIP support to MGR. Additionally:
* Add sanity checks at Setup and Solve functions
* Fix a bug in the computation of P_FF on MGR when using GPUs.
* Enable AMG level profiling with HIP
* Enable ROCTX regions in the IJ driver
* Added new capabilities to allow multilevel assignment of solver options
* New (local) block Jacobi option for smoothers and intergrid operators
* Added capabilities to do CPR in MGR
* Updated non-Galerkin strategy for constructing the coarse grid.
Co-authored-by: Quan Bui <mquan.bui@gmail.com>
Enable GPU setup for MGR solver.
* Added device specific functionality for interpolation
* Made device and host calls to interpolation consistent
* Edited IJ driver to use GPU capable options for MGR
* Updated saved files for new GPU options
* Updated CMakeLists to support new MGR capabilities
Co-authored-by: Ruipeng Li <li50@llnl.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
This PR adds automatic indentation using Artistic Style (astyle). The script config/astyle-apply.sh runs the indentation using the configuration file config/astylerc. The script also runs headers in all of the directories that automatically generate internal _hypre_*.h header files. Much of this was borrowed from the MFEM project. A pre-commit git hook was also added.
This PR adds the hypre_IntArray data structure, which wraps HYPRE_Int* and includes
memory location and size information. The CFMarker and DofFunc arrays in BoomerAMG
are wrapped with hypre_IntArray, and their default location is moved to the device when
using GPU acceleration, avoiding some copies between host and device.
This PR addresses initial changes necessary to perform MGR setup on GPUs. It also fixes a bug in the iterative use of AMG for F-relaxation, adds options to print statistics of the F-relaxation solver (for the V-cycle smoother), and places the diagonal of a transposed square matrix as the first element in the row. This last change required a couple minor changes to the saved files.
- Put in temporary fix to disable threading in MGR for get sub-block of a matrix
and build interpolation operator.
- Rename function to build A_FF block.
and coarse grid solvers to 0 to avoid non-convergence error return.
- Fixed hard-coded number of blocks option for non-Galerkin coarse grid computation.
- Added block Gauss-Seidel relaxation and change the function name
from 'hypre_block_jacobi' to 'hypre_blockRelax_solve'.
- Remove 'last_level' argument from hypre_MGRBuildInterp. Now users
can specify the interpolation type at all levels.
- Fixed some memory leaks when using full AMG for F-relaxation.
- Allow for different interpolation and restriction options for each MGR level.
- Add the number of functions for the Frelax V-cycle.
- Fixed a bug for Frelax V-cycle when used as a preconditioner. The RHS should
be obtained from the Solve phase, not the Setup phase.
- Optimize sparsity pattern of interpolation operator. Injection does not need
non-zero mapping for zero block.
- Set some default values for using AMG for F-relaxation to prevent crashes.
- New interface for setting C-F splitting for
matrices with block structure, i.e., the same variables
are ordered contiguously (s_1,s_2,...,s_n,p_1,p_2,...,p_n,...)
- Allow different methods for F-relaxation at different levels.
- Added a test file for MGR to test flow matrices coming from
geocentric.