This enhances what print_level can achieve in MGR. Particularly, now we can dump linear system info to files according to the print_level code. We also have the ability now of printing a sequence of linear systems to file (useful when hypre is used in time-stepping application).
A detailed list of changes is given below:
* Add utilities for creating/checking directories
* Add print_level codes to MGR and new info_path member
* Add hypre_MGRDataPrint
* Add call to hypre_MGRDataPrint and logic to update the print_level variable
* Update MGRSolve with new print_level logic
* Remove hypre_MGRWriteSolverParams
* Update documentation for HYPRE_MGRSetPrintLevel
* Implement new logic for HYPRE_MGR_PRINT_MODE_ASCII
This PR improves statistics reporting for MGR.
A list with detailed changes is given below:
* Add MatrixStats and MatrixStatsArray
* Add hypre_squared utility
* Fix divisor line location for Rectangular matrices
* Minor fix on hypre_squared definition
* Move nonzero variable definitions up
* Initialize global number of nonzeros at matrix creation
* Add hypre_ParCSRMatrixStatsArray and helper functions
* Add par_csr_matstats_device
* Add par_mgr_stats
* Print F-relax data only once
* Fix clang-13 build
* IJ driver now passes in print_level option to MGR
* GPU runs always require A_FF in MGR
* Add HYPRE_PRINT_SHIFTED_PARAM macro
* Add hypre_IntArraySetInterleavedValues (host/device implementations)
* Fix F-relaxation reporting + refactoring
* Update global number of nonzeros of the matrix
* Move new BoomerAMG functions to par_stats
* Apply astyle
This PR adds CUDA support to dense direct solver options (98, 99, 198, and 199) of BoomerAMG and MGR:
- Options 98 and 99 compute the LU factorization with pivoting.
- Options 198 and 199 compute the dense inverse matrix explicitly.
Detailed list of changes below:
* Add hypre_ParCSRMatrixToCSRMatrixAll_v2
* Add hypre_SeqVectorMigrate
* Add hypre_ParVectorToVectorAll_v2
* Refactor implementation of BoomerAMG's Gaussian Elimination
* Add hypre_GaussElimAllSetup and hypre_GaussElimAllSolve
* Add device support via MAGMA and cuSOLVER to BoomerAMG's LU coarsest linear solver (options 98, 99)
* Add device support via MAGMA and cuSOLVER to BoomerAMG's exact inverse solver (options 198, 199)
* Add wrappers to MAGMA's getrf and getrs
* Add MAGMA info on AMG stats + code formatting
* Add wrappers to cuSOLVER and cuBLAS functions
* Add wrapper hypre_magma_getri_nb
* Add header file for collecting hypre functors
* Add memory location to Gaussian elimination data structure
* Improve description of coarsest level solver options
* Update GE data structure in MGR
* Change Ainv to Awork
* Updated documentation for clarity and to clean up a few typos.
* Add warning messages to FEI, ParaSails, PILUT, Euclid.
* Improved and updated GPU information
* Added CMake build information
This solves an out-of-bounds memory error during `hypre_BoomerAMGSetup` when called multiple times without a call to `hypre_BoomerAMGDestroy` interleaved. This pull request makes sure that `smooth_num_levels` is reset to `hypre_ParAMGDataSmoothNumLevels(amg_data)` before the smoothers variable is allocated.
This PR adds new Print and Read functions for matrices and vectors to be stored/read in binary format. A detailed list of changes is given below:
* Add IJMatrix/ParCSRMatrix routines for binary I/O
* Add IJVector/ParVector routines for binary I/O
* Add typedefs for unsigned integer types and single-precision floating-point
* Change char sizes to HYPRE_MAX_FILE_NAME_LEN
* Add options to IJ driver for reading binary matrices/vectors
* Add regression tests for IJ input/output
Allow the use of MAGMA as local linear solver for FSAI.
Add `HYPRE_FSAISetLocalSolveType` for choosing the local linear solve type used in FSAI and add `HYPRE_BoomerAMGSetFSAILocalSolveType` for the case when FSAI is used as a smoother to BoomerAMG.
This PR adds CUDA and HIP support to FSAI according to a static pattern generation algorithm. The resulting method can also be used as a preconditioner for BoomerAMG. A detailed list of changes is given below:
* Add par_fsai_device.c
* Add hypre_FSAIApply
* Add function to dump local linear systems in dense format
* Implement static FSAI pattern computation via powers of A
* Improve filtering of candidate pattern
* Improve local linear systems extraction
* Add option for a 125pt matrix (27pt squared)
* Add options to control sizes of the memory pools with umpire
* Add hypre_GpuProfiling calls
* Improve candidate pattern truncation times
* Add max_nnz_row member and its private and public functions to FSAI
* Use max_nnz_row in FSAISetupDevice
* Add num_levels member and its private and public functions to FSAI
* Add threshold member and its public/private functions to FSAI
* Expose FSAI algorithm type to BoomerAMG
* Expose options to control FSAI setup
* Add cuSOLVER variables and calls
* Add batched dense linear solver calls to FSAI
* Improve execution time for generating random numbers
* Show FSAI parameters when amg_print_level >= 1
* Improve output of FSAIPrintStats
* Implement warp calls
* Add hypre_mask type and hypre_ballot_sync wrapper function
* Add hypre_popc and hypre_ffs wrapper functions
* Implement warp_allreduce_max calls
* Change: hypreDevice -> hypre_*Device
* Add rocSOLVER calls
* Apply astyle
* Remove redundant line
Add warnings for Euclid and PILUT redirecting users to hypre-ILU.
Rewrite hypre-ILU overview section.
Add new sections to hypre-ILU documentation: "User-level functions", "ILU as smoother for BoomerAMG", and "GPU support".
Include info about new iterative ILU options.
Update BoomerAMG complex smoothers section.
Change name "hypre-ILU" to "ILU"
Modified from #718, this PR squashes out zero columns of the off-diagonal part of a `hypre_ParCSRMatrix`.
The issue was in offd there exist empty columns (columns with no nonzeros), which correspond to "useless" entries in col_map_offd. This caused issues in at coarser grids in the communications with large number of ranks. We added a routine to compress the zero columns out and shorten col_map_offd. This should reduce communication cost even at higher levels.
Two sources of the empty columns have been located and fixed:
- Truncation after building P
- P^T(AP): only the transpose multiplication part.
---------
Co-authored-by: Noel Chalmers <noel.chalmers@gmail.com>
Co-authored-by: Ruipeng Li <li50@llnl.gov>
Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
This PR adds HIP support to hypre_ILU (setup and solve phases):
- Algorithm type 0 (BJ-ILU0)
- Algorithm type 10 (GMRES-ILU0)
- Iterative triangular solves for backward and forward substitutions.
---------
Co-authored-by: Paul Mullowney <Paul.Mullowney@nrel.gov>
This allows users to direct hypre's error messages to a memory buffer instead of stderr. With this, there are now three basic ways to use hypre when configured --with-print-errors:
- Default (mode 0): Errors are printed immediately to stderr (there is no processor information available in this print).
- Store errors in memory (mode 1) and call PrintErrorMessages to print them.
- Store errors in memory (mode 1) and call GetErrorMessages to manage the error messages however you like.
* Use unroll_factor=8 for rocm-5.4.3
* Add SortCSRRocsparse back
* Fix Wunused-variable warnings
* Set _hypre_memory_tracker to NULL after destroy
* Update tioga results after changing default rocm version to 5.2.0