Commit Graph

11604 Commits

Author SHA1 Message Date
Ruipeng Li
bec8645cf9
script option for runtest.sh (#632)
This PR changes runtest.sh to run an executable with a "script"  and allows valgrind and mpibind on all platforms.
2022-05-19 10:10:03 -07:00
Daniel Osei-Kuffuor
63208e3e34
Hotfix for issues with dsuperlu in regression test. (#631)
Commented out unnecessary memory deallocation check.
2022-05-15 21:42:28 -07:00
Wayne Mitchell
dfdd1cd12f
Sycl par matmat (#611)
Further unification of GPU implementation across cuda/hip/sycl.
Implements the parallel matrix matrix product in sycl.
HYPRE_CUDA_LAUNCH and HYPRE_SYCL_LAUNCH macros have 
been unified under HYPRE_GPU_LAUNCH for kernel launches.
Replace HYPRE_SetSpGemmUseCusparse with HYPRE_SetSpGemmUseVendor.
2022-05-09 15:24:44 -07:00
Daniel Osei-Kuffuor
00d1dfd3f7
Mgr block jacobi (#607)
* Added new capabilities to allow multilevel assignment of solver options
* New (local) block Jacobi option for smoothers and intergrid operators
* Added capabilities to do CPR in MGR
* Updated non-Galerkin strategy for constructing the coarse grid.

Co-authored-by: Quan Bui <mquan.bui@gmail.com>
2022-05-09 08:30:05 -07:00
Victor A. Paludetto Magri
8017ce459b
Fix segfault on HYPRE_SStructGraphDestroy (#617)
This PR fixes a segmentation fault on HYPRE_SStructGraphDestroy. The error occurred when the number of graph entries added to the SStructGraph via HYPRE_SStructGraphAddEntries was larger than 1000.
2022-04-06 21:12:41 -07:00
Victor A. Paludetto Magri
70d055a994
Fix complex build (#616)
This PR fixes compilation of the "complex" build variant of hypre. It also adds hypre_csqrt for computing the square root of an HYPRE_Complex number. This function/macro works when enable-complex is turned on/off.
2022-04-06 15:02:04 -07:00
Rob Falgout
4c5529810a Updating one missed copyright date in user manual 2022-04-05 16:40:02 -07:00
Victor A. Paludetto Magri
e16167fe46
Fix copyright (#615)
This PR updates Copyright headers from "Copyright 1998-2019 ..." to "Copyright (c) 1998 ..."
2022-04-05 16:19:51 -07:00
Victor A. Paludetto Magri
9415d6aa08
FSAI implementation on CPUs (#610)
Thir PR adds a factorized sparse approximate inverse (FSAI) implementation on hypre, which can be used as a standalone solver, preconditioner to Krylov methods, or complex smoother to BoomerAMG. Particularly, we consider the adaptive algorithm version, where the sparsity pattern of the lower triangular factor G is built dynamically, i.e., during an iterative procedure that tries to find the best nonzero positions for a given row of G. This implementation was performed on top of the IJ interface. It uses the diagonal portion of A for constructing G, i.e., it's a block-Jacobi method in the MPI sense. List of additional changes:

* Add caliper instrumentation to FSAI.
* Add ZeroGuess option to FSAI.
* Performance optimizations.
* Add OpenMP support to FSAI.
* Make internal BLAS/LAPACK functions thread-safe. 
* Update CMake build.
* Add new test cases: beam_tet_dof459_np1, beam_hex_dof459_np2, and beam_tet_dof2475_np4.
* Add documentation for FSAI.

Co-authored-by: Heather Switzer <switzer4@lassen36.coral.llnl.gov>
Co-authored-by: heatherms27 <hmswitzer@email.wm.edu>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
2022-04-05 11:18:39 -07:00
ulrikeyang
303457abae
fixed MM-multipass interpolation for case of no C-points (#606)
* fixed MM-multipass interpolation for case of no C-points

* fixed the issue of isolated groups of fine points and added a regression test.

* corresponding changes to the device code

Co-authored-by: Ruipeng Li <li50@llnl.gov>
2022-03-29 15:14:29 -07:00
Ruipeng Li
5fe37b2286
hypre_ParPrintf (#604)
This PR adds hypre_ParPrintf. Prints to standard out, only from the first processor in the communicator. Calls from other processes are ignored.
2022-03-21 09:10:01 -07:00
Ruipeng Li
adfd07c509
Fix build on FreeBSD/powerpc*. (#603)
lr collides with lr from machine/frame.h header (link register):
Co-authored-by: Piotr Kubaj <pkubaj@FreeBSD.org>
2022-03-18 10:16:20 -07:00
Rob Falgout
fa43ea82e3
Bug fix in prefix sum for OpenMP IJ interface (#602)
This fixes a bug found in issue #522 for the prefix sum openmp code in IJ.
2022-03-16 10:56:09 -07:00
Ruipeng Li
92faac9748
fix memory location (#600)
This PR fixes a number of memory location issues in memory copy and memset. It also adds more strict checking in memory.c in the debug mode.
2022-03-14 11:19:28 -07:00
Victor A. Paludetto Magri
6fd043c9c2
(S)Struct IO on GPUs (#599)
This PR extends the (semi)-struct matrix/vector IO functions added on #583 with GPU support. Additionally:

* Fix regression tests on Lassen.
* Read data values into host memory
* Update Umatrix read algorithm when the ParCSRMatrix is expected to live on the device
* Reset deallocated pointers at hypre_IJMatrixDestroyParCSR to NULL
* Clone rownnz info if present on a CSRMatrix
* Reduce memory transfer and remove unused variables
* Fix bug with -print option
* Build rownnz info also when the ParCSRMatrix is in assembled state
* Remove a few instances of "return ierr"
* Refactor (s)struct IO - code works with cuda and without UM
* Add executables to gitignore
2022-03-13 20:14:23 -07:00
Ruipeng Li
f7787ab0ae
fixes coarsening.jobs.14 (#598)
a temporary "fix"
2022-03-11 18:25:03 -08:00
Ruipeng Li
8c344aee9a
Invalid assumption on exclusive_scan (#575)
This PR fixes a number of initialization problems with exclusive_scan on GPUs due to invalid assumptions of this function.
2022-03-11 08:32:26 -08:00
Paul T. Bauman
251cd3d269
Need -O1 instead of -O0 for HIP in debug mode (#588)
This PR changes -O0 in debug mode to -O1 with HIP (at this time).
2022-03-04 12:40:35 -08:00
Ruipeng Li
95e6433fc7
GPU support with single precision (#572)
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ebd6eb88c3
bug fix; nonsquare rap (#581)
This PR fixes a corner case of the RAP routine for RAP matrix that is globally square but not locally.
2022-03-03 21:26:17 -08:00
Paul T. Bauman
04af9a4cd9
HYPRE_Int -> HYPRE_BigInt (#585) 2022-02-18 12:16:35 -08:00
Golam Rabbani
94070dd3a9
Updated CMakeLists.txt for SYCL (#577)
With CMake, enable CUDA stream by default when using SYCL.
2022-02-17 18:21:51 -08:00
Victor A. Paludetto Magri
33a5051398
Add SStruct IO functions (#583)
This PR adds support for native print/read functions of SStructMatrix and SStructVector. Other important changes are:
* Add public functions for reading StructMatrix and StructVector.
* Add a new set of regression tests called "io" to the TEST_sstruct folder.
2022-02-17 18:06:23 -08:00
Victor A. Paludetto Magri
49dbf7b60a
Fix cross-compilation problem (#580)
This PR fixes issue #556.

AC_CHECK_FILE was being used to test the existence of the .git folder. However, according to Autoconf manual, it does not work when cross-compiling. This PR implements another strategy for looking for the .git folder which works also when doing cross-compilation.
2022-02-16 07:55:02 -08:00
Ulrike Yang
e5a82e81e6 specified SYCL support 2022-02-10 17:12:32 -08:00
Rob Falgout
ccd135d8da Updating CHANGELOG 2022-02-10 10:18:32 -08:00
Rob Falgout
666f457d2b Bumping release number to 2.24.0 2022-02-10 07:05:43 -08:00
Rob Falgout
4ee737b53c Initial CHANGELOG update for new release 2022-02-10 07:02:26 -08:00
Ruipeng Li
ab72d05bd8
Deviceomp (#519)
This PR fixes the build with Kokkos + OMP offload, supports OMP offload without linking CUDA libraries, and supports OMP offload on Intel GPUs.
2022-02-09 06:40:57 -08:00
Ruipeng Li
8ba048c0b5
Forced regeneration of softlinks in shared library builds. (#574)
This PR (copied from #573) added -f to softlink generation to all the makefiles.
2022-02-07 16:54:44 -08:00
Ruipeng Li
e40f8219a3 fix for last merged PR 2022-02-07 18:26:53 -06:00
Quan Bui
734a10fcb7
Mgr setup gpu (#400)
Enable GPU setup for MGR solver.
* Added device specific functionality for interpolation
* Made device and host calls to interpolation consistent
* Edited IJ driver to use GPU capable options for MGR
* Updated saved files for new GPU options
* Updated CMakeLists to support new MGR capabilities

Co-authored-by: Ruipeng Li <li50@llnl.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
2022-02-07 15:54:52 -08:00
Ruipeng Li
790e8e7826
fix cuda 11 build (#569)
This PR fixes CUDA 11 build after merging #549, also adds regression tests (build only) with CUDA 11.
2022-02-02 08:40:15 -08:00
Wayne Mitchell
a7bb784a45
SYCL support for AMG solve phase (#549)
This adds matvec, matrix transpose, and vector operations (axpy, inner product, etc.)
with sycl backend (via oneMKL and oneDPL) for running on Intel GPUs. Thus, the AMG
solve phase can now execute entirely on Intel GPUs.
2022-01-31 16:15:30 -08:00
Victor A. Paludetto Magri
b159c7dd58
Fortran interfaces (#566)
This PR adds Fortran interfaces for hypre_MGR and hypre_ILU. Additionally:

* Add ArrayArray types in `fortran.h`
* Add MGR and ILU options to the fortran interfaces for Krylov solvers
2022-01-31 15:32:14 -08:00
Denis Barbier
dcbee14539
Add convenient CMake alias (#563)
Usually consumers of HYPRE call find_package(HYPRE) and depend on
HYPRE::HYPRE target.  But they could also want to use HYPRE via
add_directory (for instance via a git submodule)) or FetchContent,
in which case they have to depend on HYPRE target.

This alias makes this usage more consistent, all users could then
depend on HYPRE::HYPRE.  See for instance
  https://cmake.org/pipermail/cmake/2018-November/068629.html
2022-01-25 12:06:47 -08:00
Ruipeng Li
84fa589671
Redwood sh update (#561)
This PR adds a minor update in runtest.sh for redwood
2022-01-19 21:28:46 -08:00
Ruipeng Li
4c3ef2a0b4
Fortran gpu (#470)
This PR adds GPU examples for FORTRAN users, examples ex5f.f and ex12f.f.
2022-01-19 21:24:05 -08:00
Ruipeng Li
ce54070d76
fixed .saved (#560)
This PR updates files smoother.saved.lassen/ray that were not done in #534.
2022-01-19 16:30:15 -08:00
Ruipeng Li
514c72be69
add reading x0 from parcsr file back (#548)
This PR adds build_x0_type == 7 (read from parcsr file) back.
2022-01-12 08:57:14 -08:00
Ruipeng Li
bcccb117ef
ldg only for sm >= 35 (#516)
This PR fixes compile issues with CUDA sm_30. See #511
2022-01-12 08:55:58 -08:00
Ruipeng Li
436e09cba2
Early break in CG Eig (#534)
This PR adds early break in CG for eigenvalue estimations.
2022-01-12 08:53:42 -08:00
Wayne Mitchell
a2daaf6722
Fix dof func bad read (#533)
Resolve issue #525. 
This fixes a bad memory access when max_levels == 1.
2021-11-23 15:57:11 -08:00
Wayne Mitchell
4232108a4d
Add SYCL support (#431)
This sets up basic infrastructure (e.g. memory management, device setup, etc.)
and implements the boxloops and structure solvers in sycl.
2021-11-22 16:54:22 -08:00
Ruipeng Li
1a1c7b663e
fix spmv buffer free for hip (#532)
This PR fixes a compile issue for HIP, introduced in #512 .
Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>
2021-11-22 15:31:43 -08:00
Dan Ibanez
da32677ae0
improve cuSPARSE version conditional (#530)
This PR fixes cuSPARSE version conditional for CUSPARSE_SPMV_CSR_ALG2.
2021-11-19 12:21:26 -08:00
Dan Ibanez
f86b1f7d81
Link to Caliper's CMake package (#518)
if HYPRE_WITH_CALIPER is enabled,
then actually find the CMake package
for caliper and link to it.
Without this, include files are not
found and the caliper library is not
properly linked.
2021-11-17 15:28:16 -08:00
Rafal
83989186a2
Update CMake: RUNTIME DESTINATION (#528) 2021-11-17 12:04:23 -08:00
Ruipeng Li
66e1f2df45
fixed typo (#521)
This PR fixed typos introduced in #512
2021-11-09 21:52:38 -08:00
Rob Falgout
805ee77be8
Adding source file indentation with astyle (#498)
This PR adds automatic indentation using Artistic Style (astyle).  The script config/astyle-apply.sh runs the indentation using the configuration file config/astylerc.  The script also runs headers in all of the directories that automatically generate internal _hypre_*.h header files.  Much of this was borrowed from the MFEM project.  A pre-commit git hook was also added.
2021-11-08 19:26:59 -08:00