Commit Graph

330 Commits

Author SHA1 Message Date
Wayne Mitchell
a84fe35022
SYCL triangular solves, Chebyshev relaxation, etc. (#972)
Adding more sycl functionality including chebyshev relaxation and triangular solves,
which in turn enables Gauss-Seidel, ILU, etc.
2023-10-11 15:05:21 -07:00
Victor A. P. Magri
b372b31a11
Change sh to bash (#900)
Change shell scripts from `#!/bin/sh` to `#!/bin/bash`
2023-08-16 20:09:43 -04:00
Rob Falgout
09d4bd8849
Update mac autotest to use a tolerance when diffing residual norms (#926) 2023-06-23 10:42:28 -07:00
Rui Peng Li
72f5f3e136
Cuda versions (#879)
This PR adds support and regression tests for all the versions from CUDA 9.0 to 12.0.
2023-06-15 06:26:12 -07:00
Rui Peng Li
7d1d9ca95c
ame/ams to use Jacobi on GPUs (#924)
This PR updates the ams driver to use Jacobi smoothers on GPUs.
2023-06-14 13:25:59 -07:00
Rob Falgout
f478498295
New error handling feature to print messages to memory (#920)
This allows users to direct hypre's error messages to a memory buffer instead of stderr.  With this, there are now three basic ways to use hypre when configured --with-print-errors:
- Default (mode 0): Errors are printed immediately to stderr (there is no processor information available in this print).
- Store errors in memory (mode 1) and call PrintErrorMessages to print them.
- Store errors in memory (mode 1) and call GetErrorMessages to manage the error messages however you like.
2023-06-13 20:17:25 -07:00
Wayne Mitchell
7ff7f2f60d
oneDPL fixes and Sunspot regressions (#905)
Fixes needed due to recent changes with oneDPL.
Move regression testing of sycl build to sunspot.
2023-05-30 10:23:01 -07:00
Rui Peng Li
ec86992c4b
Cuda12 (#871)
This PR adds the support for CUDA 12.
2023-05-17 20:01:41 -07:00
Victor A. P. Magri
8b39b73a52
Fixes for Rocm 5.4.3 (#902)
* Use unroll_factor=8 for rocm-5.4.3
* Add SortCSRRocsparse back
* Fix Wunused-variable warnings
* Set _hypre_memory_tracker to NULL after destroy
* Update tioga results after changing default rocm version to 5.2.0
2023-05-11 09:05:26 -04:00
Rob Falgout
6907852618
Autoconf update (#765)
Updating to autoconf 2.71 for building the 'configure' script

Also updated 'config.guess' and 'config.sub' to the 2023 versions
2023-02-16 09:42:07 -08:00
Wayne Mitchell
a592bbd12b
Sycl matmat (#716)
Fixes for oneMKL sparse matmat and port of our custom spmv and spgemm routines to sycl. Note that this also involves significant updates to basic handling of kernel launches in sycl due to the need to support multi-dimensional kernels and the use of local shared memory.
2023-02-10 08:21:11 -08:00
Wayne Mitchell
c964884062
Update Tioga Regressions (#833)
Fix batch submission and update .saved files for Tioga regression testing.
2023-02-07 15:48:51 -08:00
Rob Falgout
98c4321fee
Modify check-license.sh to ignore '.png' files (#819) 2023-01-12 16:26:54 -08:00
Rob Falgout
45b6cdb1a0
Add mac regression test and modify runtest.sh for portability (#796)
The expression '\t' for <tab> does not work on the mac, so the whitespace
expressions were changed to use the posix standard '[[:space:]]'.
2022-12-21 08:34:48 -08:00
Victor A. Paludetto Magri
1cc7e81e21
Fix strict prototypes (#774)
This PR cleans the code for the warning Wstrict-prototypes. This flag was also added to the debug build of machine-tux.

Co-authored-by: Pierre Jolivet <pierre@joliv.et>
2022-12-14 11:13:20 -08:00
Wayne Mitchell
9bbdd9799f
SYCL regression testing (#778)
Several bug fixes and small changes for the sycl build. Addition of full regression testing on florentia with consistent and correct results for struct and ij tests with sycl backend.
2022-12-05 16:21:03 -08:00
Rui Peng Li
5546cc22d4
Runtime switch and memory tracker (#741)
This PR adds "runtime switch" feature to the solvers in hypre.

Co-authored-by: Victor A. P. Magri <paludettomag1@llnl.gov>
Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
2022-10-07 08:39:00 -07:00
Ruipeng Li
5eb84ec1db
Fix GPU memory leak (#677)
This PR fixes a memory leak on GPUs.
2022-07-15 11:20:41 -07:00
Ruipeng Li
14ee602fbf
Regression (#668)
This PR updates regression test scripts and benchmark performance results.
2022-07-05 17:10:43 -07:00
Wayne Mitchell
6f3bccb92c
Sycl interp (#638)
This adds sycl support for interpolation optionsExtInterp, ExtPIInterp,
and ExtPEInterp (which correspond to InterpType 6, 14, 16, 17, 18).
Generation of the strength matrix is also ported to sycl.
Further unification of cuda/hip/sycl kernel functions.
Adds regression tests for the sycl backend on arcticus including both ij and struct tests.
2022-07-05 16:10:36 -07:00
Ruipeng Li
e270c561b0
Spgemm (#639)
This PR includes optimizations for hypre's SpGEMM and ParSpGEMM kernels

Co-authored-by: Wayne Mitchell <mitchell82@llnl.gov>
Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
2022-06-24 10:42:16 -07:00
Ruipeng Li
ef3f890d4b
Nvcollab (#591)
This PR contains various GPU optimizations in the collaboration with the NVIDIA team. 

Co-authored-by: Peng Wang <penwang@nvidia.com>
2022-05-24 13:27:32 -07:00
Victor A. Paludetto Magri
e16167fe46
Fix copyright (#615)
This PR updates Copyright headers from "Copyright 1998-2019 ..." to "Copyright (c) 1998 ..."
2022-04-05 16:19:51 -07:00
Victor A. Paludetto Magri
9415d6aa08
FSAI implementation on CPUs (#610)
Thir PR adds a factorized sparse approximate inverse (FSAI) implementation on hypre, which can be used as a standalone solver, preconditioner to Krylov methods, or complex smoother to BoomerAMG. Particularly, we consider the adaptive algorithm version, where the sparsity pattern of the lower triangular factor G is built dynamically, i.e., during an iterative procedure that tries to find the best nonzero positions for a given row of G. This implementation was performed on top of the IJ interface. It uses the diagonal portion of A for constructing G, i.e., it's a block-Jacobi method in the MPI sense. List of additional changes:

* Add caliper instrumentation to FSAI.
* Add ZeroGuess option to FSAI.
* Performance optimizations.
* Add OpenMP support to FSAI.
* Make internal BLAS/LAPACK functions thread-safe. 
* Update CMake build.
* Add new test cases: beam_tet_dof459_np1, beam_hex_dof459_np2, and beam_tet_dof2475_np4.
* Add documentation for FSAI.

Co-authored-by: Heather Switzer <switzer4@lassen36.coral.llnl.gov>
Co-authored-by: heatherms27 <hmswitzer@email.wm.edu>
Co-authored-by: Sarah Osborn <30503782+osborn9@users.noreply.github.com>
2022-04-05 11:18:39 -07:00
Ruipeng Li
95e6433fc7
GPU support with single precision (#572)
This PR fixes the GPU support with single precision.
2022-03-04 12:05:32 -08:00
Ruipeng Li
ab72d05bd8
Deviceomp (#519)
This PR fixes the build with Kokkos + OMP offload, supports OMP offload without linking CUDA libraries, and supports OMP offload on Intel GPUs.
2022-02-09 06:40:57 -08:00
Ruipeng Li
790e8e7826
fix cuda 11 build (#569)
This PR fixes CUDA 11 build after merging #549, also adds regression tests (build only) with CUDA 11.
2022-02-02 08:40:15 -08:00
Ruipeng Li
4c3ef2a0b4
Fortran gpu (#470)
This PR adds GPU examples for FORTRAN users, examples ex5f.f and ex12f.f.
2022-01-19 21:24:05 -08:00
Rob Falgout
805ee77be8
Adding source file indentation with astyle (#498)
This PR adds automatic indentation using Artistic Style (astyle).  The script config/astyle-apply.sh runs the indentation using the configuration file config/astylerc.  The script also runs headers in all of the directories that automatically generate internal _hypre_*.h header files.  Much of this was borrowed from the MFEM project.  A pre-commit git hook was also added.
2021-11-08 19:26:59 -08:00
Rob Falgout
aadfd86de4
Remove '..' directory dependency in test Makefile (#487)
The 'test/Makefile' had the '..' directory in the include path, which caused the 'HYPRE_config.h' file to be included from two different places (in '..' and in 'install').  In the spack autotest, this caused a conflict.

* Check that 'git describe' works in autoconf build
* Updated versioncheck tests to work when 'git describe' fails
* Updated CMake build to work when 'git describe' fails
* Update autotest filters to ignore error messages from 'git describe'
2021-10-06 09:29:30 -07:00
Sarah Osborn
5622749ab2
Add clean-up phase to spack build in autotest (#481)
* Add clean-up phase to spack build in autotest
2021-10-04 18:17:52 -05:00
Ruipeng Li
a1b4dc0717
Multipass gpu (#490)
This PR (modified from #489) adds GPU support for multipass interpolations.

Co-authored-by: ulrikeyang <ulrikey@rzansel61.coral.llnl.gov>
Co-authored-by: Ulrike Yang <yang11@llnl.gov>
Co-authored-by: ulrikeyang <ulrikey@rzansel41.coral.llnl.gov>
Co-authored-by: ulrikeyang <ulrikey@rzansel19.coral.llnl.gov>
Co-authored-by: ulrikeyang <ulrikey@rzansel16.coral.llnl.gov>
Co-authored-by: ulrikeyang <ulrikey@rzansel46.coral.llnl.gov>
Co-authored-by: Paul T. Bauman <ptbauman@gmail.com>
Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>
2021-09-30 11:58:52 -07:00
Rob Falgout
22b1b8a513
Added more filtering to check-license.sh autotest script (#486) 2021-09-24 12:55:30 -07:00
Rob Falgout
408f361bd0
Add HYPRE_DEVELOP variables (#472)
This commit introduces three new variables to the 'HYPRE_config' file through both the autoconf and CMake builds. They are defined only when there is a '.git' directory present, and are otherwise left undefined.  These new variables may help users who work directly with the development branch of hypre to keep their code current and backward compatible with previous releases and also individual commits between those releases.  The new variables are:

HYPRE_DEVELOP_STRING - a string created from the 'git describe' command that indicates the last release tag, the number of commits beyond that last release, and the corresponding commit hash.
HYPRE_DEVELOP_NUMBER - the number of commits since the last release.
HYPRE_DEVELOP_BRANCH - defined only if the main development branch is being used, and is set to the name of that branch (currently master).

The commit also adds runtime regression tests for the variables in the 'src/test' directory.
2021-09-20 20:05:58 -07:00
Ruipeng Li
8c9f41a4d0
GPU ams ame ads (#398)
This PR adds GPU support for ams, ame and ads, and the following parcsr operations on GPUs, ParCSRAdd, ParCSRTranspose, l1 hybrid G-S/SSOR.

Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
2021-06-21 14:36:46 -07:00
Wayne Mitchell
5f8472b05c
Amgdd fixes (#386)
This removes the masked matvec routine previously used for CF L1 Jacobi relaxation in the AMG-DD solver. There was a bug present in the GPU code and the bsrxmv cusparse routine no longer supports our use-case as of cuda 11. In addition, appropriate regression test results were saved for the GPU implementation of AMG-DD.
2021-06-15 10:44:46 -07:00
Ruipeng Li
ad5d7e009f
Gpu mixedInt (#380)
This PR adds GPU support for mixedInt. 

Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
2021-06-10 11:10:13 -07:00
Ruipeng Li
b88d965c16
renamed utilities/hypre_*.[c,h] to *.[c,h] (#385)
Renamed 'hypre_*' filenames in 'src/sstruct_ls' and 'utilities/' directories
Fixed AUTOTEST tests that filter on the renamed 'hypre_' files

Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
2021-05-27 16:39:42 -07:00
Ruipeng Li
3bc7d267ef
Gpu default (#336)
This PR changes AMG defaults regarding GPUs at various places, adds regression tests on GPUs, simplifies CUDA boxloop implementations. 

Co-authored-by: Sarah Virginia Osborn <osborn9@llnl.gov>
Co-authored-by: PaulMullowney <pmullown@nrel.gov>
Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
Co-authored-by: Ruipeng Li <li50@euler.llnl.gov>
Co-authored-by: Ruipeng Li <coe0141@redwood.cm.cluster>
2021-05-24 17:16:35 -07:00
Sarah Osborn
8a41a42c8d
Cmake cuda updates (#349)
Add CUDA options to CMake. 
* Add CUDA options to CMake, including adding CUDA_SRCS to each sub-directory's CMake list
* Changes so that HYPRE_config.h from CMake is consistent with configure
* Fix to license commenting style
* Add CUDA + CMake tests for building on lassen
2021-05-19 15:39:57 -05:00
Rob Falgout
5122196348 Adding filter to check-license test to ignore runtests-* files 2021-02-18 07:00:23 -08:00
Ruipeng Li
68f510c11b
Test jobs for enable-mixedint (#280)
This PR separates the jobs that break with --enable-mixedint (CGC, ParaSail, Euclid) from regular job scripts, so regression tests can selectively run jobs with --enable-mixedint. The new runtests<-option> files contain a list of tests for the runtest.sh
script that can be run by passing <-option> to the 'run.sh' autotest script. This should enable more flexibility for building regression test suites.

We should also revisit the notes pointed out by @rfalgout at some point.

Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
2021-02-17 20:24:58 -08:00
Rob Falgout
be18e595ae
Remove the global partition code from hypre (#273)
This PR removes the global partition code from hypre.
2021-02-08 15:16:29 -08:00
Rob Falgout
6f9260b67c
Add saved-file extension to runtest (#271)
This pull request adds a -save <ext> feature to the runtest.sh script to allow testing against different saved files on different platforms such as GPU machines. See Issue #255. A few additional things were done:

- All of the checks against the saved files were moved out of the individual tests and into runtest.sh.
- The output-file sanity checks that are in many of the tests were modified so they no longer depend on the saved files. Several issues were also uncovered and fixed.
2021-02-08 15:11:45 -08:00
Ramesh Pankajakshan
414fa671be
Umpire (#243)
This PR contains the support of UMPIRE pooling allocators for host and GPU memory. Configure hypre with --with-umpire, device and uvm allocations and deallocations are done with umpire, whereas host pool is not enabled by default. This PR also includes some other minor changes:

Adding .gitignore to the repo
Removing all malloc/calloc/realloc/free and regression testing on finding them
No longer compile ij.c with C++ compiler. It goes back to a C code now.
Introducing HYPRE_USING_GPU, which is equivalent to HYPRE_USING_CUDA || HYPRE_USING_DEVICE_OPENMP
Adding a few user-level interfaces: HYPRE_SetMemoryLocation, HYPRE_SetExecutionPolicy, HYPRE_SetGPUMemoryPoolSize and HYPRE_CSRMatrixSetSpGemmUseCusparse

Co-authored-by: li50@llnl.gov <liruipengblue@gmail.com>
Co-authored-by: Rob Falgout <rfalgout@llnl.gov>
Co-authored-by: Ruipeng Li <li50@llnl.gov>
2021-02-03 12:31:25 -08:00
Ruipeng Li
2186a8fb34
triangular solve on GPUs; runcheck (#256)
This PR fixes triangular solve on GPUs, and runcheck.sh

Co-authored-by: Daniel Osei-Kuffuor <oseikuffuor1@llnl.gov>
2021-01-15 20:46:59 -08:00
Ruipeng Li
b49727f16b
Cuda triangular smoothers (#240)
* This commit has CUDA based smoothers for AMG based on the triangular parts of sparse matrices. This includes an Gauss-Seidel (relax_type==3), which uses CUSPARSE triangular solvers to invert L. Symmetric Gauss Seidel is implemented in relax_type==6 also via CUSPARSE. Finally, 2 new smoothers are added. THe first is a 2 stage approximation to Gauss Seidel using a parallel MatVec and L (relax_type==11). The second (relax_type==12) is a less effective version of 11. It uses A_diag instead of L for the smoothing. CPU implementations of these new smoothers are also provided. For the two stage algorithms, L and U are NOT explicitly created. This seems faster and saves memory. In the two stage preconditioner, multiply by invdiag rather than divide by diagonal reduces register pressure and yields full occupancy.
Co-authored-by: Paul Mullowney <pmullown@nrel.gov>
Co-authored-by: PaulMullowney <60452402+PaulMullowney@users.noreply.github.com>
2020-12-17 19:37:59 -08:00
Luke
22f4d3f8c6
Cuda 11 API (#163)
This PR adds CUDA-11 support.
2020-11-05 20:57:57 -08:00
Ruipeng Li
aaf5aa564a
Aggressive coarsening and 2- stage MM-ext Interpolations on GPUs (#195)
This PR contains the following changes:
* Aggressive coarsening, i.e, 2nd SoC on GPUs
* 2-stage MM-ext Interpolations (MM-ext, MM-ext+e) on GPUs
* Enhanced abilities of extracting strong FF/FC/CF/CC submatrix with given SoC matrix
* Bug fix in device PMIS
Co-authored-by: Bjorn Sjogreen <sjogreen2@llnl.gov>
Co-authored-by: ulrikeyang <yang11@llnl.gov>
2020-09-23 17:13:23 -07:00
Ruipeng Li
1c0598626c Merge branch 'master' of https://github.com/hypre-space/hypre into PETScFix 2020-08-27 20:12:13 -07:00