Chip Kerchner
|
ab8725d947
|
Turn off vectorize version of rsqrt - doesn't match generic version
|
2023-01-27 18:28:54 +00:00 |
|
Charles Schlosser
|
6d9f662a70
|
Tweak atan2
|
2023-01-26 17:38:21 +00:00 |
|
Chip Kerchner
|
6fc9de7d93
|
Fix slowdown in bfloat16 MMA when rows is not a multiple of 8 or columns is not a multiple of 4.
|
2023-01-25 18:22:20 +00:00 |
|
Charles Schlosser
|
7f58bc98b1
|
Refactor sparse
|
2023-01-23 17:55:50 +00:00 |
|
Rasmus Munk Larsen
|
576448572f
|
More fixes for __GNUC_PATCHLEVEL__.
|
2023-01-23 17:04:24 +00:00 |
|
Rasmus Munk Larsen
|
164ddf75ab
|
Use __GNUC_PATCHLEVEL__ rather than __GNUC_PATCH__, according to the documentation https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html
|
2023-01-23 16:56:14 +00:00 |
|
Charles Schlosser
|
5a7ca681d5
|
Fix sparse insert
|
2023-01-20 21:32:32 +00:00 |
|
Antonio Sánchez
|
08c961e837
|
Add custom ODR-safe assert.
|
2023-01-20 17:38:13 +00:00 |
|
Sean McBride
|
d70b4864d9
|
issue #2581: review and cleanup of compiler version checks
|
2023-01-17 18:58:34 +00:00 |
|
Mehdi Goli
|
b523120687
|
[SYCL-2020 Support] Enabling Intel DPCPP Compiler support to Eigen
|
2023-01-16 07:04:08 +00:00 |
|
tttapa
|
bae119bb7e
|
Support per-thread is_malloc_allowed() state
|
2023-01-16 01:34:56 +00:00 |
|
Charles Schlosser
|
fa0bd2c34e
|
improve sparse permutations
|
2023-01-15 03:21:25 +00:00 |
|
Antonio Sánchez
|
2e61c0c6b4
|
Add missing EIGEN_DEVICE_FUNC in a few places when called by asserts.
|
2023-01-15 02:06:17 +00:00 |
|
Charles Schlosser
|
4aca06f63a
|
avoid move assignment in ColPivHouseholderQR
|
2023-01-15 01:34:10 +00:00 |
|
Charles Schlosser
|
68082b8226
|
Fix QR, again
|
2023-01-13 03:23:17 +00:00 |
|
Sergey Fedorov
|
4d05765345
|
Altivec fixes for Darwin: do not use unsupported VSX insns
|
2023-01-12 16:33:33 +00:00 |
|
Rasmus Munk Larsen
|
6156797016
|
Revert "Add template to specify QR permutation index type, Fix ColPivHouseholderQR Lapacke bindings"
This reverts commit be7791e097
|
2023-01-11 18:50:52 +00:00 |
|
Charles Schlosser
|
be7791e097
|
Add template to specify QR permutation index type, Fix ColPivHouseholderQR Lapacke bindings
|
2023-01-11 15:57:28 +00:00 |
|
Charles Schlosser
|
9463fc95f4
|
change insert strategy
|
2023-01-11 06:24:49 +00:00 |
|
Martin Burchell
|
c54785b071
|
Fix error: unused parameter 'tmp' [-Werror,-Wunused-parameter] on clang/32-bit arm
|
2023-01-10 21:15:28 +00:00 |
|
Charles Schlosser
|
81172cbdcb
|
Overhaul Sparse Core
|
2023-01-07 22:09:42 +00:00 |
|
Chip Kerchner
|
d20fe21ae4
|
Improve performance for Power10 MMA bfloat16 GEMM
|
2023-01-06 23:08:37 +00:00 |
|
Ryan Senanayake
|
fe7f527787
|
Fix guard macros for emulated FP16 operators on GPU
|
2023-01-06 22:02:51 +00:00 |
|
Antonio Sánchez
|
262194f12c
|
Fix a bunch of minor build and test issues.
|
2023-01-06 16:37:26 +00:00 |
|
Antonio Sánchez
|
3564668908
|
Fix overalign check.
|
2023-01-05 17:10:48 +00:00 |
|
Charles Schlosser
|
f3929ac7ed
|
Fix EIGEN_HAS_CXX17_OVERALIGN for icc
|
2023-01-03 17:30:10 +00:00 |
|
Charles Schlosser
|
a8bab0d8ae
|
Patch SparseLU
|
2022-12-31 04:52:36 +00:00 |
|
Arthur
|
311cc0f9cc
|
Enable NEON pcmp, plset, and complex psqrt
|
2022-12-22 05:38:34 +00:00 |
|
Antonio Sánchez
|
dbf7ae6f9b
|
Fix up C++ version detection macros and cmake tests.
|
2022-12-20 18:06:03 +00:00 |
|
Antonio Sánchez
|
bb6675caf7
|
Fix incorrect NEON native fp16 multiplication.
|
2022-12-19 20:46:44 +00:00 |
|
Rasmus Munk Larsen
|
dd85d26946
|
Revert "Avoid mixing types in CompressedStorage.h"
|
2022-12-19 20:09:37 +00:00 |
|
Arthur Feeney
|
c4fb6af24b
|
Enable NEON pabs for unsigned int types
|
2022-12-19 17:07:36 +00:00 |
|
Rasmus Munk Larsen
|
04e4f0bb24
|
Add missing colon in SparseMatrix.h.
|
2022-12-16 21:50:00 +00:00 |
|
Rasmus Munk Larsen
|
3d8a8def8a
|
Avoid mixing types in CompressedStorage.h
|
2022-12-16 20:11:02 +00:00 |
|
Charles Schlosser
|
4bb2446796
|
Add operators to CompressedStorageIterator
|
2022-12-16 16:48:50 +00:00 |
|
Alexander Richardson
|
37de432907
|
Avoid using std::raise() for divide by zero
|
2022-12-14 20:06:16 +00:00 |
|
Alexander Richardson
|
62de593c40
|
Allow std::initializer_list constructors in constexpr expressions
|
2022-12-14 17:05:37 +00:00 |
|
Charles Schlosser
|
6d3e3678b4
|
optimize equalspace packetop
|
2022-12-13 01:22:25 +00:00 |
|
Charles Schlosser
|
2004831941
|
add EqualSpaced / setEqualSpaced
|
2022-12-13 00:54:57 +00:00 |
|
Melven Roehrig-Zoellner
|
273f803846
|
Add BDCSVD_LAPACKE binding
|
2022-12-09 18:50:12 +00:00 |
|
Antonio Sánchez
|
03c9b4738c
|
Enable direct access for NestByValue.
|
2022-12-07 18:21:45 +00:00 |
|
Chip Kerchner
|
b59f18b4f7
|
Increase L2 and L3 cache size for Power10.
|
2022-12-07 18:20:33 +00:00 |
|
Charles Schlosser
|
44fe539150
|
add sparse sort inner vectors function
|
2022-12-01 19:28:56 +00:00 |
|
Lianhuang Li
|
d194167149
|
Fix the bug using neon instruction fmla for data type half
|
2022-12-01 17:28:57 +00:00 |
|
Pedro Caldeira
|
31ab62d347
|
Add support for Power10 (AltiVec) MMA instructions for bfloat16.
|
2022-11-30 23:33:37 +00:00 |
|
Antonio Sánchez
|
dcb042a87d
|
Fix serialization for non-compressed matrices.
|
2022-11-30 18:16:47 +00:00 |
|
Antonio Sánchez
|
2260e11eb0
|
Fix reshape strides when input has non-zero inner stride.
|
2022-11-29 19:39:29 +00:00 |
|
Alexandre Hoffmann
|
23524ab6fc
|
Changing BiCGSTAB parameters initialization so that it works with custom types
|
2022-11-29 19:37:46 +00:00 |
|
Antonio Sánchez
|
ab2b26fbc2
|
Fix sparseLU solver when destination has a non-unit stride.
|
2022-11-29 19:37:03 +00:00 |
|
Antonio Sánchez
|
e7b1ad0315
|
Add serialization for sparse matrix and sparse vector.
|
2022-11-21 19:43:07 +00:00 |
|
Charles Schlosser
|
044f3f6234
|
Fix bug in handmade_aligned_realloc
|
2022-11-18 22:35:31 +00:00 |
|
Charles Schlosser
|
02805bd56c
|
Fix AVX2 psignbit
|
2022-11-16 13:43:11 +00:00 |
|
Chip Kerchner
|
399ce1ed63
|
Fix duplicate execution code for Power 8 Altivec in pstore_partial.
|
2022-11-16 13:41:42 +00:00 |
|
Gabriele Buondonno
|
6431dfdb50
|
Cross product for vectors of size 2. Fixes #1037
|
2022-11-15 22:39:42 +00:00 |
|
Antonio Sánchez
|
8588d8c74b
|
Correct pnegate for floating-point zero.
|
2022-11-15 18:07:23 +00:00 |
|
Antonio Sanchez
|
5eacb9e117
|
Put brackets around unsigned type names.
|
2022-11-15 09:09:45 -08:00 |
|
Antonio Sánchez
|
37e40dca85
|
Fix ambiguity in PPC for vec_splats call.
|
2022-11-14 18:58:16 +00:00 |
|
Antonio Sánchez
|
7dc6db75d4
|
Fix typo in CholmodSupport
|
2022-11-08 23:49:56 +00:00 |
|
Charles Schlosser
|
9b6d624eab
|
fix neon
|
2022-11-08 20:03:01 +00:00 |
|
Rasmus Munk Larsen
|
7e398e9436
|
Add missing return keyword in psignbit for NEON.
|
2022-11-04 16:13:09 +00:00 |
|
Charles Schlosser
|
82b152dbe7
|
Add signbit function
|
2022-11-04 00:31:20 +00:00 |
|
Antonio Sánchez
|
8f8e36458f
|
Remove recently added sparse assert in SparseMapBase.
|
2022-11-03 17:29:05 +00:00 |
|
Antonio Sanchez
|
01a31b81b2
|
Remove unused parameter name.
|
2022-11-01 15:51:25 -07:00 |
|
Antonio Sánchez
|
c5b896c5a3
|
Allow empty matrices to be resized.
|
2022-10-27 20:33:35 +00:00 |
|
Antonio Sánchez
|
886aad1361
|
Disable patan for double on PPC.
|
2022-10-27 17:56:08 +00:00 |
|
Antonio Sánchez
|
ab407b2b6e
|
Fix handmade_aligned_malloc offset computation.
|
2022-10-27 17:33:47 +00:00 |
|
Antonio Sánchez
|
adb30efb25
|
Add assert for invalid outerIndexPtr array in SparseMapBase.
|
2022-10-26 22:51:33 +00:00 |
|
Antonio Sánchez
|
c27d1abe46
|
Fix pragma check for disabling fastmath.
|
2022-10-26 22:50:57 +00:00 |
|
Charles Schlosser
|
a226371371
|
Change handmade_aligned_malloc/realloc/free to store a 1 byte offset instead of absolute address
|
2022-10-22 22:51:31 +00:00 |
|
Antonio Sánchez
|
bf48d46338
|
Explicitly state that indices must be sorted.
|
2022-10-19 18:15:29 +00:00 |
|
Rasmus Munk Larsen
|
3bb6a48d8c
|
Fix bug atan2
|
2022-10-12 23:49:32 +00:00 |
|
Rasmus Munk Larsen
|
14c847dc0e
|
Refactor special values test for pow, and add a similar test for atan2
|
2022-10-12 20:12:08 +00:00 |
|
Rasmus Munk Larsen
|
462758e8a3
|
Don't use generic sign function for sign(complex) unless it is vectorizable
|
2022-10-12 16:03:29 +00:00 |
|
Rasmus Munk Larsen
|
c0d6a72611
|
Use pnegate(pzero(x)) as a generic way to generate -0.0. Some compiler do not handle the literal -0.0 properly in fastmath mode.
|
2022-10-12 01:57:05 +00:00 |
|
Laurent Rineau
|
7846c7387c
|
Eigen/Sparse: fix warnings -Wunused-but-set-variable
|
2022-10-11 17:37:04 +00:00 |
|
Rasmus Munk Larsen
|
3167544873
|
Handle NaN inputs to atan2.
|
2022-10-10 19:36:36 -07:00 |
|
Rasmus Munk Larsen
|
72db3f0fa5
|
Remove references to M_PI_2 and M_PI_4.
|
2022-10-11 00:27:16 +00:00 |
|
Rasmus Munk Larsen
|
5ceed0d57f
|
Guard GCC-specific pragmas with "#ifdef EIGEN_COMP_GNUC"
|
2022-10-10 20:38:53 +00:00 |
|
Rasmus Munk Larsen
|
e95c4a837f
|
Simpler range reduction strategy for atan<float>().
|
2022-10-04 18:11:00 +00:00 |
|
Antonio Sánchez
|
80efbfdeda
|
Unconditionally enable CXX11 math.
|
2022-10-04 17:37:47 +00:00 |
|
Antonio Sánchez
|
e5794873cb
|
Replace assert with eigen_assert.
|
2022-10-04 17:11:23 +00:00 |
|
Antonio Sánchez
|
7d6a9925cc
|
Fix 4x4 inverse when compiling with -Ofast.
|
2022-10-04 16:05:49 +00:00 |
|
Rasmus Munk Larsen
|
1414a76fa9
|
Only vectorize atan<double> for Altivec if VSX is available.
|
2022-10-03 22:06:58 +00:00 |
|
Rasmus Munk Larsen
|
c475228b28
|
Vectorize atan() for double.
|
2022-10-01 01:49:30 +00:00 |
|
Rasmus Munk Larsen
|
1e1848fdb1
|
Add a vectorized implementation of atan2 to Eigen.
|
2022-09-28 20:46:49 +00:00 |
|
Rasmus Munk Larsen
|
b3bf8d6a13
|
Try to reduce size of GEBP kernel for non-ARM targets.
|
2022-09-28 02:37:18 +00:00 |
|
Rasmus Munk Larsen
|
13b69fc1b0
|
Try to reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR
|
2022-09-23 20:09:42 +00:00 |
|
Rasmus Munk Larsen
|
ed8cda3ce4
|
Move EIGEN_NEON_GEBP_NR macro to the right place in GeneralBlockPanelKernel.h
|
2022-09-23 02:24:27 +00:00 |
|
Rasmus Munk Larsen
|
e2ea866515
|
Add a macro to set the nr trait in the BEBP kernel for NEON.
|
2022-09-22 23:56:34 +00:00 |
|
Lianhuang Li
|
23299632c2
|
Use 3px8/2px8/1px8/1x8 gebp_kernel on arm64-neon
|
2022-09-21 16:36:40 +00:00 |
|
Rasmus Munk Larsen
|
7b2901e2aa
|
Add vectorized integer division for int32 with AVX512, AVX or SSE.
|
2022-09-21 00:27:23 +00:00 |
|
Rasmus Munk Larsen
|
f913a40678
|
Revert "Add AVX int32_t pdiv"
This reverts commit ea84e7ad63
|
2022-09-16 22:48:08 +00:00 |
|
Rasmus Munk Larsen
|
273e0c884e
|
Revert "Add constexpr, test for C++14 constexpr."
|
2022-09-16 21:14:29 +00:00 |
|
Charles Schlosser
|
ea84e7ad63
|
Add AVX int32_t pdiv
|
2022-09-16 17:06:29 +00:00 |
|
Rasmus Munk Larsen
|
afc014f1b5
|
Allow mixed types for pow(), as long as the exponent is exactly representable in the base type.
|
2022-09-12 21:55:30 +00:00 |
|
Rasmus Munk Larsen
|
e8a2aa24a2
|
Fix a couple of issues with unary pow():
|
2022-09-09 17:21:11 +00:00 |
|
Rohit Santhanam
|
07d0759951
|
[ROCm] Fix for sparse matrix related breakage on ROCm.
|
2022-09-09 14:41:00 +00:00 |
|
Antonio Sánchez
|
fb212c745d
|
Fix g++-6 constexpr and c++20 constexpr build errors.
|
2022-09-09 03:41:45 +00:00 |
|
Thomas Gloor
|
ec9c7163a3
|
Feature/skew symmetric matrix3
|
2022-09-08 20:44:40 +00:00 |
|
Antonio Sánchez
|
311ba66f7c
|
Fix realloc for non-trivial types.
|
2022-09-08 19:39:36 +00:00 |
|
Rasmus Munk Larsen
|
f9dfda28ab
|
Add missing comparison operators for GPU packets.
|
2022-09-07 21:13:45 +00:00 |
|
Tobias Schlüter
|
133498c329
|
Add constexpr, test for C++14 constexpr.
|
2022-09-07 03:42:34 +00:00 |
|
Antonio Sanchez
|
3e44f960ed
|
Reduce compiler warnings for tests.
|
2022-09-06 18:20:56 +00:00 |
|
Florian Richer
|
b7e21d4e38
|
Call check_that_malloc_is_allowed() in aligned_realloc()
|
2022-09-06 18:00:37 +00:00 |
|
Michael Palomas
|
525f066671
|
fixed msvc compilation error in GeneralizedEigenSolver.h
|
2022-09-04 17:50:43 +00:00 |
|
Antonio Sánchez
|
f241a2c18a
|
Add asserts for index-out-of-bounds in IndexedView.
|
2022-09-02 17:28:03 +00:00 |
|
Antonio Sánchez
|
30c42222a6
|
Fix some test build errors in new unary pow.
|
2022-08-30 17:24:14 +00:00 |
|
Rasmus Munk Larsen
|
bd393e15c3
|
Vectorize acos, asin, and atan for float.
|
2022-08-29 19:49:33 +00:00 |
|
Charles Schlosser
|
e5af9f87f2
|
Vectorize pow for integer base / exponent types
|
2022-08-29 19:23:54 +00:00 |
|
chuckyschluz
|
8acbf5c11c
|
re-enable pow for complex types
|
2022-08-26 17:29:02 -04:00 |
|
Rasmus Munk Larsen
|
7064ed1345
|
Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>.
|
2022-08-26 17:02:37 +00:00 |
|
Rasmus Munk Larsen
|
98e51c9e24
|
Avoid undefined behavior in array_cwise test due to signed integer overflow
|
2022-08-26 16:19:03 +00:00 |
|
Arthur
|
a7c1cac18b
|
Fix GeneralizedEigenSolver::info() and Asserts
|
2022-08-25 22:05:04 +00:00 |
|
Antonio Sanchez
|
714678fc6c
|
Add missing ptr in realloc call.
|
2022-08-24 22:04:04 -07:00 |
|
Charles Schlosser
|
b2a13c9dd1
|
Sparse Core: Replace malloc/free with conditional_aligned
|
2022-08-23 21:44:22 +00:00 |
|
Rasmus Munk Larsen
|
6aad0f821b
|
Fix psign for unsigned integer types, such as bool.
|
2022-08-22 20:19:35 +00:00 |
|
Rasmus Munk Larsen
|
1a09defce7
|
Protect new pblend implementation with EIGEN_VECTORIZE_AVX2
|
2022-08-22 18:28:03 +00:00 |
|
Rasmus Munk Larsen
|
7c67dc67ae
|
Use proper double word division algorithm for pow<double>. Gives 11-15% speedup.
|
2022-08-17 18:36:23 +00:00 |
|
Matthew Sterrett
|
7a3b667c43
|
Add support for AVX512-FP16 for vectorizing half precision math
|
2022-08-17 18:15:21 +00:00 |
|
Charles Schlosser
|
76a669fb45
|
add fixed power unary operation
|
2022-08-16 21:32:36 +00:00 |
|
Matthew Sterrett
|
39fcc89798
|
Removed unnecessary checks for FP16C
|
2022-08-16 18:14:41 +00:00 |
|
Romain Biessy
|
2f7cce2dd5
|
[SYCL] Fix some SYCL tests
|
2022-08-16 17:37:54 +00:00 |
|
Arthur
|
27367017bd
|
Disable bad "deprecated warning" edge-case in BDCSVD
|
2022-08-11 18:43:31 +00:00 |
|
Lexi Bromfield
|
66ea0c09fd
|
Don't double-define Half functions on aarch64
|
2022-08-09 20:00:34 +00:00 |
|
Rasmus Munk Larsen
|
97e0784dc6
|
Vectorize the sign operator in Eigen.
|
2022-08-09 19:54:57 +00:00 |
|
Arthur
|
be20207d10
|
Fix vectorized Jacobi Rotation
|
2022-08-08 19:29:56 +00:00 |
|
Rasmus Munk Larsen
|
7a87ed1b6a
|
Fix code and unit test for a few corner cases in vectorized pow()
|
2022-08-08 18:48:36 +00:00 |
|
Chip Kerchner
|
9e0afe0f02
|
Fix non-VSX PowerPC build
|
2022-08-08 18:18:17 +00:00 |
|
Chip Kerchner
|
84a9d6fac9
|
Fix use of Packet2d type for non-VSX.
|
2022-08-03 20:48:13 +00:00 |
|
Chip Kerchner
|
ce60a7be83
|
Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API.
|
2022-08-03 18:15:19 +00:00 |
|
Antonio Sánchez
|
5a1c7807e6
|
Fix inner iterator for sparse block.
|
2022-08-03 17:26:12 +00:00 |
|
Antonio Sánchez
|
7896c7dc6b
|
Use numext::sqrt in ConjugateGradient.
|
2022-07-29 20:17:23 +00:00 |
|
Ilya Tokar
|
e618c4a5e9
|
Improve pblend AVX implementation
|
2022-07-29 18:45:33 +00:00 |
|
sjusju
|
ef4654bae7
|
Add true determinant to QR and it's variants
|
2022-07-29 18:24:14 +00:00 |
|
Alexander Richardson
|
b7668c0371
|
Avoid including <sstream> with EIGEN_NO_IO
|
2022-07-29 18:02:51 +00:00 |
|
John Mather
|
7dd3dda3da
|
Updated AccelerateSupport documentation after PR 966.
|
2022-07-29 17:42:31 +00:00 |
|
Julian Kent
|
69714ff613
|
Add Sparse Subset of Matrix Inverse
|
2022-07-28 18:04:35 +00:00 |
|
Antonio Sánchez
|
34780d8bd1
|
Include immintrin.h header for enscripten.
|
2022-07-22 02:27:42 +00:00 |
|
Antonio Sánchez
|
2cf4d18c9c
|
Disable AVX512 GEMM kernels by default.
|
2022-07-20 21:22:48 +00:00 |
|
Charles Schlosser
|
a678a3e052
|
Fix aligned_realloc to call check_that_malloc_is_allowed() if ptr == 0
|
2022-07-19 20:59:07 +00:00 |
|
b-shi
|
4a56359406
|
Add option to disable avx512 GEBP kernels
|
2022-07-18 17:59:09 +00:00 |
|
Mathieu Westphal
|
1092574b26
|
Fix wrong doxygen group usage
|
2022-07-12 13:22:46 +02:00 |
|
Antonio Sánchez
|
bb51d9f4fa
|
Fix ODR violations.
|
2022-07-09 04:56:36 +00:00 |
|
Chip Kerchner
|
84cf3ff18d
|
Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.
|
2022-06-27 19:18:00 +00:00 |
|
Chip Kerchner
|
c603275dc9
|
Better performance for Power10 using more load and store vector pairs for GEMV
|
2022-06-27 18:11:55 +00:00 |
|
Antonio Sánchez
|
bc2ab81634
|
Eliminate undef warnings when not compiling for AVX512.
|
2022-06-24 15:10:10 +00:00 |
|
Antonio Sánchez
|
0e083b172e
|
Use numext::sqrt in Householder.h.
|
2022-06-21 16:29:59 +00:00 |
|
b-shi
|
37673ca1bc
|
AVX512 TRSM kernels use alloca if EIGEN_NO_MALLOC requested
|
2022-06-17 18:05:26 +00:00 |
|
Chip Kerchner
|
4d1c16eab8
|
Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.
|
2022-06-15 16:06:43 +00:00 |
|
Mehdi Goli
|
7ea823e824
|
[SYCL-Spec] According to [SYCL-2020 spec](...
|
2022-06-13 15:52:29 +00:00 |
|
Arthur
|
ba4d7304e2
|
Document DiagonalBase
|
2022-06-08 17:46:32 +00:00 |
|
Binhao Qin
|
95463b59bc
|
Mark index_remap as EIGEN_DEVICE_FUNC in src/Core/Reshaped.h (Fixes #2493)
|
2022-06-07 20:10:47 +00:00 |
|
Shi, Brian
|
28812d2ebb
|
AVX512 TRSM Kernels respect EIGEN_NO_MALLOC
|
2022-06-07 11:28:42 -07:00 |
|
Arthur
|
14aae29470
|
Provide DiagonalMatrix Product and Initializers
|
2022-06-06 21:43:22 +00:00 |
|
aaraujom
|
8fbb76a043
|
Fix build issues with MSVC for AVX512
|
2022-06-03 14:55:40 +00:00 |
|
aaraujom
|
d49ede4dc4
|
Add AVX512 s/dgemm optimizations for compute kernel (2nd try)
|
2022-05-28 02:00:21 +00:00 |
|
Arthur
|
705ae70646
|
Add R-Bidiagonalization step to BDCSVD
|
2022-05-27 02:00:24 +00:00 |
|
Mario Rincon-Nigro
|
e99163e732
|
fix: issue 2481: LDLT produce wrong results with AutoDiffScalar
|
2022-05-25 15:26:10 +00:00 |
|
Chip Kerchner
|
aa8b7e2c37
|
Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster)
|
2022-05-23 15:18:29 +00:00 |
|
Guoqiang QI
|
32a3f9ac33
|
Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h
|
2022-05-23 09:33:49 +00:00 |
|
Eisuke Kawashima
|
ac5c83a3f5
|
unset executable flag
|
2022-05-22 22:47:43 +09:00 |
|
Antonio Sanchez
|
481a4a8c31
|
Fix BDCSVD condition for failing with numerical issue.
|
2022-05-20 08:18:31 -07:00 |
|
Antonio Sánchez
|
028ab12586
|
Prevent BDCSVD crash caused by index out of bounds.
|
2022-05-19 22:29:48 +00:00 |
|
Antonio Sánchez
|
9b9496ad98
|
Revert "Add AVX512 optimizations for matrix multiply"
This reverts commit 25db0b4a82
|
2022-05-13 18:50:33 +00:00 |
|
aaraujom
|
25db0b4a82
|
Add AVX512 optimizations for matrix multiply
|
2022-05-12 23:41:19 +00:00 |
|
Alex_M
|
2c055f8633
|
make diagonal matrix cols() and rows() methods constexpr
|
2022-05-03 10:13:37 +02:00 |
|
Chip Kerchner
|
c2f15edc43
|
Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV.
|
2022-04-25 16:23:01 +00:00 |
|
John Mather
|
9e026e5e28
|
Removed need to supply the Symmetric flag to UpLo argument for Accelerate LLT and LDLT
|
2022-04-21 20:02:10 +00:00 |
|
Chip Kerchner
|
44ba7a0da3
|
Fix compiler bugs for GCC 10 & 11 for Power GEMM
|
2022-04-20 15:59:00 +00:00 |
|
Chip Kerchner
|
b02c384ef4
|
Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub
|
2022-04-18 16:16:32 +00:00 |
|
Rohit Santhanam
|
3de96caeaa
|
Fix HouseholderSequence.h
|
2022-04-17 02:46:56 +00:00 |
|
Antonio Sánchez
|
f845a8bb1a
|
Fix cwise NaN propagation for scalar input.
|
2022-04-16 05:07:44 +00:00 |
|
Charles Schlosser
|
a4bb513b99
|
Update HouseholderSequence.h
|
2022-04-15 16:56:17 +00:00 |
|
Shi, Brian
|
fc1d888415
|
Remove AVX512VL dependency in trsm
|
2022-04-14 12:44:24 -07:00 |
|
Antonio Sánchez
|
07db964bde
|
Restrict new AVX512 trsm to AVX512VL, rename files for consistency.
|
2022-04-14 16:58:32 +00:00 |
|
Charles Schlosser
|
67eeba6e72
|
Avoidable heap allocation in applyHouseholderToTheLeft
|
2022-04-13 18:45:36 +00:00 |
|
Antonio Sánchez
|
efb08e0bb5
|
Revert "Fix ambiguous DiagonalMatrix constructors."
This reverts commit a81bba962a
|
2022-04-12 03:54:31 +00:00 |
|
Chip Kerchner
|
53eec53d2a
|
Fix Power GEMV order of operations in predux for MMA.
|
2022-04-11 21:29:05 +00:00 |
|
Antonio Sánchez
|
a81bba962a
|
Fix ambiguous DiagonalMatrix constructors.
|
2022-04-11 19:13:25 +00:00 |
|
Tobias Schlüter
|
f3ba220c5d
|
Remove EIGEN_EMPTY_STRUCT_CTOR
|
2022-04-08 18:27:26 +00:00 |
|
Antonio Sánchez
|
5ed7a86ae9
|
Fix MSVC+CUDA issues.
|
2022-04-08 18:05:32 +00:00 |
|
Antonio Sánchez
|
734ed1efa6
|
Fix ODR issues in lapacke_helpers.
|
2022-04-08 15:31:30 +00:00 |
|
Antonio Sánchez
|
2c45a3846e
|
Fix some max size expressions.
|
2022-04-06 22:19:57 +00:00 |
|
Erik Schultheis
|
df87d40e34
|
constexpr reshape helper
|
2022-04-05 17:32:17 +00:00 |
|
Chip Kerchner
|
403fa33409
|
Performance improvements in GEMM for Power
|
2022-04-05 12:18:53 +00:00 |
|
Erik Schultheis
|
e1df3636b2
|
More constexpr helpers
|
2022-04-04 18:38:34 +00:00 |
|
Erik Schultheis
|
64909b82bd
|
static const class members turned into constexpr
|
2022-04-04 17:33:33 +00:00 |
|
William Talbot
|
2c0ef43b48
|
Added Scaling function overload for vector rvalue reference
|
2022-04-04 16:50:09 +00:00 |
|
Antonio Sanchez
|
ba2cb835aa
|
Add back std::remove* aliases - third-party libraries rely on these.
|
2022-04-01 17:02:52 +00:00 |
|
Antonio Sánchez
|
73b2c13bf2
|
Disable f16c scalar conversions for MSVC.
|
2022-03-30 18:35:32 +00:00 |
|
Tobias Schlüter
|
e22d58e816
|
Add is_constant_evaluated, update alignment checks
|
2022-03-25 04:00:58 +00:00 |
|
Erik Schultheis
|
b9d2900e8f
|
added a missing typename and fixed a unused typedef warning
|
2022-03-24 12:07:18 +02:00 |
|
b-shi
|
0611f7fff0
|
Add missing explicit reinterprets
|
2022-03-23 21:10:26 +00:00 |
|
Essex Edwards
|
cd3c81c3bc
|
Add a NNLS solver to unsupported - issue #655
|
2022-03-23 20:20:44 +00:00 |
|
Chip Kerchner
|
0699fa06fe
|
Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor.
|
2022-03-23 18:09:33 +00:00 |
|
Antonio Sánchez
|
19a6a827c4
|
Optimize visitor traversal in case of RowMajor.
|
2022-03-23 15:27:57 +00:00 |
|
Romain Biessy
|
f2a3e03e9b
|
Fix usages of wrong namespace
|
2022-03-21 15:07:53 +00:00 |
|
Antonio Sánchez
|
4451823fb4
|
Fix ODR violation in trsm.
|
2022-03-20 15:56:53 +00:00 |
|
Antonio Sánchez
|
9a14d91a99
|
Fix AVX512 builds with MSVC.
|
2022-03-18 16:04:53 +00:00 |
|
Chip Kerchner
|
7b10795e39
|
Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's...
|
2022-03-17 22:35:27 +00:00 |
|