Antonio Sánchez
bc2ab81634
Eliminate undef warnings when not compiling for AVX512.
2022-06-24 15:10:10 +00:00
Antonio Sánchez
0e083b172e
Use numext::sqrt in Householder.h.
2022-06-21 16:29:59 +00:00
b-shi
37673ca1bc
AVX512 TRSM kernels use alloca if EIGEN_NO_MALLOC requested
2022-06-17 18:05:26 +00:00
Chip Kerchner
4d1c16eab8
Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.
2022-06-15 16:06:43 +00:00
Mehdi Goli
7ea823e824
[SYCL-Spec] According to [SYCL-2020 spec](...
2022-06-13 15:52:29 +00:00
Arthur
ba4d7304e2
Document DiagonalBase
2022-06-08 17:46:32 +00:00
Binhao Qin
95463b59bc
Mark index_remap as EIGEN_DEVICE_FUNC in src/Core/Reshaped.h ( Fixes #2493 )
2022-06-07 20:10:47 +00:00
Shi, Brian
28812d2ebb
AVX512 TRSM Kernels respect EIGEN_NO_MALLOC
2022-06-07 11:28:42 -07:00
Arthur
14aae29470
Provide DiagonalMatrix Product and Initializers
2022-06-06 21:43:22 +00:00
aaraujom
8fbb76a043
Fix build issues with MSVC for AVX512
2022-06-03 14:55:40 +00:00
aaraujom
d49ede4dc4
Add AVX512 s/dgemm optimizations for compute kernel (2nd try)
2022-05-28 02:00:21 +00:00
Arthur
705ae70646
Add R-Bidiagonalization step to BDCSVD
2022-05-27 02:00:24 +00:00
Mario Rincon-Nigro
e99163e732
fix: issue 2481: LDLT produce wrong results with AutoDiffScalar
2022-05-25 15:26:10 +00:00
Chip Kerchner
aa8b7e2c37
Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster)
2022-05-23 15:18:29 +00:00
Guoqiang QI
32a3f9ac33
Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h
2022-05-23 09:33:49 +00:00
Eisuke Kawashima
ac5c83a3f5
unset executable flag
2022-05-22 22:47:43 +09:00
Antonio Sanchez
481a4a8c31
Fix BDCSVD condition for failing with numerical issue.
2022-05-20 08:18:31 -07:00
Antonio Sánchez
028ab12586
Prevent BDCSVD crash caused by index out of bounds.
2022-05-19 22:29:48 +00:00
Antonio Sánchez
9b9496ad98
Revert "Add AVX512 optimizations for matrix multiply"
...
This reverts commit 25db0b4a82
2022-05-13 18:50:33 +00:00
aaraujom
25db0b4a82
Add AVX512 optimizations for matrix multiply
2022-05-12 23:41:19 +00:00
Alex_M
2c055f8633
make diagonal matrix cols() and rows() methods constexpr
2022-05-03 10:13:37 +02:00
Chip Kerchner
c2f15edc43
Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV.
2022-04-25 16:23:01 +00:00
John Mather
9e026e5e28
Removed need to supply the Symmetric flag to UpLo argument for Accelerate LLT and LDLT
2022-04-21 20:02:10 +00:00
Chip Kerchner
44ba7a0da3
Fix compiler bugs for GCC 10 & 11 for Power GEMM
2022-04-20 15:59:00 +00:00
Chip Kerchner
b02c384ef4
Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub
2022-04-18 16:16:32 +00:00
Rohit Santhanam
3de96caeaa
Fix HouseholderSequence.h
2022-04-17 02:46:56 +00:00
Antonio Sánchez
f845a8bb1a
Fix cwise NaN propagation for scalar input.
2022-04-16 05:07:44 +00:00
Charles Schlosser
a4bb513b99
Update HouseholderSequence.h
2022-04-15 16:56:17 +00:00
Shi, Brian
fc1d888415
Remove AVX512VL dependency in trsm
2022-04-14 12:44:24 -07:00
Antonio Sánchez
07db964bde
Restrict new AVX512 trsm to AVX512VL, rename files for consistency.
2022-04-14 16:58:32 +00:00
Charles Schlosser
67eeba6e72
Avoidable heap allocation in applyHouseholderToTheLeft
2022-04-13 18:45:36 +00:00
Antonio Sánchez
efb08e0bb5
Revert "Fix ambiguous DiagonalMatrix constructors."
...
This reverts commit a81bba962a
2022-04-12 03:54:31 +00:00
Chip Kerchner
53eec53d2a
Fix Power GEMV order of operations in predux for MMA.
2022-04-11 21:29:05 +00:00
Antonio Sánchez
a81bba962a
Fix ambiguous DiagonalMatrix constructors.
2022-04-11 19:13:25 +00:00
Tobias Schlüter
f3ba220c5d
Remove EIGEN_EMPTY_STRUCT_CTOR
2022-04-08 18:27:26 +00:00
Antonio Sánchez
5ed7a86ae9
Fix MSVC+CUDA issues.
2022-04-08 18:05:32 +00:00
Antonio Sánchez
734ed1efa6
Fix ODR issues in lapacke_helpers.
2022-04-08 15:31:30 +00:00
Antonio Sánchez
2c45a3846e
Fix some max size expressions.
2022-04-06 22:19:57 +00:00
Erik Schultheis
df87d40e34
constexpr reshape helper
2022-04-05 17:32:17 +00:00
Chip Kerchner
403fa33409
Performance improvements in GEMM for Power
2022-04-05 12:18:53 +00:00
Erik Schultheis
e1df3636b2
More constexpr helpers
2022-04-04 18:38:34 +00:00
Erik Schultheis
64909b82bd
static const class members turned into constexpr
2022-04-04 17:33:33 +00:00
William Talbot
2c0ef43b48
Added Scaling function overload for vector rvalue reference
2022-04-04 16:50:09 +00:00
Antonio Sanchez
ba2cb835aa
Add back std::remove* aliases - third-party libraries rely on these.
2022-04-01 17:02:52 +00:00
Antonio Sánchez
73b2c13bf2
Disable f16c scalar conversions for MSVC.
2022-03-30 18:35:32 +00:00
Tobias Schlüter
e22d58e816
Add is_constant_evaluated, update alignment checks
2022-03-25 04:00:58 +00:00
Erik Schultheis
b9d2900e8f
added a missing typename and fixed a unused typedef warning
2022-03-24 12:07:18 +02:00
b-shi
0611f7fff0
Add missing explicit reinterprets
2022-03-23 21:10:26 +00:00
Essex Edwards
cd3c81c3bc
Add a NNLS solver to unsupported - issue #655
2022-03-23 20:20:44 +00:00
Chip Kerchner
0699fa06fe
Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor.
2022-03-23 18:09:33 +00:00
Antonio Sánchez
19a6a827c4
Optimize visitor traversal in case of RowMajor.
2022-03-23 15:27:57 +00:00
Romain Biessy
f2a3e03e9b
Fix usages of wrong namespace
2022-03-21 15:07:53 +00:00
Antonio Sánchez
4451823fb4
Fix ODR violation in trsm.
2022-03-20 15:56:53 +00:00
Antonio Sánchez
9a14d91a99
Fix AVX512 builds with MSVC.
2022-03-18 16:04:53 +00:00
Chip Kerchner
7b10795e39
Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's...
2022-03-17 22:35:27 +00:00
Antonio Sánchez
3ca1228d45
Work around MSVC compiler bug dropping const.
2022-03-17 20:50:26 +00:00
Tobias Schlüter
40eb34bc5d
Fix RowMajorBit <-> RowMajor mixup.
2022-03-17 15:28:12 +00:00
Antonio Sanchez
e34db1239d
Fix missing pound
2022-03-16 12:26:12 -07:00
Antonio Sánchez
591906477b
Fix up PowerPC MMA flags so it builds by default.
2022-03-16 19:16:28 +00:00
b-shi
518fc321cb
AVX512 Optimizations for Triangular Solve
2022-03-16 18:04:50 +00:00
Erik Schultheis
421cbf0866
Replace Eigen type metaprogramming with corresponding std types and make use of alias templates
2022-03-16 16:43:40 +00:00
Arthur
514f90c9ff
Remove workarounds for bad GCC-4 warnings
2022-03-16 00:08:16 +00:00
Rasmus Munk Larsen
9ad5661482
Revert "Fix up PowerPC MMA flags so it builds by default."
2022-03-15 20:51:03 +00:00
Antonio Sánchez
65eeedf964
Fix up PowerPC MMA flags so it builds by default.
2022-03-15 20:22:23 +00:00
Tobias Schlüter
cb1e8228e9
Convert bit calculation to constexpr, avoid casts.
2022-03-13 22:38:36 +09:00
Rohit Santhanam
2a6be5492f
Fix construct_at compilation breakage on ROCm.
2022-03-09 16:47:53 +00:00
Duncan McBain
a3b64625e3
Remove ComputeCpp-specific code from SYCL Vptr
2022-03-08 22:44:18 +00:00
Tobias Schlüter
cd2ba9d03e
Add construct_at, destroy_at wrappers. Use throughout.
2022-03-08 20:43:22 +00:00
AlexanderMueller
dfa5176780
make SparseSolverBase and IterativeSolverBase move constructable
2022-03-08 20:03:53 +01:00
Tobias Schlüter
9883108f3a
Remove copy_bool workaround for gcc 4.3
2022-03-08 17:43:11 +00:00
John Mather
3a9d404d76
Add support for Apple's Accelerate sparse matrix solvers
2022-03-08 00:09:18 +00:00
Antonio Sánchez
0ae94456a0
Remove duplicate IsRowMajor declaration.
2022-03-04 21:22:02 +00:00
Rasmus Munk Larsen
0e6f4e43f1
Fix a few confusing comments in psincos_float.
2022-03-04 20:41:49 +00:00
Sean McBride
f1b9692d63
Removed EIGEN_UNUSED decorations from many functions that are in fact used
2022-03-03 20:19:33 +00:00
Arthur
c9ff739af1
Fix JacobiSVD_LAPACKE bindings
2022-03-03 19:24:07 +00:00
Zhuo Zhang
d0b1aef6f6
Speed lscg by using .noalias
2022-03-03 08:52:09 +00:00
Antonio Sanchez
55c7400db5
Fix enum conversion warnings in BooleanRedux.
2022-03-03 04:44:20 +00:00
Antonio Sánchez
9c07e201ff
Modified sqrt/rsqrt for denormal handling.
2022-03-02 17:20:47 +00:00
Antonio Sánchez
b48922cb5c
Fix SVD for MSVC+CUDA.
2022-03-01 21:35:22 +00:00
Yury Gitman
bf6726a0c6
Fix any/all reduction in the case of row-major layout
2022-03-01 05:27:50 +00:00
Antonio Sánchez
f03df0df53
Fix SVD for MSVC.
2022-02-28 19:53:15 +00:00
Antonio Sánchez
19c39bea29
Fix mixingtypes for g++-11.
2022-02-25 19:28:10 +00:00
Rasmus Munk Larsen
8b875dbef1
Changes to fast SQRT/RSQRT
2022-02-23 17:32:21 +00:00
Ramil Sattarov
f9b7564faa
E2K: initial support of LCC MCST compiler for the Elbrus 2000 CPU architecture
2022-02-23 17:07:34 +00:00
Arthur
cd80e04ab7
Add assert for edge case if Thin U Requested at runtime
2022-02-23 05:35:19 +00:00
Martin Heistermann
550af3938c
Fix for crash bug in SPQRSupport: Initialize pointers to nullptr to avoid free() calls of invalid pointers.
2022-02-18 16:13:28 +00:00
Antonio Sánchez
58a90c7463
Use fixed-sized U/V for fixed-sized inputs.
2022-02-16 18:31:47 +00:00
Antonio Sánchez
c367ed26a8
Make FixedInt constexpr, fix ODR of fix<N>
2022-02-16 17:47:51 +00:00
Antonio Sánchez
766087329e
Re-add svd::compute(Matrix, options) method to avoid breaking external projects.
2022-02-16 00:54:02 +00:00
Antonio Sánchez
a58af20d61
Add descriptions to Matrix typedefs.
2022-02-15 21:53:27 +00:00
Antonio Sánchez
28e008b99a
Fix sqrt/rsqrt for NEON.
2022-02-15 21:31:51 +00:00
Antonio Sanchez
23755030c9
Fix MSVC+NVCC 9.2 pragma error.
2022-02-15 10:51:32 -08:00
Erik Schultheis
7197b577fb
Remove unused macros in AVX packetmath.
...
The following macros are removed:
* EIGEN_DECLARE_CONST_Packet8f
* EIGEN_DECLARE_CONST_Packet4d
* EIGEN_DECLARE_CONST_Packet8f_FROM_INT
* EIGEN_DECLARE_CONST_Packet8i
2022-02-14 10:34:23 +00:00
Chip Kerchner
cb5ca1c901
Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC
2022-02-09 18:47:08 +00:00
Matt Keeter
cec0005c74
Return alphas() and betas() by const reference
2022-02-08 23:16:10 +00:00
Rasmus Munk Larsen
92d0026b7b
Provide a definition for numeric_limits static data members
2022-02-08 20:34:53 +00:00
Björn Ingvar Dahlgren
b94bddcde0
Typo in COD's doc: matrixR() -> matrixT()
2022-02-07 18:30:25 +00:00
Antonio Sánchez
94bed2b80c
Fix collision with resolve.h.
2022-02-07 18:17:42 +00:00
Antonio Sánchez
9441d94dcc
Revert "Make fixed-size Matrix and Array trivially copyable after C++20"
...
This reverts commit 47eac21072
2022-02-05 04:40:29 +00:00
Rasmus Munk Larsen
979fdd58a4
Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.
2022-02-05 00:20:13 +00:00
Antonio Sánchez
4bffbe84f9
Restrict GCC<6.3 maxpd workaround to only gcc.
2022-02-04 22:47:34 +00:00
Antonio Sánchez
e7f4a901ee
Define EIGEN_HAS_AVX512_MATH in PacketMath.
2022-02-04 22:25:52 +00:00
Antonio Sánchez
6b60bd6754
Fix 32-bit arm int issue.
2022-02-04 21:59:33 +00:00
Antonio Sánchez
96da541cba
Fix AVX512 math function consistency, enable for ICC.
2022-02-04 19:35:18 +00:00
Antonio Sánchez
cafeadffef
Fix ODR violations.
2022-02-04 19:01:07 +00:00
Arthur
18b50458b6
Update SVD Module with Options template parameter
2022-02-02 00:15:44 +00:00
Erik Schultheis
89c6ab2385
removed some documentation referencing c++98 behaviour
2022-01-30 12:02:18 +00:00
Chip Kerchner
66464bd2a8
Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV
2022-01-27 20:35:53 +00:00
Rasmus Munk Larsen
8f2c6f0aa6
Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf.
2022-01-26 20:38:05 +00:00
Erik Schultheis
d271a7d545
reduce float warnings (comparisons and implicit conversions)
2022-01-26 18:16:19 +00:00
Rasmus Munk Larsen
51311ec651
Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub.
2022-01-26 04:25:41 +00:00
Rasmus Munk Larsen
ea2c02060c
Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.
2022-01-21 23:49:18 +00:00
Arthur Feeney
4b0926f99b
Prevent heap allocation in diagonal product
2022-01-21 21:15:44 +00:00
Ilya Tokar
a0fc640c18
Add support for packets of int64 on x86
2022-01-21 19:55:23 +00:00
Erik Schultheis
970640519b
Cleanup
2022-01-21 01:48:59 +00:00
Stephen Pierce
81c928ba55
Silence some MSVC warnings
2022-01-21 00:29:23 +00:00
Sean McBride
c454b8c813
Improve clang warning suppressions by checking if warning is supported
2022-01-21 00:27:43 +00:00
David Gao
fb05198bdd
Port EIGEN_OPTIMIZATION_BARRIER to soft float arm
2022-01-20 00:44:17 +00:00
arthurfeeney
937c3d73cb
Fix implicit conversion warning in GEBP kernel's packing
2022-01-17 17:00:59 -06:00
Essex Edwards
49a8a1e07a
Minor correction/clarification to LSCG solver documentation
2022-01-14 19:48:54 +00:00
Arthur
5fe0115724
Update comment referencing removed macro EIGEN_SIZE_MIN_PREFER_DYNAMIC
2022-01-14 19:29:47 +00:00
Arthur
ff4dffc34d
fix implicit conversion warning in vectorwise_reverse_inplace
2022-01-13 20:30:54 +00:00
Chip Kerchner
708fd6d136
Add MMA and performance improvements for VSX in GEMV for PowerPC.
2022-01-13 13:23:18 +00:00
Jörg Buchwald
d1bf056394
fix compilation issue with gcc < 10 and -std=c++2a
2022-01-13 01:24:20 +01:00
Rasmus Munk Larsen
a30ecb7221
Don't use the fast implementation if EIGEN_GPU_CC, since integer_packet is not defined for float4 used by the GPU compiler (even on host).
2022-01-12 20:16:16 +00:00
Erik Schultheis
5a0a165c09
fix broken asserts
2022-01-12 18:31:53 +00:00
Rasmus Munk Larsen
0b58738938
Fix two corner cases in the new implementation of logistic sigmoid.
2022-01-12 00:41:29 +00:00
Matthias Möller
a32e6a4047
Explicit type casting
2022-01-10 22:06:43 +00:00
Kolja Brix
8d81a2339c
Reduce usage of reserved names
2022-01-10 20:53:29 +00:00
Essex Edwards
c61b3cb0db
Fix IterativeSolverBase referring to itself as ConjugateGradient
2022-01-08 08:25:15 +00:00
Rasmus Munk Larsen
80ccacc717
Fix accuracy of logistic sigmoid
2022-01-08 00:15:14 +00:00
Rasmus Munk Larsen
8b8125c574
Make sure the scalar and vectorized path for array.exp() return consistent values.
2022-01-07 23:31:35 +00:00
Lingzhu Xiang
47eac21072
Make fixed-size Matrix and Array trivially copyable after C++20
...
Making them trivially copyable allows using std::memcpy() without undefined
behaviors.
Only Matrix and Array with trivially copyable DenseStorage are marked as
trivially copyable with an additional type trait.
As described in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0848r3.html
it requires extremely verbose SFINAE to make the special member functions of
fixed-size Matrix and Array trivial, unless C++20 concepts are available to
simplify the selection of trivial special member functions given template
parameters. Therefore only make this feature available to compilers that support
C++20 P0848R3.
Fix #1855 .
2022-01-07 19:04:35 +00:00
Matthias Möller
c4b1dd2f6b
Add support for Cray, Fujitsu, and Intel ICX compilers
...
The following preprocessor macros are added:
- EIGEN_COMP_CPE and EIGEN_COMP_CLANGCPE version number of the CRAY compiler if
Eigen is compiled with the Cray C++ compiler, 0 otherwise.
- EIGEN_COMP_FCC and EIGEN_COMP_CLANGFCC version number of the FCC compiler if
Eigen is compiled with the Fujitsu C++ compiler, 0 otherwise
- EIGEN_COMP_CLANGICC version number of the ICX compiler if Eigen is compiled
with the Intel oneAPI C++ compiler, 0 otherwise
All three compilers (Cray, Fujitsu, Intel) offer a traditional and a Clang-based
frontend. This is distinguished by the CLANG prefix.
2022-01-07 18:46:16 +00:00
Rasmus Munk Larsen
96dc37a03b
Some fixes/cleanups for numeric_limits & fix for related bug in psqrt
2022-01-07 01:10:17 +00:00
Fabian Keßler
ed27d988c1
Fixes #i2411
2022-01-06 20:02:37 +00:00
Rasmus Munk Larsen
7b5a8b6bc5
Improve plog: 20% speedup for float + handle denormals
2022-01-05 23:40:31 +00:00
Andrew Johnson
a491c7f898
Allow specifying inner & outer stride for CWiseUnaryView - fixes #2398
2022-01-05 19:24:46 +00:00
Erik Schultheis
9210e71fb3
ensure that eigen::internal::size is not found by ADL, rename to ssize and...
2022-01-05 00:46:09 +00:00
Lingzhu Xiang
7244a74ab0
Add bounds checking to Eigen serializer
2022-01-03 17:00:24 +08:00
Shiva Ghose
a4098ac676
Fix duplicate include guard *ALTIVEC_H -> *ZVECTOR_H
...
Some some header guards were repeated between the `AltiVec` package and the
`ZVector` packages. This could cause a problem if (for whatever reason) someone
attempts to include headers for both architectures.
2021-12-31 08:43:24 +00:00
David Tellenbach
d705eb5f86
Revert "Select AVX2 even if the data size is not a multiple of 8"
...
Tests are failing for AVX and NEON.
This reverts commit eb85b97339 .
2021-12-28 23:57:06 +01:00
Rasmus Munk Larsen
8eab7b6886
Improve exp<float>(): Don't flush denormal results +4% speedup.
...
1. Speed up exp(x) by reducing the polynomial approximant from degree 7 to
degree 6. With exactly representable coefficients computed by the Sollya tool,
this still gives a maximum relative error of 1 ulp, i.e. faithfully rounded, for
arguments where exp(x) is a normalized float. This change results in a speedup
of about 4% for AVX2.
2. Extend the range where exp(x) returns a non-zero result to from ~[-88;88] to
~[-104;88] i.e. return denormalized values for large negative arguments instead
of zero. Compared to exp<double>(x) the denormalized results gradually decrease
in accuracy down to 0.033 relative error for arguments around x = -104 where
exp(x) is ~std::numeric<float>::denorm_min(). This is expected and acceptable.
2021-12-28 15:00:19 +00:00
David Tellenbach
c06c3e52a0
Include immintrin.h if F16C is available and vectorization is disabled
...
If EIGEN_DONT_VECTORIZE is defined, immintrin.h is not included even if F16C is available. Trying to use F16C intrinsics thus fails.
This fixes issue #2395 .
2021-12-25 19:51:42 +00:00
Erik Schultheis
f7a056bf04
Small fixes
...
This MR fixes a bunch of smaller issues, making the following changes:
* Template parameters in the documentation are documented with `\tparam` instead
of `\param`
* Superfluous semicolon warnings fixed
* Fixed the type of literals used to initialize float variables
2021-12-21 16:46:09 +00:00
Erik Schultheis
dee6428a71
fixed clang warnings about alignment change and floating point precision
2021-12-18 17:18:16 +00:00
Kolja Brix
d0b4b75fbb
Simplify logical_xor()
2021-12-16 20:20:47 +00:00
Erik Schultheis
e939c06b0e
Small speed-up in row-major sparse dense product
2021-12-15 18:46:25 +00:00
Erik Schultheis
c20e908ebc
turn some macros intro constexpr functions
2021-12-10 19:27:01 +00:00
Erik Schultheis
0f36e42169
Fix
2021-12-10 16:59:48 +00:00
Rasmus Munk Larsen
f04fd8b168
Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385 .
2021-12-08 17:57:23 +00:00
Erik Schultheis
cc11e240ac
Some further cleanup
2021-12-06 18:01:15 +00:00
Rasmus Munk Larsen
3ffefcb95c
Only include <atomic> if needed.
2021-12-02 23:55:25 +00:00
Erik Schultheis
d60f7fa518
Improved lapacke binding code for HouseholderQR and PartialPivLU
2021-12-02 00:10:58 +00:00
Erik Schultheis
ec2fd0f7ed
Require recent GCC and MSCV and removed EIGEN_HAS_CXX14 and some other feature test macros
2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen
085c2fc5d5
Revert "Update SVD Module to allow specifying computation options with a...
2021-11-30 18:45:54 +00:00
Erik Schultheis
4dd126c630
fixed cholesky with 0 sized matrix (cf. #785 )
2021-11-30 17:17:41 +00:00
Rohit Santhanam
4d3e50036f
Fix for HIP compilation breakage in selfAdjoint and triangular view classes.
2021-11-30 14:00:59 +00:00
Erik Schultheis
63abb35dfd
SFINAE'ing away non-const overloads if selfAdjoint/triangular view is not referring to an lvalue
2021-11-29 22:51:26 +00:00
Jakub Gałecki
1b8dce564a
bugfix: issue #2375
2021-11-29 22:26:15 +00:00
Francesco Mazzoli
eb85b97339
Select AVX2 even if the data size is not a multiple of 8
2021-11-29 21:13:24 +00:00
Arthur
eef33946b7
Update SVD Module to allow specifying computation options with a template parameter. Resolves #2051
2021-11-29 20:50:46 +00:00
Erik Schultheis
f33a31b823
removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks
2021-11-29 19:18:57 +00:00
Rohit Santhanam
9d3ffb3fbf
Fix for HIP compilation failure in DenseBase.
2021-11-28 15:59:30 +00:00
David Tellenbach
08da52eb85
Remove DenseBase::nonZeros() which just calls DenseBase::size()
...
Fixes #2382 .
2021-11-27 14:31:00 +00:00
Ali Can Demiralp
96e537d6fd
Add EIGEN_DEVICE_FUNC to DenseBase::hasNaN() and DenseBase::allFinite().
2021-11-27 11:27:52 +00:00
Erik Schultheis
b8b6566f0f
Currently, the binding of LLT to Lapacke is done using a large macro. This factors out a large part of the functionality of the macro and implement them explicitly.
2021-11-25 16:11:25 +00:00
Erik Schultheis
ec4efbd696
remove EIGEN_HAS_CXX11
2021-11-24 20:08:49 +00:00
Rasmus Munk Larsen
5137a5157a
Make numeric_limits members constexpr as per the newer C++ standards.
...
Author: majnemer@google.com
2021-11-19 15:58:36 +00:00
Erik Schultheis
7e586635ba
don't use deprecated MappedSparseMatrix
2021-11-19 15:58:04 +00:00
Erik Schultheis
b0fb5417d3
Fixed Sparse-Sparse Product in case of mixed StorageIndex types
2021-11-18 18:33:31 +00:00
Pablo Speciale
d04edff570
Update Umeyama.h: src_var is only used when with_scaling == true. Therefore, the actual computation can be avoided when with_scaling == false.
2021-11-16 17:58:22 +00:00
Rasmus Munk Larsen
2b9297196c
Update Transform.h to make transform_construct_from_matrix and transform_take_affine_part callable from device code. Fixes #2377 .
2021-11-16 00:58:30 +00:00
Erik Schultheis
ca9c848679
use consistent StorageIndex types in SparseMatrix::Map
...
and `SparseMatrix::TransposedSparseMatrix`
2021-11-15 22:18:26 +00:00
Erik Schultheis
13954c4440
moved pruning code to SparseVector.h
2021-11-15 22:16:01 +00:00
Nathan Luehr
da79095923
Convert diag pragmas to nv_diag.
2021-11-15 03:42:42 +00:00
Erik Schultheis
532cc73f39
fix a typo
2021-11-13 13:11:06 +02:00
Gengxin Xie
5c642950a5
Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C
...
if the compiler isn't clang
2021-11-04 22:13:01 +00:00
Gilad
0d73440fb2
Documentation of Quaternion constructor from MatrixBase
2021-11-04 16:21:26 +00:00
Xinle Liu
478a1bdda6
Fix total deflation issue in BDCSVD, when & only when M is already diagonal.
2021-11-02 16:53:55 +00:00
Chip Kerchner
9cf34ee0ae
Invert rows and depth in non-vectorized portion of packing (PowerPC).
2021-10-28 21:59:41 +00:00
Ilya Tokar
e1cb6369b0
Add AVX vector path to float2half/half2float
...
Makes e. g. matrix multiplication 2x faster:
name old cpu/op new cpu/op delta
BM_convers 181ms ± 1% 62ms ± 9% -65.82% (p=0.016 n=4+5)
Tested on all possible input values (not adding tests, since they
take a long time).
2021-10-28 13:59:01 -04:00
Antonio Sanchez
03d4cbb307
Fix min/max nan-propagation for scalar "other".
...
Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`.
Fixes #2362 .
2021-10-28 09:28:29 -07:00
Antonio Sanchez
e559701981
Fix compile issue for gcc 4.8
2021-10-28 08:23:19 -07:00
Rohit Santhanam
48e40b22bf
Preliminary HIP bfloat16 GPU support.
2021-10-27 18:36:45 +00:00
Antonio Sanchez
40bbe8a4d0
Fix ZVector build.
...
Cross-compiled via `s390x-linux-gnu-g++`, run via qemu. This allows the
packetmath tests to pass.
2021-10-27 16:30:15 +00:00
Alex Druinsky
6bb6a6bf53
Vectorize fp16 tanh and logistic functions on Neon
...
Activates vectorization of the Eigen::half versions of the tanh and
logistic functions when they run on Neon. Both functions convert their
inputs to float before computing the output, and as a result of this
commit, the conversions and the computation in float are vectorized.
2021-10-27 16:09:16 +00:00
Andreas Krebbel
8faafc3aaa
ZVector: Move alignas qualifier to come first
...
We currently have plenty of type definitions with the alignment
qualifier coming after the type. The compiler warns about ignoring
them:
int EIGEN_ALIGN16 ai[4];
Turn this into:
EIGEN_ALIGN16 int ai[4];
2021-10-26 15:33:47 +02:00
Alex Druinsky
d0e3791b1a
Fix vectorized reductions for Eigen::half
...
Fixes compiler errors in expressions that look like
Eigen::Matrix<Eigen::half, 3, 1>::Random().maxCoeff()
The error comes from the code that creates the initial value for
vectorized reductions. The fix is to specify the scalar type of the
reduction's initial value.
The cahnge is necessary for Eigen::half because unlike other types,
Eigen::half scalars cannot be implicitly created from integers.
2021-10-25 14:44:33 -07:00
Yann Billeter
6c3206152a
fix(CommaInitializer): pass dims at compile-time
2021-10-25 19:53:38 +00:00
Antonio Sanchez
0578feaabc
Remove const from visitor return type.
...
This seems to interfere with `pload`/`ploadu`, since `pload<const
Packet**>` are not defined.
This should unbreak the arm/ppc builds.
2021-10-25 19:09:50 +00:00
benardp
b63c096fbb
Extend EIGEN_QT_SUPPORT to Qt6
2021-10-23 23:43:06 +00:00
Lennart Steffen
163f11e24a
Included note on inner stride for compile-time vectors. See https://gitlab.com/libeigen/eigen/-/issues/2355#note_711078126
2021-10-22 09:46:43 +00:00
Rasmus Munk Larsen
2d3fec8ff6
Add nan-propagation options to matrix and array plugins.
2021-10-21 19:40:11 +00:00
Antonio Sanchez
b86e013321
Revert bit_cast to use memcpy for CUDA.
...
To elide the memcpy, we need to first load the `src` value into
registers by making a local copy. This avoids the need to resort
to potential UB by using `reinterpret_cast`.
This change doesn't seem to affect CPU (at least not with gcc/clang).
With optimizations on, the copy is also elided.
2021-10-21 08:14:11 -07:00
Antonio Sanchez
45e67a6fda
Use reinterpret_cast on GPU for bit_cast.
...
This seems to be the recommended approach for doing type punning in
CUDA. See for example
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/
(the latter puns a double to an `int2`).
The issue is that for CUDA, the `memcpy` is not elided, and ends up
being an expensive operation. We already have similar `reintepret_cast`s across
the Eigen codebase for GPU (as does TensorFlow).
2021-10-20 21:34:40 +00:00
Antonio Sanchez
95bb645e92
Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation.
...
Looks like we need to update the
`EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` for newer versions of MSVC as
well when compiling with NVCC. Fixes build issues for VS 2017.
2021-10-20 19:38:14 +00:00
Antonio Sanchez
fd5f48e465
Fix tuple compilation for VS2017.
...
VS2017 doesn't like deducing alias types, leading to a bunch of compile
errors for functions involving the `tuple` alias. Replacing with
`TupleImpl` seems to solve this, allowing the test to compile/pass.
2021-10-20 19:18:34 +00:00
Antonio Sanchez
d0d34524a1
Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h
...
The `Complex.h` file applies equally to HIP/CUDA, so placing under the
generic `GPU` folder.
The `TensorReductionCuda.h` has already been deprecated, now removing
for the next Eigen version.
2021-10-20 12:00:19 -07:00
Rasmus Munk Larsen
f2c9c2d2f7
Vectorize Visitor.h.
2021-10-20 16:58:01 +00:00