eigen

CFD/eigen

Author	SHA1	Message	Date
Chip Kerchner	211c5dfc67	Add optional offset parameter to ploadu_partial and pstoreu_partial	2023-06-23 19:53:05 +00:00
Charles Schlosser	44c20bbbe3	rint round floor ceil	2023-06-23 16:29:16 +00:00
Charles Schlosser	387175c258	Fix safe_abs in int_pow	2023-06-23 04:12:41 +00:00
Charles Schlosser	969c31eefc	Fix AVX pstore	2023-06-15 01:47:38 +00:00
wilfried.karel	6c1411e521	define a move constructor for Ref<const...>	2023-06-14 20:10:51 +00:00
wilfried.karel	d8f3eb87bf	Compile- and run-time assertions for the construction of Ref<const>.	2023-06-14 15:49:58 +00:00
Charles Schlosser	59b3ef5409	Partially Vectorize Cast	2023-06-09 16:54:31 +00:00
Rasmus Munk Larsen	7d7576f326	Avoid underflow in prsqrt.	2023-06-06 14:06:19 -07:00
Charles Schlosser	b7151ffaab	Fix unary pow error handling and test	2023-06-06 18:46:55 +00:00
Rasmus Munk Larsen	7ac8897431	Reduce max relative error of prsqrt from 3 to 2 ulps.	2023-06-04 22:25:33 +00:00
Charles Schlosser	1d80e23186	Optimize scalar_unary_pow_op error handling	2023-06-02 18:53:06 +00:00
Alexander Shaposhnikov	316eab8deb	Do not set EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC for cuda compilation	2023-05-31 15:15:06 +00:00
Rasmus Munk Larsen	8c43bf2b5b	Clean up Redux.h and fix vectorization_logic test after changes to traversal order in Redux.	2023-05-24 20:26:52 +00:00
Charles Schlosser	da6a71faf0	Add linear redux evaluators	2023-05-24 17:07:25 +00:00
Charles Schlosser	67a1e881d9	Sparse matrix column/row removal	2023-05-24 17:04:45 +00:00
Rasmus Munk Larsen	de1c884687	Add reference to writeup of approach used in canonicalEulerAngles.	2023-05-24 15:52:26 +00:00
Charles Schlosser	307a417e1c	Fix unrolled assignment evaluator	2023-05-22 16:39:24 +00:00
Juraj Oršulić	c18f94e3b0	Geometry/EulerAngles: introduce canonicalEulerAngles	2023-05-19 15:42:22 +00:00
Charles Schlosser	7d9bb90f15	SVD: fix numerous compiler warnings / failures	2023-05-15 16:56:47 +00:00
Rasmus Munk Larsen	96c42771d6	Make it possible to override the synchonization primitives used by the threadpool using macros.	2023-05-09 19:36:17 +00:00
Rasmus Munk Larsen	1321821e86	Add missing braces in Umeyama.h	2023-05-09 19:10:50 +00:00
Rasmus Munk Larsen	524c329ab2	Work around compiler bug in Umeyama.h.	2023-05-09 18:53:56 +00:00
Charles Schlosser	fbf7189bd5	Fix cuda compilation	2023-05-08 16:15:47 +00:00
Mehdi Goli	0623791930	[SYCL-2020] Enabling USM support for SYCL. SYCL-1.2.1 did not have support for USM.	2023-05-05 17:30:36 +00:00
Tobias Wood	94f57867fe	Thread pool	2023-05-05 16:23:34 +00:00
Charles Schlosser	725c11719b	Visitor: fix modulo by zero compiler warning	2023-05-04 18:21:09 +00:00
Chip Kerchner	b8208b363c	Specialized loadColData correctly - fix previous BF16 GEMV MR	2023-05-04 16:38:17 +00:00
Chip Kerchner	fda1373a15	Fix ColMajor BF16 GEMV for when vector is RowMajor	2023-05-03 20:12:50 +00:00
Charles Schlosser	fdc749de2a	JacobiSVD: set m_nonzeroSingularValues to zero if not finite	2023-05-02 17:48:21 +00:00
Chip Kerchner	6418ac0285	Unroll F32 to BF16 loop - 1.8X faster conversions for LLVM. Use vector pairs for GCC.	2023-05-01 16:54:16 +00:00
Charles Schlosser	c9a14f48d9	SSE Packet4ui has pcmp, pmin, pmax	2023-04-28 20:36:08 +00:00
Rasmus Munk Larsen	0b51f763cb	Revert "Geometry/EulerAngles: make sure that returned solution has canonical ranges" This reverts commit `7f06bcae2c`	2023-04-27 00:06:23 +00:00
Antonio Sánchez	2d0c6ad873	Revert "Vectorize cast" This reverts commit `eb5ff1861a`	2023-04-26 18:03:36 +00:00
Charles Schlosser	8999525c29	AVX2: Packet4ul has pmul, abs2	2023-04-26 16:22:16 +00:00
Charles Schlosser	eb5ff1861a	Vectorize cast	2023-04-26 02:50:13 +00:00
Antonio Sánchez	3918768be1	Fix sparse iterator and tests.	2023-04-25 19:05:49 +00:00
Charles Schlosser	f6cf5dca80	Packet4ul does not have Abs2	2023-04-21 19:48:01 +00:00
Chip Kerchner	03f646b7e3	New VSX version of BF16 GEMV (Power) - up to 6.7X faster	2023-04-21 17:06:59 +00:00
Charles Schlosser	29c8e3c754	fix pow for uint32_t, disable pmul<Packet4ul>	2023-04-21 05:47:56 +00:00
Juraj Oršulić	7f06bcae2c	Geometry/EulerAngles: make sure that returned solution has canonical ranges	2023-04-19 19:12:24 +00:00
Rasmus Munk Larsen	a347dbbab2	Delete last few occurences of HasHalfPacket.	2023-04-19 10:36:59 -07:00
Charles Schlosser	2b954be663	fix typo in sse packetmath	2023-04-18 18:17:41 +00:00
Rasmus Munk Larsen	25685c90ad	Fix incorrect packet type for unsigned int version of pfirst() in MSVC workaround in PacketMath.h.	2023-04-18 17:46:23 +00:00
Chip Kerchner	3f3ce214e6	New BF16 pcast functions and move type casting to TypeCasting.h	2023-04-18 02:38:38 +00:00
Pedro Gonnet	17b5b4de58	Add `Packet4ui`, `Packet8ui`, and `Packet4ul` to the `SSE`/`AVX` `PacketMath.h` headers	2023-04-17 23:33:59 +00:00
Charles Schlosser	87300c93ca	Refactor IndexedView	2023-04-17 12:32:50 +00:00
Chip Kerchner	1148f0a9ec	Add dynamic dispatch to BF16 GEMM (Power) and new VSX version	2023-04-14 22:20:42 +00:00
Rasmus Munk Larsen	554fe02ae3	Enable new AVX512 GEMM kernel by default.	2023-04-12 13:39:06 -07:00
Charles Schlosser	0d12fcc34e	Insert from triplets	2023-04-12 20:01:48 +00:00
b-shi	15fbddaf9b	ASAN fixes for AVX512 GEMM/TRSM	2023-04-04 15:54:24 -07:00
Charles Schlosser	178ef8c97f	qualify non-const symbolic indexed view with is_lvalue	2023-04-04 19:06:32 +00:00
Rasmus Munk Larsen	df1049ddf4	Small packet math cleanup.	2023-04-04 16:14:32 +00:00
Antoine Hoarau	9b48d10215	Guard all malloc, realloc and free() fonctions with check_that_malloc_is_allowed()	2023-04-04 04:24:22 +00:00
Rasmus Munk Larsen	c730290fa0	Use the correct truncating intrinsic for double->int casting.	2023-04-03 13:56:41 -07:00
Charles Schlosser	766db02020	disable raw array indexed view access for 1d arrays	2023-03-29 02:39:45 +00:00
Charles Schlosser	bfbc66e078	refactor indexedviewmethods, enable non-const ref access with symbolic indices	2023-03-29 01:35:26 +00:00
Rasmus Munk Larsen	1a5dfd7c0f	Fix incorrect casting in AVX512DQ path.	2023-03-27 09:28:06 -07:00
Charles Schlosser	a08649994f	Optimize generic_rsqrt_newton_step	2023-03-24 22:42:57 +00:00
Rasmus Munk Larsen	b8b8a26145	Add more missing vectorized casts for int on x86, and remove redundant unit tests	2023-03-24 16:02:00 +00:00
unageek	33e206f714	Remove unused declarations of BLAS/LAPACK routines	2023-03-23 21:54:05 +00:00
Rasmus Munk Larsen	d57a79e512	Optimize float->bool cast for AVX2, based on Charles Schlosser's comments.	2023-03-21 20:59:25 -07:00
Rasmus Munk Larsen	a5ae832773	Fix reversal of arguments to _mm256_set_m128() in pcast<Packet4d, Packet8f>.	2023-03-22 03:21:44 +00:00
Rasmus Munk Larsen	09945f2cc1	Optimize casting for x86_64.	2023-03-21 18:24:16 +00:00
Colin Broderick	8f9b8e3630	Replaced all instances of internal::(U)IntPtr with std::(u)intptr_t. Remove ICC workaround.	2023-03-21 16:50:23 +00:00
Antonio Sánchez	2c8011c2dd	Fix arm builds.	2023-03-20 16:59:38 +00:00
Charles Schlosser	fd8f410bbe	Fix 2624 2625	2023-03-20 16:30:04 +00:00
Jonas Schulze	81cb6a51d0	Fix some typos	2023-03-16 23:11:43 +00:00
Rasmus Munk Larsen	0488b708b4	Vectorize tensor.isnan() by using typed predicates.	2023-03-16 04:04:22 +00:00
Rasmus Munk Larsen	f02856c640	Use EIGEN_NOT_A_MACRO macro (oh the irony!) to avoid build issue in TensorFlow.	2023-03-15 11:42:57 -07:00
Rasmus Munk Larsen	690ae9502f	Use C++11 standard features for detecting presence of Inf and NaN	2023-03-15 16:52:44 +00:00
Chip Kerchner	d71ac6a755	Fix recent PowerPC warnings and clang warning	2023-03-15 16:50:46 +00:00
Chip Kerchner	23e1541863	Put deadcode checks back in from previous change.	2023-03-14 00:57:16 +00:00
Chip Kerchner	6c58f0fe1f	Revert changes that made BF16 GEMM to cause bad register spillage for LLVM (Power)	2023-03-13 23:36:06 +00:00
Rasmus Munk Larsen	79de101d23	Handle PropagateFast the same way as PropagateNaN in minmax visitor to	2023-03-13 20:47:11 +00:00
Chip Kerchner	9d72412385	Add MMA to BF16 GEMV - 5.0-6.3X faster (for Power)	2023-03-13 19:37:13 +00:00
Rasmus Munk Larsen	2067b54b13	Fix bug in minmax_coeff_visitor for matrix of all NaNs.	2023-03-13 18:25:22 +00:00
Rasmus Munk Larsen	ee0ff0ab3a	Fix typo in MathFunctions.h	2023-03-13 15:50:40 +00:00
Rasmus Munk Larsen	21c49e8f8e	Delete mystery character from Eigen/src/Core/arch/NEON/MathFunctions.h	2023-03-10 23:27:24 +00:00
Rasmus Munk Larsen	6bb9609bcb	Make new Select implementation backwards compatible.	2023-03-10 23:07:47 +00:00
Antonio Sánchez	394aabb0a3	Fix failing MSVC tests due to compiler bugs.	2023-03-10 22:36:57 +00:00
Rasmus Munk Larsen	d6235d76db	Clean up generic packetmath specializations for various backends with the help of a macro.	2023-03-10 22:02:23 +00:00
Rasmus Munk Larsen	e8fdf127c6	Work around compiler bug in Tridiagonalization.h	2023-03-10 21:21:07 +00:00
Rasmus Munk Larsen	adf26b6840	Add newline to end of file.	2023-03-10 16:53:22 +00:00
Rasmus Munk Larsen	3492d9e2e5	s/Lesser/Less/	2023-03-10 00:28:31 +00:00
Rasmus Munk Larsen	2419632cf5	Revert change to allFinite(), since the new version does not work for complex numbers.	2023-03-09 21:50:43 +00:00
Charles Schlosser	7bf2968fed	Specify Permutation Index for PartialPivLU and FullPivLU	2023-03-07 20:28:05 +00:00
Charles Schlosser	1ce8b25825	Vectorize any() / all()	2023-03-06 23:54:02 +00:00
Charles Schlosser	cb8e6d4975	Fix 2240, 2620	2023-03-06 23:11:06 +00:00
Chip Kerchner	2b513ca2a0	Added partial linear access for LHS & Output - 30% faster for bfloat16 GEMM MMA (Power)	2023-03-02 19:22:43 +00:00
Charles Schlosser	0b396c3167	Scalarize comps	2023-03-02 17:06:23 +00:00
Antonio Sánchez	62d5cfe835	Fix ODR issues with Intel's AVX512 TRSM kernels.	2023-02-27 07:54:52 +00:00
Charles Schlosser	826627f653	vectorize comparisons and select by enabling typed comparisons	2023-02-25 20:52:11 +00:00
Rasmus Munk Larsen	2e9b945baf	Fix bug that disabled vectorization for coeffMin/coeffMax.	2023-02-25 20:03:54 +00:00
Antonio Sánchez	bc5cdc7a67	Guard use of long double on GPU device.	2023-02-24 21:49:59 +00:00
Chip Kerchner	e4598fedbe	Fix compiler versions for certain instructions on Power.	2023-02-23 23:24:41 +00:00
Rasmus Munk Larsen	1c0a6cf228	Get rid of EIGEN_HAS_AVX512_MATH workaround.	2023-02-23 23:16:41 +00:00
Rasmus Munk Larsen	6bcd941ee3	Use pmsub in twoprod. This speeds up pow() on Skylake by ~1%.	2023-02-21 20:09:29 +00:00
Rasmus Munk Larsen	ce62177b5b	Vectorize atanh & add a missing definition and unit test for atan.	2023-02-21 03:14:05 +00:00
Charles Schlosser	049a144798	Add typed logicals	2023-02-18 01:23:47 +00:00
Chip Kerchner	e797974689	Add and enable Packet int divide for Power10.	2023-02-17 19:04:18 +00:00

1 2 3 4 5 ...

7147 Commits