eigen

CFD/eigen

Author	SHA1	Message	Date
Rasmus Munk Larsen	f6c51d9209	Fix missing header inclusion and colliding definitions for half type casting, which broke build with -march=native on Haswell/Skylake.	2019-08-30 14:03:29 -07:00
Rasmus Munk Larsen	b021cdea6d	Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories.	2019-08-27 11:30:31 -07:00
Mehdi Goli	16a56b2ddd	[SYCL] This PR adds the minimum modifications to Eigen core required to run Eigen unsupported modules on devices supporting SYCL. * Adding SYCL memory model * Enabling/Disabling SYCL backend in Core * Supporting Vectorization	2019-06-27 12:25:09 +01:00
Gael Guennebaud	c53eececb0	Implement AVX512 vectorization of std::complex<float/double>	2018-12-06 15:58:06 +01:00
Gael Guennebaud	3fba59ea59	temporarily re-disable SSE/AVX vectorization of complex<> on AVX512 -> this needs to be fixed though!	2018-12-06 00:13:26 +01:00
Gael Guennebaud	f91500d303	Fix pandnot order in AVX512	2018-11-30 14:32:06 +01:00
Gael Guennebaud	aa6097395b	Add missing SSE/AVX type-casting in AVX512 mode	2018-11-28 16:09:08 +01:00
Gael Guennebaud	b131a4db24	bug #1631 : fix compilation with ARM NEON and clang, and cleanup the weird pshiftright_and_cast and pcast_and_shiftleft functions.	2018-11-27 23:45:00 +01:00
Gael Guennebaud	fa7fd61eda	Unify SSE/AVX psin functions. It is based on the SSE version which is much more accurate, though very slightly slower. This changeset also includes the following required changes: - add packet-float to packet-int type traits - add packet float<->int reinterpret casts - add faster pselect for AVX based on blendv	2018-11-27 22:41:51 +01:00
Christian von Schultz	4a40b3785d	Collapsed revision (based on pull request PR-325) * Support compiling without IO streams Add the preprocessor definition EIGEN_NO_IO which, if defined, disables all use of the IO streams part of the standard library.	2018-10-22 21:14:40 +02:00
Gael Guennebaud	1dd1f8e454	bug #65 : add vectorization of partial reductions along the outer-dimension, for instance: colmajor_mat.rowwise().mean()	2018-10-09 23:36:50 +02:00
Gael Guennebaud	b0c66adfb1	bug #231 : initial implementation of STL iterators for dense expressions	2018-10-01 23:21:37 +02:00
Gael Guennebaud	a488d59787	merge with default Eigen	2018-09-21 11:51:49 +02:00
Mehdi Goli	01358300d5	Creating separate SYCL required PR for uncontroversial files.	2018-08-03 16:59:15 +01:00
Rasmus Munk Larsen	2ebcb911b2	Add pcast packet op for NEON.	2018-07-26 14:28:48 -07:00
Alexey Frunze	1f523e7304	Add MIPS changes missing from previous merge.	2018-07-18 12:27:50 -07:00
Gael Guennebaud	308725c3c9	More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA	2018-07-18 13:51:36 +02:00
Gael Guennebaud	86d9c0255c	Forward declaring std::array does not work with all std libs, so let's just include <array>	2018-07-13 13:06:44 +02:00
Gael Guennebaud	006e18e52b	Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h	2018-07-12 16:57:41 +02:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Deven Desai	b6cc0961b1	updates based on PR feedback There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`	2018-06-14 10:21:54 -04:00
Andrea Bocci	f7124b3e46	Extend CUDA support to matrix inversion and selfadjointeigensolver	2018-06-11 18:33:24 +02:00
Deven Desai	8fbd47052b	Adding support for using Eigen in HIP kernels. This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.	2018-06-06 10:12:58 -04:00
Gael Guennebaud	647b724a36	Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted)	2018-05-29 20:46:46 +02:00
Gael Guennebaud	40b4bf3d32	AVX512: _mm512_rsqrt28_ps is available for AVX512ER only	2018-04-03 14:36:27 +02:00
luz.paz	e3912f5e63	MIsc. source and comment typos Found using `codespell` and `grep` from downstream FreeCAD	2018-03-11 10:01:44 -04:00
nluehr	f9bdcea022	For cuda 9.1 replace math_functions.hpp with cuda_runtime.h	2017-12-18 16:51:15 -08:00
Benoit Steiner	a4089991eb	Added support for CUDA 9.0.	2017-08-31 02:49:39 +00:00
Gael Guennebaud	21633e585b	bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER	2017-08-24 11:06:47 +02:00
Gael Guennebaud	b0f55ef85a	merge	2017-02-21 17:04:10 +01:00
Gael Guennebaud	3d200257d7	Add support for automatic-size deduction in reshaped, e.g.: mat.reshaped(4,AutoSize); <-> mat.reshaped(4,mat.size()/4);	2017-02-21 15:57:25 +01:00
Gael Guennebaud	e7ebe52bfb	bug #1391 : include IO.h before DenseBase to enable its usage in DenseBase plugins.	2017-02-13 09:46:20 +01:00
Gael Guennebaud	24409f3acd	Use fix<> API to specify compile-time reshaped sizes.	2017-01-29 15:20:35 +01:00
Gael Guennebaud	9036cda364	Cleanup intitial reshape implementation: - reshape -> reshaped - make it compatible with evaluators.	2017-01-29 14:57:45 +01:00
Gael Guennebaud	0e89baa5d8	import yoco xiao's work on reshape	2017-01-29 14:29:31 +01:00
Gael Guennebaud	25a1703579	Merged in ggael/eigen-flexidexing (pull request PR-294) generalized operator() for indexed access and slicing	2017-01-26 08:04:23 +00:00
Gael Guennebaud	b0db4eff36	bug #1382 : move using std::size_t/ptrdiff_t to Eigen's namespace (still better than the global namespace!)	2017-01-23 22:03:57 +01:00
Gael Guennebaud	7691723e34	Add support for fixed-value in symbolic expression, c++11 only for now.	2017-01-19 19:25:29 +01:00
Benoit Steiner	924600a0e8	Made sure that enabling avx2 instructions enables avx and sse instructions as well.	2017-01-19 09:54:48 -08:00
Gael Guennebaud	4989922be2	Add support for symbolic expressions as arguments of operator()	2017-01-16 22:21:23 +01:00
Gael Guennebaud	752bd92ba5	Large code refactoring: - generalize some utilities and move them to Meta (size(), array_size()) - move handling of all and single indices to IndexedViewHelper.h - several cleanup changes	2017-01-11 17:24:02 +01:00
Gael Guennebaud	b1dc0fa813	Move fix and symbolic to their own file, and improve doxygen compatibility	2017-01-11 14:28:28 +01:00
Gael Guennebaud	ac7e4ac9c0	Initial commit to add a generic indexed-based view of matrices. This version already works as a read-only expression. Numerous refactoring, renaming, extension, tuning passes are expected...	2017-01-06 00:01:44 +01:00
Gael Guennebaud	3182bdbae6	Disable vectorization when compiled by nvcc, even is EIGEN_NO_CUDA is defined	2017-07-17 11:01:28 +02:00
Gael Guennebaud	bbd97b4095	Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases	2017-07-17 01:02:51 +02:00
Gael Guennebaud	b240080e64	bug #1436 : fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing.	2017-06-15 10:16:30 +02:00
Benoit Steiner	fb1d0138ec	Include SSE packet instructions when compiling with avx512 enabled.	2016-12-19 07:32:48 -08:00
Benoit Steiner	81151bd474	Fixed merge conflicts	2016-11-19 19:12:59 -08:00
Benoit Steiner	7c30078b9f	Merged eigen/eigen into default	2016-11-17 22:53:37 -08:00
Luke Iwanski	c5130dedbe	Specialised basic math functions for SYCL device.	2016-11-17 11:47:13 +00:00
Benoit Steiner	f2e8b73256	Enable the use of AVX512 instruction by default	2016-11-16 21:28:04 -08:00
Benoit Steiner	d46a36cc84	Merged eigen/eigen into default	2016-11-04 18:22:55 -07:00
Mehdi Goli	0ebe3808ca	Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;	2016-11-04 18:18:19 +00:00
Benoit Steiner	5c3995769c	Improved AVX512 configuration	2016-11-03 04:50:28 -07:00
Benoit Steiner	ca0ba0d9a4	Improved AVX512 support	2016-11-03 04:00:49 -07:00
Benoit Steiner	c80587c92b	Merged eigen/eigen into default	2016-11-03 03:55:11 -07:00
Benoit Steiner	0585b2965d	Disable vectorization on device only when compiling for sycl	2016-11-02 11:44:27 -07:00
Benoit Steiner	cf20b30d65	Merge latest updates from trunk	2016-10-20 09:42:05 -07:00
Benoit Steiner	d3943cd50c	Fixed a few typos in the ternary tensor expressions types	2016-10-19 12:56:12 -07:00
Mehdi Goli	8fb162fc85	Fixing the typo regarding missing #if needed for proper handling of exceptions in Eigen/Core.	2016-10-16 12:52:34 +01:00
Mehdi Goli	15380f9a87	Applyiing Benoit's comment to return the missing line back in Eigen/Core	2016-10-14 16:39:41 +01:00
Mehdi Goli	524fa4c46f	Reducing the code by generalising sycl backend functions/structs.	2016-10-14 12:09:55 +01:00
Benoit Steiner	9f3276981c	Enabling AVX512 should also enable AVX2.	2016-10-06 10:29:48 -07:00
Benoit Steiner	78b569f685	Merged latest updates from trunk	2016-10-05 18:48:55 -07:00
Benoit Steiner	ae1385c7e4	Pull the latest updates from trunk	2016-10-05 14:54:36 -07:00
Benoit Steiner	409e887d78	Added support for constand std::complex numbers on GPU	2016-10-03 11:06:24 -07:00
RJ Ryan	b2c6dc48d9	Add CUDA-specific std::complex<T> specializations for scalar_sum_op, scalar_difference_op, scalar_product_op, and scalar_quotient_op.	2016-09-20 07:18:20 -07:00
Luke Iwanski	b91e021172	Merged with default.	2016-09-19 14:03:54 +01:00
Luke Iwanski	cb81975714	Partial OpenCL support via SYCL compatible with ComputeCpp CE.	2016-09-19 12:44:13 +01:00
Gael Guennebaud	a4c266f827	Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only.	2016-08-23 14:23:08 +02:00
Gael Guennebaud	2f7e2614e7	bug #1232 : refactor special functions as a new SpecialFunctions module, currently in unsupported/.	2016-07-08 11:13:55 +02:00
Eugene Brevdo	39baff850c	Add TernaryFunctors and the betainc SpecialFunction. TernaryFunctors and their executors allow operations on 3-tuples of inputs. API fully implemented for Arrays and Tensors based on binary functors. Ported the cephes betainc function (regularized incomplete beta integral) to Eigen, with support for CPU and GPU, floats, doubles, and half types. Added unit tests in array.cpp and cxx11_tensor_cuda.cu Collapsed revision * Merged helper methods for betainc across floats and doubles. * Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase. * Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper. * betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup. * Update TernaryOp and SpecialFunctions (betainc) based on review comments.	2016-06-02 17:04:19 -07:00
Gael Guennebaud	8d97ba6b22	bug #725 : make move ctor/assignment noexcept.	2016-06-03 14:28:25 +02:00
Benoit Steiner	3d0741f027	Include mmintrin.h to make it possible to use mmx instructions when needed. For example, this will enable the definition of a half packet for the Packet4f type.	2016-05-23 20:43:48 -07:00
Benoit Steiner	7d980d74e5	Started to vectorize the processing of 16bit floats on CPU.	2016-05-23 15:21:40 -07:00
Benoit Steiner	07a247dcf4	Pulled latest updates from upstream	2016-04-29 13:41:26 -07:00
Benoit Steiner	80200a1828	Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when compiling with clang since it's unclear which versions of clang actually support these instruction.	2016-04-20 12:10:27 -07:00
Benoit Steiner	1d0238375d	Made sure all the required header files are included when trying to use fp16	2016-04-19 17:44:12 -07:00
Rasmus Larsen	6498dadc2f	Merged eigen/eigen into default	2016-04-11 17:42:05 -07:00
Benoit Steiner	d6e596174d	Pull latest updates from upstream	2016-04-11 17:20:17 -07:00
Gael Guennebaud	fec4c334ba	Remove all references to MKL in BLAS wrappers.	2016-04-11 16:04:09 +02:00
Rasmus Larsen	c34e55c62b	Merged eigen/eigen into default	2016-04-07 20:23:03 -07:00
Benoit Steiner	532fdf24cb	Added support for hardware conversion between fp16 and full floats whenever possible.	2016-04-06 17:11:31 -07:00
Konstantinos Margaritis	2bba4ee2cf	Merged kmargar/eigen/tip into default	2016-04-05 22:22:08 +03:00
Konstantinos Margaritis	988344daf1	enable the other includes as well	2016-04-05 05:59:30 -04:00
Rasmus Larsen	30242b7565	Merged eigen/eigen into default	2016-04-01 17:19:36 -07:00
Rasmus Munk Larsen	1aa89fb855	Add matrix condition estimator module that implements the Higham/Hager algorithm from http://www.maths.manchester.ac.uk/~higham/narep/narep135.pdf used in LPACK. Add rcond() methods to FullPivLU and PartialPivLU.	2016-04-01 10:27:59 -07:00
Benoit Steiner	4f1a7e51c1	Pull math functions from the global namespace only when compiling cuda code with nvcc. When compiling with clang, we want to use the std namespace.	2016-03-30 17:59:49 -07:00
Konstantinos Margaritis	ed6b9d08f1	some primitives ported, but missing intrinsics and crash with asm() are a problem	2016-03-27 18:47:49 -04:00
Benoit Steiner	048c4d6efd	Made half floats usable on hardware that doesn't support them natively.	2016-03-11 17:21:42 -08:00
Benoit Steiner	ac5d706a94	Added support for simple coefficient wise tensor expression using half floats on CUDA devices	2016-02-19 08:19:12 +00:00
Benoit Steiner	0606a0a39b	FP16 on CUDA are only available starting with cuda 7.5. Disable them when using an older version of CUDA	2016-02-18 23:15:23 -08:00
Benoit Steiner	7151bd8768	Reverted unintended changes introduced by a bad merge	2016-02-19 06:20:50 +00:00
Benoit Steiner	17b9fbed34	Added preliminary support for half floats on CUDA GPU. For now we can simply convert floats into half floats and vice versa	2016-02-19 06:16:07 +00:00
Benoit Steiner	6c9cf117c1	Fixed indentation	2016-02-04 10:34:10 -08:00
Benoit Steiner	9a415fb1e2	Preliminary support for AVX512	2015-12-10 15:34:57 -08:00
Eugene Brevdo	fa4f933c0f	Add special functions to Eigen: lgamma, erf, erfc. Includes CUDA support and unit tests.	2015-12-07 15:24:49 -08:00
Gael Guennebaud	0bb12fa614	Add LU::transpose().solve() and LU::adjoint().solve() API.	2015-12-01 14:38:47 +01:00
Gael Guennebaud	d866279364	Clean a bit the implementation of inverse permutations	2015-10-08 18:36:39 +02:00

1 2 3 4 5 ...

461 Commits