diff --git a/doc/C01_QuickStartGuide.dox b/doc/C01_QuickStartGuide.dox
index 2f8e9f5c8..3d98e14f5 100644
--- a/doc/C01_QuickStartGuide.dox
+++ b/doc/C01_QuickStartGuide.dox
@@ -24,6 +24,7 @@ namespace Eigen {
- \ref TutorialCoreTransposeAdjoint
- \ref TutorialCoreDotNorm
- \ref TutorialCoreTriangularMatrix
+ - \ref TutorialCoreSelfadjointMatrix
- \ref TutorialCoreSpecialTopics
\n
@@ -577,35 +578,88 @@ vec1.normalize();\endcode
top\section TutorialCoreTriangularMatrix Dealing with triangular matrices
-Read/write access to special parts of a matrix can be achieved. See \link MatrixBase::part() const this \endlink for read access and \link MatrixBase::part() this \endlink for write access..
+Currently, Eigen does not provide any explcit triangular matrix, with storage class. Instead, we
+can reference a triangular part of a square matrix or expression to perform special treatment on it.
+This is achieved by the class TriangularView and the MatrixBase::triangularView template function.
+Note that the opposite triangular part of the matrix is never referenced, and so it can, e.g., store
+a second triangular matrix.
|
-Extract triangular matrices \n from a given matrix m:
+Reference a read/write triangular part of a given \n
+matrix (or expression) m with optional unit diagonal:
| \code
-m.part()
-m.part()
-m.part()
-m.part()
-m.part()
-m.part()\endcode
+m.triangularView()
+m.triangularView()
+m.triangularView()
+m.triangularView()\endcode
|
|
-Write to triangular parts \n of a matrix m:
+Writting to a specific triangular part:\n (only the referenced triangular part is evaluated)
| \code
-m1.part() = m2;
-m1.part() = m2;
-m1.part() = m2;
-m1.part() = m2;\endcode
+m1.triangularView() = m2 + m3 \endcode
|
|
-Special: take advantage of symmetry \n (selfadjointness) when copying \n an expression into a matrix
+Convertion to a dense matrix setting the opposite triangular part to zero:
| \code
-m.part() = someSelfadjointMatrix;
-m1.part() = m2 + m2.adjoint(); // m2 + m2.adjoint() is selfadjoint \endcode
+m2 = m1.triangularView()\endcode
|
+|
+Products:
+ | \code
+m3 += s1 * m1.adjoint().triangularView() * m2
+m3 -= s1 * m2.conjugate() * m1.adjoint().triangularView() \endcode
+ |
+|
+Solving linear equations:\n(\f$ m_2 := m_1^{-1} m_2 \f$)
+ | \code
+m1.triangularView().solveInPlace(m2)
+m1.adjoint().triangularView().solveInPlace(m2)\endcode
+ |
+
+
+|
+Conversion to a dense matrix:
+ | \code
+m2 = m.selfadjointView();\endcode
+ |
+|
+Product with another general matrix or vector:
+ | \code
+m3 = s1 * m1.conjugate().selfadjointView() * m3;
+m3 -= s1 * m3.adjoint() * m1.selfadjointView();\endcode
+ |
+|
+Rank 1 and rank K update:
+ | \code
+// fast version of m1 += s1 * m2 * m2.adjoint():
+m1.selfadjointView().rankUpdate(m2,s1);
+// fast version of m1 -= m2.adjoint() * m2:
+m1.selfadjointView().rankUpdate(m2.adjoint(),-1); \endcode
+ |
+|
+Rank 2 update: (\f$ m += s u v^* + s v u^* \f$)
+ | \code
+m.selfadjointView().rankUpdate(u,v,s);
+\endcode
+ |
+|
+Solving linear equations:\n(\f$ m_2 := m_1^{-1} m_2 \f$)
+ | \code
+// via a standard Cholesky factorization
+m1.selfadjointView().llt().solveInPlace(m2);
+// via a Cholesky factorization with pivoting
+m1.selfadjointView().ldlt().solveInPlace(m2);
+\endcode
+ |
diff --git a/doc/I02_HiPerformance.dox b/doc/I02_HiPerformance.dox
index 012e7d71b..8f23b5d19 100644
--- a/doc/I02_HiPerformance.dox
+++ b/doc/I02_HiPerformance.dox
@@ -3,6 +3,99 @@ namespace Eigen {
/** \page HiPerformance Advanced - Using Eigen with high performance
+In general achieving good performance with Eigen does no require any special effort:
+simply write your expressions in the most high level way. This is especially true
+for small fixed size matrices. For large matrices, however, it might useful to
+take some care when writing your expressions in order to minimize useless evaluations
+and optimize the performance.
+In this page we will give a brief overview of the Eigen's internal mechanism to simplify
+and evaluate complex expressions, and discuss the current limitations.
+In particular we will focus on expressions matching level 2 and 3 BLAS routines, i.e,
+all kind of matrix products and triangular solvers.
+
+Indeed, in Eigen we have implemented a set of highly optimized routines which are very similar
+to BLAS's ones. Unlike BLAS, those routines are made available to user via a high level and
+natural API. Each of these routines can perform in a single evaluation a wide variety of expressions.
+Given an expression, the challenge is then to map it to a minimal set of primitives.
+As explained latter, this mechanism has some limitations, and knowing them will allow
+you to write faster code by making your expressions more Eigen friendly.
+
+\section GEMM General Matrix-Matrix product (GEMM)
+
+Let's start with the most common primitive: the matrix product of general dense matrices.
+In the BLAS world this corresponds to the GEMM routine. Our equivalent primitive can
+perform the following operation:
+\f$ C += \alpha op1(A) * op2(B) \f$
+where A, B, and C are column and/or row major matrices (or sub-matrices),
+alpha is a scalar value, and op1, op2 can be transpose, adjoint, conjugate, or the identity.
+When Eigen detects a matrix product, it analyzes both sides of the product to extract a
+unique scalar factor alpha, and for each side its effective storage (order and shape) and conjugate state.
+More precisely each side is simplified by iteratively removing trivial expressions such as scalar multiple,
+negate and conjugate. Transpose and Block expressions are not evaluated and only modify the storage order
+and shape. All other expressions are immediately evaluated.
+For instance, the following expression:
+\code m1 -= (s1 * m2.adjoint() * (-(s3*m3).conjugate()*s2)).lazy() \endcode
+is automatically simplified to:
+\code m1 += (s1*s2*conj(s3)) * m2.adjoint() * m3.conjugate() \endcode
+which exactly matches our GEMM routine.
+
+\subsection GEMM_Limitations Limitations
+Unfortunately, this simplification mechanism is not perfect yet and not all expressions which could be
+handled by a single GEMM-like call are correctly detected.
+
+
+| Not optimal expression |
+Evaluated as |
+Optimal version (single evaluation) |
+Comments |
+
+
+| \code m1 += m2 * m3; \endcode |
+\code temp = m2 * m3; m1 += temp; \endcode |
+\code m1 += (m2 * m3).lazy(); \endcode |
+Use .lazy() to tell Eigen the result and right-hand-sides do not alias. |
+
+
+| \code m1 += (s1 * (m2 * m3)).lazy(); \endcode |
+\code temp = (m2 * m3).lazy(); m1 += s1 * temp; \endcode |
+\code m1 += (s1 * m2 * m3).lazy(); \endcode |
+This is because m2 * m3 is immediately evaluated by the scalar product.
+ Make sure the matrix product is the top most expression. |
+
+
+| \code m1 = m1 + m2 * m3; \endcode |
+\code temp = (m2 * m3).lazy(); m1 = m1 + temp; \endcode |
+\code m1 += (m2 * m3).lazy(); \endcode |
+Here there is no way to detect at compile time that the two m1 are the same,
+ and so the matrix product will be immediately evaluated. |
+
+
+| \code m1 += ((s1 * m2).transpose() * m3).lazy(); \endcode |
+\code temp = (s1*m2).transpose(); m1 = (temp * m3).lazy(); \endcode |
+\code m1 += (s1 * m2.transpose() * m3).lazy(); \endcode |
+This is because our expression analyzer stops at the first expression which cannot
+ be converted to a scalar multiple of a conjugate and therefore the nested scalar
+ multiple cannot be properly extracted. |
+
+
+| \code m1 += (m2.conjugate().transpose() * m3).lazy(); \endcode |
+\code temp = m2.conjugate().transpose(); m1 += (temp * m3).lazy(); \endcode |
+\code m1 += (m2.adjoint() * m3).lazy(); \endcode |
+Same reason. Use .adjoint() or .transpose().conjugate() |
+
+
+| \code m1 += ((s1*m2).block(....) * m3).lazy(); \endcode |
+\code temp = (s1*m2).block(....); m1 += (temp * m3).lazy(); \endcode |
+\code m1 += (s1 * m2.block(....) * m3).lazy(); \endcode |
+Same reason. |
+
+
+
+Of course all these remarks hold for all other kind of products that we will describe in the following paragraphs.
+
+
+
+
| BLAS equivalent routine |
@@ -20,7 +113,7 @@ namespace Eigen {
GEMM |
m1 += s1 * m2.adjoint() * m3 |
m1 += (s1 * m2).adjoint() * m3 |
-This is because our expression analyser stops at the first transpose expression and cannot extract the nested scalar multiple. |
+This is because our expression analyzer stops at the first transpose expression and cannot extract the nested scalar multiple. |
| GEMM |
@@ -36,7 +129,7 @@ namespace Eigen {
| SYR |
-m.sefadjointView().rankUpdate(v,s) |
+m.seductive().rankUpdate(v,s) |
|
Computes m += s * v * v.adjoint() |