244 lines
10 KiB
TeX
244 lines
10 KiB
TeX
%=============================================================================
|
|
%=============================================================================
|
|
|
|
\chapter{Linear-Algebraic System Interface (IJ)}
|
|
\label{ch-IJ}
|
|
|
|
The \code{IJ} interface described in this chapter is the lowest common
|
|
denominator for specifying linear systems in \hypre{}. This interface
|
|
provides access to general sparse-matrix solvers in \hypre{}, not
|
|
to the specialized solvers that require more problem information.
|
|
|
|
%-----------------------------------------------------------------------------
|
|
|
|
\section{IJ Matrix Interface}
|
|
|
|
As with the other interfaces in \hypre{}, the \code{IJ} interface
|
|
expects to get data in distributed form because this is the only
|
|
scalable approach for assembling matrices on thousands of processes.
|
|
Matrices are assumed to be distributed by blocks of rows as follows:
|
|
\begin{equation}
|
|
\left[
|
|
\begin{array}{c}
|
|
~~~~~~~~~~ A_0 ~~~~~~~~~~ \\
|
|
A_1 \\
|
|
\vdots \\
|
|
A_{P-1}
|
|
\end{array}
|
|
\right]
|
|
\end{equation}
|
|
In the above example, the matrix is distributed accross the $P$
|
|
processes, $0, 1, ..., P-1$ by blocks of rows. Each submatrix $A_p$
|
|
is ``owned'' by a single process and its first and last row numbers
|
|
are given by the global indices \code{ilower} and \code{iupper} in the
|
|
\code{Create()} call below.
|
|
|
|
The following example code illustrates the basic usage of the
|
|
\code{IJ} interface for building matrices:
|
|
\begin{display}
|
|
\begin{verbatim}
|
|
|
|
MPI_Comm comm;
|
|
HYPRE_IJMatrix ij_matrix;
|
|
HYPRE_ParCSRMatrix parcsr_matrix;
|
|
int ilower, iupper;
|
|
int jlower, jupper;
|
|
int nrows;
|
|
int *ncols;
|
|
int *rows;
|
|
int *cols;
|
|
double *values;
|
|
|
|
HYPRE_IJMatrixCreate(comm, ilower, iupper, jlower, jupper, &ij_matrix);
|
|
HYPRE_IJMatrixSetObjectType(ij_matrix, HYPRE_PARCSR);
|
|
HYPRE_IJMatrixInitialize(ij_matrix);
|
|
|
|
/* set matrix coefficients */
|
|
HYPRE_IJMatrixSetValues(ij_matrix, nrows, ncols, rows, cols, values);
|
|
...
|
|
/* add-to matrix cofficients, if desired */
|
|
HYPRE_IJMatrixAddToValues(ij_matrix, nrows, ncols, rows, cols, values);
|
|
...
|
|
|
|
HYPRE_IJMatrixAssemble(ij_matrix);
|
|
HYPRE_IJMatrixGetObject(ij_matrix, (void **) &parcsr_matrix);
|
|
|
|
\end{verbatim}
|
|
\end{display}
|
|
The \code{Create()} routine creates an empty matrix object that lives
|
|
on the \code{comm} communicator. This is a collective call (i.e.,
|
|
must be called on all processes from a common synchronization point),
|
|
with each process passing its own row extents, \code{ilower} and
|
|
\code{iupper}. The row partitioning must be contiguous, i.e.,
|
|
\code{iupper} for process \code{i} must equal \code{ilower}$-1$ for
|
|
process \code{i}$+1$. Note that this allows matrices to have 0- or
|
|
1-based indexing. The parameters \code{jlower} and \code{jupper}
|
|
define a column partitioning, and should match \code{ilower} and
|
|
\code{iupper} when solving square linear systems. See the Reference
|
|
Manual for more information.
|
|
|
|
The \code{SetObjectType()} routine sets the underlying matrix object
|
|
type to \code{HYPRE_PARCSR} (this is the only object type currently
|
|
supported). The \code{Initialize()} routine indicates that the matrix
|
|
coefficients (or values) are ready to be set. This routine may or may
|
|
not involve the allocation of memory for the coefficient data,
|
|
depending on the implementation. The optional \code{SetRowSizes()}
|
|
and \code{SetDiagOffdSizes()} routines
|
|
mentioned later in this chapter and in the Reference Manual, should be
|
|
called before this step.
|
|
|
|
The \code{SetValues()} routine sets matrix values for some number of
|
|
rows (\code{nrows}) and some number of columns in each row
|
|
(\code{ncols}). The actual row and column numbers of the matrix
|
|
\code{values} to be set are given by \code{rows} and \code{cols}.
|
|
The coefficients can be modified with the
|
|
\code{AddToValues()} routine. If \code{AddToValues()} is used to add
|
|
to a value that previously didn't exist, it will set this value.
|
|
Note that while \code{AddToValues()}
|
|
will add to values on other processors, \code{SetValues()} does not set
|
|
values on other processors. Instead if a user calls \code{SetValues()}
|
|
on processor $i$ to set a matrix coefficient belonging to processor $j$,
|
|
processor $i$ will
|
|
erase all previous occurrences of this matrix coefficient,
|
|
so they will not contribute to this coefficient on processor $j$.
|
|
The actual coefficient has to be set on processor $j$.
|
|
|
|
The \code{Assemble()} routine is a collective call, and finalizes the
|
|
matrix assembly, making the matrix ``ready to use''. The
|
|
\code{GetObject()} routine retrieves the built matrix object so that
|
|
it can be passed on to \hypre{} solvers that use the \code{ParCSR}
|
|
internal storage format. Note that this is not an expensive routine;
|
|
the matrix already exists in \code{ParCSR} storage format, and the
|
|
routine simply returns a ``handle'' or pointer to it. Although we
|
|
currently only support one underlying data storage format, in the
|
|
future several different formats may be supported.
|
|
|
|
One can preset the row sizes of the matrix in order to reduce the
|
|
execution time for the matrix specification. One can specify the
|
|
total number of coefficients for each row, the number of coefficients
|
|
in the row that couple the diagonal unknown to (\code{Diag}) unknowns
|
|
in the same processor domain, and the number of coefficients in the
|
|
row that couple the diagonal unknown to (\code{Offd}) unknowns in
|
|
other processor domains:
|
|
|
|
\begin{display}
|
|
\begin{verbatim}
|
|
|
|
HYPRE_IJMatrixSetRowSizes(ij_matrix, sizes);
|
|
HYPRE_IJMatrixSetDiagOffdSizes(matrix, diag_sizes, offdiag_sizes);
|
|
|
|
\end{verbatim}
|
|
\end{display}
|
|
|
|
Once the matrix has been assembled, the sparsity pattern cannot be
|
|
altered without completely destroying the matrix object and starting
|
|
from scratch. However, one can modify the matrix values of an already
|
|
assembled matrix. To do this, first call the \code{Initialize()}
|
|
routine to re-initialize the matrix, then set or add-to values as
|
|
before, and call the \code{Assemble()} routine to re-assemble before
|
|
using the matrix. Re-initialization and re-assembly are very cheap,
|
|
essentially a no-op in the current implementation of the code.
|
|
|
|
%-----------------------------------------------------------------------------
|
|
|
|
\section{IJ Vector Interface}
|
|
|
|
The following example code illustrates the basic usage of the
|
|
\code{IJ} interface for building vectors:
|
|
|
|
\begin{display}
|
|
\begin{verbatim}
|
|
MPI_Comm comm;
|
|
HYPRE_IJVector ij_vector;
|
|
HYPRE_ParVector par_vector;
|
|
int jlower, jupper;
|
|
int nvalues;
|
|
int *indices;
|
|
double *values;
|
|
|
|
HYPRE_IJVectorCreate(comm, jlower, jupper, &ij_vector);
|
|
HYPRE_IJVectorSetObjectType(ij_vector, HYPRE_PARCSR);
|
|
HYPRE_IJVectorInitialize(ij_vector);
|
|
|
|
/* set vector values */
|
|
HYPRE_IJVectorSetValues(ij_vector, nvalues, indices, values);
|
|
...
|
|
|
|
HYPRE_IJVectorAssemble(ij_vector);
|
|
HYPRE_IJVectorGetObject(ij_vector, (void **) &par_vector);
|
|
|
|
\end{verbatim}
|
|
\end{display}
|
|
The \code{Create()} routine creates an empty vector object that lives
|
|
on the \code{comm} communicator. This is a collective call, with each
|
|
process passing its own index extents, \code{jlower} and
|
|
\code{jupper}. The names of these extent parameters begin with a
|
|
\code{j} because we typically think of matrix-vector multiplies as
|
|
the fundamental operation involving both matrices and vectors. For
|
|
matrix-vector multiplies, the vector partitioning should match the
|
|
column partitioning of the matrix (which also uses the \code{j}
|
|
notation). For linear system solves, these extents will typically
|
|
match the row partitioning of the matrix as well.
|
|
|
|
The \code{SetObjectType()} routine sets the underlying vector storage
|
|
type to \code{HYPRE_PARCSR} (this is the only storage type currently
|
|
supported). The \code{Initialize()} routine indicates that the vector
|
|
coefficients (or values) are ready to be set. This routine may or may
|
|
not involve the allocation of memory for the coefficient data,
|
|
depending on the implementation.
|
|
|
|
The \code{SetValues()} routine sets the vector \code{values} for some
|
|
number (\code{nvalues}) of \code{indices}.
|
|
The values can be modified with the
|
|
\code{AddToValues()} routine.
|
|
Note that while \code{AddToValues()}
|
|
will add to values on other processors, \code{SetValues()} does not set
|
|
values on other processors. Instead if a user calls \code{SetValues()}
|
|
on processor $i$ to set a value belonging to processor $j$,
|
|
processor $i$ will
|
|
erase all previous occurrences of this matrix coefficient,
|
|
so they will not contribute to this value on processor $j$.
|
|
The actual value has to be set on processor $j$.
|
|
|
|
The \code{Assemble()} routine is a trivial collective call, and
|
|
finalizes the vector assembly, making the vector ``ready to use''.
|
|
The \code{GetObject()} routine retrieves the built vector object so
|
|
that it can be passed on to \hypre{} solvers that use the
|
|
\code{ParVector} internal storage format.
|
|
|
|
Vector values can be modified in much the same way as with matrices by
|
|
first re-initializing the vector with the \code{Initialize()} routine.
|
|
|
|
|
|
%-----------------------------------------------------------------------------
|
|
|
|
\section{A Scalable Interface}
|
|
|
|
As explained in the previous sections, problem data is passed to the
|
|
\hypre{} library in its distributed form. However, as is typically
|
|
the case for a parallel software library, some information
|
|
regarding the global distribution of the data will be needed for
|
|
\hypre{} to perform its function.
|
|
In particular, a solver algorithm requires that a processor obtain
|
|
``nearby'' data from other processors in order to complete the solve.
|
|
While a processor may easily determine what data it needs from other
|
|
processors, it may not know which processor owns the data it needs.
|
|
Therefore, processors must determine their communication partners, or
|
|
neighbors.
|
|
|
|
The straightforward approach to determining neighbors involves
|
|
constructing a global partition of the data which requires $O(P)$ data
|
|
storage. This storage requirement (as well the costs of many of the
|
|
associated algorithms that access the storage) is not scalable for
|
|
machines such as BlueGene/L with tens of thousands of processors. The
|
|
problem of determining inter-processor communication (in the absence
|
|
of a global description of the data) in a scalable manner is addressed
|
|
in \cite{assumedpartition06}. When using \hypre{} on many thousands of
|
|
processors, compiling the library with the ``no global partition''
|
|
option as detailed in Section \ref{config_options} improves
|
|
scalability as shown in
|
|
\cite{assumedpartition06}. Note that this optimization is only
|
|
recommended when using at least several thousand of processors and is
|
|
most beneficial when using tens of thousands of processors.
|
|
|