# Introduction

The linear equation solver LINSOL is designed to solve a l-dimensional linear system MAT*x = b with a very large and sparse l x l real matrix MAT. In this manual page we explain the structure of the storage scheme that is used by LINSOL. In the following we use the term A instead of MAT.

# Storing A Matrix On Distributed Memory Parallel Computers

LINSOL is designed to run optimally on parallel distributed-memory computers. Before the computation can start, the GLOBAL matrix has to be distributed to the local memories of the processors. This is done in the following way: Each process(or) p (p=1,2,...,nproc) gets lmatbk(p) consecutive rows of the GLOBAL matrix A starting at row ptrmbk(p)+1. Thus we have a LOCAL lmatbk(p) x l matrix A(p) on every process(or) p. This LOCAL matrix is stored in the sparse matrix storage scheme described in the next chapter. Additionaly, LINSOL assumes a logical column block splitting of the matrix on each process(or). This is explained in the chapter "Column Block Splitting".

# The Sparse Matrix Storage Scheme

The real LOCAL matrix A(p) of size lmatbk(p) x l is represented as sum of LOCAL 8 sub-matrices A(i,p) :

A(p) = A(1,p) + A(2,p) + ... + A(i,p) + ... + A(11,p)

= 1*MAIN + nvt(2,p)*FULLDIAG + nvt(3,p)*PDIAG +

... + nvt(6,p)*SKY + nvt(7,p)*FULLROW +

... + nvt(8,p)*FULLCOL + nvt(9,p)*PROW +

... + nvt(11,p)*PCOL

Each submatrix A(i,p) contains one (and only one) special data structure. The vectors within the allowed data structures are named "vector terms". The number of vector terms for A(i,p) is stored in the (virtual) array nvt(i,p). The different submatrices (and vector terms respectively) can be assembled without cutback to the matrix A(p).

WARNING:

Two elements of two different vector terms can have the same entry in the matrix A(p) - this means that the entry is the sum of the two elements. The overlay of elements in the matrix A(p) does not influence the correct computation of the solution of the linear system; but then the normalization does not work in the specified way - this means that you don't get exactly the chosen normalization (except of the normalization by main diagonal).

In the following, the index p indicating the process(or) number will be dropped. The integer number typ indicates the type of the LOCAL submatrix.

NOTES:

The variables iac, iar and the array index are GLOBAL and relate to the GLOBAL matrix A. The variables l and lvt are local and relate to the LOCAL matrix A. The variables adda, indc and indr are pointers and point to the matrix array mat and to the array of indices index respectively. The following examples show the settings of the varables for the first block of the LOCAL matrix on the first process(or)- this means that for the other blocks and (or) on the other processes (processors) the values of the GLOBAL variables (iac, iar, index) change.

# MAIN : main diagonal (typ=1)

A(1) is the main diagonal of the matrix A. This is the diagonal starting with row- and column-index ptrmbk(p)+1. Gaps with no entries have to be filled by zeroes.
```			12345678
1	*-------
2	-*------
3	--*-----
A(1)	=     4	---*----
5	----*---
6	-----*--
7	------*-
8	-------*
```

# FULLDIAG : full diagonal (typ=2)

Starting at the column iac+1 and the row iar+1, the vector term FULLDIAG is a diagonal portion of length lvt. Gaps with no entries have to be filled by zeroes. In the example iac=3, iar=1 and lvt=4 holds.
```			12345678
1	--------
2	---*----
3	----*---
A(2)	=     4	-----*--
5	------*-
6	--------
7	--------
8	--------
```

# PDIAG : packed diagonal (typ=3)

Starting at the column iac+1 and the row iar+1, the vector term PDIAG is a sparse diagonal portion with length lvt. A pointer indc points to the starting adress of the array index which is used to adress the nonzero elements in the diagonal. In the example iac=3, iar=1, lvt=3 and index(indc+1..lvt)=(1,2,4) holds.
```			12345678
1	--------
2	---*----
3	----*---
A(3)	=     4	--------
5	------*-
6	--------
7	--------
8	--------
```

# SKY : starry sky (typ=6)

lvt entries of nonzero elements are completely defined by two pointers indc,indr. The column-indices are listed in the integer vector index(indc+1..lvt), the row-indices in the vector index(indr+1..lvt). In the example lvt=3, index(indc+1..lvt)=(1,2,6), and index(indr+1..lvt)=(3,1,7) holds.
```			12345678
1	-*------
2	--------
3	*-------
A(6)	=     4	--------
5	--------
6	--------
7	-----*--
8	--------
```

# FULLROW: full row (typ=7)

FULLROW defines lvt contiguous elements in the row iar+1 starting at column iac+1. Gaps with no entries have to be filled by zeroes. In the example iar=7, iac=2 and lvt=5 holds.
```			12345678
1	--------
2	--------
3	--------
A(7)	=     4	--------
5	--------
6	--------
7	--------
8	--*****-
```

# FULLCOL: full column (typ=8)

FULLCOL defines lvt contiguous elements in the column iac+1 starting at row iar+1. Gaps with no entries have to be filled by zeroes. In the example iac=7, iar=2 and lvt=3 holds.
```			12345678
1	--------
2	--------
3	-------*
A(8)	=     4	-------*
5	-------*
6	--------
7	--------
8	--------
```

# PROW: packed row (typ=9)

PROW defines lvt elements adressed by vector index via indc in the row iar+1. In the example iar=7, lvt=3 and index(indc+1..lvt)=(3,5,8) holds.
```			12345678
1	--------
2	--------
3	--------
A(9)	=     4	--------
5	--------
6	--------
7	--------
8	--*-*--*
```

# PCOL: packed column (typ=11)

PCOL defines lvt elements adressed by vector index via indr in the column iac+1. In the example iac=7, lvt=3 and index(indr+1..lvt)=(3,4,7) holds.
```			12345678
1	--------
2	--------
3	-------*
A(11)	=     4	-------*
5	--------
6	--------
7	-------*
8	--------
```
NOTE:

The typ SKY="starry sky" is not allowed on a vector computer.

On each process, all index vectors are stored in the integer array index of length lindex. Indc+1 and indr+1 point to the first element of index vectors relating to vector terms within the array index. The entries of the LOCAL matrix A are stored in the real array MAT of length lmat. Adda+1 points to the first element of a vector term within the array MAT. The integer numbers typ, adda, lvt, iac, iar, indc and indr are handed over in the integer matrix info(ia1,ia2); ia1>=nvt and ia2>=7 must hold. The values info(i,1),...,info(i,7) specify the information for the i-th vector term of typ=info(i,1). The following table shows the meaning of info(i,k) for the different data structures:

```	 shape	  I  k=1     k=2    k=3	   k=4	   k=5	  k=6	 k=7
-------------------------------------------------------------
MAIN	  I   1	     adda    l	    0	    0	   0	  0
FULLDIAG I   2	     adda   lvt	   iac	   iar	   0	  0
PDIAG	  I   3	     adda   lvt	   iac	   iar	 indc	  0
SKY	  I   6	     adda   lvt	    0	    0	 indc	indr
FULLROW  I   7	     adda   lvt	   iac	   iar	   0	  0
FULLCOL  I   8	     adda   lvt	   iac	   iar	   0	  0
PROW	  I   9	     adda   lvt	    0	   iar	 indc	  0
PCOL	  I  11	     adda   lvt	   iac	    0	   0	indr
```

# Column Block Splitting

LINSOL assumes that the matrix is not only partitioned in blocks of rows among the processors (processes), but also that each matrix MAT(p) is partitioned logically in nproc (== number of processors (processes)) column blocks. It must hold: if a vector term of the matrix MAT(p) has its first element in column block i (i=1,..,nproc) - this means that it is between column ptrmbk(i)+1 and ptrmbk(i+1) - , then ALL elements of the vector term have to be in the column block i.

# The Symmetric Case

A symmetric matrix is stored in a special symmetric storage scheme which uses the symmetry of the matrix. Matrix A can be represented as

A = AA + D + AA(T)

where D is the main diagonal of A and AA is a M X M-matrix (not necessarily equal to the upper or lower triangle of A). AA(T) denotes the transposed matrix of AA. In the symmetric case only the matrix AA and the main diagonal (if nonzero) must be handed over to LINSOL.

# A Subdivision Strategy

But there is the problem how to separate the matrix A into the local sub-matrices A(i) with i element of set {1,2,3,6,7,8,9,11} because of different partitioning possibilities. The optimal way depends strongly on the used computer. We recommend the following strategy to find the partitioning: first look to all 'long' full row or column portions in your matrix; second look to all 'long' packed row or column portions. Then look to all occupied diagonals of A (=diagonal with at least one nonzero element). The long diagonals with many nonzero elements are stored in the scheme FULLDIAG. If there are enough nonzero elements in the diagonal i.e. the packed diagonal is 'long', you should store the diagonal in scheme PDIAG. What 'long' means, depends on your computer. Normally 'long' is the vector length where a computer nearly reaches its peak performance for the special operation (e.g. general triad, linked triad, scalar product, ..). For the remaining elements of the matrix you can select the SKY scheme with a vector length less than l, i.e. less than the dimension of the linear system.

# Examples

Here follow two examples for the storage scheme in the non-symmetric and the symmetric case. The example programs exam01.f and exam01s.f in the directory \$LINSOL_ROOT/examples refer to these examples.

# Example (non-symmetric problem on a monoprocessor)

The related FORTRAN program is exam01.f . First we look to a non-symmetric problem. MAT is a 8 X 8 matrix (=>l=8):

```		| 4.  0	  2.  0	  0   0	  5.  0. |
| 0   4. -1.  3.  0   0	  0   0. |
| 0   0	  4. -1.  0  -2.  0   0. |
MAT =	| 3.  0	  0   4. -1.  0	  0   0. |
| 0   0	  0   0	  4. -1.  3.  0. |
| 0   0	  0   0	  0   4.  0   0. |
| 0   0	  0   0	  0   0	  4.  0. |
| 0   0	  1.  1.  1.  1.  1.  4. |
```
The matrix MAT is subdivided into five vector terms (=>nvt=5)

MAT = MAIN + 1*FULLDIAG + 1*IXCOL + 1*SKY + 1*FULLROW

= A(1) + A(2) + A(4) + A(6) + A(7) .

We set ia1=nvt=5 and ia2=7. We get
```		 -------------------------------
| 4.				|     MAIN:
|     4.			|     info(1,1)=typ=1
A(1) =	|	      4.		|     info(1,3)=l=8
|		  4.		|
|		      4.	|
|			  4.	|
|			      4.|
-------------------------------
|				|     FULLDIAG:
|	 -1.			|     info(2,1)=typ=2
A(2) =	|		 -1.		|     info(2,3)=lvt=4
|		     -1.	|     info(2,4)=iac=2
|				|     info(2,5)=iar=1
|				|
|				|
-------------------------------
|	  2.			|     PDIAG:
|	      3.		|     info(3,1)=typ=3
A(3) =	|				|     info(3,3)=lvt=3
|			  3.	|     info(3,4)=iac=2
|				|     info(3,5)=iar=0
|				|     info(3,6)=indc=0
|				|
-------------------------------
|			  5.	|     SKY :
|				|     info(4,1)=typ=6
A(6) =	| 3.				|     info(4,3)=lvt=3
|				|     info(4,6)=indc=3
|				|     info(4,7)=indr=6
|				|
|				|
-------------------------------
|				|     FULLROW:
|				|     info(5,1)=typ=7
A(7) =	|				|     info(5,3)=lvt=5
|				|     info(5,4)=iac=2
|				|     info(5,5)=iar=7
|				|
|	  1.  1.  1.  1.  1.	|     .
-------------------------------
```
The matrix elements are saved in the one-dimensional array MAT (lmat=23):
```[______A(1)_____________][____A(2)_______][__A(3)__]
(4.,4.,4.,4.,4.,4.,4.,4. , -1.,-1.,-1.,-1., 2.,3.,3.,
^                      ^                ^         ^

[__A(6)___][_____A(7)_____]
5.,-2.,3. , 1.,1.,1.,1.,1.)
^
```
The index vectors are stored in the array index (lindex=9):
```[_A(3)][____A(6)_____]
(1,2,5 , 7,6,1, 1,3,4)
^    ^       ^
info(3,6)=0 info(4,7)=6
^
^
info(4,6)=3
```

# Example (symmetric problem on a monoprocessor)

The related FORTRAN program is exam01s.f . Now we look to a symmetric problem. We just mirror all entries except for the main diagonal of the above shown matrix on the main diagonal. Again the matrix is a 8 X 8 matrix (=>l=8):

```		| 4.  0	  2.  3.  0   0	  5.  0. |
| 0   4. -1.  3.  0   0	  0   0. |
| 2  -1.  4. -1.  0  -2.  0   1. |
MAT =	| 3.  3. -1.  4. -1.  0	  0   1. |
| 0   0	  0  -1.  4. -1.  3.  1. |
| 0   0	 -2.  0	 -1.  4.  0   1. |
| 5.  0	  0   0	  3.  0	  4.  1. |
| 0   0	  1.  1.  1.  1.  1.  4. |
```
To store this symmetric matrix, we can use the storage scheme of example 4.1. All we have to do is to set lsym=.true.

# References

[1]

Gross,L.; Sternecker,P.; Schoenauer,W.: The Finite Element Tool Package VECFEM, Universitaet Karlsruhe, Interner Bericht 45/91
[2]
Gross,L.; Roll,C.; Sternecker,P.; Schoenauer,W.: The Extension of VECFEM for the Solution of Nonlinear Functional Equations by the Finite Element Method. Universitaet Karlsruhe, Interner Bericht 47/92
[3]
Gross,L.; Roll,C.; Schoenauer,W.: VECFEM for Mixed Finite Elements. Universitaet Karlsruhe, Interner Bericht 50/93

# Researchers / Programmers / Consulting

Original Program by H. Haefner, L. Grosz, C. Roll, P. Sternecker, R. Weiss 1989-2001

Consulting by

```     Numerikforschung fuer Supercomputer
Rechenzentrum der Universitaet Karlsruhe
Am	Zirkel 2
D-76128 Karlsruhe
Germany
Tel.      :   (+721) 608/4869
Fax.      :   (+721) 32550
e-mail    :   haefner@rz.uni-karlsruhe.de
```