slpdoc.f

      SUBROUTINE SLPDOC
C***BEGIN PROLOGUE  SLPDOC
C***PURPOSE  Sparse Linear Algebra Package Version 2.0.2 Documentation.
C            Routines to solve large sparse symmetric and nonsymmetric
C            positive definite linear systems, Ax = b, using precondi-
C            tioned iterative methods.
C***LIBRARY   SLATEC (SLAP)
C***CATEGORY  D2A4, D2B4, Z
C***TYPE      SINGLE PRECISION (SLPDOC-S, DLPDOC-D)
C***KEYWORDS  BICONJUGATE GRADIENT SQUARED, DOCUMENTATION,
C             GENERALIZED MINIMUM RESIDUAL, ITERATIVE IMPROVEMENT,
C             NORMAL EQUATIONS, ORTHOMIN,
C             PRECONDITIONED CONJUGATE GRADIENT, SLAP,
C             SPARSE ITERATIVE METHODS
C***AUTHOR  Seager, Mark. K., (LLNL)
C             User Systems Division
C             Lawrence Livermore National Laboratory
C             PO BOX 808, L-60
C             Livermore, CA 94550
C             (FTS) 543-3141, (510) 423-3141
C             seager@llnl.gov
C***DESCRIPTION
C                                 The
C                    Sparse Linear Algebra Package
C
C                @@@@@@@  @            @@@    @@@@@@@@
C               @       @ @           @   @   @       @
C               @         @          @     @  @       @
C                @@@@@@@  @         @       @ @@@@@@@@
C                       @ @         @@@@@@@@@ @
C               @       @ @         @       @ @
C                @@@@@@@  @@@@@@@@@ @       @ @
C
C      @       @                            @@@@@@@        @@@@@
C      @       @                           @       @      @    @@
C      @       @  @@@@@@@  @ @@                    @     @    @  @
C      @       @ @       @ @@  @             @@@@@@      @   @   @
C       @     @  @@@@@@@@@ @                @            @  @    @
C        @   @   @         @               @         @@@  @@    @
C         @@@     @@@@@@@  @               @@@@@@@@@ @@@   @@@@@
C
C
C    =================================================================
C    ========================== Introduction =========================
C    =================================================================
C      This package was  originally derived from a set of  iterative
C      routines written by Anne Greenbaum, as announced in "Routines
C      for Solving Large Sparse Linear Systems",  Tentacle, Lawrence
C      Livermore  National  Laboratory,  Livermore  Computing Center
C      (January 1986), pp 15-21.
C
C    This document  contains the specifications for  the  SLAP Version
C    2.0 package, a Fortran 77  package  for  the  solution  of  large
C    sparse   linear systems, Ax  =  b,  via  preconditioned iterative
C    methods.   Included in  this  package are "core"  routines  to do
C    Iterative   Refinement  (Jacobi's  method),  Conjugate  Gradient,
C    Conjugate Gradient on the normal equations, AA'y = b,  (where x =
C    A'y and  A' denotes the  transpose of   A), BiConjugate Gradient,
C    BiConjugate  Gradient  Squared, Orthomin and  Generalized Minimum
C    Residual Iteration.    These "core" routines   do  not  require a
C    "fixed"   data  structure   for storing  the   matrix  A  and the
C    preconditioning   matrix  M.   The  user  is free  to  choose any
C    structure that facilitates  efficient solution  of the problem at
C    hand.  The drawback  to this approach  is that the user must also
C    supply at least two routines  (MATVEC and MSOLVE,  say).   MATVEC
C    must calculate, y = Ax, given x and the user's data structure for
C    A.  MSOLVE must solve,  r = Mz, for z (*NOT*  r) given r  and the
C    user's data  structure for  M (or its  inverse).  The user should
C    choose M so that  inv(M)*A  is approximately the identity and the
C    solution step r = Mz is "easy" to  solve.  For some of the "core"
C    routines (Orthomin,  BiConjugate Gradient and  Conjugate Gradient
C    on the  normal equations)   the user must  also  supply  a matrix
C    transpose times   vector  routine  (MTTVEC,  say)  and (possibly,
C    depending    on the "core"  method)   a  routine  that solves the
C    transpose  of   the   preconditioning    step     (MTSOLV,  say).
C    Specifically, MTTVEC is a routine which calculates y = A'x, given
C    x and the user's data structure for A (A' is the transpose of A).
C    MTSOLV is a routine which solves the system r = M'z for z given r
C    and the user's data structure for M.
C
C    This process of writing the matrix vector operations  can be time
C    consuming and error  prone.  To alleviate  these problems we have
C    written drivers   for  the  "core" methods  that  assume the user
C    supplies one of two specific data structures (SLAP Triad and SLAP
C    Column format), see  below.  Utilizing these  data structures  we
C    have augmented   each  "core" method  with   two preconditioners:
C    Diagonal  Scaling and Incomplete Factorization.  Diagonal scaling
C    is easy to implement, vectorizes very  well and for problems that
C    are  not too  ill-conditioned  reduces the  number  of iterations
C    enough   to warrant its use.  On   the other  hand, an Incomplete
C    factorization  (Incomplete  Cholesky for  symmetric systems   and
C    Incomplete LU for nonsymmetric  systems) may  take much longer to
C    calculate, but it reduces the iteration count (for most problems)
C    significantly.  Our implementations  of IC and ILU  vectorize for
C    machines with hardware gather scatter, but the vector lengths can
C    be quite short if  the  number  of non-zeros  in a column is  not
C    large.
C
C    =================================================================
C    ==================== Supplied Data Structures ===================
C    =================================================================
C    The following describes the data   structures supplied  with  the
C    package: SLAP Triad and Column formats.
C
C    ====================== S L A P Triad format =====================
C
C    In the SLAP Triad format only the non-zeros are stored.  They may
C    appear in *ANY* order.  The user supplies three  arrays of length
C    NELT, where NELT  is the   number of  non-zeros  in the   matrix:
C    (IA(NELT),  JA(NELT), A(NELT)).  If  the matrix is symmetric then
C    one need only store the lower triangle (including  the  diagonal)
C    and NELT would be the corresponding  number  of non-zeros stored.
C    For each non-zero the user puts the row and column  index of that
C    matrix  element   in the  IA  and JA  arrays.  The  value  of the
C    non-zero matrix element is placed  in  the corresponding location
C    of  the A array.   This  is an extremely  easy  data structure to
C    generate.  On the other hand, it is not very  efficient on vector
C    computers for the iterative  solution of  linear systems.  Hence,
C    SLAP changes this input data structure to  the SLAP Column format
C    for the iteration (but does not change it back).
C
C    Here  is an example   of  the  SLAP  Triad storage  format  for a
C    nonsymmetric 5x5 Matrix.  NELT=11.   Recall that the  entries may
C    appear in any order.
C
C     5x5 Matrix       SLAP Triad format for 5x5 matrix on left.
C                           1  2  3  4  5  6  7  8  9 10 11
C    |11 12  0  0 15|   A: 51 12 11 33 15 53 55 22 35 44 21
C    |21 22  0  0  0|  IA:  5  1  1  3  1  5  5  2  3  4  2
C    | 0  0 33  0 35|  JA:  1  2  1  3  5  3  5  2  5  4  1
C    | 0  0  0 44  0|
C    |51  0 53  0 55|
C
C    ====================== S L A P Column format ====================
C
C    In the SLAP Column format  the non-zeros are stored counting down
C    columns (except for the  diagonal entry,  which must appear first
C    in each "column") and are stored  in the real array A.   In other
C    words, for each column in the matrix first put the diagonal entry
C    in A.   Then put in  the other non-zero  elements going  down the
C    column (except the  diagonal) in order.   The IA  array holds the
C    row index for each non-zero.  The JA array holds the offsets into
C    the  IA,  A arrays for the  beginning  of each column.   That is,
C    IA(JA(ICOL)), A(JA(ICOL)) are the  first elements of the  ICOL-th
C    column in IA   and A.  IA(JA(ICOL+1)-1),  A(JA(ICOL+1)-1) are the
C    last elements  of  the ICOL-th column.   Note that we always have
C    JA(N+1) = NELT+1, where N is the number of  columns in the matrix
C    and NELT is the number of non-zeros in the matrix.  If the matrix
C    is  symmetric one need  only store the  lower triangle (including
C    the diagonal) and  NELT  would be the   corresponding   number of
C    non-zeros stored.
C
C    Here is  an  example of the  SLAP   Column storage format  for  a
C    nonsymmetric 5x5 Matrix (in the  A and  IA arrays '|' denotes the
C    end of a column):
C
C       5x5 Matrix      SLAP Column format for 5x5 matrix on left.
C                           1  2  3    4  5    6  7    8    9 10 11
C    |11 12  0  0 15|   A: 11 21 51 | 22 12 | 33 53 | 44 | 55 15 35
C    |21 22  0  0  0|  IA:  1  2  5 |  2  1 |  3  5 |  4 |  5  1  3
C    | 0  0 33  0 35|  JA:  1  4  6    8  9   12
C    | 0  0  0 44  0|
C    |51  0 53  0 55|
C
C    =================================================================
C    ====================== Which Method To Use ======================
C    =================================================================
C
C                          BACKGROUND
C    In solving a large sparse linear system Ax = b using an iterative
C    method, it   is  not necessary to actually   store  the matrix A.
C    Rather, what is needed is a procedure  for multiplying the matrix
C    A times a given vector y to obtain the matrix-vector product, Ay.
C    SLAP has been written to take advantage of this fact.  The higher
C    level routines in the package require storage only of the non-zero
C    elements of   A (and  their  positions), and  even this   can  be
C    avoided, if the  user  writes his own subroutine for  multiplying
C    the matrix times a vector  and   calls the lower-level  iterative
C    routines in the package.
C
C    If  the matrix A is ill-conditioned,  then most iterative methods
C    will be slow to converge (if they converge  at all!).  To improve
C    the  convergence  rate,  one  may use  a "matrix  splitting," or,
C    "preconditioning matrix," say, M.  It is then necessary to solve,
C    at each iteration, a linear system  with coefficient matrix M.  A
C    good preconditioner  M should have  two  properties: (1) M should
C    "approximate" A, in the sense that the  matrix inv(M)*A  (or some
C    variant  thereof) is better conditioned  than the original matrix
C    A; and  (2) linear  systems with coefficient  matrix M should  be
C    much easier  to solve  than  the original system with coefficient
C    matrix   A.   Preconditioning routines  in the   SLAP package are
C    separate from the  iterative   routines,  so   that any of    the
C    preconditioners provided in the package,   or one that the   user
C    codes himself, can be used with any of the iterative routines.
C
C                        CHOICE OF PRECONDITIONER
C    If you  willing   to live with   either the SLAP Triad or  Column
C    matrix data structure  you  can then  choose one  of two types of
C    preconditioners   to   use:   diagonal  scaling    or  incomplete
C    factorization.  To  choose   between these two   methods requires
C    knowing  something  about the computer you're going  to run these
C    codes on  and how well incomplete factorization  approximates the
C    inverse of your matrix.
C
C    Let us  suppose you have   a scalar  machine.   Then,  unless the
C    incomplete factorization is very,  very poor this  is *GENERALLY*
C    the method to choose.  It  will reduce the  number of  iterations
C    significantly and is not all  that expensive  to compute.  So  if
C    you have just one  linear system to solve  and  "just want to get
C    the job  done" then try  incomplete factorization first.   If you
C    are thinking of integrating some SLAP  iterative method into your
C    favorite   "production  code" then  try incomplete  factorization
C    first,  but  also check  to see that  diagonal  scaling is indeed
C    slower for a large sample of test problems.
C
C    Let us now suppose  you have  a  vector  computer  with  hardware
C    gather/scatter support (Cray X-MP, Y-MP, SCS-40 or Cyber 205, ETA
C    10,  ETA Piper,  Convex C-1,  etc.).   Then  it is much harder to
C    choose  between the  two  methods.   The  versions  of incomplete
C    factorization in SLAP do in fact vectorize, but have short vector
C    lengths and the factorization step is relatively  more expensive.
C    Hence,  for  most problems (i.e.,  unless  your  problem  is  ill
C    conditioned,  sic!)  diagonal  scaling is  faster,  with its very
C    fast    set up  time    and  vectorized  (with   long    vectors)
C    preconditioning step (even though  it  may take more iterations).
C    If you have several systems (or  right hand sides) to  solve that
C    can  utilize  the  same  preconditioner  then the   cost   of the
C    incomplete factorization can   be  amortized over these  several
C    solutions.  This situation gives more advantage to the incomplete
C    factorization methods.  If  you have  a  vector  machine  without
C    hardware  gather/scatter (Cray  1,  Cray  2  &  Cray 3) then  the
C    advantages for incomplete factorization are even less.
C
C    If you're trying to shoehorn SLAP into your  favorite "production
C    code" and can not easily generate either the SLAP Triad or Column
C    format  then  you are  left  to   your  own  devices in terms  of
C    preconditioning.  Also,  you may  find that the   preconditioners
C    supplied with SLAP are not sufficient  for your problem.  In this
C    situation we would  recommend  that you   talk  with a  numerical
C    analyst  versed in   iterative   methods   about   writing  other
C    preconditioning  subroutines (e.g.,  polynomial  preconditioning,
C    shifted incomplete factorization,  SOR  or SSOR  iteration).  You
C    can always "roll your own"  by using the "core" iterative methods
C    and supplying your own MSOLVE and MATVEC (and possibly MTSOLV and
C    MTTVEC) routines.
C
C                          SYMMETRIC SYSTEMS
C    If your matrix is symmetric then you would want to use one of the
C    symmetric system  solvers.    If  your  system  is  also positive
C    definite,   (Ax,x) (Ax dot  product  with x) is  positive for all
C    non-zero  vectors x,  then use   Conjugate Gradient (SCG,  SSDCG,
C    SSICSG).  If you're  not sure it's SPD   (symmetric and  Positive
C    Definite)  then try SCG anyway and  if it works, fine.  If you're
C    sure your matrix is not  positive definite  then you  may want to
C    try the iterative refinement   methods  (SIR)  or the  GMRES code
C    (SGMRES) if SIR converges too slowly.
C
C                         NONSYMMETRIC SYSTEMS
C    This   is currently  an  area  of  active research  in  numerical
C    analysis  and   there   are   new  strategies  being   developed.
C    Consequently take the following advice with a grain of salt.   If
C    you matrix is positive definite, (Ax,x)  (Ax  dot product  with x
C    is positive for all non-zero  vectors x), then you can use any of
C    the    methods   for   nonsymmetric   systems (Orthomin,   GMRES,
C    BiConjugate Gradient, BiConjugate Gradient  Squared and Conjugate
C    Gradient applied to the normal equations).  If your system is not
C    too ill conditioned then try  BiConjugate Gradient Squared (BCGS)
C    or GMRES (SGMRES).  Both  of  these methods converge very quickly
C    and do  not require A'  or M' ('  denotes transpose) information.
C    SGMRES  does require  some  additional storage,  though.  If  the
C    system is very  ill conditioned  or   nearly positive  indefinite
C    ((Ax,x) is positive,  but may be  very small),  then GMRES should
C    be the first choice,  but try the  other  methods  if you have to
C    fine tune  the solution process for a  "production code".  If you
C    have a great preconditioner for the normal  equations (i.e., M is
C    an approximation to the inverse of AA' rather than  just  A) then
C    this is not a bad route to travel.  Old wisdom would say that the
C    normal equations are a disaster  (since it squares the  condition
C    number of the system and SCG convergence is linked to this number
C    of    infamy), but   some     preconditioners    (like incomplete
C    factorization) can reduce the condition number back below that of
C    the original system.
C
C    =================================================================
C    ======================= Naming Conventions ======================
C    =================================================================
C    SLAP  iterative  methods,    matrix vector    and  preconditioner
C    calculation  routines   follow a naming   convention  which, when
C    understood, allows one to determine the iterative method and data
C    structure(s) used.  The  subroutine  naming convention  takes the
C    following form:
C                          P[S][M]DESC
C    where
C        P  stands for the precision (or data type) of the routine and
C           is required in all names,
C        S  denotes whether or not the routine requires the SLAP Triad
C           or Column format (it does if the second letter of the name
C           is S and does not otherwise),
C        M  stands for the type of preconditioner used (only appears
C           in drivers for "core" routines), and
C     DESC  is some number of letters describing the method or purpose
C           of the routine.  The following is a list of the "DESC"
C           fields for iterative methods and their meaning:
C             BCG,BC:       BiConjugate Gradient
C             CG:           Conjugate Gradient
C             CGN,CN:       Conjugate Gradient on the Normal equations
C             CGS,CS:       biConjugate Gradient Squared
C             GMRES,GMR,GM: Generalized Minimum RESidual
C             IR,R:         Iterative Refinement
C             JAC:          JACobi's method
C             GS:           Gauss-Seidel
C             OMN,OM:       OrthoMiN
C
C    In the single precision version of SLAP, all routine names start
C    with an S. The brackets around the S and M designate that these
C    fields are optional.
C
C    Here are some examples of the routines:
C    1) SBCG: Single precision BiConjugate Gradient "core" routine.
C       One can deduce that this is a "core" routine, because the S and
C       M fields are missing and BiConjugate Gradient is an iterative
C       method.
C    2) SSDBCG: Single precision, SLAP data structure BCG with Diagonal
C       scaling.
C    3) SSLUBC: Single precision, SLAP data structure BCG with incom-
C       plete LU factorization as the preconditioning.
C    4) SCG: Single precision Conjugate Gradient "core" routine.
C    5) SSDCG: Single precision, SLAP data structure Conjugate Gradient
C       with Diagonal scaling.
C    6) SSICCG: Single precision, SLAP data structure Conjugate Gra-
C       dient with Incomplete Cholesky factorization preconditioning.
C
C
C    =================================================================
C    ===================== USER CALLABLE ROUTINES ====================
C    =================================================================
C    The following is a list of  the "user callable" SLAP routines and
C    their one line descriptions.  The headers denote  the  file names
C    where the routines can be found, as distributed for UNIX systems.
C
C    Note:  Each core routine, SXXX, has a corresponding stop routine,
C         ISSXXX.  If the stop routine does not have the specific stop
C         test the user requires (e.g., weighted infinity norm),  then
C         the user should modify the source for ISSXXX accordingly.
C
C    ============================= sir.f =============================
C    SIR: Preconditioned Iterative Refinement Sparse Ax = b Solver.
C    SSJAC: Jacobi's Method Iterative Sparse Ax = b Solver.
C    SSGS: Gauss-Seidel Method Iterative Sparse Ax = b Solver.
C    SSILUR: Incomplete LU Iterative Refinement Sparse Ax = b Solver.
C
C    ============================= scg.f =============================
C    SCG: Preconditioned Conjugate Gradient Sparse Ax=b Solver.
C    SSDCG: Diagonally Scaled Conjugate Gradient Sparse Ax=b Solver.
C    SSICCG: Incomplete Cholesky Conjugate Gradient Sparse Ax=b Solver.
C
C    ============================= scgn.f ============================
C    SCGN: Preconditioned CG Sparse Ax=b Solver for Normal Equations.
C    SSDCGN: Diagonally Scaled CG Sparse Ax=b Solver for Normal Eqn's.
C    SSLUCN: Incomplete LU CG Sparse Ax=b Solver for Normal Equations.
C
C    ============================= sbcg.f ============================
C    SBCG: Preconditioned BiConjugate Gradient Sparse Ax = b Solver.
C    SSDBCG: Diagonally Scaled BiConjugate Gradient Sparse Ax=b Solver.
C    SSLUBC: Incomplete LU BiConjugate Gradient Sparse Ax=b Solver.
C
C    ============================= scgs.f ============================
C    SCGS: Preconditioned BiConjugate Gradient Squared Ax=b Solver.
C    SSDCGS: Diagonally Scaled CGS Sparse Ax=b Solver.
C    SSLUCS: Incomplete LU BiConjugate Gradient Squared Ax=b Solver.
C
C    ============================= somn.f ============================
C    SOMN: Preconditioned Orthomin Sparse Iterative Ax=b Solver.
C    SSDOMN: Diagonally Scaled Orthomin Sparse Iterative Ax=b Solver.
C    SSLUOM: Incomplete LU Orthomin Sparse Iterative Ax=b Solver.
C
C    ============================ sgmres.f ===========================
C    SGMRES: Preconditioned GMRES Iterative Sparse Ax=b Solver.
C    SSDGMR: Diagonally Scaled GMRES Iterative Sparse Ax=b Solver.
C    SSLUGM: Incomplete LU GMRES Iterative Sparse Ax=b Solver.
C
C    ============================ smset.f ============================
C       The following routines are used to set up preconditioners.
C
C    SSDS: Diagonal Scaling Preconditioner SLAP Set Up.
C    SSDSCL: Diagonally Scales/Unscales a SLAP Column Matrix.
C    SSD2S: Diagonal Scaling Preconditioner SLAP Normal Eqns Set Up.
C    SS2LT: Lower Triangle Preconditioner SLAP Set Up.
C    SSICS: Incomplete Cholesky Decomp. Preconditioner SLAP Set Up.
C    SSILUS: Incomplete LU Decomposition Preconditioner SLAP Set Up.
C
C    ============================ smvops.f ===========================
C       Most of the incomplete  factorization  (LL' and LDU) solvers
C       in this  file require an  intermediate routine  to translate
C       from the SLAP MSOLVE(N, R, Z, NELT, IA,  JA, A, ISYM, RWORK,
C       IWORK) calling  convention to the calling  sequence required
C       by  the solve routine.   This generally  is  accomplished by
C       fishing out pointers to the preconditioner (stored in RWORK)
C       from the  IWORK  array and then making a call to the routine
C       that actually does the backsolve.
C
C    SSMV: SLAP Column Format Sparse Matrix Vector Product.
C    SSMTV: SLAP Column Format Sparse Matrix (transpose) Vector Prod.
C    SSDI: Diagonal Matrix Vector Multiply.
C    SSLI: SLAP MSOLVE for Lower Triangle Matrix (set up for SSLI2).
C    SSLI2: Lower Triangle Matrix Backsolve.
C    SSLLTI: SLAP MSOLVE for LDL' (IC) Fact. (set up for SLLTI2).
C    SLLTI2: Backsolve routine for LDL' Factorization.
C    SSLUI: SLAP MSOLVE for LDU Factorization (set up for SSLUI2).
C    SSLUI2: SLAP Backsolve for LDU Factorization.
C    SSLUTI: SLAP MTSOLV for LDU Factorization (set up for SSLUI4).
C    SSLUI4: SLAP Backsolve for LDU Factorization.
C    SSMMTI: SLAP MSOLVE for LDU Fact of Normal Eq (set up for SSMMI2).
C    SSMMI2: SLAP Backsolve for LDU Factorization of Normal Equations.
C
C    =========================== slaputil.f ==========================
C       The following utility routines are useful additions to SLAP.
C
C    SBHIN: Read Sparse Linear System in the Boeing/Harwell Format.
C    SCHKW: SLAP WORK/IWORK Array Bounds Checker.
C    SCPPLT: Printer Plot of SLAP Column Format Matrix.
C    SS2Y: SLAP Triad to SLAP Column Format Converter.
C    QS2I1R: Quick Sort Integer array, moving integer and real arrays.
C            (Used by SS2Y.)
C    STIN: Read in SLAP Triad Format Linear System.
C    STOUT: Write out SLAP Triad Format Linear System.
C
C
C***REFERENCES  1. Mark K. Seager, A SLAP for the Masses, in
C                  G. F. Carey, Ed., Parallel Supercomputing: Methods,
C                  Algorithms and Applications, Wiley, 1989, pp.135-155.
C***ROUTINES CALLED  (NONE)
C***REVISION HISTORY  (YYMMDD)
C   880715  DATE WRITTEN
C   890404  Previous REVISION DATE
C   890915  Made changes requested at July 1989 CML Meeting.  (MKS)
C   890921  Removed TeX from comments.  (FNF)
C   890922  Numerous changes to prologue to make closer to SLATEC
C           standard.  (FNF)
C   890929  Numerous changes to reduce SP/DP differences.  (FNF)
C           -----( This produced Version 2.0.1. )-----
C   891003  Rearranged list of user callable routines to agree with
C           order in source deck.  (FNF)
C   891004  Updated reference.
C   910411  Prologue converted to Version 4.0 format.  (BAB)
C           -----( This produced Version 2.0.2. )-----
C   910506  Minor improvements to prologue.  (FNF)
C   920511  Added complete declaration section.  (WRB)
C   920929  Corrected format of reference.  (FNF)
C   921019  Improved one-line descriptions, reordering some.  (FNF)
C***END PROLOGUE  SLPDOC