Symmetric Eigensolvers in Sca/LAPACK Osni Marques ([email protected])

15
Symmetric Eigensolvers in Sca/LAPACK Osni Marques ([email protected])
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of Symmetric Eigensolvers in Sca/LAPACK Osni Marques ([email protected])

Symmetric Eigensolvers in

Sca/LAPACK

Osni Marques

([email protected])

03/17/2008 ParLab – Symmetric Eigensolvers 2

LAPACK Symmetric Tridiagonal Eigensolvers

• QR (STEQR): all eigenvectors, O(n3)• Bisection plus inverse iteration (STEVX): subset of eigenvectors, O(n2)• Divide-and-conquer (STEDC): all eigenvectors, faster than the the

previous two but needs more workspace.• Multiple relative robust representations (STEGR): faster than all above for

most matrices from industrial and scientific applications, least workspace.

1989

2005Dhillon, Parlett, Voemel, Marques

Typical performance (timing) of different eigensolvers on matrices coming from industrial applications. In the picture, “old” refers to the version currently available in LAPACK, which will be soon replaced by a “new” and more robust implementation; n ranges from 1824 to 8012.

03/17/2008 ParLab – Symmetric Eigensolvers 3

The Essence of the MRRR Algorithm

Factor TI=LDLT, (L,D) is a relative robust representation RRR for the eigenvalue subset •determines all eigenvalues in to high relative accuracy

•small relative changes in entries of L and D cause small relative changes in each eigenvalue in

Given an RRR for a set of eigenvalues:•For each eigenvalue with a large relative gap

• Compute eigenvalue to high relative accuracy

• Compute the FP (Fernando Parlett) vector (eigenvector)

•For each of the remaining groups of eigenvalues

• Choose shift outside the group

• Compute new RRR, L+D+L+T=LDLT new I

• Refine the eigenvalues.

03/17/2008 ParLab – Symmetric Eigensolvers 4

Testing LAPACK functionalities

• At installation time: optional and limited number of test cases to verify the integrity of the installation (LAPACK/TESTING)

• During the development phase: intensive and stressful tests on a variety of computer architectures

03/17/2008 ParLab – Symmetric Eigensolvers 5

Intensive Testing: Requirements and Goals

• Generation of difficult test cases • Bookkeeping of test cases (so that

new or competing algorithm can stressed in a similar way)

• Various platforms• AMD Athlon • AMD Opteron• Itanium 2• Pentium III• Pentium 4• POWER 3• SGI IP35 • SUN sparcv9 • CRAY X1

• Various (Fortran) compilers: Intel,

SUN, SGI, IBM…

• Accuracy

• Performance (time)

• Tuning of parameters (automatic or manual)

• Algorithmic choices (different IEEE variants)

Reveal different numerical behaviors(in particular IEEE arithmetic features), as well as performance issues

03/17/2008 ParLab – Symmetric Eigensolvers 6

Matrix Types

• Built-in matrices• 1-2-1 tridiagonal matrix

(1D Poisson equation)• Wilkinson tridiagonal matrix

(eigenvalues clustered in pairs)

• Built-in eigenvalue distributions• repeated eigenvalues

1=1 and i=1/k, i=2,3…n

1=1 and i=1, i=1,2…n-1, n=1/k• geometric distribution

i = k (1- i)/(n-1), i=1,2…n• different condition numbers (k)• different random number distributions can be multiplied by random signs

• Glued matrices

• combinations of the above cases• very tight eigenvalue clusters

• Eigenvalue distributions (D) read from files: QTDQT with random orthogonal Q

• Tridiagonal matrices from real world applications• Chemistry

(analysis of molecules)

• Harwell-Boeing Collection (structural engineering, etc)

• University of Florida Collection

(FEM analysis, NASA)

• Matrices from LAPACK users

• Lanczos algorithm without

reorthogonalization to provoke very close eigenvalues

Examples of eigenvalue distributions of matrices from applications

Accuracy and timings for families of matrices, for a number of different computer architectures

03/17/2008 ParLab – Symmetric Eigensolvers 9

Profiles

03/17/2008 ParLab – Symmetric Eigensolvers 10

What have we found?

• LAPACK 3.0 STEGR (and STEDC!) fails on some of the new test matrices

• Different matrix classes with different challenges • STEGR about 10 times slower than STEDC for glued Wilkinson

matrices

• Architecture differences • Pentium slows when infinity occurs

• Vectorization issues on CRAY

Reference tester for future development

03/17/2008 ParLab – Symmetric Eigensolvers 11

Parallel Eigensolvers

• PDSYEVX: bisection + inverse iteration

• PDSYEVD: parallel divide and conquer (F. Tisseur)

• PDSYEVR: MRRR (C. Vömel)

03/17/2008 ParLab – Symmetric Eigensolvers 12

Pitfalls of Parallelization

• Straightforward approach: n eigenpairs, p processors cyclic assignment of n/p eigenpairs to each processor

• Each processor computes orthogonal eigenvectors

• Orthogonality between processors is not guaranteed

• ScaLAPACK: PDSYEVX can break!

03/17/2008 ParLab – Symmetric Eigensolvers 13

Parallelization: the right way

03/17/2008 ParLab – Symmetric Eigensolvers 14

MRRR versus DC

(Tridiagonal part of PDSYEVR and PDSYEVD)

Lapw (n=22908, A. Tate). Runtime and efficiency of the tridiagonal MRRR/D&C part on the IBM SP5.

Hubbard (n=63504, Ward and Bai). Runtime and efficiency of the tridiagonal MRRR/D&C part on the IBM SP5.

03/17/2008 ParLab – Symmetric Eigensolvers 15

References

• Performance and Accuracy of LAPACK's Symmetric Tridiagonal Eigensolvers, J. Demmel, O. Marques, B. Parlett, and C. Vömel. SIAM J. Sci. Comp., 30:1508–1526, 2008.

• A Testing Infrastructure for Symmetric Tridiagonal Eigensolvers, J. Demmel, O. Marques, B. Parlett, and C. Vömel. ACM TOMS, 35, 2008.

• Computations of Eigenpair Subsets with the MRRR Algorithm, B. Parlett, O. Marques and C. Voemel. Numerical Linear Algebra with Applications, 13:643-653, 2006.

The Design and Implementation of the MRRR Algorithm, I. Dhillon, B. Parlett, and C. Vömel. Technical Report UT-CS-04-541, December, 2004.

ScaLAPACK’S MRRR Algorithm, C. Vömel, LAPACK Working Note 195, November 2007.

♦ http://crd.lbl.gov/~osni/Codes/stetester (source code available upon request)