8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National...

133
8/22/20 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract DE-AC04-94AL85000.

Transcript of 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National...

Page 1: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

1

Trilinos TutorialOverview and Basic Concepts

Michael A. HerouxSandia National Laboratories

ACTS Toolkit Workshop 2006

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.

Page 2: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

2

Trilinos Development TeamRoss Bartlett Lead Developer of ThyraDeveloper of Rythmos

Paul Boggs Developer of Thyra

Todd CoffeyLead Developer of Rythmos

Jason Cross Developer of Jpetra

David DayDeveloper of Komplex

Clark DohrmannDeveloper of CLAPS

Michael Gee Developer of ML, NOX

Bob Heaphy Lead developer of Trilinos SQA

Mike Heroux Trilinos Project Leader Lead Developer of Epetra, AztecOO, Kokkos, Komplex, IFPACK, Thyra, TpetraDeveloper of Amesos, Belos, EpetraExt, Jpetra

Ulrich HetmaniukDeveloper of Anasazi

Robert Hoekstra Lead Developer of EpetraExtDeveloper of Epetra, Thyra, Tpetra

Russell HooperDeveloper of NOX

Vicki Howle Lead Developer of MerosDeveloper of Belos and Thyra

Jonathan Hu Developer of ML

Sarah KnepperDeveloper of Komplex

Tammy Kolda Lead Developer of NOX

Joe KotulskiLead Developer of Pliris

Rich Lehoucq Developer of Anasazi and Belos

Kevin Long Lead Developer of Thyra, Developer of Belos and Teuchos

Roger Pawlowski Lead Developer of NOX

Michael Phenow Trilinos Webmaster Lead Developer of New_Package

Eric Phipps Developer of LOCA and NOX

Marzio SalaLead Developer of Didasko and IFPACKDeveloper of ML, Amesos

Andrew Salinger Lead Developer of LOCA

Paul Sexton Developer of Epetra and Tpetra

Bill SpotzLead Developer of PyTrilinosDeveloper of Epetra, New_Package

Ken Stanley Lead Developer of Amesos and New_Package

Heidi Thornquist Lead Developer of Anasazi, Belos and Teuchos

Ray Tuminaro Lead Developer of ML and Meros

Jim Willenbring Developer of Epetra and New_Package. Trilinos library manager

Alan Williams Developer of Epetra, EpetraExt, AztecOO, Tpetra

Page 3: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

3

Outline of Talk

Background/Motivation. Trilinos Package Concepts. ANAs, LALs and APPs. Managing Parallel Data: Petra Object Model. Whirlwind Tour of Some Trilinos Packages. Generic Programming. SQA/SQE Issues. Concluding remarks.

Page 4: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

4

Target Problems: PDES and more…

PDES

Circuits

InhomogeneousFluids

And More…

Page 5: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

5

Motivation For Trilinos Sandia does LOTS of solver work. When I started at Sandia in May 1998:

Aztec was a mature package. Used in many codes. FETI, PETSc, DSCPack, Spooles, ARPACK, DASPK, and many

other codes were (and are) in use. New projects were underway or planned in multi-level

preconditioners, eigensolvers, non-linear solvers, etc… The challenges:

Little or no coordination was in place to:• Efficiently reuse existing solver technology.• Leverage new development across various projects.• Support solver software processes.• Provide consistent solver APIs for applications.

ASCI was forming software quality assurance/engineering (SQA/SQE) requirements:

• Daunting requirements for any single solver effort to address alone.

Page 6: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

6

Evolving Trilinos Solution Trilinos1 is an evolving framework to address these challenges:

Fundamental atomic unit is a package. Includes core set of vector, graph and matrix classes (Epetra/Tpetra packages). Provides a common abstract solver API (Thyra package). Provides a ready-made package infrastructure (new_package package):

• Source code management (cvs, bonsai).• Build tools (autotools).• Automated regression testing (queue directories within repository).• Communication tools (mailman mail lists).

Specifies requirements and suggested practices for package SQA. In general allows us to categorize efforts:

Efforts best done at the Trilinos level (useful to most or all packages). Efforts best done at a package level (peculiar or important to a package). Allows package developers to focus only on things that are unique to

their package.

1. Trilinos loose translation: “A string of pearls”

Page 7: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

Full “Vertical” Solver Coverage Trilinos Packages

· Transient Problems:

· DAEs/ODEs: Rythmos0 0

Solve ( ( ), ( ), ) 0

[0, ], (0) , (0)

for ( ) , [0, ]n

f x t x t t

t T x x x x

x t t T

· Nonlinear Problems:

· Nonlinear equations:

· Stability analysis:

NOX

LOCA

Given nonlinear op ( , )

Solve ( ) 0 for

For ( , ) 0 find space singular

n m n

n

c x u

c x x

cc x u u U

x

· Explicit Linear Problems:

· Matrix/graph equations:

· Vector problems:Epetra, Tpetra

Compute ; ( ); ,

Compute ; , ; ,

m n m n

n

y Ax A A G A G

y x w x y x y

Anasazi

· Implicit Linear Problems:

· Linear equations:

· Eigen problems:

AztecOO, Belos, Ifpack,ML,etc.

Given linear ops (matrices) ,

Solve for

Solve for (all) ,

n n

n

n

A B

Ax b x

Av Bv v

· Optimization Problems:

· Unconstrained:

· Constrained:MOOCHO

Find that minimizes ( )

Find and that

minimizes ( , ) s.t. ( , ) 0

n

m n

u f u

y u

f y u c y u

Page 8: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

Trilinos StatisticsTrilinos Statistics by Release

19

13

8

2.23

2.70

5

22

22

22

16

5.48

4.40

9

27

26

26

17

7.16

7.36

11

30

29

27

18

8.95

14.71

19

33

0 10 20 30 40

Packages inrepository

Limitedrelease

packages

Generalrelease

packages

Source lines(100K)

Dow nloads(100s)

AutomatedRegression

Testedpackages

Developers

Counts

Release 6.0 (9/05)

Release 5.0 (3/05)

Release 4.0 (6/04)

Release 3.1 (9/03)Registered Users by Type (1045 Total)

University, 568

Government, 239

Personal, 114

Industry, 99

Other, 25

Stats: Trilinos Download Page 8/18/2006.

Registered Users by Region (1045 Total)

359

319

176

126

42

13

10

Europe

US (except Sandia)

Sandia (includesunregistered)

Asia

Americas (except US)

Australia/NZ

Africa

Page 9: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

9

Trilinos Package Concepts

Package: The Atomic Unit

Page 10: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

10

Trilinos Packages Trilinos is a collection of Packages. Each package is:

Focused on important, state-of-the-art algorithms in its problem regime.

Developed by a small team of domain experts. Self-contained: No explicit dependencies on any other software

packages (with some special exceptions). Configurable/buildable/documented on its own.

Sample packages: NOX, AztecOO, ML, IFPACK, Meros. Special package collections:

Petra (Epetra, Tpetra, Jpetra): Concrete Data Objects Thyra: Abstract Conceptual Interfaces Teuchos: Common Tools. New_package: Jumpstart prototype.

Page 11: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

11

Greek Names

Page 12: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

Objective Package(s)

Linear algebra objects Epetra, Jpetra, Tpetra

Krylov solvers AztecOO, Belos, Komplex

ILU-type preconditioners AztecOO, IFPACK

Multilevel preconditioners ML, CLAPS

Eigenvalue problems Anasazi

Block preconditioners Meros

Direct sparse linear solvers Amesos

Direct dense solvers Epetra, Teuchos, Pliris

Abstract interfaces Thyra

Nonlinear system solvers NOX, LOCA

Time Integrators/DAEs Rythmos

C++ utilities, (some) I/O Teuchos, EpetraExt, Kokkos

Trilinos Tutorial Didasko

“Skins” PyTrilinos, WebTrilinos, Star-P, Stratimikos, ForTrilinos

Optimization (SAND) MOOCHO

Archetype package NewPackage

Other new in 7.0 release Galeri, Isorropia, Moertel, RTOp

Basic Linear Algebra classes

Block Krylov Methods

ScalablePreconditioners3rd Party Direct SolverPackage.

Abstract Interfaces.

Trilinos Package Summary

Page 13: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

13

Why Packages?

Let’s Do Some Matrix Analysis!

Page 14: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

14Trilinos Development Team (Again)Ross Bartlett Lead Developer of ThyraDeveloper of Rythmos

Paul Boggs Developer of Thyra

Todd CoffeyLead Developer of Rythmos

Jason Cross Developer of Jpetra

David DayDeveloper of Komplex

Clark DohrmannDeveloper of CLAPS

Michael Gee Developer of ML, NOX

Bob Heaphy Lead developer of Trilinos SQA

Mike Heroux Trilinos Project Leader Lead Developer of Epetra, AztecOO, Kokkos, Komplex, IFPACK, Thyra, TpetraDeveloper of Amesos, Belos, EpetraExt, Jpetra

Ulrich HetmaniukDeveloper of Anasazi

Robert Hoekstra Lead Developer of EpetraExtDeveloper of Epetra, Thyra, Tpetra

Russell HooperDeveloper of NOX

Vicki Howle Lead Developer of MerosDeveloper of Belos and Thyra

Jonathan Hu Developer of ML

Sarah KnepperDeveloper of Komplex

Tammy Kolda Lead Developer of NOX

Joe KotulskiLead Developer of Pliris

Rich Lehoucq Developer of Anasazi and Belos

Kevin Long Lead Developer of Thyra, Developer of Belos and Teuchos

Roger Pawlowski Lead Developer of NOX

Michael Phenow Trilinos Webmaster Lead Developer of New_Package

Eric Phipps Developer of LOCA and NOX

Marzio SalaLead Developer of Didasko and IFPACKDeveloper of ML, Amesos

Andrew Salinger Lead Developer of LOCA

Paul Sexton Developer of Epetra and Tpetra

Bill SpotzLead Developer of PyTrilinosDeveloper of Epetra, New_Package

Ken Stanley Lead Developer of Amesos and New_Package

Heidi Thornquist Lead Developer of Anasazi, Belos and Teuchos

Ray Tuminaro Lead Developer of ML and Meros

Jim Willenbring Developer of Epetra and New_Package. Trilinos library manager

Alan Williams Developer of Epetra, EpetraExt, AztecOO, Tpetra

Page 15: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

15

Developer and Package Node IDsDeveloper IDs

RossBartlett = 1;PaulBoggs = 2;ToddCoffey = 3;JasonCross = 4;DavidDay = 5;ClarkDohrmann= 6;MichaelGee = 7;BobHeaphy = 8;MikeHeroux = 9;UlrichHetmaniuk = 10;RobHoekstra = 11;RussellHooper = 12;VickiHowle = 13;JonathanHu = 14;SarahKnepper = 15;TammyKolda = 16;JoeKotulski = 17;RichLehoucq = 18;KevinLong = 19;RogerPawlowski = 20;MichaelPhenow = 21;EricPhipps = 22;MarzioSala = 23;AndrewSalinger = 24;PaulSexton = 25;BillSpotz = 26;KenStanley = 27;HeidiThornquist = 28;RayTuminaro = 29;JimWillenbring = 30;AlanWilliams = 31;

Package IDs

PyTrilinos = 1;amesos = 2;anasazi = 3;aztecoo = 4;belos = 5;claps = 6;didasko = 7;epetra = 8;epetraext = 9;ifpack = 10;jpetra = 11;kokkos = 12;komplex = 13;meros = 14;loca = 15;ml = 16;new_package = 17;nox = 18;pliris = 19;rythmos = 20;teuchos = 21;thyra = 22;tpetra = 23;trilinosframework = 24;

Page 16: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

16

Developer-Package EdgesA(i,j) = 1 if developer i contributes to package j

A = sparse(31,24);

A(RossBartlett,rythmos) = 1;A(RossBartlett,thyra) = 1;A(PaulBoggs,thyra) = 1;A(ToddCoffey,rythmos) = 1;A(JasonCross,jpetra) = 1;A(DavidDay,komplex) = 1;A(ClarkDohrmann,claps) = 1;A(MichaelGee,ml) = 1;A(MichaelGee,nox) = 1;A(BobHeaphy,trilinosframework) = 1;A(MikeHeroux,trilinosframework) = 1;A(MikeHeroux,epetra) = 1;A(MikeHeroux,aztecoo) = 1;A(MikeHeroux,kokkos) = 1;A(MikeHeroux,komplex) = 1;A(MikeHeroux,ifpack) = 1;A(MikeHeroux,thyra) = 1;A(MikeHeroux,tpetra) = 1;A(MikeHeroux,amesos) = 1;A(MikeHeroux,belos) = 1;A(MikeHeroux,epetraext) = 1;A(MikeHeroux,jpetra) = 1;A(UlrichHetmaniuk,anasazi) = 1;

A(MarzioSala,didasko) = 1;A(MarzioSala,ifpack) = 1;A(MarzioSala,ml) = 1;A(MarzioSala,amesos) = 1;A(AndrewSalinger,loca) = 1;A(PaulSexton,epetra) = 1;A(PaulSexton,tpetra) = 1;A(BillSpotz,PyTrilinos) = 1;A(BillSpotz,epetra) = 1;A(BillSpotz,new_package) = 1;A(KenStanley,amesos) = 1;A(KenStanley,new_package) = 1;A(HeidiThornquist,anasazi) = 1;A(HeidiThornquist,belos) = 1;A(HeidiThornquist,teuchos) = 1;A(RayTuminaro,ml) = 1;A(RayTuminaro,meros) = 1;A(JimWillenbring,epetra) = 1;A(JimWillenbring,new_package) = 1;A(JimWillenbring,trilinosframework) = 1;A(AlanWilliams,epetra) = 1;A(AlanWilliams,epetraext) = 1;A(AlanWilliams,aztecoo) = 1;A(AlanWilliams,tpetra) = 1;

A(RobHoekstra,epetra) = 1;A(RobHoekstra,thyra) = 1;A(RobHoekstra,tpetra) = 1;A(RobHoekstra,epetraext) = 1;A(RussellHooper,nox) = 1;A(VickiHowle,meros) = 1;A(VickiHowle,belos) = 1;A(VickiHowle,thyra) = 1;A(JonathanHu,ml) = 1;A(SarahKnepper,komplex) = 1;A(TammyKolda,nox) = 1;A(TammyKolda,trilinosframework) = 1;A(JoeKotulski,pliris) = 1;A(RichLehoucq,anasazi) = 1;A(RichLehoucq,belos) = 1;A(KevinLong,thyra) = 1;A(KevinLong,belos) = 1;A(KevinLong,teuchos) = 1;A(RogerPawlowski,nox) = 1;A(MichaelPhenow,trilinosframework) = 1;A(MichaelPhenow,trilinosframework) = 1;A(EricPhipps,loca) = 1;A(EricPhipps,nox) = 1;

Page 17: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

17

Developer-Package Matrix

Number of developers per package: Maximum: max(sum(A)) = 6 Average: sum(sum(A))/24 =

2.875

Number of package affiliations per developer: Maximum: max(sum(A’)) = 12 (me).

• Minus outlier: = 4.

Average: sum(sum(A’))/31 = 2.26

Observations: Small package teams. Multiple packages per developer.

spy(A);

Page 18: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

18

Developer-Developer Matrix (A*A’)

B = A*A’ Apply Symmetric AMD

reordering. Two developers connected if co-

authors of a package. Only two developers

“disconnected”: Clark Dohrmann: CLAPS. Joe Kotulski: Pliris.

Komplex Subteam

ML Subteam Anasazi Subteam

Epetra Subteam

NOX Subteam

Thyra Subteam

Page 19: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

19

Package-Package Matrix (A’*A)(Not sure what to say about this)

B = A’*A, Apply symamd. Two packages connected if they share

a developer. Only two packages “disconnected”:

• Clark Dohrmann: CLAPS.

• Joe Kotulski: Pliris.

Me

Epetra, Amesos, Didasko, Ifpack

Without Me

Meros, AztecOO, Tpetra, EpetraExt

Page 20: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

20

Why Packages?

Partial Answer: Small Inter-related teams.

Page 21: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

21

Package Interoperability

Page 22: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

22

Interoperability vs. Dependence (“Can Use”) (“Depends On”)

Although most Trilinos packages have no explicit dependence, each package must interact with some other packages: NOX needs operator, vector and solver objects. AztecOO needs preconditioner, matrix, operator and vector objects. Interoperability is enabled at configure time. For example, NOX:

--enable-nox-lapack compile NOX lapack interface libraries

--enable-nox-epetra compile NOX epetra interface libraries

--enable-nox-petsc compile NOX petsc interface libraries

Trilinos configure script is vehicle for: Establishing interoperability of Trilinos components… Without compromising individual package autonomy.

Trilinos offers seven basic interoperability mechanisms.

../configure –enable-python

Page 23: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

23

Trilinos Interoperability Mechanisms(Acquired as Package Matures)

Package builds under Trilinos configure scripts.

Package can be built as part of a suite of packages; cross-package interfaces enable/disable automatically

Package accepts user data as Epetra or Thyra objects

Applications using Epetra/Thyra can use package

Package accepts parameters from Teuchos ParameterLists

Applications using Teuchos ParameterLists can drive package

Package can be used via Thyra abstract solver classes

Applications or other packages using Thyra can use package

Package can use Epetra for private data.

Package can then use other packages that understand Epetra

Package accesses solver services via Thyra interfaces

Package can then use other packages that implement Thyra interfaces

Package available via PyTrilinos Package can be used with other Trilinos packages via Python.

Page 24: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

24

“Can Use” vs. “Depends On”“Can Use” Interoperable without dependence. Dense is Good. Encouraged.

“Depends On” OK, if essential. Epetra: 9 clients. Thyra, Teuchos, NOX: 2 clients. Discouraged.

Page 25: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

25

Interoperability Example: ML

ML: Multi-level Preconditioner Package. Primary Developers: Ray Tuminaro, Jonathan Hu, Marzio Sala. No explicit, essential dependence on other Trilinos packages.

Uses abstract interfaces to matrix/operator objects. Has independent configure/build process (but can be invoked at Trilinos level).

Interoperable with other Trilinos packages and other libraries: Accepts user data as Epetra matrices/vectors. Can use Epetra for internal matrices/vectors. Can be used via Thyra abstract interfaces. Can be built via Trilinos configure/build process. Available as preconditioner to all other Trilinos packages. Can use IFPACK, Amesos, AztecOO objects as smoothers, coarse solvers. Can be driven via Teuchos ParameterLists. Available to PETSc users without dependence on any other Trilinos packages.

Page 26: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

26

Package Maturation Process

Asynchronicity

Page 27: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

27

Day 1 of Package Life CVS: Each package is self-contained in Trilinos/package/ directory. Bugzilla: Each package has its own Bugzilla product. Bonsai: Each package is browsable via Bonsai interface. Mailman: Each Trilinos package, including Trilinos itself, has four mail

lists: [email protected]

• CVS commit emails. “Finger on the pulse” list. [email protected]

• Mailing list for developers. [email protected]

• Issues for package users. [email protected]

• Releases and other announcements specific to the package.

New_package (optional): Customizable boilerplate for Autoconf/Automake/Doxygen/Python/Thyra/Epetra/TestHarness/Website

Page 28: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

28

Sample Package Maturation ProcessStep Example

Package added to CVS: Import existing code or start with new_package.

ML CVS repository migrated into Trilinos (July 2002).

Mail lists, Bugzilla Product, Bonsai database created.

ml-announce, ml-users, ml-developers, ml-checkins, ml-regression @software.sandia.gov created, linked to CVS (July 2002).

Package builds with configure/make, Trilinos-compatible

ML adopts Autoconf, Automake starting from new_package (June 2003).

Epetra objects recognized by package. ML accepts user data as Epetra matrices and vectors (October 2002).

Package accessible via Thyra interfaces. ML adaptors written for TSFCore_LinOp (Thyra) interface (May 2003).

Package uses Epetra for internal data. ML able to generate Epetra matrices. Allows use of AztecOO, Amesos, Ifpack, etc. as smoothers and coarse grid solvers (Feb-June 2004).

Package parameters settable via Teuchos ParameterList

ML gets manager class, driven via ParameterLists (June 2004).

Package usable from Python (PyTrilinos) ML Python wrappers written using new_package template (April 2005).

Startup Steps Maturation Steps

Page 29: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

29

Maturation Jumpstart: NewPackage NewPackage provides jump start to develop/integrate a new

package NewPackage is a “Hello World” program and website:

Simple but it does work with autotools. Compiles and builds.

NewPackage directory contains: Commonly used directory structure: src, test, doc, example, config. Working Autoconf/Automake files. Documentation templates (doxygen). Working regression test setup. Working Python and Thyra adaptors.

Substantially cuts down on: Time to integrate new package. Variation in package integration details. Development of website.

NOTE: NewPackage can be used independent from Trilinos

Page 30: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

30

What Trilinos is not Trilinos is not a single monolithic piece of software. Each package:

Can be built independent of Trilinos. Has its own self-contained CVS structure. Has its own Bugzilla product and mail lists. Development team is free to make its own decisions about algorithms, coding

style, release contents, testing process, etc.

Trilinos top layer is not a large amount of source code: Trilinos repository (6.0 branch) contains: 660,378 source lines of code

(SLOC). Sum of the packages SLOC counts : 648,993. Trilinos top layer SLOC count: 11,385 (1.7%).

Trilinos is not “indivisible”: You don’t need all of Trilinos to get things done. Any collection of packages can be combined and distributed. Current public release contains only 16 of the 20+ Trilinos packages.

Page 31: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

31

Insight from HistoryA Philosophy for Future Directions

In the early 1800’s U.S. had many new territories. Question: How to incorporate into U.S.?

Colonies? No. Expand boundaries of existing states? No. Create process for self-governing regions. Yes. Theme: Local control drawing on national resources.

Trilinos package architecture has some similarities: Asynchronous maturation. Packages decide degree of interoperations, use of Trilinos

facilities.

Strength of each: Scalable growth with local control.

Page 32: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

32

Solver Collaboration: The Big Picture

Page 33: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

33

ANA Linear Operator Interface

Solver Software Components and Interfaces

2) LAL : Linear Algebra Library (e.g. vectors, sparse matrices, sparse factorizations, preconditioners)

ANA

APP

ANA/APP Interface

ANA Vector Interface

1) ANA : Abstract Numerical Algorithm (e.g. linear solvers, eigen solvers, nonlinear solvers, stability analysis, uncertainty quantification, transient solvers, optimization etc.)

3) APP : Application (the model: physics, discretization method etc.)

Example Trilinos Packages:• Belos (linear solvers)• Anasazi (eigen solvers)• NOX (nonlinear equations)• Rhythmos (ODEs,DAEs)• MOOCHO (Optimization)• …

Example Trilinos Packages:• Epetra/Tpetra (Mat,Vec)• Ifpack, AztecOO, ML (Preconditioners)• Meros (Preconditioners)• Pliris (Interface to direct solvers)• Amesos (Direct solvers)• Komplex (Complex/Real forms)• …Types of Software Components

ThyraANA Interfaces to Linear Algebra

FEI/ThyraAPP to LAL Interfaces Custom/Thyra

LAL to LAL Interfaces

Thyra::Nonlin

Examples:• SIERRA• NEVADA• Xyce• Sundance• …

LAL

Matrix Preconditioner

Vector

Page 34: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

34

LAL Foundation: Petra Petra provides a “common language” for distributed

linear algebra objects (operator, matrix, vector)

Petra provides distributed matrix and vector services. Has 3 implementations under development.

Page 35: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

35

Perform redistribution of distributed objects:• Parallel permutations.• “Ghosting” of values for local computations.• Collection of partial results from remote processors.

Petra Object Model

Abstract Interface to Parallel Machine• Shameless mimic of MPI interface.• Keeps MPI dependence to a single class (through all of Trilinos!).• Allow trivial serial implementation.• Opens door to novel parallel libraries (shmem, UPC, etc…)

Abstract Interface for Sparse All-to-All Communication• Supports construction of pre-recorded “plan” for data-driven communications.• Examples:

• Supports gathering/scatter of off-processor x/y values when computing y = Ax.• Gathering overlap rows for Overlapping Schwarz.• Redistribution of matrices, vectors, etc…

Describes layout of distributed objects:• Vectors: Number of vector entries on each processor and global ID• Matrices/graphs: Rows/Columns managed by a processor.• Called “Maps” in Epetra.

Dense Distributed Vector and Matrices:• Simple local data structure.• BLAS-able, LAPACK-able.• Ghostable, redistributable.• RTOp-able.

Base Class for All Distributed Objects:• Performs all communication.• Requires Check, Pack, Unpack methods from derived class.

Graph class for structure-only computations:• Reusable matrix structure.• Pattern-based preconditioners.• Pattern-based load balancing tools. Basic sparse matrix class:

• Flexible construction process.• Arbitrary entry placement on parallel machine.

Page 36: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

36

Petra Implementations

Three version under development: Epetra (Essential Petra):

Current production version. Restricted to real, double precision arithmetic. Uses stable core subset of C++ (circa 2000). Interfaces accessible to C and Fortran users.

Tpetra (Templated Petra): Next generation C++ version. Templated scalar and ordinal fields. Uses namespaces, and STL: Improved usability/efficiency.

Jpetra (Java Petra): Pure Java. Portable to any JVM. Interfaces to Java versions of MPI, LAPACK and BLAS via interfaces.

Page 37: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

37

Data Model Wrap-Up

Our target apps require flexible data model. Petra Object Model (POM) supports:

Arbitrary placement of vector, graph, matrix entries on parallel machine.

Arbitrary redistribution, ghosting and collection of distributed data.

This flexibility is needed by LALs: Algebraic and multi-level preconditioners. Concrete distributed matrix kernels. Direct methods.

POM is complex: Non-LALs (ANAs) do not rely on it.

Page 38: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

38

Abstract Numerical Algorithms (ANAs)

Page 39: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

39Full “Vertical” Solver Coverage Trilinos Packages

· Transient Problems:

· DAEs/ODEs: Rythmos0 0

Solve ( ( ), ( ), ) 0

[0, ], (0) , (0)

for ( ) , [0, ]n

f x t x t t

t T x x x x

x t t T

· Nonlinear Problems:

· Nonlinear equations:

· Stability analysis:

NOX

LOCA

Given nonlinear op ( , )

Solve ( ) 0 for

For ( , ) 0 find space singular

n m n

n

c x u

c x x

cc x u u U

x

· Explicit Linear Problems:

· Matrix/graph equations:

· Vector problems:Epetra, Tpetra

Compute ; ( ); ,

Compute ; , ; ,

m n m n

n

y Ax A A G A G

y x w x y x y

Anasazi

· Implicit Linear Problems:

· Linear equations:

· Eigen problems:

AztecOO, Belos, Ifpack,ML,etc.

Given linear ops (matrices) ,

Solve for

Solve for (all) ,

n n

n

n

A B

Ax b x

Av Bv v

· Optimization Problems:

· Unconstrained:

· Constrained:MOOCHO

Find that minimizes ( )

Find and that

minimizes ( , ) s.t. ( , ) 0

n

m n

u f u

y u

f y u c y u

Page 40: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

40

Goal of Abstraction: Flexibility

Linear solvers (Direct, Iterative, Preconditioners) Abstraction of basic vector/matrix operations (dot, axpy, mv). Can use any concrete linear algebra library (Epetra, PETSc, BLAS).

Nonlinear solvers (Newton, etc.) + Abstraction of linear solve (solve Ax=b). Can use any concrete linear solver library (AztecOO, ML, PETSc, LAPACK).

Transient/DAE solvers (implicit) + Abstraction of nonlinear solve. … and so on.

But how do we really make it work, and retain high performance? Our answer: Thyra.

Page 41: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

Introducing Abstract Numerical Algorithms

An ANA is a numerical algorithm that can be expressed abstractly solely in terms of vectors, vector spaces, and linear operators (i.e. not with direct element access)

Example Linear ANA (LANA) : Linear Conjugate Gradients

scalar product<x,y> defined by vector space

vector-vector operations

linear operator applications

Scalar operations

Types of operations Types of objects

What is an abstract numerical algorithm (ANA)?

Linear Conjugate Gradient Algorithm

Page 42: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

42Examples of Nonlinear Abstract Numerical Algorithms

Class of Numerical Problem Example ANA

Nonlinear equations Newton’s method (e.g. NOX, MOOCHO)

Initial Value DAE/ODE Backward Euler method (e.g. Rythmos)

Linear equations GMRES (e.g. Belos)

Requires

Requires

Page 43: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

43

LinearOpVectorSpace

OpBase

Vector

MultiVector

1

columns1..*

RTOpT

rangedomain

space

Basic Thyra Vector and Operator Interfaces

LinearOps apply to Vectors

An operator knows its domain and range spaces

A linear operator is a kind of operator

A Vector knows its VectorSpace

<<create>>

VectorSpaces create Vectors!

Page 44: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

44

Basic Thyra Vector and Operator Interfaces

LinearOpVectorSpace

OpBase

Vector

MultiVector

1

columns1..*

RTOpT

rangedomain

space

<<create>>

• Basically compatible with HCL (Symes et al.) and many other operator/vector interfaces

• Near optimal for many but not all abstract numerical algorithms (ANAs)

• What’s missing?

=> Multi-vectors!

Page 45: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

45Introducing Multi-Vectors

What is a multi-vector?• An m multi-vector V is a tall thin dense

matrix composed of m column vectors vj

Example: m = 4 columns

V = =

What ANAs can exploit multi-vectors?• Block linear solvers (e.g. block GMRES):• Block eigen solvers (i.e. block Arnoldi):• Compact limited memory quasi-Newton• Tensor methods for nonlinear equations

Why are multi-vectors important?• Cache performance• Reduce global communication

Examples of multi-vector operations

=

Y = A X

Q = XT Y

Y = Y + X Q

• Block dot products (m2)

• Block update

• Operator applications

(i.e. mat-vecs)

= +

=

Page 46: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

46

Basic Thyra Vector and Operator Interfaces

LinearOpVectorSpace

OpBase

Vector

MultiVector

1

columns1..*

RTOpT

rangedomain

space

<<create>>

Where do multi-vectors fit in?

Page 47: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

47

LinearOpVectorSpace

OpBase

Vector

MultiVector

1

columns1..*

RTOpT

rangedomain

space

Basic Thyra Vector and Operator Interfaces

A MulitVector is a linear operator

A MulitVector has a collection of column vectors

A MulitVector is a tall thin dense matrix

<<create>>

VectorSpaces create MultiVectors

LinearOps apply to MultiVectors

A Vector is a MultiVector

Page 48: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

48

LinearOpVectorSpace

OpBase

Vector

MultiVector

1

columns1..*

RTOpT

rangedomain

space

Basic Thyra Vector and Operator Interfaces

What about standard vector ops? Reductions (norm, dot etc.)? Transformations (axpy, scaling etc.)?

What about specialized vector ops? e.g. Interior point methods for opt

<<create>>

Page 49: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

49What’s the Big Deal about Vector-Vector Operations?

Examples from OOQP (Gertz, Wright)

y y x i ni i i / , ...1y y x z i ni i i i , ...1

yy y y yy y y y

y y yi ni

i i

i i

i

min min

max max

min max

, ...ififif0

1 dx:max

Example from TRICE (Dennis, Heinkenschloss, Vicente)

d

b u w bw b

u a w aw a

i ni

i i i

i i

i i i

i i

( ) and and

( ) and . and .

, ...

/

/

1 2

1 2

01 0

01 0

1

ifififif

Example from IPOPT (Waechter)

Ui

Li

Ui

Lii

U

Li

Li

Ui

Lii

L

iU

iiU

iL

iiL

iU

iL

Li

UiL

i

i

xxxxx

xxxxxwhere

ni

xxx

xxx

xxxx

x

x

,maxˆ

,minˆ:

...1,

ˆifˆ

ˆifˆ

ˆˆif2 Currently in MOOCHO :

> 40 vector operations!

Many different and unusual vector operations are needed by interior point methods for optimization!

Page 50: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

50Vector Reduction/Transformation Operators Defined

Reduction/Transformation Operators (RTOp) Defined

z 1i … z q

i opt( i , v 1i … v p

i , z 1i … z q

i ) element-wise transformation opr( i , v 1

i … v pi , z 1

i … z qi ) element-wise reduction

2 oprr( 1 , 2 ) reduction of intermediate reduction objects

• v 1 … v p R n : p non-mutable (constant) input vectors

• z 1 … z q

R n : q mutable (non-constant) input/output vectors• : reduction target object (many be non-scalar (e.g. {yk ,k}), or NULL)

Key to Optimal Performance

• opt(…) and opr(…) applied to entire sets of subvectors (i = a…b) independently:z 1

a:b … z qa:b , op( a, b , v 1

a:b … v pa:b , z 1

a:b … z qa:b , )

• Communication between sets of subvectors only for NULL, oprr( 1 , 2 ) 2

RTOpT

apply_op(in sv[1..p], inout sz[1…q], inout beta )

ROpDotProd

apply_op(… )

TOpAssignVectors

apply_op(… )

Vector

applyOp( in : RTOpT, apply_op(in v[1..p] , inout z[1…q], inout beta)

SerialVectorStd

applyOp( … )

MPIVectorStd

applyOp( … )

Software Implementation (see RTOp paper on Trilinos website “Publications”)

• “Visitor” design pattern (a.k.a. function forwarding)

Page 51: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

51

Basic Thyra Vector and Operator Interfaces

The missing piece:Reduction/Transformation Operators • Supports all needed vector operations• Data/parallel independence• Optimal performance

R. A. Bartlett, B. G. van Bloemen Waanders and M. A. Heroux. Vector Reduction/Transformation Operators, ACM TOMS, March 2004

LinearOpVectorSpace

OpBase

Vector

MultiVector

1

columns1..*

RTOpT

rangedomain

space

Page 52: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

52

Managing Parallel Data

The Petra Object Model

Page 53: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

53

A Simple Epetra/AztecOO Program

// Header files omitted…int main(int argc, char *argv[]) { MPI_Init(&argc,&argv); // Initialize MPI, MpiComm Epetra_MpiComm Comm( MPI_COMM_WORLD );

// ***** Create x and b vectors ***** Epetra_Vector x(Map); Epetra_Vector b(Map); b.Random(); // Fill RHS with random #s

// ***** Create an Epetra_Matrix tridiag(-1,2,-1) *****

Epetra_CrsMatrix A(Copy, Map, 3); double negOne = -1.0; double posTwo = 2.0;

for (int i=0; i<NumMyElements; i++) { int GlobalRow = A.GRID(i); int RowLess1 = GlobalRow - 1; int RowPlus1 = GlobalRow + 1; if (RowLess1!=-1) A.InsertGlobalValues(GlobalRow, 1, &negOne, &RowLess1); if (RowPlus1!=NumGlobalElements) A.InsertGlobalValues(GlobalRow, 1, &negOne, &RowPlus1); A.InsertGlobalValues(GlobalRow, 1, &posTwo, &GlobalRow); }A.FillComplete(); // Transform from GIDs to LIDs

// ***** Map puts same number of equations on each pe *****

int NumMyElements = 1000 ; Epetra_Map Map(-1, NumMyElements, 0, Comm); int NumGlobalElements = Map.NumGlobalElements();

// ***** Report results, finish *********************** cout << "Solver performed " << solver.NumIters() << " iterations." << endl << "Norm of true residual = " << solver.TrueResidual() << endl;

MPI_Finalize() ; return 0;}

// ***** Create/define AztecOO instance, solve ***** AztecOO solver(problem); solver.SetAztecOption(AZ_precond, AZ_Jacobi); solver.Iterate(1000, 1.0E-8);

// ***** Create Linear Problem ***** Epetra_LinearProblem problem(&A, &x, &b);

Page 54: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

54

Typical Flow of Epetra Object Construction

Construct Comm

Construct Map

Construct x Construct b Construct A

• Any number of Comm objects can exist.• Comms can be nested (ee.g., serial within MPI).

• Maps describe parallel layout.• Maps typically associated with more than one comp object.• Two maps (source and target) define an export/import object.

• Computational objects.• Compatibility assured via common map.

Page 55: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

55

Parallel Data Redistribution

Epetra vectors, multivectors, graphs and matrices are distributed via one of the map objects.

A map is basically a partitioning of a list of global IDs: IDs are simply labels, no need to use contiguous values (Directory class

handles details for general ID lists). No a priori restriction on replicated IDs.

If we are given: A source map and A set of vectors, multivectors, graphs and matrices (or other distributable

objects) based on source map. Redistribution is performed by:

1. Specifying a target map with a new distribution of the global IDs.2. Creating Import or Export object using the source and target maps.3. Creating vectors, multivectors, graphs and matrices that are redistributed

(to target map layout) using the Import/Export object.

Page 56: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

56

Example: epetra/ex9.cppint main(int argc, char *argv[]) { MPI_Init(&argc, &argv); Epetra_MpiComm Comm(MPI_COMM_WORLD); int NumGlobalElements = 4; // global dimension of

the problem int NumMyElements; // local nodes Epetra_IntSerialDenseVector MyGlobalElements;

if( Comm.MyPID() == 0 ) { NumMyElements = 3; MyGlobalElements.Size(NumMyElements); MyGlobalElements[0] = 0; MyGlobalElements[1] = 1; MyGlobalElements[2] = 2; } else { NumMyElements = 3; MyGlobalElements.Size(NumMyElements); MyGlobalElements[0] = 1; MyGlobalElements[1] = 2; MyGlobalElements[2] = 3; }// create a map Epetra_Map Map(-1,MyGlobalElements.Length(),

MyGlobalElements.Values(),0, Comm);

// create a vector based on map Epetra_Vector xxx(Map); for( int i=0 ; i<NumMyElements ; ++i ) xxx[i] = 10*( Comm.MyPID()+1 ); if( Comm.MyPID() == 0 ){ double val = 12; int pos = 3; xxx.SumIntoGlobalValues(1,0,&val,&pos); } cout << xxx; // create a target map, in which all elements are on proc 0 int NumMyElements_target; if( Comm.MyPID() == 0 ) NumMyElements_target = NumGlobalElements; else NumMyElements_target = 0; Epetra_Map TargetMap(-1,NumMyElements_target,0,Comm); Epetra_Export Exporter(Map,TargetMap); // work on vectors Epetra_Vector yyy(TargetMap); yyy.Export(xxx,Exporter,Add); cout << yyy;MPI_Finalize();return( EXIT_SUCCESS );}

Page 57: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

57

Output: epetra/ex9.cpp

> mpirun -np 2 ./ex9.exeEpetra::Vector MyPID GID Value 0 0 10 0 1 10 0 2 10Epetra::Vector 1 1 20 1 2 20 1 3 20Epetra::Vector MyPID GID Value 0 0 10 0 1 30 0 2 30 0 3 20Epetra::Vector

PE 0xxx(0)=10xxx(1)=10xxx(2)=10

PE 1xxx(1)=20xxx(2)=20xxx(3)=20

PE 0yyy(0)=10yyy(1)=30yyy(2)=30yyy(3)=20

PE 1

Export/Add

Before Export After Export

Page 58: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

58

Import vs. Export

Import (Export) means calling processor knows what it wants to receive (send).

Distinction between Import/Export is important to user, almost identical in implementation.

Import (Export) objects can be used to do an Export (Import) as a reverse operation.

When mapping is bijective (1-to-1 and onto), either Import or Export is appropriate.

Page 59: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

59

Example: 1D Matrix Assembly

a b

-uxx = fu(a) = 0

u(b) = 1

x1 x2 x3

PE 0 PE 1

• 3 Equations: Find u at x1, x2 and x3

• Equation for u at x2 gets a contribution from PE 0 and PE 1.

• Would like to compute partial contributions independently.

• Then combine partial results.

Page 60: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

60

Two Maps

We need two maps: Assembly map:

• PE 0: { 1, 2 }.

• PE 1: { 2, 3 }.

Solver map:• PE 0: { 1, 2 } (we arbitrate ownership of 2).

• PE 1: { 3 }.

Page 61: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

61

End of Assembly Phase

At the end of assembly phase we have AssemblyMatrix:

On PE 0:

On PE 1:

Want to assign all of Equation 2 to PE 0 for usewith solver.

NOTE: For a class of Neumann-Neumann preconditioners, the above layout is exactly what we want.

2 1 0

1 1 0

Equation 1:

Equation 2:

0 1 1

0 1 2

Equation 2:

Equation 3:

Row 2 is shared

Page 62: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

62

Export Assembly Matrix to Solver Matrix

Epetra_Export Exporter(AssemblyMap, SolverMap);

Epetra_CrsMatrix SolverMatrix (Copy, SolverMap, 0);

SolverMatrix.Export(AssemblyMatrix, Exporter, Add);

SolverMatrix.FillComplete();

Page 63: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

63

Matrix Export

2 1 0

1 2 1

Equation 1:

Equation 2:

Equation 3: 0 1 2

2 1 0

1 1 0

0 1 1

0 1 2

Equation 1:

Equation 2:

Equation 2:

Equation 3:

PE 0 PE 0

PE 1 PE 1

Before Export After Export

Export/Add

Page 64: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

64

Example: epetraext/ex2.cppint main(int argc, char *argv[]) {

MPI_Init(&argc,&argv);

Epetra_MpiComm Comm (MPI_COMM_WORLD);

int MyPID = Comm.MyPID();

int n=4;

// Generate Laplacian2d gallery matrix

Trilinos_Util::CrsMatrixGallery G("laplace_2d", Comm);

G.Set("problem_size", n*n);

G.Set("map_type", "linear"); // Linear map initially

// Get the LinearProblem.

Epetra_LinearProblem *Prob = G.GetLinearProblem();

// Get the exact solution.

Epetra_MultiVector *sol = G.GetExactSolution();

// Get the rhs (b) and lhs (x)

Epetra_MultiVector *b = Prob->GetRHS();

Epetra_MultiVector *x = Prob->GetLHS();

// Repartition graph using Zoltan

EpetraExt::Zoltan_CrsGraph * ZoltanTrans = new EpetraExt::Zoltan_CrsGraph();

EpetraExt::LinearProblem_GraphTrans * ZoltanLPTrans =

new EpetraExt::LinearProblem_GraphTrans( *(dynamic_cast<EpetraExt::StructuralSameTypeTransform<Epetra_CrsGraph>*>(ZoltanTrans)) );

cout << "Creating Load Balanced Linear Problem\n";

Epetra_LinearProblem &BalancedProb = (*ZoltanLPTrans)(*Prob);

// Get the rhs (b) and lhs (x)

Epetra_MultiVector *Balancedb = Prob->GetRHS();

Epetra_MultiVector *Balancedx = Prob->GetLHS();

cout << "Balanced b: " << *Balancedb << endl;

cout << "Balanced x: " << *Balancedx << endl;

MPI_Finalize() ;

return 0 ;

}

Page 65: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

65

Need for Import/Export

Solvers for complex engineering applications need expressive, easy-to-use parallel data redistribution: Allows better scaling for non-uniform overlapping Schwarz. Necessary for robust solution of multiphysics problems.

We have found import and export facilities to be a very natural and powerful technique to address these issues.

Page 66: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

66

Details about Epetra Maps

Getting beyond standard use case…

Page 67: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

67

1-to-1 Maps

1-to-1 map (defn): A map is 1-to-1 if each GID appears only once in the map (and is therefore associated with only a single processor).

Certain operations in parallel data repartitioning require 1-to-1 maps. Specifically: The source map of an import must be 1-to-1. The target map of an export must be 1-to-1. The domain map of a 2D object must be 1-to-1. The range map of a 2D object must be 1-to-1.

Page 68: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

68

2D Objects: Four Maps

Epetra 2D objects: CrsMatrix CrsGraph VbrMatrix

Have four maps: RowMap: On each processor, the GIDs of the rows that processor

will “manage”. ColMap: On each processor, the GIDs of the columns that

processor will “manage”. DomainMap: The layout of domain objects

(the x vector/multivector in y=Ax). RangeMap: The layout of range objects

(the y vector/multivector in y=Ax).

Must be 1-to-1 maps!!!

Typically a 1-to-1 map

Typically NOT a 1-to-1 map

Page 69: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

69

Sample Problem

2 1 0

1 2 1

0 1 2

1

2

3

x

x

x

=

1

2

3

y

y

y

y A x

Page 70: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

70

Case 1: Standard Approach

RowMap = {0, 1} ColMap = {0, 1, 2} DomainMap = {0, 1} RangeMap = {0, 1}

1 1

22

2 1 0,... ,...

1 2 1

y xy A x

xy

First 2 rows of A, elements of y and elements of x, kept on PE 0. Last row of A, element of y and element of x, kept on PE 1.

PE 0 Contents

3 3,... 0 1 2 ,...y y A x x

PE 1 Contents

RowMap = {2} ColMap = {1, 2} DomainMap = {2} RangeMap = {2}

Notes: Rows are wholly owned. RowMap=DomainMap=RangeMap (all 1-to-1). ColMap is NOT 1-to-1. Call to FillComplete: A.FillComplete(); // Assumes

2 1 0

1 2 1

0 1 2

1

2

3

x

x

x

=1

2

3

y

y

y

y A xOriginal Problem

Page 71: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

71

1

2

3

x

x

x

1

2

3

y

y

y

Case 2: Twist 1

RowMap = {0, 1} ColMap = {0, 1, 2} DomainMap = {1, 2} RangeMap = {0}

21

3

2 1 0,... ,...

1 2 1

xy y A x

x

First 2 rows of A, first element of y and last 2 elements of x, kept on PE 0. Last row of A, last 2 element of y and first element of x, kept on PE 1.

PE 0 Contents

21

3

,... 0 1 2 ,...y

y A x xy

PE 1 Contents

RowMap = {2} ColMap = {1, 2} DomainMap = {0} RangeMap = {1, 2}

Notes: Rows are wholly owned. RowMap is NOT = DomainMap

is NOT = RangeMap (all 1-to-1). ColMap is NOT 1-to-1. Call to FillComplete:

A.FillComplete(DomainMap, RangeMap);

2 1 0

1 2 1

0 1 2

=

y A xOriginal Problem

Page 72: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

72Case 2: Twist 2

RowMap = {0, 1} ColMap = {0, 1} DomainMap = {1, 2} RangeMap = {0}

21

3

2 1 0,... ,...

1 1 0

xy y A x

x

First row of A, part of second row of A, first element of y and last 2 elements of x, kept on PE 0.

Last row, part of second row of A, last 2 element of y and first element of x, kept on PE 1.

PE 0 Contents

21

3

0 1 1,... ,...

0 1 2

yy A x x

y

PE 1 Contents

RowMap = {1, 2} ColMap = {1, 2} DomainMap = {0} RangeMap = {1, 2}

Notes: Rows are NOT wholly owned. RowMap is NOT = DomainMap

is NOT = RangeMap (all 1-to-1). RowMap and ColMap are NOT 1-to-1. Call to FillComplete:

A.FillComplete(DomainMap, RangeMap);

2 1 0

1 2 1

0 1 2

=

y A xOriginal Problem

1

2

3

x

x

x

1

2

3

y

y

y

Page 73: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

73

Overview of Trilinos Packages

Page 74: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

74

1Petra is Greek for “foundation”.

Trilinos Common Language: Petra Petra provides a “common language” for distributed

linear algebra objects (operator, matrix, vector)

Petra1 provides distributed matrix and vector services. Exists in basic form as an object model:

Describes basic user and support classes in UML, independent of language/implementation.

Describes objects and relationships to build and use matrices, vectors and graphs.

Has 3 implementations under development.

Page 75: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

75

Petra Implementations

Three version under development: Epetra (Essential Petra):

Current production version. Restricted to real, double precision arithmetic. Uses stable core subset of C++ (circa 2000). Interfaces accessible to C and Fortran users.

Tpetra (Templated Petra): Next generation C++ version. Templated scalar and ordinal fields. Uses namespaces, and STL: Improved usability/efficiency.

Jpetra (Java Petra): Pure Java. Portable to any JVM. Interfaces to Java versions of MPI, LAPACK and BLAS via interfaces.

Developers: Mike Heroux, Rob Hoekstra, Alan Williams, Paul Sexton

Page 76: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

76

Epetra

Basic Stuff: What you would expect. Variable block matrix data structures. Multivectors. Arbitrary index labeling. Flexible, versatile parallel data redistribution. Language support for inheritance, polymorphism and

extensions. View vs. Copy.

Page 77: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

77

AztecOO Krylov subspace solvers: CG, GMRES, Bi-CGSTAB,… Incomplete factorization preconditioners

Aztec is the workhorse solver at Sandia: Extracted from the MPSalsa reacting flow code. Installed in dozens of Sandia apps. 1900+ external licenses.

AztecOO improves on Aztec by: Using Epetra objects for defining matrix and RHS. Providing more preconditioners/scalings. Using C++ class design to enable more sophisticated use.

AztecOO interfaces allows: Continued use of Aztec for functionality. Introduction of new solver capabilities outside of Aztec.

Developers: Mike Heroux, Alan Williams, Ray Tuminaro

Page 78: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

78

IFPACK: Algebraic Preconditioners Overlapping Schwarz preconditioners with incomplete

factorizations, block relaxations, block direct solves.

Accept user matrix via abstract matrix interface (Epetra versions).

Uses Epetra for basic matrix/vector calculations. Supports simple perturbation stabilizations and condition

estimation. Separates graph construction from factorization, improves

performance substantially. Compatible with AztecOO, ML, Amesos. Can be used by

NOX and ML.Developers: Marzio Sala, Mike Heroux

Page 79: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

79

Amesos

Interface to direct solver for distributed sparse linear systems

Challenges: No single solver dominates Different interfaces and data formats, serial and parallel Interface often change between revisions

Amesos offers: A single, clear, consistent interface, to various packages Common look-and-feel for all classes Separation from specific solver details Use serial and distributed solvers; Amesos takes care of data

redistribution

Developers: Ken Stanley, Marzio Sala, Tim Davis

Page 80: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

80

Epetra_LinearProblem Problem( A, x, b);

Amesos Afact;

Amesos_BaseSolver* Solver = Afact.Create( “Klu”, Problem ) ;

Solver->SymbolicFactorization( ) ;

Solver->NumericFactorization( ) ;

Solver->Solve( ) ;

Calling Amesos

Page 81: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

81

Amesos: Supported Libraries

Library Name Language communicator# procs

for solution1

KLU C Serial 1

Paraklete C MPI any

UMFPACK 4.4 C Serial 1

SuperLU 3.0 C Serial 1

SuperLU_DIST 2.0 C MPI any

MUMPS 4.6.2 F90 MPI any

ScaLAPACK 1.7 F77 MPI any

Others: DSCPACK, PARDISO, TAUCS

1Matrices can be distributed over any number of processes

Page 82: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

82

ML: Multi-level Preconditioners Smoothed aggregation, multigrid and domain decomposition

preconditioning package

Critical technology for scalable performance of some key apps.

ML compatible with other Trilinos packages: Accepts user data as Epetra_RowMatrix object (abstract interface).

Any implementation of Epetra_RowMatrix works. Implements the Epetra_Operator interface. Allows ML

preconditioners to be used with AztecOO and TSF.

Can also be used completely independent of other Trilinos packages.

Developers: Ray Tuminaro, Jonathan Hu, Marzio Sala

Page 83: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

83

NOX: Nonlinear Solvers

Suite of nonlinear solution methods

NOX uses abstract vector and “group” interfaces: Allows flexible selection and tuning of various strategies:

• Directions.

• Line searches.

Epetra/AztecOO/ML, LAPACK, PETSc implementations of abstract vector/group interfaces.

Designed to be easily integrated into existing applications.

Developers: Tammy Kolda, Roger Pawlowski

Page 84: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

84

LOCA

Library of continuation algorithms

Provides Zero order continuation  First order continuation  Arc length continuation  Multi-parameter continuation (via Henderson's MF Library) Turning point continuation  Pitchfork bifurcation continuation  Hopf bifurcation continuation  Phase transition continuation  Eigenvalue approximation (via ARPACK or Anasazi)

Developers: Andy Salinger, Eric Phipps

Page 85: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

85

EpetraExt: Extensions to Epetra

Library of useful classes not needed by everyone

Most classes are types of “transforms”. Examples:

Graph/matrix view extraction. Epetra/Zoltan interface. Explicit sparse transpose. Singleton removal filter, static condensation filter. Overlapped graph constructor, graph colorings. Permutations. Sparse matrix-matrix multiply. Matlab, MatrixMarket I/O functions. …

Most classes are small, useful, but non-trivial to write.

Developer: Robert Hoekstra, Alan Williams, Mike Heroux

Page 86: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

86

Teuchos Utility package of commonly useful tools:

ParameterList class: key/value pair database, recursive capabilities. LAPACK, BLAS wrappers (templated on ordinal and scalar type). Dense matrix and vector classes (compatible with BLAS/LAPACK). FLOP counters, Timers. Ordinal, Scalar Traits support: Definition of ‘zero’, ‘one’, etc. Reference counted pointers, and more…

Takes advantage of advanced features of C++: Templates Standard Template Library (STL)

ParameterList: Allows easy control of solver parameters.

Developers: Roscoe Barlett, Kevin Long, Heidi Thorquist, Mike Heroux, Paul Sexton, Kris Kampshoff

Page 87: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

87

Belos and Anasazi

Next generation linear solvers (Belos) and eigensolvers (Anasazi) libraries, written in templated C++.

Provide a generic interface to a collection of algorithms for solving large-scale linear problems and eigenproblems.

Algorithm implementation is accomplished through the use of abstract base classes (mini interface). Interfaces are derived from these base classes to operator-vector products, status tests, and any arbitrary linear algebra library.

Includes block linear solvers and eigensolvers.

Developers:Heidi Thorquist, Mike Heroux, Chris Baker, Rich Lehoucq

Page 88: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

88

Kokkos

Goal: Isolate key non-BLAS kernels for the purposes of optimization.

Kernels: Dense vector/multivector updates and collective ops (not in

BLAS/Teuchos). Sparse MV, MM, SV, SM.

Serial-only for now. Reference implementation provided (templated). Mechanism for improving performance:

Default is aggressive compilation of reference source. BeBOP: Jim Demmel, Kathy Yelick, Rich Vuduc, UC Berkeley. Vector version: Cray.

Developer: Mike Heroux

Page 89: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

89

NewPackage Package NewPackage provides jump start to develop/integrate a new

package

NewPackage is a “Hello World” program and website: Simple but it does work with autotools. Compiles and builds.

NewPackage directory contains: Commonly used directory structure: src, test, doc, example, config. Working autotools files. Documentation templates (doxygen). Working regression test setup.

Substantially cuts down on: Time to integrate new package. Variation in package integration details. Development of website.

Page 90: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

90

CLAPS Package Clark Dohrmann:

Expert in computational structural mechanics. Unfamiliar with parallel computing. Limited C++ experience.

Epetra: Advanced parallel basic linear algebra package. Supports sparse, dense linear algebra and parallel data redistribution.

Clark+Epetra: CLOP: Overlapping Schwarz preconditioner. Scales with increasing number of constraints. sleg[1,2,4]x joint models, 12 modes, vplant cluster (previously unsolvable).

# DOF #Constraints # procs Time

33,145 4,956 4 11m 36s

227,170 18,366 24 29m 22s

1,674,502 70,566 144 29m 22s

3 Months to develop. Depends only on Epetra. Compatible with rest of Trilinos.

CLAPS=Clark’s Linear Algebra Suite

Page 91: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

91

Pliris Package

Migration of Linpack benchmark code to supported environment. Used new_package as starting point. Put under source control for the first time. Portable build process for the first time. Documented distributed data model (Epetra). Has its own Bugzilla product, mail lists, regression tests. Source code browsable via Bonsai. And more… But Pliris is still an independent piece of code…and better off after

the process.

Page 92: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

92

Interface Packages

Thyra and PyTrilinos

Page 93: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

93template<class Scalar> bool sillyCgSolve( const Thyra::LinearOpBase<Scalar> &A ,const Thyra::VectorBase<Scalar> &b ,const int maxNumIters ,const typename Teuchos::ScalarTraits<Scalar>::magnitudeType tolerance ,Thyra::VectorBase<Scalar>*x ,std::ostream *out = NULL ) { using Teuchos::RefCountPtr; typedef Teuchos::ScalarTraits<Scalar> ST; // We need to use ScalarTraits to support arbitrary types. typedef typename ST::magnitudeType ScalarMag; // This is the type returned from a vector norm. typedef one ST::one());// Get the vector space (domain and range spaces should be the same) RefCountPtr<const Thyra::VectorSpaceBase<Scalar> > space = A.domain();

// Compute initial residual : r = b - A*x RefCountPtr<Thyra::VectorBase<Scalar> > r = createMember(space); Thyra::assign(&*r,b); // r = b Thyra::apply(A,Thyra::NOTRANS,*x,&* one, one); // r = -A*x + r const ScalarMag r0_nrm = Thyra::norm(*r); // Compute ||r0|| = sqrt(<r0,r0>) for convergence test if(r0_nrm == ST::zero()) return true; // Trivial RHS and initial LHS guess?

// Create workspace vectors and scalars RefCountPtr<Thyra::VectorBase<Scalar> > p = createMember(space), q = createMember(space); Scalar rho_old;

// Perform the iterations for( int iter = 0; iter <= maxNumIters; ++iter ) { // Check convergence and output iteration const ScalarMag r_nrm = Thyra::norm(*r); // Compute ||r|| = sqrt(<r,r>) if( r_nrm/r0_nrm < tolerance ) return true; // Converged to tolerance, Success!

// Compute iteration const Scalar rho = space->scalarProd(*r,*r); // <r,r> -> rho if(iter==0) Thyra::assign(&*p,*r); // r -> p (iter == 0) else Thyra::Vp_V( &*p, *r, Scalar(rho/rho_old) ); // r+(rho/rho_old)*p -> p (iter > 0) Thyra::apply(A,Thyra::NOTRANS,*p,&*q); // A*p -> q const Scalar alpha = rho/space->scalarProd(*p,*q); // rho/<p,q> -> alpha Thyra::Vp_StV( x, Scalar(+alpha), *p ); // +alpha*p + x -> x Thyra::Vp_StV( &*r, Scalar(-alpha), *q ); // -alpha*q + r -> r rho_old = rho; // rho-> rho_old (remember rho for next iter) } return false; // Failure} // end sillyCgSolve

Thyra: sillyCGSolve

Generic CG code. To use, provide:

Thyra::LinearOp. Thyra::Vector.

Works on any parallel machine.

Allows mixing of vectors matrices: PETSc & Epetra.

Works with any scalar datatype.

Page 94: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

94

PyTrilinos

PyTrilinos provides Python access to Trilinos packages. Uses SWIG to generate bindings. Epetra, AztecOO, IFPACK,

ML, NOX, LOCA, Amesosand NewPackage are support.

Possible to: Define RowMatrix implementation in Python. Use from Trilinos C++ code.

Performance for large grain is equivalent to C++. Several times hit for very fine grain code.

Page 95: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

95#include "mpi.h"#include "Epetra_MpiComm.h"#include "Epetra_Vector.h"#include "Epetra_Time.h"#include "Epetra_RowMatrix.h"#include "Epetra_CrsMatrix.h"#include "Epetra_Time.h"#include "Epetra_LinearProblem.h"#include "Trilinos_Util_CrsMatrixGallery.h"using namespace Trilinos_Util;int main(int argc, char *argv[]){MPI_Init(&argc, &argv);Epetra_MpiComm Comm(MPI_COMM_WORLD);int nx = 1000;int ny = 1000 * Comm.NumProc();CrsMatrixGallery Gallery("laplace_2d", Comm);Gallery.Set("ny", ny);Gallery.Set("nx", nx);Gallery.Set("problem_size",nx*ny);Gallery.Set("map_type", "linear");Epetra_LinearProblem* Problem =

Gallery.GetLinearProblem();assert (Problem != 0);// retrieve pointers to solution (lhs), right-hand side (rhs)// and matrix itself (A)Epetra_MultiVector* lhs = Problem->GetLHS();Epetra_MultiVector* rhs = Problem->GetRHS();Epetra_RowMatrix* A = Problem->GetMatrix();Epetra_Time Time(Comm);for (int i = 0 ; i < 10 ; ++i)A->Multiply(false, *lhs, *rhs);cout << Time.ElapsedTime() << endl;MPI_Finalize();return(EXIT_SUCCESS);} // end of main()

#! /usr/bin/env python

from PyTrilinos import Epetra, Triutils

Epetra.Init()

Comm = Epetra.PyComm()

nx = 1000

ny = 1000 * Comm.NumProc()

Gallery = Triutils.CrsMatrixGallery("laplace_2d", Comm)

Gallery.Set("nx", nx)

Gallery.Set("ny", ny)

Gallery.Set("problem_size", nx * ny)

Gallery.Set("map_type", "linear")

Matrix = Gallery.GetMatrix()

LHS = Gallery.GetStartingSolution()

RHS = Gallery.GetRHS()

Time = Epetra.Time(Comm)

for i in xrange(10):

Matrix.Multiply(False, LHS, RHS)

print Time.ElapsedTime()

Epetra.Finalize()

C++ vs. Python: Equivalent Code

Page 96: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

96

Templates and Generic Programming

Page 97: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

97

Investment in Generic Programming

2nd Generation Trilinos packages are templated on: OrdinalType (think int). ScalarType (think double).

Examples:Teuchos::SerialDenseMatrix<int, double> A; Teuchos::SerialDenseMatrix<short, float> B;

Scalar/Ordinal Traits mechanism completes support for genericity. The following packages support templates:

Teuchos (Basic Tools)Thyra (Abstract Interfaces)Tpetra (including MPI support)Belos (Krylov and Block Krylov Linear), IFPACK (algebraic preconditioners, next version), Anasazi (Eigensolvers),

Page 98: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

98

Potential Benefits of Templated Types

Templated scalar (and ordinal) types have great potential: Generic: Algorithms expressed over abstract field. High Performance: Partial template specialization +

Fortran. Facilitate variety of algorithmic studies. Allow study of asymptotic behavior of discretizations. Facilitate debugging: Reduces FP error as source error. Use your imagination… All new Trilinos packages are templated.

Page 99: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

99

Kokkos Results: Sparse MVPentium M 1.6GHz Cygwin/GCC 3.2 (WinXP Laptop)

Data Set Dimension Nonzeros # RHS MFLOPS

DIE3D 9873 1733371 1 247.62

2 418.94

3 577.79

4 691.62

5 787.90

dday01 21180 923006 1 230.75

2 407.96

3 553.80

4 668.24

5 738.40

FIDAP035 19716 218308 1 171.22

2 349.29

3 374.24

4 498.99

5 545.77

Page 100: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

100

ARPREC

The ARPREC library uses arrays of 64-bit floating-point numbers to represent high-precision floating-point numbers.

ARPREC values behave just like any other floating-point datatype, except the maximum working precision (in decimal digits) must be specified before any calculations are done mp::mp_init(200);

Illustrate the use of ARPREC with an example using Hilbert matrices.

Page 101: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

101

Hilbert Matrices

A Hilbert matrix HN is a square N-by-N matrix such that:

For Example: 1

1

jiH

ijN

51

41

31

41

31

21

31

211

3H

Page 102: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

102

Hilbert Matrices

Notoriously ill-conditioned κ(H3) 524

κ(H5) 476610

κ(H10) 1.6025 x 1013

κ(H20) 7.8413 x 1017

κ(H100) 1.7232 x 1020

Hilbert matrices introduce large amounts of error

Page 103: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

103

Hilbert Matrices and Cholesky Factorization

With double-precision arithmetic, Cholesky factorization will fail for HN for all N > 13.

Can we improve on this using arbitrary-precision floating-point numbers?

Precision Largest N for which Cholesky Factorization is successful

Single Precision 8

Double Precision 13

Arbitrary Precision (20) 29

Arbitrary Precision (40) 40

Arbitrary Precision (200) 145

Arbitrary Precision (400) 233

Page 104: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

104

New Trilinos 7.0 Packages

Page 105: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

105

Galeri & Isorropia

Galeri: Generates Epetra maps and matrices for testing. A set of functionalities that are very close to that of the

MATLAB's gallery() function. Capabilities to create several well-know finite difference matrices. A simple finite element code that can be used to discretize scalar

second order elliptic problems using Galerkin and SUPG techniques, on both 2D and 3D unstructured grids.

Isorropia: Repartitioning/rebalancing. Assists with redistributing objects such as:

• Matrices and matrix-graphs in a parallel execution setting.

• Allows for more efficient computations.

• Primarily an interface to the Zoltan library.

• Can be built and used with minimal capability without Zoltan.

Page 106: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

106

Moertel & Moocho

Moertel: Capabilities for nonconforming mesh tying and contact

formulations in 2 and 3 dimensions using Mortar methods. Mortar methods are types of Lagrange Multiplier constraints that

can be used in contact formulations and in non-conforming or conforming mesh tying as well as in domain decomposition techniques.

Used in a large class of nonconforming situations such as the surface coupling of different physical models, discretization schemes or non-matching triangulations along interior interfaces of a domain.

MOOCHO: (Multifunctional Object-Oriented arCHitecture for Optimization) Large-scale invasive simultaneous analysis and design (SAND)

using primarily SQP-related methods.

Page 107: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

107

RTOp & Rythmos

RTOp: (reduction/transformation operators) Provides basic mechanism for implementing vector operations in a

flexible and efficient manner.

Rythmos: ODE/DAE. Includes: backward Euler, forward Euler, explicit Runge-Kutta,

and implicit BDF at this time. Native support for operator split methods. Highly modular. Forward sensitivity computations will be included in the first

release with adjoint sensitivities coming in near future.

Page 108: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

108

WebTrilinos

WebTrilinos: Web interface to Trilinos Generate test problems or read from file. Generate C++ or Python code fragments and click-run. Hand modify code fragments and re-run. Will use during hands-on.

Page 109: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

109

Dynamic External Package Support

New directory Trilinos/packages/external. Supports seamless integration of externally developed

packages via package registration. Completes the goal of Trilinos-compatibility.

Page 110: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

110

Software Quality

Page 111: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

111

SQA/SQE

Software Quality Assurance/Engineering is important. No longer sufficient to say “We do a good job”. Trilinos facilitates SQA/SQE development/processes for

packages: 32 of 47 ASCI SQE practices are directly handled by Trilinos (no

requirements on packages). Trilinos provides significant support for the remaining 15. Trilinos Dev Guide Part II: Specific to ASCI requirements. Trilinos software engineering policies provide a ready-made

infrastructure for new packages. Trilinos philosophy:

Few requirements. Instead mostly suggested practices. Provides package with option to provide alternate process.

Page 112: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

112Trilinos Service SQE Practices Impact

Yearly Trilinos User Group Meeting (TUG) and Developer Forum:

Once a year gathering for tutorials, package feature updates, user/developer requirements discussion and developer training.

— All Requirements steps: gathering, derivation, documentation, feasibility,etc.— User and Developer training.

Monthly Trilinos leaders meetings:

Trilinos leaders, including package development leaders, key managers, funding sources and other stakeholders participate in monthly phone meetings to discuss any timely issues related to the Trilinos Project.

—Developer Training.

—Design reviews.

—Policy decisions across all development phases.

Trilinos and package mail lists:

Trilinos lists for leaders, announcements, developers, users, checkins and similar lists at the package level support a variety of communication. All lists are archived, providing critical artifacts for assessments and audits.

—Developer/user/client communication.

—Requirements/design/testing artifacts.

—Announcement/documenting of releases.

Trilinos and Trilinos3PL source repositories:

All source code, development and user documentation is retained and tracked. In addition, reference versions of all external software, including BLAS, LAPACK, Umfpack, etc. are retained in Trilinos3PL.

—Source management.

—Versioning.

—Third-party software management.

Bugzilla Products:

Each package has its own Bugzilla Product with standard components.

—Requirements/faults capturing and tracking.

Trilinos configure script and M4 macros:

The Trilinos configure script and related macros support portable installation of Trilinos and its packages

—Portability.

—Software release.

Trilinos test harness:

Trilinos provides a base testing plan and automated testing across multiple platforms, plus creation of testing artifacts. Test harness results are used to derive a variety of metrics for SQE.

—Pre-checkin and regression testing.

—Software metrics.

Page 113: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

113

The Trilinos Tutorial Package

Trilinos is large (and still growing…) More than 600K code lines Many packages

• a lot of functionalities…

• … and also a lot to learn

What is the best way for a new user to learn Trilinos? Using a Trilinos package of course: Didasko!

Page 114: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

114

What DIDASKO is not

DIDASKO, as a tutorial, cannot cover the most advanced features of Trilinos: Only the “stable” features are covered

DIDASKO is not a substitute of each package’s documentation and examples, which remain of fundamental importance

Page 115: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

115

Covered Packages

At present, we cover

Epetra Triutils IFPACKTeuchos AztecOO MLNOX Amesos EpetraExtAnasazi Tpetra

Page 116: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

116

Trilinos Availability/Information Trilinos and related packages are available via LGPL. Current release (6.0) is “click release”. Unlimited availability. Trilinos Release 7: September 2006. Trilinos Awards:

2004 R&D 100 Award. SC2004 HPC Software Challenge Award. Sandia Team Employee Recognition Award. Lockheed-Martin Nova Award Nominee.

More information: http://software.sandia.gov http://software.sandia.gov/trilinos Additional documentation at my website:

http://www.cs.sandia.gov/~mheroux. 4th Annual Trilinos User Group Meeting: November 7-9, 2006 at SNL.

Page 117: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

117

Summary

Trilinos package model provides a welcoming environment for solver developers: Extremely flexible. Building up package, not making it conform. Scalable growth model for complex multi-physics solver suites. Working toward full vertical coverage of capabilities.

Basic principles can be applied to any similar “project of projects”: Modularity Interfaces Common infrastructure.

Trilinos new_package package is self-contained software environment: Package and website useful to any project wanting to ramp up in use of

standard tools. Available separately, Open Source. Used by groups outside of Trilinos as well as within.

Page 118: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

118

Trilinos Might be of Interest to you if…

Your application includes non-PDE equations. Multiple RHS, Block Krylov linear, eigen solvers, SAND

needed. You develop an independent Trilinos-compatible solver. Interested in multi-physics. App is in C++. Interested in collaborating on Fortran interface. Unstructured Data redistribution is important. Multi-precision useful. Parallel Python interest.

Page 119: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

119

Extras

Page 120: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

120Algorithms Research: Truly Useful Multi-level Methods

Fly-through of next 4 slides. Theme:

Multi-level preconditioning has come of age across broad spectrum of problems.

Page 121: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

1542/1414 (Gee,Heinstein,Key,Pierson,Tuminaro)

Novel nonlinear AMG

ML, NOX, Epetra, Amesos, AztecOO

Constraints projected out in F(x)

Improved coloring performance by 10x

Initial testing excellent convergence

Pressurize tire with 5 loadsteps 

Nonlinear AMG/ADAGIO

its

Jacobi 22

MG 13

# elmts its

91 13

559 14

1241 25

2331 23

3925 30

6119 35

• Large reduction in iteration count.• Slow growth as size increases.

Page 122: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

Helium plume V&V Project

260K, 16 processor run

ML/GMRES is ~25% faster than without ML

As problem size increase, ML expected to be more beneficial

ASC SIERRA Applications

SIERRA/Fuego/Syrinx

SIERRA/Aria Multiphysicscoupled potential/thermal/displacement DP MEMS problemML reduced solve time (40%) from ~20 minutes to ~12 minutes

compared to actual ANSYS runs & ARIA re-creation ANSYS scheme

Page 123: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

123

Premo premo (Latin) – to squeeze (compress)

Compressible flow to determine aerodynamic characteristics

for the Nuclear Weapons Complex Additional Issues that have been addressed

FEI/Nox/Trilinos interface developmentBlock Algorithm ImprovementsBlock Gauss-Seidel, Block Grid Transfers

Numerical Issuesnonlinear path, time step path, setup time, linear sys. difficulty

B61 problem (6.5 million dofs, 64 procs) Pseudo-transient + Newton Euler flow, Mach .8 Linear solver: 173 solves, tolerance= 10-4

GMRES/ILU 7494 sec GMRES/MG 3704 sec

Page 124: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

124

Premo premo (Latin) – to squeeze (compress)

Falcon problem (13+ Million dofs, 150 procs) Pseudo-transient + Newton Euler flow, Mach .75 Linear solver: 109 solves, tolerance= 10-4

GMRES/ILU 7620 sec GMRES/MG 3787 sec 10x improvement on final linear solve. > 5x gains on some problems

over entire sequence

New Grid Transfers (220+K Falcon, 1 proc) Pseudo-transient + Newton, Euler flow @ Mach .75 Last linear solve, tolerance= 10-9

GMRES/MG/old transfers 47 its, 49 sec GMRES/MG/new transfers 24 its, 26 sec

Page 125: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

125

Multi-level Methods Summary Solving hard, real problems fast, scalably. Still need more…

Page 126: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

126Algorithms Research: Specialized Solvers

Next wave of capabilities: Specialized solvers. Examples in Trilinos:

Optimal domain decomposition preconditioners for structures: CLAPS

Mortar methods for interface coupling: Moertel. Segregated Preconditioners for Navier-Stokes: Meros.

Examples in Applications: EMU (with Boeing). Tramonto.

Page 127: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

127

Lipid Bi-Layer Problem

00 0

0 00 0

0

F0 0 00

0

0 0

00 0

0

0

0

0

0

0

19n

19n

3n

3n

0

0 00

0

2n A11 A12

A21

A22

•A11 solve easily applied in parallel.

•Apply GMRES to S = A22 – A21*inv(A11)*A12•Need a preconditioner for S.

Page 128: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

128

Preconditioner for S

FA22 = D11

D21

D21

FA22 A22 = D11

0

D21

• D11, D22 = O(1), D21 = O(1e-10)• Ignore D21 for preconditioning.• P(S) requires

• 2 diagonal scalings, • matvec with F.

• All distributed operations.

Page 129: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

129

Tramonto Solver Summary 3-level linear operator structure. Matlab to fully parallel: 1 month. Complex orchestration:

Preconditioner: 100+ distributed Epetra matrices used in sequence. ML, IFPACK, Amesos used on subproblems.

Utilizes 8 Trilinos packages in total. 566 Lines of Code (Polymer Solver). Polymorphic Design.

Page 130: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

130

3D Polymer Results

Same Iteration Counts asProcessor Count Increases

Approximately Linear PerformanceImprovement For Fixed Size Problem

Page 131: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

131

Properties of New Solver

Uniform distributed mesh → uniform distributed work. Preconditioner sub-steps naturally parallel:

→ Results invariant to processor count up to round-off.

Preconditioner requires almost no extra memory:→ 4-10X reduction over previous approach.

GMRES subspace and storage reduced 6X-10X or more. Speedup 20-2X. Solver has:

No tuning parameters. Near linear scaling.

Laurie’s Favorite Properties!

Michael A. Heroux, Laura J. D. Frink, and Andrew G. Salinger. Schur complement based approaches to solving density functional theories for inhomogeneous fluids on parallel computers. SIAM J. Sci. Comput. Submitted.

Page 132: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

132

Specialized Solver API: EMU

EMU: Peridynamics modeling code. Boeing: Modeling of composite material failure. Onset to failure: Explicit time stepping. Prior to onset: Desire implicit (large-time step). Previous capability: Serial Y12M. Present: Trilinos serial and (almost) parallel. Status:

Solver API production-ready (290 LOC). Serial results complete. Parallel soon (requires work in EMU also).

First results (smallest problem): 3.5X faster (complete run).

Page 133: 8/22/2006 1 Trilinos Tutorial Overview and Basic Concepts Michael A. Heroux Sandia National Laboratories ACTS Toolkit Workshop 2006 Sandia is a multiprogram.

8/22/2006

133

Specialized Solvers Summary

Trilinos: Rich, configurable foundation Parallel data model. Large and growing collection of solvers in scalable

package framework.

Specialized solvers: Next-level advances Exploit inherent problem properties. Few Lines of Code: Leverage Trilinos. Robust, scalable, efficient.

Michael A. Heroux. A Solver-Independent API for multi-DOF Applications using Trilinos. Int. J. of Computational Science and Engineering. Submitted.