An Overview of TSFCore Roscoe A. Bartlett 9211, Optimization and Uncertainty Estimation Sandia is a...

An Overview of TSFCore

Roscoe A. Bartlett

9211, Optimization and Uncertainty Estimation

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy under contract DE-AC04-94AL85000.

TSFCore SAND Reports

Get most recent copy at: Trilinos/doc/TSFCore

Nonlinear Equations : Foundation for all our Work!

Applications• Discretized PDEs (e.g. finite element, finite volume, finite difference etc.)• Network problems (e.g. Xyce)

Nonlinear Equations : Sensitivities

Related Algorithms• Gradient-based optimization

• SAND• NAND

• Nonlinear equations (NLS)• Multidisciplinary analysis

• Linear (matrix) analysis• Block iterative solvers• Eigenvalue problems

• Uncertainty quantification• SFE

• Stability analysis / continuation• Transients (ODEs, DAEs)

B. van Bloemen Waanders, R. A. Bartlett, K. R. Long and P. T. Boggs. Large Scale Non-Linear Programming: PDE Applications and Dynamical Systems, Sandia National Laboratories, SAND2002-3198, 2002

Applications, Algorithms, Linear-Algebra Software

APP : Application (e.g. MPSalsa, Xyce, SIERRA, NEVADA etc.)LAL : Linear-Algebra Library (e.g. Petra/Ifpack, PETSc, Aztec etc.)ANA : Abstract Numerical Algorithm (e.g. optimization, nonlinear solvers, stability analysis,

SFE, transient solvers etc.)

A P P In t e r f a c e

V e cM a t

P r e c o n d i t io n e r

1..*1..*1

Computes functions

Key points• Complex algorithms• Complex software• Complex interfaces• Complex computers• Duplication of effort?

Examples: Epetra_RowMatrix fei::Matrix TSF::MatrixOperator

TSFCore

A P P In t e r f a c e

V e cM a t

P r e c o n d i t io n e r

1..*1..*

Computes functions

Key points• Maximizing development impact• Software can be run on more sophisticated computers• Fosters improved algorithm development

TSFCore

TSFCore::Nonlin

Requirements for TSFCore

TSFCore should:

Be portable to ASCI platforms

Provide for stable and accurate numerical computations

Represent a minimal but complete interface that will result in implementations that are:

Near optimal in computational speed

Near optimal in storage

Be independent of computing environment (SPMD, MS, CS etc.)

Be easy to develop adapters for existing libraries (e.g. Epetra, PETSc etc.)

Example ANA : Linear Conjugate Gradient Solver

TSFCore : Basic Linear Algebra Interfaces

LinearOpVectorSpace

OpBase

Vector

MultiVector

columns1..*

rangedomainspace

An operator knows its domain and range spaces

A linear operator is a kind of operator

Warning! Unified Modeling Langage (UML) Notation!

A Vector knows its VectorSpace

<<create>>

VectorSpaces create Vectors!

LinearOpVectorSpace

OpBase

Vector

MultiVector

columns1..*

rangedomainspace

<<create>>

LinearOpVectorSpace

OpBase

Vector

MultiVector

columns1..*

rangedomainspace

A MulitVector is a linear operator!

A MulitVector has a collection of column vectors!

A MulitVector is a tall thin dense matrix

<<create>>

VectorSpaces create MultiVectors!

LinearOpVectorSpace

OpBase

Vector

MultiVector

columns1..*

rangedomainspace

<<create>>

The Key to success!Reduction/Transformation Operators • Supports all needed vector operations• Data/parallel independence• Optimal performance

LinearOpVectorSpace

OpBase

Vector

MultiVector

columns1..*

rangedomainspace

R. A. Bartlett, B. G. van Bloemen Waanders and M. A. Heroux. Vector Reduction/Transformation Operators, Accepted to ACM TOMS, 2003

Background for TSFCore

1996 : Hilbert Class Library (HCL), [Symes and Gockenbach]

Abstract vector spaces, vectors, linear operators

2000 : Epetra, [Heroux]

Concrete multi-vectors

2001 : Trilinos Solver Framework (TSF) 0.1, [Long]

2001 : AbstractLinAlgPack (ALAP) (MOOCHO LA interfaces), [Bartlett]

Reduction/transformation operators (RTOp)

Abstract multi-vectors

createMember() : VectorcreateMembers(in numMembers : int) : MultiVectorisCompatible(in vecSpc : VectorSpace) : boolscalarProd(in x : Vector, in y : Vector) : Scalar

dim : int

TSFCore::VectorSpace

applyOp(in op : RTOpT, inout ...)

TSFCore::Vector

apply_op(inout ...)reduce_reduct_objs(inout ...)

RTOpPack::RTOpT

applyOp(in op : RTOpT, inout ...)subView(in col_rng : Range1D) : MultiVectorsubView(in numCols : int, in cols[1..numCols] : int) : MultiVector

TSFCore::MultiVector

opSupported(in M_trans) : bool

TSFCore::OpBase

columns1..*

«create»

domainspace

apply(in M_trans, in x : Vector, out y : Vector, in ...)apply(in M_trans, in X : MultiVector, out Y : MultiVector, in ...)

TSFCore::LinearOp

createVecSpc(in dim : int) : VectorSpace

TSFCore::VectorSpaceFactory

«create»

smallVecSpcFcty

VectorSpaceFactory is related to MultiVectors

VectorSpaces create Vectors and MultiVectors!

MultiVector subviews can be created!

Vector and MultiVector versions of apply(…)!

Adjoints supported but are optional!

Only one vector method!

TSFCore Details

All interfaces are templated on Scalar type (support real and complex)

Smart reference counted pointer class Teuchos::RefCountPtr<> used for all dynamic memory management

Many operations have default implementations based on very few pure virtual methods

RTOp operators (and wrapper functions) are provided for many common level-1 vector and multi-vector operations

Default implementation provided for MultiVector (MultiVectorCols)

Default implementations provided for serial computation: VectorSpace (SerialVectorSpace), VectorSpaceFactory (SerialVectorSpaceFactory), Vector (SerialVector)

Vector-Vector Operations Provided with TSFCore

namespace TSFCore {

template<class Scalar> Scalar sum( const Vector<Scalar>& v ); // result = sum(v(i))

template<class Scalar> Scalar norm_1( const Vector<Scalar>& v ); // result = ||v||1

template<class Scalar> Scalar norm_2( const Vector<Scalar>& v ); // result = ||v||2

template<class Scalar> Scalar norm_inf( const Vector<Scalar>& v_rhs ); // result = ||v||inf

template<class Scalar> Scalar dot( const Vector<Scalar>& x

,const Vector<Scalar>& y ); // result = x'*y

template<class Scalar> Scalar get_ele( const Vector<Scalar>& v, Index i ); // result = v(i)

template<class Scalar> void set_ele( Index i, Scalar alpha

,Vector<Scalar>* v ); // v(i) = alpha

template<class Scalar> void assign( Vector<Scalar>* y, const Scalar& alpha ); // y = alpha

template<class Scalar> void assign( Vector<Scalar>* y

,const Vector<Scalar>& x ); // y = x

template<class Scalar> void Vp_S( Vector<Scalar>* y, const Scalar& alpha ); // y += alpha

template<class Scalar> void Vt_S( Vector<Scalar>* y, const Scalar& alpha ); // y *= alpha

template<class Scalar> void Vp_StV( Vector<Scalar>* y, const Scalar& alpha

,const Vector<Scalar>& x ); // y = alpha*x + y

template<class Scalar> void ele_wise_prod( const Scalar& alpha

,const Vector<Scalar>& x, const Vector<Scalar>& v, Vector<Scalar>* y ); // y(i)+=alpha*x(i)*v(i)

template<class Scalar> void ele_wise_divide( const Scalar& alpha

,const Vector<Scalar>& x, const Vector<Scalar>& v, Vector<Scalar>* y ); // y(i)=alpha*x(i)/v(i)

template<class Scalar> void seed_randomize( unsigned int ); // Seed for randomize()

template<class Scalar> void randomize( Scalar l, Scalar u, Vector<Scalar>* v ); // v(i) = random(l,u)

} // end namespace TSFCore

TSFCore : Vectors and Vector Spaces

C++ code:

template<class Scalar>Scalar foo( const VectorSpace<Scalar>& S ){ Teuchos::RefCountPtr<Vector<Scalar> > x = S.createMember(), // create x y = S.createMember(); // create y assign( &*x, 1.0 ); // x = 1 randomize( -1.0, +1.0, &*y ); // y = rand(-1,1) Vp_StV( &*y, -2.0, *x ); // y += -2.0 * x Scalar gamma = dot(*x,*y); // gamma = x’*y return gamma;}

Mathematical notation:

TSFCore : Applying a Linear Operator

C++ Prototype:

namespace TSFCore { enum ETransp { NOTRANS, TRANS, CONJTRANS }; template<class Scalar> class LinearOp : public virtual OpBase<Scalar> { public: virtual void apply( ETransp M_trans, const Vector<Scalar> &x, Vector<Scalar> *y ,Scalar alpha = 1.0, Scalar beta = 0.0 ) const = 0; };}

Example:

template<class Scalar>void myOp( const Vector<Scalar> &x, const LinearOp<Scalar> &M ,Vector<Scalar> *y ){ M.apply( NOTRANS, x, y );}

Example ANA : Linear Conjugate Gradient Solver

Multi-vector Conjugate-Gradient Solver : Single Iteration

template<class Scalar> void CGSolver<Scalar>::doIteration( const LinearOp<Scalar> &M, ETransp opM_notrans ,ETransp opM_trans, MultiVector<Scalar> *X, Scalar a ,const LinearOp<Scalar> *M_tilde_inv ,ETransp opM_tilde_inv_notrans, ETransp opM_tilde_inv_trans ) const {

const Index m = currNumSystems_;

int j;

if( M_tilde_inv )

M_tilde_inv->apply( opM_tilde_inv_notrans, *R_, &*Z_ );

assign( &*Z_, *R_ );

dot( *Z_, *R_, &rho_[0] );

if( currIteration_ == 1 ) {

assign( &*P_, *Z_ );

else {

for(j=0;j<m;++j) beta_[j] = rho_[j]/rho_old_[j];

update( *Z_, &beta_[0], 1.0, &*P_ );

M.apply( opM_notrans, *P_, &*Q_ );

dot( *P_, *Q_, &gamma_[0] );

for(j=0;j<m;++j) alpha_[j] = rho_[j]/gamma_[j];

update( &alpha_[0], +1.0, *P_, X )

update( &alpha_[0], -1.0, *Q_, &*R_ ); }

The TSFCore Trilinos package

packages/TSFCore src

interfaces Core : VectorSpace, Vector, LinearOp etc … Solvers : Iterative linear solver interfaces (unofficial!) Nonlin : Nonlinear problem interfaces (unofficial!)

utilities Core : Testing etc … Solvers : Some iterative solvers (CG, BiCG, GMRES) Nonlin : Testing etc …

adapters mpi-base : Node classes for MPI-based vector spaces Epetra : EpetraVectorSpace, EpetraVector etc …

examples …

TSFCore::Nonlin : Interfaces to Nonlinear Problems

Supported Areas• NAND optimization• SAND optimization• Nonlinear equations• Multidisciplinary analysis• Stability analysis / continuation• SFE

Function evaluations:

NonlinearProblem

(nonsingular) State Jacobian evaluations :

Auxiliary Jacobian evaluations :

NonlinearProblemFirstOrder

State constraints andresponse functions

TSFCore::Nonlin : Interfaces to Nonlinear Problems

initialize()isInitialized() : boolset_c(in c : Vector)set_g(in g : Vector)unsetQuantities()calc_c(in y : Vector, in u[1...Nu] : Vector = NULL, in newPoint : bool = true)cac_g(in y : Vector, in u[1..Nu] : Vector = NULL, in newPoint : bool = true)

Nu : intnumResponseFunctions : IndexyL : VectoryU : VectoruL[1..Nu] : VectoruU[1..Nu] : VectorgL : VectorgU : Vectory0 : Vectoru0[1...Nu] : Vector

Nonlin::NonlinearProblem

TSFCore::VectorSpace

space_y1space_u1..*

space_c 1space_g 1

adjointSupported() : boolset_DcDy(in DcDy : LinearOpWithSolve)set_DcDu(in l : int, in DcDu : LinearOp)set_DgDy(in DgDy : MultiVector)set_DgDu(in l : int, in DgDu : MultiVector)calc_DcDy(in y : Vector, in u[1..Nu] : Vector = NULL, in newPoint : bool = true)calc_DcDu(in l : int, in y : Vector, in u[1..Nu] : Vector = NULL, in newPoint : bool = true)cac_DgDy(in y : Vector, in u[1..Nu] : Vector = NULL, in newPoint : bool = true)calc_DgDu(in l : int, in y : Vector, in u[1..Nu] : Vector = NULL, in newPoint : bool = true)

Nonlin::NonlinearProblemFirstOrder

AbstractFactory<LinearOpNonsing>AbstractFactory<LinearOp>

factory_DcDy1factory_DcDu

TSFCore::LinearOp

nonsingStatus() : ENonsingStatussolve(in M_trans, in x : Vector, out y : Vector, in ...)solve(in M_trans, in X : MultiVector, out Y : MultiVector, in ...)

Nonlin::LinearSolveOp

Nonlin::LinearOpWithSolve

preconditioner

«create»

Solvers::ConvergenceTester

TSFCore::OpBase

Supported Areas• SAND• Nonlinear equations• Multidisciplinary analysis• Stability analysis / continuation• SFE

Summary

SAND Reports

R. A. Bartlett, M. A. Heroux and K. R. Long. TSFCore : A Package of Light-Weight Object-Oriented Abstractions for the Development of Abstract Numerical Algorithms and Interfacing to Linear Algebra Libraries and Applications, Sandia National Laboratories, SAND2003-1378, 2003

R. A. Bartlett, TSFCore::Nonlin : An Extension of TSFCore for the Development of Nonlinear Abstract Numerical Algorithms and Interfacing to Nonlinear Applications, Sandia National Laboratories, SAND2003-1377, 2003

Location: Trilinos/doc/TSFCore

The End

Thank You!

Extra Slides

Examples of Non-Standard Vector Operations

Examples from OOQP (Gertz, Wright)y y x i ni i i / , ...1y y x z i ni i i i , ...1

yy y y yy y y y

y y yi ni

min min

max max

min max, ...

ififif0

1 dx:max

Example from TRICE (Dennis, Heinkenschloss, Vicente)

b u w bw b

u a w aw a

( ) and and

( ) and . and .

ifififif

Example from IPOPT (Waechter)

xxxxxxxxxx

nixxxxxx

,maxˆ,minˆ

...1,ˆifˆˆifˆ

ˆˆif2 Currently in MOOCHO :

> 40 vector operations!

Goals for a Vector Interface

Compute efficiency => Near optimal performance

Optimization developers add new operations => Independence of linear algebra . library developers

Compute environment independence => Flexible optimization software

Minimal number of methods => Easy to write adapters

Approaches to Developing Vector Interfaces

(1) Linear algebra library allows direct access to vector elements

(2) Optimizer-specific interfaces

(3) General-purpose primitive vector operations

Vector Reduction/Transformation Operators Defined

Reduction/Transformation Operators (RTOp) Defined

z 1i … z q

i opt( i , v 1i … v p

i , z 1i … z q

i ) element-wise transformation opr( i , v 1

i … v pi , z 1

i … z qi ) element-wise reduction

2 oprr( 1 , 2 ) reduction of intermediate reduction objects

• v 1 … v p R n : p non-mutable input vectors

• z 1 … z q

R n : q mutable input/output vectors• : reduction target object (many be non-scalar (e.g. {yk ,k}), or NULL)

Key to Optimal Performance

• opt(…) and opr(…) applied to entire sets of subvectors (i = a…b) independently:z 1

a:b … z qa:b , op( a, b , v 1

a:b … v pa:b , z 1

a:b … z qa:b , )

• Communication between sets of subvectors only for NULL, oprr( 1 , 2 ) 2

Object-Oriented Design for User Defined RTOp Operators

Advantages:• Functionality

• Linear-algebra implementations can be changed with no impact on optimizer• Optimizer developers can unilaterally add new vector operations

• Performance• Near optimal performance (large subvectors)• Multiple simultaneous global reductions => no sequential bottlenecks• No unnecessary temporary vectors or multiple vector read/writes

• Disadvantages:• New concepts, initially harder to understand interfaces?

V e c t o r

apply_op( op:RTOp, ... , reduct_obj:ReductionTarget, ... )

R T O pR e d u c t i o n T a r g e t

A s s i g n S c a l a r O p

set_alpha(alpha)apply_op(...)

D o t P r o d u c t O p

apply_op(...)reduce_reduct_objs(...)

O p t i m i z e r

M a x S t e p O p

apply_op( ..., reduct_obj:ReductionTarget )reduce_reduct_objs( ... )

O u tO f C o re V e c to r

apply_op(...)apply_op(...)apply_op(...)

M P I V e c t o rS e r ia lV e c to rset_beta(beta)apply_op(...)reduce_reduct_objs(...)

......

Applys

Abstracts / encapsulates vectors Implements vector operations

RTOp vs. Primitives : Communication

• Compare– RTOp (all-at-once reduction (i.e. ISIS++ QMR solver))

{, , , , } { (xT x)1/2, (vT v)1/2, (wT w)1/2, wT v, vT t }– Primitives (5 separate reductions)

(xT x)1/2, (vT v)1/2, (wT w)1/2, wT v, vT

0 100 200 300 400

num_axpys

------

local dim

50,000

* 128 processors on CPlant®

RTOp vs. Primitives : Multiple Ops and Temporaries

• Compare– RTOp (all-at-once reduction){ max : x + d } = min{ max( ( - xi)/di, 0 ), for i = 1 … n } – Primitives (5 temporaries, 6 vector operations)-xi ui, xi + vi, vi / di wi, 0 yi, max{wi,yi} zi, min{zi,i=1…n}

* 1 processor (gcc 3.1 under Linux)

Question: Does OO C++ allow for good scalability for massively parallel computing (i.e. 100 to 10000 processors)?

Parallel Scalability of MOOCHO

Scaleable exampleNLP (m = n/2) 2,...,10101)(..

21)(min

njxxxxcts

jmjmjj

A C NT

Z C NI

Variable reductionrange / null spacedecomposition

• Diagonal matrices => All vector ops!

Where is the parallel bottleneck?

Is it OO C++ or MPI?

* Red Hat Linux cluster (4 nodes)• 2.0 GHz Intel P4 processors• MPICH 1.2.2.1

Answer => MPI

Serial overhead of MOOCHO (n=2, Np=1) 0.41 milliseconds per rSQP iteration

Overhead of MPI communication (Np=4) 0.42 milliseconds per global reduction

1 2 3 4

Np (number of processors)

llel S

n (global dim)

20,000

200,0002,000,000

An Overview of TSFCore Roscoe A. Bartlett 9211, Optimization and Uncertainty Estimation Sandia is a...

Documents

Transcript of An Overview of TSFCore Roscoe A. Bartlett 9211, Optimization and Uncertainty Estimation Sandia is a...

1 An Overview of Trilinos Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed.

Epetra Concepts Data management using Epetra Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation,

The MultiThreaded Graph Library November 17, 2009 Jon Berry Greg Mackey Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia.

Sandia National Laboratories · Sandia National Laboratories is a multiprogram laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin

Epetra Tutorial Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,

Mike Hightower Sandia National Laboratories · Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company for the United States Department of Energy’s

Energy Storage Opportunities Analysis Phase II Final ... · Energy Storage Opportunities Analysis Phase ... Sandia is a multiprogram laboratory operated by Sandia Corporation, ...

SciDAC SSS Quarterly Report Sandia Labs May 10, 2005 William McLendon Ron Oldfield Neil Pundit Sandia is a multiprogram laboratory operated by Sandia Corporation,

rd Fuzing & Firing Systems at Sandia National …...Sandia National Laboratories rmcenti@sandia.gov (505) 845-9138 May 20, 2009 Sandia is a multiprogram laboratory operated by Sandia

Sandia · 2 Sandia Research is a magazine published by Sandia National Laboratories, a multiprogram engineering and science laboratory operated by Sandia Corporation, a Lockheed Martin

SciDAC SSS Quarterly Report Sandia Labs August 27, 2004 William McLendon Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed.

How To Give a Talk Tammy Kolda Sandia National Labs July 3, 2007 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin.

Sandia National Laboratories Albuquerque, N.M.doeplasma.eecs.umich.edu/files/Web_Barnat_Ed_2010_03_19.pdf · Sandia National Laboratories. Albuquerque, N.M. Sandia is a multiprogram

Trilinos Progress, Challenges and Future Plans Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation,

Extreme Scale Trilinos: How We are Ready, And Not Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia.

Trilinos Overview and Future Plans Michael A. Heroux Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a.

Parallel Visualization Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin.

1 An Overview of Trilinos Mark Hoemmen Sandia National Laboratories 30 June 2014 Sandia is a multiprogram laboratory managed and operated by Sandia Corporation,

RE Nygren, Sandia ARIES Town Hall - 10-12dec2008 - UCSD Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,

M. Ulrickson - Mechanical Engineeringmammoli/ME217_stuff/lectures_f2011/Nuclear_F… · M. Ulrickson . Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed