The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux...
Transcript of The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux...
![Page 1: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/1.jpg)
1
The Trilinos Project ExascaleRoadmap
Michael A. HerouxSandia National Laboratories
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
Contributors: Mark Hoemmen, Siva Rajamanickam, Tobias Wiesner,Alicia Klinvex, Heidi Thornquist,
Kate Evans, Andrey Prokopenko, Lois McInnes
trilinos.github.io
![Page 2: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/2.jpg)
Outline
§ Challenges.§ BriefOverviewofTrilinos.§ On-nodeparallelism.§ ParallelAlgorithms.§ TrilinosProductsOrganization.§ ForTrilinos.§ ThexSDK.
2
![Page 3: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/3.jpg)
Challenges
§ On-nodeconcurrencyexpression:§ Vectorization/SIMT.§ On-nodetasking.§ Portability.
§ Parallelalgorithms:§ Vector/SIMTexpressible.§ Latencytolerant.§ Highlyscalable.
§ Multi-scaleandmulti-physics.§ Preconditioning.§ Softwarecomposition.
§ Resilience(DiscussedtomorrowmorningMS40).
3
![Page 4: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/4.jpg)
4
Trilinos Overview
![Page 5: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/5.jpg)
What is Trilinos?§ Object-oriented software framework for…§ Solving big complex science & engineering problems.§ Large collection of reusable scientific capabilities.§ More like LEGO™ bricks than Matlab™.
5
![Page 6: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/6.jpg)
OptimalKernelstoOptimalSolutions:w Geometry,Meshingw Discretizations,LoadBalancing.w ScalableLinear,Nonlinear,Eigen,
Transient,Optimization,UQsolvers.w ScalableI/O,GPU,Manycore
w 60+Packages.w Otherdistributions:
w CrayLIBSCI.w GitHubrepo.
w ThousandsofUsers.w Worldwidedistribution.
LaptopstoLeadershipsystems
![Page 7: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/7.jpg)
Trilinos Package SummaryObjective Package(s)
DiscretizationsMeshing & Discretizations STK, Intrepid, Pamgen, Sundance, ITAPS, Mesquite
Time Integration Rythmos
MethodsAutomatic Differentiation Sacado
Mortar Methods Moertel
Services
Linear algebra objects Epetra, Tpetra, Kokkos, Xpetra
Interfaces Thyra, Stratimikos, RTOp, FEI, Shards
Load Balancing Zoltan, Isorropia, Zoltan2“Skins” PyTrilinos, WebTrilinos, ForTrilinos, Ctrilinos, Optika
C++ utilities, I/O, thread API Teuchos, EpetraExt, Kokkos, Triutils, ThreadPool, Phalanx, Trios
Solvers
Iterative linear solvers AztecOO, Belos, Komplex
Direct sparse linear solvers Amesos, Amesos2, ShyLUDirect dense linear solvers Epetra, Teuchos, Pliris
Iterative eigenvalue solvers Anasazi, Rbgen
ILU-type preconditioners AztecOO, IFPACK, Ifpack2, ShyLU
Multilevel preconditioners ML, CLAPS, Muelu
Block preconditioners Meros, Teko
Nonlinear system solvers NOX, LOCA, Piro
Optimization (SAND) MOOCHO, Aristos, TriKota, Globipack, Optipack
Stochastic PDEs Stokhos
![Page 8: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/8.jpg)
Unique features of Trilinos
8
§ Huge library of algorithms§ Linear and nonlinear solvers, preconditioners, …§ Optimization, transients, sensitivities, uncertainty, …
§ Growing support for multicore & hybrid CPU/GPU§ Built into the new Tpetra linear algebra objects
§ Therefore into iterative solvers with zero effort!§ Unified intranode programming model: Kokkos§ Spreading into the whole stack:
§ Multigrid, sparse factorizations, element assembly…
§ Support for mixed and arbitrary precisions§ Don’t have to rebuild Trilinos to use it
§ Support for flexible 2D sparse partitioning§ Useful for graph analytics, other data science apps.
§ Support for huge (> 2B unknowns) problems
![Page 9: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/9.jpg)
AnasaziSnapshot
§ Abbreviationskey§ “Gen.”=generalizedeigenvaluesystem.§ “Prec:”=canuseapreconditioner.§ BKS=BlockKrylov Schur.§ LOBPCG=LocallyOptimial BlockPreconditionedConjugateGradient.§ RTR=RiemannianTrust-Regionmethod.
§ ^ denotesthatthesefeaturesmaybeimplementedifthereissufficientinterest.§ $ denotesthattheTraceMin familyofsolversiscurrentlyexperimental. 9
Core team:- Alicia Klinvex- Rich Lehoucq- Heidi Thornquist
Works with:- Epetra- Tpetra- Custom
https://trilinos.github.io/anasazi.html
![Page 10: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/10.jpg)
Trilinoslinearsolvers§ Sparselinearalgebra
(Kokkos/KokkosKernels/Tpetra)§ Threadedconstruction,Sparsegraphs,(block)
sparsematrices,densevectors,parallelsolvekernels,parallelcommunication&redistribution
§ Iterative(Krylov)solvers (Belos)§ CG,GMRES,TFQMR,recyclingmethods
10
§ Sparsedirectsolvers (Amesos2)§ Algebraiciterativemethods (Ifpack2)
§ Jacobi,SOR,polynomial,incompletefactorizations,additiveSchwarz
§ Shared-memoryfactorizations (ShyLU)§ LU,ILU(k),ILUt,IC(k),iterativeILU(k)§ Direct+iterative preconditioners
§ Segregated blocksolvers(Teko)§ Algebraicmultigrid (MueLu)
KokkosKernels
![Page 11: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/11.jpg)
11
On-node data & execution
![Page 12: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/12.jpg)
Mustsupport>3architectures§ Comingsystemstosupport
§ Trinity (IntelHaswell &KNL)§ Sierra:NVIDIAGPUs+IBM
multicoreCPUs§ Plus“everythingelse”
§ 3differentarchitectures§ MulticoreCPUs(bigcores)§ Manycore CPUs(smallcores)§ GPUs(highlyparallel)
§ MPIonly,&MPI+threads§ Threadsdon’talwayspayon
non-GPUarchitecturestoday§ Portingtothreadsmustnot
slowdowntheMPI-onlycase12
![Page 13: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/13.jpg)
Kokkos:Performance,Portability,&Productivity
13
DDR#
HBM#
DDR#
HBM#
DDR#DDR#
DDR#
HBM#HBM#
Kokkos#
LAMMPS# Sierra# Albany#Trilinos#
![Page 14: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/14.jpg)
14
Kokkos ProgrammingModel
• Machinemodel:• N execution space + M memory spaces • NxM matrix for memory access performance/possibility• Asynchronous execution allowed
• Implementationapproach• A C++ template library• C++11 now required• Target different back-ends for different hardware architecture• Abstract hardware details and execution mapping details away
• Distribution• Open Source library• Soon (i.e. in the next few weeks) available on GitHub
• LongTermVision• Move features into the C++ standard (Carter Edwards voting committee member)
Goal:OneCodegivesgoodperformanceoneveryplatform
![Page 15: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/15.jpg)
15
AbstractionConcepts
Execution Pattern: parallel_for, parallel_reduce, parallel_scan, task, …
Execution Policy : how (and where) a user function is executedE.g., data parallel range : concurrently call function(i) for i = [0..N)User’s function is a C++ functor or C++11 lambda
Execution Space : where functions executeEncapsulates hardware resources; e.g., cores, GPU, vector units, ...
Memory Space : where data residesØ AND what execution space can access that dataAlso differentiated by access performance; e.g., latency & bandwidth
Memory Layout : how data structures are ordered in memoryØ provide mapping from logical to physical index space
Memory Traits : how data shall be accessedØ allow specialisation for different usage scenarios (read only, random, atomic, …)
![Page 16: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/16.jpg)
16
ExecutionPattern#include <Kokkos_Core.hpp>#include <cstdio>
int main(int argc, char* argv[]) {// Initialize Kokkos analogous to MPI_Init()// Takes arguments which set hardware resources (number of threads, GPU Id)Kokkos::initialize(argc, argv);
// A parallel_for executes the body in parallel over the index space, here a simple range 0<=i<10// It takes an execution policy (here an implicit range as an int) and a functor or lambda// The lambda operator has one argument, and index_type (here a simple int for a range)Kokkos::parallel_for(10,[=](int i){
printf(”Hello %i\n",i); });
// A parallel_reduce executes the body in parallel over the index space, here a simple range 0<=i<10 and // performs a reduction over the values given to the second argument // It takes an execution policy (here an implicit range as an int); a functor or lambda; and a return valuedouble sum = 0;Kokkos::parallel_reduce(10,[=](int i, int& lsum) {
lsum += i; },sum);printf("Result %lf\n",sum);
// A parallel_scan executes the body in parallel over the index space, here a simple range 0<=i<10 and // Performs a scan operation over the values given to the second argument // If final == true lsum contains the prefix sum. double sum = 0;Kokkos::parallel_scan(10,[=](int i, int& lsum, bool final) {
if(final) printf(”ScanValue %i\n",lsum); lsum += i;
});
Kokkos::finalize();}
![Page 17: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/17.jpg)
Kokkos protectsusagainst…§ Hardwaredivergence§ Programmingmodeldiversity§ Threadsatall
§ Kokkos::Serial back-end§ Kokkos’semanticsrequire
vectorizable (ivdep)loops§ Exposeparallelismtoexploitlater§ Hierarchicalparallelismmodel
encouragesexploitinglocality
§ Kokkos protectsourHUGEtimeinvestmentofportingTrilinos
§ Note:Kokkos isnotmagic,cannotmakebadalgorithmsscale.
17
Kokkos isourhedge
![Page 18: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/18.jpg)
Parallel Algorithms
![Page 19: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/19.jpg)
FoundationalTechnology:KokkosKernels§ ProvideBLAS(1,2,3);Sparse;GraphandTensorKernels§ Kokkos based:PerformancePortable§ Interfacestovendorlibrariesifapplicable(MKL,CuSparse,…)§ Goal:Providekernelsforalllevelsofnodehierarchy
19
VectorLane§ ElementalFunctions,Serial,e.g.3x3DGEMM
HyperThread§ VectorParallelism,Synchfree,e.g.MatrixRowx Vector
Core§ ThreadParallelism,SharedL1/L2,e.g.SubdomainSolve
Socket§ ThreadTeams,SharedL3,e.g.FullSolve
![Page 20: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/20.jpg)
Kokkoskernels:SparseMatrix-MatrixMultiplication(SpGEMM)
• SpGEMM is the most expensive part of the multigrid setup.
• New portable write-avoiding algorithm in KokkosKernels is~14x faster than NVIDIA’s CUSPARSE on K80 GPUs.
• ~2.8x faster than Intel’s MKL on Intel’s Knights Landing (KNL).
• Memory scalable: Solving larger problems that cannot be solved by codes like NVIDIA’s CUSP and Intel’s MKL when using large number of threads.
• Up is good.
1.22x%
2.55x%
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
7.00%
1% 2% 4% 8% 16% 32% 64% 128% 256% 1% 2% 4% 8% 16% 32% 64% 128% 256%
NoReuse% Reuse%
Geo
metric
%mean%of%th
e%GFLOPs%fo
r%vario
us%M
ulDp
licaD
ons%%on%KN
L%%%
(Stron
g%Scaling)%
KokkosKernels%
MKL%
AxP$ RX(AP)$ AxP$ RX(AP)$ AxP$ RX(AP)$
Laplace$ Brick$ Empire$
cuSPARSE$ 0.100$ 0.229$ 0.291$ 0.542$ 0.646$ 0.715$
KokkosKernels$ 1.489$ 1.458$ 2.234$ 2.118$ 2.381$ 1.678$
14.89x$
0.00$
0.25$
0.50$
0.75$
1.00$
1.25$
1.50$
1.75$
2.00$
2.25$
2.50$
Sparse$M
atrixHM
atrix$m
ulIplicaIon$$
GFLOPs$on$K80$
cuSPARSE$
KokkosKernels$
![Page 21: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/21.jpg)
• Goal: Identify independent data that can be processed in parallel.
• Performance: Better quality (4x on average) and run time (1.5x speedup ) w.r.t cuSPARSE.
• Enables parallelization of preconditioners: Gauss Seidel: 82x speedup on KNC, 136x on K20 GPUs
63.87&
1.22&
11.01&
1.75&
2.38&
0.66&
0.41&
0.19&
0.44&
0.64&
0.13&
0.25&
0.50&
1.00&
2.00&
4.00&
8.00&
16.00&
32.00&
64.00&
ci in kr li h a rg e B Q
Speedu
p&w.r.t.&cuSPAR
SE& KokkosKernels& cuSPARSE&
0.02$
0.97$
0.19$
0.59$
0.95$
0.32$
0.39$
0.19$
0.33$
0.34$
0.00$
0.20$
0.40$
0.60$
0.80$
1.00$
1.20$
cir in kr liv ho au rg eu Bu Q
Normalized
$#colors$w.r.t.$
cuSPAR
SE$
KokkosKernels$cuSPARSE$
Kokkoskernels:GraphColoringandSymmetricGauss-Seidel
Kokkos and Kokkos Kernels are available independent of Trilinos:https://github.com/kokkos
![Page 22: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/22.jpg)
▪ MPI+Xbasedsubdomainsolvers▪ DecouplethenotionofoneMPIrankasonesubdomain:Subdomainscanspan
multipleMPIrankseachwithitsownsubdomainsolverusingXorMPI+X▪ Subpackages ofShyLU:MultipleKokkos-basedoptionsforon-nodeparallelism
▪ Basker :LUorILU(t)factorization▪ Tacho:IncompleteCholesky - IC(k)▪ Fast-ILU:Fast-ILUfactorizationforGPUs
▪ KokkosKernels:ColoringbasedGauss-Seidel(M.Deveci),TriangularSolves(A.Bradley)
ShyLU andSubdomainSolvers:Overview
TachoBasker FAST-ILUKLU2
Amesos2 Ifpack2
ShyLU
KokkosKernels –SGS, Tri-Solve (HTS)
![Page 23: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/23.jpg)
23
![Page 24: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/24.jpg)
ObjectConstructionModifications:MakingMPI+Xtrulyscalable(lotsofwork)
§ Pattern:1. Count /estimateallocationsize;mayuseKokkos parallel_scan2. Allocate;useKokkos::Viewforbestdatalayout&firsttouch3. Fill:parallel_reduce overerrorcodes;ifyourunoutofspace,keep
going,counthowmuchmoreyouneed,&returnto(2)4. Compute (e.g.,solvethelinearsystem)usingfilleddatastructures
§ ComparetoFill,Setup,Solve sparselinearalgebrausepattern§ Fortran<=77codersshouldfindthisfamiliar§ Semanticschange:Runningoutofmemorynotanerror!
§ Alwaysreturn:Eithernosideeffects,orcorrectresult§ Callersmustexpectfailure&protectagainstinfiniteloops§ Generalizestootherkindsoffailures,evenfaulttolerance
24
![Page 25: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/25.jpg)
Trilinos Product Organization
![Page 26: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/26.jpg)
ProductLeaders:Maximizecohesion,controlcoupling
26
§ Product:§ Framework(J.Willenbring).§ DataServices(K.Devine).§ LinearSolvers(S.Rajamanickam).§ NonlinearSolvers(R.Pawlowski).§ Discretizations (M.Perego).
§ Productfocus:§ New,strongerleadershipmodel.§ Focus:
§ PublishedAPIs§ Highcohesionwithinproduct.Lowcouplingacross.§ Deliberateproduct-levelupstreamplanning&design.
![Page 27: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/27.jpg)
ForTrilinos: Full, sustainable Fortran access to Trilinos
HistoryViaadhoc interfaces,FortranappshavebenefittedfromTrilinoscapabilitiesformanyyears,mostnotablyclimate.
Why ForTrilinosTrilinosprovidealargecollectionofrobust,productionscientificC++software,includingstate-of-the-artmanycore/GPUcapabilities.ForTrilinos willgiveaccesstoFortranapps.
Research Details– Accessviaadhoc APIshasexistedforyears,especiallyintheclimatecommunity.
– ForTrilinos providesnative,sustainableAPIs,includingsupportforuser-providedFortranphysics-basedpreconditioninganduser-definedFortranoperators.
– Useofrobustauto-generationcodeandAPItoolsmakesfutureextensibilityfeasible.
– Auto-generationtoolsapplytootherprojects.– ForTrilinos willprovideearlyaccesstolatestscalablemanycore/GPUfunctionality,sophisticatedsolvers.
MichaelHeroux(PI,SNL),KateEvans(co-PI,ORNL),devteambasedatORNL.
https://github.com/flang-compiler
![Page 28: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/28.jpg)
The Extreme-Scale Scientific Software Development Kit (xSDK)
![Page 29: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/29.jpg)
Extreme-scaleScientificSoftwareEcosystem
Libraries• Solvers,etc.• Interoperable.
Frameworks&tools• Docgenerators.• Test,buildframework.
Extreme-scaleScientificSoftwareDevelopmentKit(xSDK)
SWengineering• Productivitytools.• Models,processes.
Domaincomponents• Reactingflow,etc.• Reusable.
Documentationcontent• Sourcemarkup.• Embeddedexamples.
Testingcontent• Unittests.• Testfixtures.
Buildcontent• Rules.• Parameters.
Libraryinterfaces• Parameterlists.• Interfaceadapters.• Functioncalls.
Shareddataobjects• Meshes.• Matrices,vectors.
Nativecode&dataobjects• Singleusecode.• Coordinatedcomponentuse.• Applicationspecific.
Extreme-scaleScienceApplications
Domaincomponentinterfaces• Datamediatorinteractions.• Hierarchicalorganization.• Multiscale/multiphysics coupling.
Focusofkeyaccomplishments:xSDK foundations
![Page 30: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/30.jpg)
Impact:• Improvedcodequality,usability,access,sustainability• InformpotentialusersthatanxSDK memberpackagecanbeeasilyusedwithotherxSDK packages• Foundationforworkonperformanceportability,deeperlevelsofpackageinteroperability
Buildingthefoundationofahighlyeffectiveextreme-scalescientificsoftwareecosystem
30
Focus:Increasingthefunctionality,quality,andinteroperabilityofimportantscientificlibraries,domaincomponents,anddevelopmenttools¨ xSDKrelease0.2.0:April2017(soon)
¤ Spack packageinstallationn ./spack install xsdk
¤ Packageinteroperabilityn Numericallibraries
n hypre,PETSc,SuperLU,Trilinosn Domaincomponents
n Alquimia,PFLOTRAN¨ xSDK communitypolicies:
¤ Addresschallengesininteroperabilityandsustainabilityofsoftwaredevelopedbydiversegroupsatdifferentinstitutions
website:xSDK.info
![Page 31: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/31.jpg)
xSDK communitypolicies31
xSDK compatiblepackage:MustsatisfymandatoryxSDK policies:M1.SupportxSDK communityGNUAutoconf orCMakeoptions.M2.Provideacomprehensivetestsuite.M3.Employuser-providedMPIcommunicator.M4.Givebesteffortatportabilitytokeyarchitectures.M5.Provideadocumented,reliablewaytocontactthedevelopmentteam.M6.Respectsystemresourcesandsettingsmadebyotherpreviouslycalledpackages.M7.Comewithanopensourcelicense.M8.ProvidearuntimeAPItoreturnthecurrentversionnumberofthesoftware.M9.Usealimitedandwell-definedsymbol,macro,library,andincludefilenamespace.M10.Provideanaccessiblerepository(notnecessarilypubliclyavailable).M11.HavenohardwiredprintorIOstatements.M12.Allowinstalling,building,andlinkingagainstanoutsidecopyofexternalsoftware.M13.Installheadersandlibrariesunder<prefix>/include/and<prefix>/lib/.M14.Bebuildableusing64bitpointers.32bitisoptional.
Draft 0.3, Dec 2016
Alsospecifyrecommendedpolicies,whichcurrentlyareencouragedbutnotrequired:R1.Haveapublicrepository.R2.Possibletoruntestsuiteundervalgrindinordertotestformemorycorruptionissues.R3.Adoptanddocumentconsistentsystemforerrorconditions/exceptions.R4.Freeallsystemresourcesithasacquiredassoonastheyarenolongerneeded.R5.Provideamechanismtoexportorderedlistoflibrarydependencies.
xSDK memberpackage:MustbeanxSDK-compatiblepackage,and itusesorcanbeusedbyanotherpackageinthexSDK,andtheconnectinginterfaceisregularlytestedforregressions.
https://xsdk.info/policies
![Page 32: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/32.jpg)
xSDK4 SnapshotStatus
ASCRxSDKRelease0.1and0.2givesapplicationteamssingle-pointinstallation,accesstoTrilinos,hypre,PETScandSuperLU.
ValueThexSDKisessentialformulti-scale/physicsapplicationcouplingandinteroperability,andbroadestaccesstocriticallibraries.xSDkeffortsexpandcollaborationscopetoalllabs.
Research Details– xSDKstartedunderDOEASCR(Ndousse-Fetter,2014).– PriortoxSDK,verydifficulttousexSDKmemberlibstogether,adhocapproachrequired,versioninghard.
– Latestrelease(andfuture)usesSpack forbuilds.– xSDK4ECPwillinclude3additionallibs.– xSDKeffortsenablenewscopeofcross-labcoordinationandcollaboration.
– Long-termgoal:Createcommunityandpolicy-basedlibraryandcomponentecosystemsforcompositionalapplicationdevelopment.
Ref:xSDK Foundations:TowardanExtreme-scaleScientificSoftwareDevelopmentKit,,Feb2017,https://arxiv.org/abs/1702.08425,toappearinSupercomputingFrontiersandInnovations.
xsdk.info
hypre
Trilinos
PETSc
SuperLU
Released:Feb2017
TestedonkeymachinesatALCF,OLCF,NERSC,alsoLinuxandMacOSX
SLATE
DTK
SUNDIALS
Release:Oct2017
All ECP math and scientific libraries,
Many community
libraries
Future
MichaelHeroux(co-leadPI,SNL),LoisCurfman McInnes (co-leadPI,ANL),JamesWillenbring(Releaselead,SNL)
![Page 33: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/33.jpg)
MorexSDK info33
¨ Paper:xSDK Foundations:TowardanExtreme-scaleScientificSoftwareDevelopmentKit¤ R.Bartlett,I.Demeshko,T.Gamblin,G.Hammond,M.Heroux,J.Johnson,A.Klinvex,X.Li,L.C.McInnes,D.Osei-Kuffuor,J.Sarich,B.Smith,J.Willenbring,U.M.Yang
¤ https://arxiv.org/abs/1702.08425¤ ToappearinSupercomputingFrontiersandInnovations,2017
¨ CSE17Posters:¤ xSDK:WorkingtowardaCommunitySoftwareEcosystem
n https://doi.org/10.6084/m9.figshare.4531526¤ ManagingtheSoftwareEcosystemwithSpack
n https://doi.org/10.6084/m9.figshare.4702294
![Page 34: The Trilinos Project Exascale Roadmap · The Trilinos Project Exascale Roadmap Michael A. Heroux Sandia National Laboratories Sandia National Laboratories is a multimissionlaboratory](https://reader030.fdocuments.us/reader030/viewer/2022040917/5e91cf3312caf11e98468ee8/html5/thumbnails/34.jpg)
FinalTake-AwayPoints▪ Intra-nodeparallelismisbiggestchallengerightnow:
▪ Kokkos providesvehicleforreasoningandimplementingon-nodeparallel.▪ Eventualgoal:SearchandreplaceKokkos::withstd::
▪ Node-parallelalgorithmsarealreadyavailable.▪ Fullynodeparallelexecutionishardwork.
▪ Inter-nodeparallelism:▪ Muelu frameworkprovideflexibility:
▪ Pluggable,customizablecomponents.▪ Multi-physics.
▪ TrilinosProducts:▪ Improvesupstreamplanningabilities.▪ Enablesincreasedcohesion,reduced(incidental)coupling.
▪ Communityexpansion:▪ ForTrilinos providesnativeaccessandextensibilityforFortranusers.▪ xSDK providesturnkey,consistentaccesstogrowingsetoflibraries.
▪ Contactifinterestedinjoining.▪ EuroTUG 2018:BetweenISCandPASC?
34