CAQR ( Demmel et al, 2008)
description
Transcript of CAQR ( Demmel et al, 2008)
CAQR (Demmel et al, 2008)
MapReduce-enabled Model Reduction for Large Scale Simulation Data
Sandia National Laboratories is a multi-program laboratory operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin company, for the U.S. Department of Energy’s National Nuclear
Security Administration under contract DE-AC04-94AL85000.
Motivation
• Given input parameters, a physical simulation approximates a space/time dependent solution.• Each solution is computationally expensive, so exhaustive parameter studies (e.g., UQ/SA/optimization) are infeasible.• Cheaper reduced order models enable guesstimates for parameter inquiries.
solution
space time input parameters
fixed space/time discretization
experimental design
truncation amplitude
s
space-time basis
vectors
unknown parameter basis
functions
• The left singular vectors are the space-time basis.• The singular values give the amplitudes.• The truncation is determined by examining the decay of the singular values. • Treat each right singular vector as samples from the unknown basis functions.
Approximation Model
Tuning the Approximation
SVD
Interpolating the Parameter Basis
• Construct parameter basis functions by interpolating the samples. • Can use any appropriate interpolation method (e.g., polynomials, radial basis functions, Gaussian process models, piecewise linear, splines, … )
A B
Which section would you rather try to interpolate,
A or B?
Ordered by Increasing Oscillations• From the SVD, it is natural to expect that the parameter bases will become more oscillatory as the index increases. • Therefore, we expect the gradient between sample points to increase as the index increases.
“PREDICTABLE” “UNPREDICTABLE”
“predictable” “unpredictable”
Space-time prediction variance:
The truncation is the largest such that
For and
Heat Conduction Example
Paul G. ConstantineStanford University
David F. GleichPurdue University
Joe RuthruffSandia National Labs
Jeremy TempletonSandia National Labs
SVD in MapReduce
http://github.com/dgleich/mrtsqr
Simulation Informatics!SUPERCOMPUTER DATA CENTER ENGINEER
Each HPC simulation generates gigabytes of data.
A data cluster canhold hundreds or thousands of simulations …
… enabling engineers to query simulation data for statistical studies and uncertainty quantification.
• We implement the SVD via a communication-avoiding QR factorization for tall, skinny matrices in MapReduce.
Constantine and Gleich, Tall and Skinny QR Factorizations in MapReduce Architectures, 2011
• 80k nodes, 300 time steps• 104 basis runs
• SVD of 24m x 104 data matrix (18G)• 500x reduction in wall clock time
Truth
ROM
Variance