IMA Thematic Year on Mathematics of Materials and ...

IMA Thematic Year on Mathematics of Materials and Mathematics of Materials and

MacromoleculesMacromoleculesThanks to Local Organizers:

Mitch Luskin, Maria Calderer, Dick James

Effective Theories for Materials Effective Theories for Materials and Macromoleculesand Macromolecules

Sloppy Models: Universality in Sloppy Models: Universality in Data FittingData Fitting

Kevin S. Brown, JPS, Rick Cerione, Chris Myers, Kelvin Lee, JoshWaterfall, Fergal Casey, Ryan Gutenkunst, Søren Frederiksen,

Karsten Jacobsen, Colin Hill, Guillermo Calero

+NGF

Error Bars for Interatomic Potentials

Cell Dynamics

Fitting Exponentials, Polynomials

Ensemble: Extrapolation

Ensemble:Interpolation

Fitting Decaying ExponentialsFitting Decaying Exponentials

ttt eAeAeAt 321321)( γγγ −−− ++=Γ

Classic Ill-Posed Inverse Problem

Given Geiger counter measurements from a

radioactive pile, can we recover the identity of the elements and/or

predict future radioactivity? Good fits with bad decay rates!

Fit

∑=

−=

DN

i i

iyyC1

2

2))(()(σθθr

r

P, S, I3532 125

6 Parameter Fit

PC12 DifferentiationPC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

EGFREGFR NGFRNGFR

RasRas

SosSos

ERK1/2ERK1/2

MEK1/2MEK1/2

RafRaf--11

+NGF+EGF

Pumps up signal (Mek)Tunes down

signal (Raf-1)

Biologists study which proteins talk to which. Modeling?

48 Parameter Fit

‘‘Sloppy Model’ Errors for AtomsSloppy Model’ Errors for AtomsBayesian Ensemble Approach to Error Estimation of Interatomic Potentials

Søren Frederiksen, Karsten W. Jacobsen, Kevin Brown, JPS

Atomistic potential820,000 Mo atoms(Jacobsen, Schiøtz)

Quantum Electronic

Structure (Si)90 atoms (Mo)

(Arias)

Interatomic Potentials V(r1,r2,…)• Fast to compute• Limit me/M → 0 justified• Guess functional formPair potential ∑ V(ri-rj) poorBond angle dependenceCoordination dependence

• Fit to experiment (old)• Fit to forces from electronic

structure calculations (new)

17 Parameter Fit

Why the Name Sloppy Model?Why the Name Sloppy Model?Huge Fluctuations around Best Fit

eigen

parameters

bare parameters

Best Fit

Hessian ∂2C/∂θ2 at Best FitSloppy Directions ⇔Small Eigenvalues

Eigenvalues Span Huge Range

Each eigenvalue ~three times next

Ill-conditionedStiff 1cm

Sloppy~meters,kmLocal Collinearity of

ParametersMany alternative fits

just as goodHuge ranges of

allowed parametersE

igen

valu

e

Tyson

Brown

Kholodenko

Stiff Sloppy

Sloppy Model EigenvaluesSloppy Model EigenvaluesMany fitting problems are sloppy

Molybdenum Interatomic Potential

Cell Dynamics Lessons:• Sloppy Due to Insufficient Data?

No: Perfect Data Sloppy Too• Survives Anharmonicity? Yes: Principle Component Analysis

Signal Transduction

Polynomial Fitting

Anharmonic

Perfect (Fake) Data

H

Ensemble of ModelsEnsemble of ModelsWe want to consider not just minimum cost fits, but all

parameter sets consistent with the available data. New level of abstraction: statistical mechanics in model space.

Generate an ensemble of states with Boltzmann weights exp(-C/T) and compute for an observable:

222

1

)()(

)(1

θθσ

θ

rr

r

OO

ON

O

O

N

ii

E

E

−=

= ∑=

O is chemical concentration, or rate constant …

TVVH Λ=

bare

eigen

Don’t trust predictions that vary

48 Parameter “Fit” to Data48 Parameter “Fit” to Data

ERK

*

Time

10’

ERK

*

Time

10’

+EGF

+NGF

∑=

−=

DN

i i

iyyC1

2

2))((21)(

σθθr

r

bare

eigen

Cost is Energy

Ensemble of Fits Gives Error Bars

Error Bars from Data Uncertainty

Does the Erk Model Does the Erk Model Predict Predict

Experiments?Experiments?

Model predicts that the left branch isn’t important

Bro

wn’

s E

xper

imen

tM

odel

Pre

dict

ion

Predictive Despite Sloppy Fluctuations!

Which Rate Constants are in the Stiffest Eigenvector?Which Rate Constants are in the Stiffest Eigenvector?

**

*

*

*

stiffest **

* *

2nd stiffest

Eigenvector components along

the bare parameters reveal which ones are most important

for a given eigenvector.

Ras

Raf1

Oncogenes

Interatomic Potential Error BarsInteratomic Potential Error Bars

Best fit is sloppy: ensemble of fits that aren’t much

worse than best fit. Ensemble in Model Space!

T0 set by equipartition

energy = best cost

Error Bars from quality of

best fit

Ensemble of Acceptable Fits to DataNot transferableUnknown errors

• 3% elastic constant• 10% forces• 100% fcc-bcc, dislocation core

Green = DFT, Red = Fits

T0

Sloppy Molybdenum: Does it Work?Sloppy Molybdenum: Does it Work?Comparing Predicted and Actual Errors

Sloppy model error σi gives total error if ratio r = errori/σidistributed as a Gaussian: cumulative distribution P(r)=Erf(r/√2)

Three potentials• Force errors• Elastic moduli• Surfaces• Structural• Dislocation core• 7% < σi < 200%

Note: tails…MEAM errors underestimated by ~ factor of 2

“Sloppy model”systematic

error most of total

~2 << 200%/7%

Fitting Polynomials: HilbertFitting Polynomials: HilbertWhat is Sloppiness?

Sloppiness as Perverse, Skewed Choice of

Preferred Basis(Human or Biological)

Polynomial fit: L2 norm

• Hessian = 1/(i+j+1)= Hilbert matrix

(Classic ill-conditioned matrix)• Monomial coefficients θn sloppy.• Orthonormal shifted Legendre

• Coefficients an not sloppy

∫ ∑ −=1

0

2))(()( dxxfxC nnθθ

∑= )(xPaModel nn

Exploring Parameter SpaceExploring Parameter SpaceRugged? More like Grand Canyon (Josh)

Glasses: Rugged LandscapeMetastable Local ValleysTransition State Passes

Optimization Hell: Golf CourseSloppy Models

Minima: 5 stiff, N-5 sloppySearch: Flat planes with cliffs

Ensemble Fluctuations Along Ensemble Fluctuations Along EigendirectionsEigendirections

log e

fluct

uatio

ns a

long

ei

gend

irect

ion

stiffsloppy

3x previous

Monte Carlo Fluctuations Suppressed in Soft Directions: Anharmonicity or Convergence?

Work In Progress

Error BarsError BarsStochastic versus Sensitivity

Sensitivity Analysis = Harmonic Approximation for Errors• Yields Much Larger Prediction Fluctuations• Anharmonicity Constrains Soft Modes• Mimic w/ modest prior (fluctuations < 106, one σ)• Sensitivity w/Prior Fluctuations Now Close to Monte Carlo

Work In Progress

Sloppy Model Universality?Sloppy Model Universality?Why are all these problems so similar?

Work In Progress

Random Matrix GOE Ensemble: many different NxN random symmetric matrices have level repulsion, universal~Wigner-Dyson spacings as N→∞Product ensemble: equally spaced logs! stronger level repulsionFitting exponentials: very strong level repulsion!New random matrix ensemble?Fitting exponentials:

stiffest minus second

Strong Level

Repulsion

IMA Thematic Year on Mathematics of Materials and ...

Documents

Transcript of IMA Thematic Year on Mathematics of Materials and ...