Adaptive Sampling with Topological Scores

63
Adaptive Sampling with Topological Scores Dan Maljovec 1 , Bei Wang 1 , Ana Kupresanin 2 , Gardar Johannesson 2 , Valerio Pascucci 1 , and Peer-Timo Bremer 2 1 SCI Institute, University of Utah 2 Lawrence Livermore National Laboratory

description

Adaptive Sampling with Topological Scores. Dan Maljovec 1 , Bei Wang 1 , Ana Kupresanin 2 , Gardar Johannesson 2 , Valerio Pascucci 1 , and Peer- Timo Bremer 2 1 SCI Institute, University of Utah 2 Lawrence Livermore National Laboratory. 1. Framework:. 4. 2. 3. - PowerPoint PPT Presentation

Transcript of Adaptive Sampling with Topological Scores

Page 1: Adaptive Sampling with Topological Scores

Adaptive Sampling with Topological Scores

Dan Maljovec1, Bei Wang1, Ana Kupresanin2, Gardar Johannesson2, Valerio Pascucci1, and Peer-Timo

Bremer2

1SCI Institute, University of Utah2Lawrence Livermore National Laboratory

Page 2: Adaptive Sampling with Topological Scores

Select training data & run simulation

Fit a Predicting Model (PM)

Select candidates & predict response values from PM

Score candidates & select candidate to add to training data

1

2

3

4

Framework:

Page 3: Adaptive Sampling with Topological Scores

Choosing the “Right” Points4 Models fit by using 10 training points:

True Response Model:

Page 4: Adaptive Sampling with Topological Scores

Space-filling SamplingNo prior knowledge of the dataset

Where should we sample the model?

Page 5: Adaptive Sampling with Topological Scores

Space-filling SamplingFill the domain space as evenly as possible

Page 6: Adaptive Sampling with Topological Scores

Space-filling SamplingObtain responses and fit model

Page 7: Adaptive Sampling with Topological Scores

Space-filling Sampling1 round:

Page 8: Adaptive Sampling with Topological Scores

Space-filling Sampling2 rounds:

Page 9: Adaptive Sampling with Topological Scores

Space-filling Sampling3 rounds:

Page 10: Adaptive Sampling with Topological Scores

Space-filling Sampling4 rounds:

Page 11: Adaptive Sampling with Topological Scores

Space-filling Sampling5 rounds:

Page 12: Adaptive Sampling with Topological Scores

Space-filling SamplingRefit after 5 rounds of “filling voids”:

Page 13: Adaptive Sampling with Topological Scores

Space-filling SamplingWhat have we learned?

Initial fit Refit after adding 5 points

Page 14: Adaptive Sampling with Topological Scores

Topologically-Inspired Adaptive Sampling

Starting over

Page 15: Adaptive Sampling with Topological Scores

Topologically-Inspired Adaptive Sampling

1 round:

Page 16: Adaptive Sampling with Topological Scores

Topologically-Inspired Adaptive Sampling

2 rounds:

Page 17: Adaptive Sampling with Topological Scores

Topologically-Inspired Adaptive Sampling

3 rounds:

Page 18: Adaptive Sampling with Topological Scores

Topologically-Inspired Adaptive Sampling

4 rounds:

Page 19: Adaptive Sampling with Topological Scores

Topologically-Inspired Adaptive Sampling

5 rounds:

Page 20: Adaptive Sampling with Topological Scores

Topologically-Inspired Adaptive Sampling

Refit after 5 rounds of choosing the most topologically significant point:

Page 21: Adaptive Sampling with Topological Scores

What have we learned?

Initial fit Refit after adding 5 points

Topologically-Inspired Adaptive Sampling

Page 22: Adaptive Sampling with Topological Scores

Space-filling Method Adaptive Method

Comparison

Initial Fit

Page 23: Adaptive Sampling with Topological Scores

Space-filling Method Adaptive Method

Comparison

True Response Model

Page 24: Adaptive Sampling with Topological Scores

Measuring Topological SignificanceThese points were selected in topologically significant regions:

How do we measure topological impact?

Page 25: Adaptive Sampling with Topological Scores

Morse-Smale ComplexA partition of the domain into monotonic regions

Stable Manifolds Unstable Manifolds Morse-Smale Complex

∩ =

Page 26: Adaptive Sampling with Topological Scores

Computing the Morse-Smale Complex

Stable Manifolds Unstable Manifolds Morse-Smale Complex

∩ =

Traditionally:1. Compute steepest ascent/descent gradient from each point in

dataset2. Color points twice once by following ascending gradient to

maximum & another by descending gradient to minimum3. Intersect partitions to form Morse-Smale Complex

Page 27: Adaptive Sampling with Topological Scores

Computing the Morse-Smale ComplexTraditionally:

1.a. Compute steepest ascent gradient from each point in dataset2.a. Color points based on maximum where flow halts

Page 28: Adaptive Sampling with Topological Scores

Computing the Morse-Smale ComplexTraditionally:

1.b. Compute steepest descent gradient from each point in dataset2.b. Color points based on minimum where flow halts

Page 29: Adaptive Sampling with Topological Scores

Computing the Morse-Smale Complex

Stable Manifolds Unstable Manifolds Morse-Smale Complex

∩ =

Traditionally:3. Intersect partitions to form Morse-Smale Complex

Page 30: Adaptive Sampling with Topological Scores

Computing the Morse-Smale Complex

Stable Manifolds Unstable Manifolds Morse-Smale Complex

∩ =

Traditionally:1. Compute steepest ascent/descent gradient from each point in

dataset Assumes points have a neighborhood2. Color points twice once by following ascending gradient to

maximum & another by descending gradient to minimum3. Intersect partitions to form Morse-Smale Complex

Page 31: Adaptive Sampling with Topological Scores

Computing the Approximate Morse-Smale Complex

Stable Manifolds Unstable Manifolds Morse-Smale Complex

∩ =

Modified Algorithm:1. Compute approximate KNN for a point2. Compute steepest ascent/descent gradient from each point’s

neighborhood in dataset3. Color points by gradient flows4. Intersect partitions to form Morse-Smale Complex

Page 32: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

a b

c

d

Tracking Persistence

Count the number of connected components belonging to the sub-level sets of the function

Page 33: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

a b

c

d

Tracking Persistence

Every connected component has a birth

Page 34: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

a b

c

d

Tracking Persistence

Every connected component has a birth

Page 35: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

a b

c

d

Tracking Persistence

And every connected component has a death

Page 36: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

a b

c

d

Tracking Persistence

Births and deaths of connected components are tied to critical points of the function

Page 37: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

a b

c

d

Tracking Persistence

Births and deaths of connected components are tied to critical points of the function

Page 38: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

a b

c

d

Tracking Persistence

Page 39: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

0 1 2 30

1

2

3

x

y

birth

death

a b

c

d

Tracking Persistence

A persistence diagram plots birth-death pairings

The function value difference between the critical point pairings is what we define as the persistence of a feature

Page 40: Adaptive Sampling with Topological Scores

In surfaces (and higher dimension corollaries), we have saddle points

Persistence Simplification in 2D

Page 41: Adaptive Sampling with Topological Scores

Persistence Simplification in 2DWe can smooth away noise features by thresholding

critical point pairs above a certain persistence level

Page 42: Adaptive Sampling with Topological Scores

Persistence Simplification in 2D

Page 43: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

x

y

birth

death

0 1 2 30

1

2

3

Bottleneck Distance

Comparing two similar functions persistence diagrams

Page 44: Adaptive Sampling with Topological Scores

0 1 2 30

1

2

3

0 1 2 30

1

2

3

x

y

birth

death

Bottleneck Distance

Optimization: minimize the distance between pairs of points in the two plots including the line y=x in both pairs

Bottleneck distance – the maximum distance resulting from this pairing

Page 45: Adaptive Sampling with Topological Scores

Select training data & run simulation

Fit a Predicting Model (PM)

Select candidates & predict response values from PM

Score candidates & select candidate to add to training data

1

2

3

4

Framework:

Page 46: Adaptive Sampling with Topological Scores

Selecting Initial Training DataTests use Latin Hypercube Sampling

Page 47: Adaptive Sampling with Topological Scores

Fitting a Statistical ModelExperimented with:• Gaussian Process Model– Sparse Online Gaussian Process (C++)

• 2 Implementations of MARS with bootstrapping– Earth (CRAN)– MDA (CRAN)

• Neural Network with bootstrapping– NNet (CRAN)

Page 48: Adaptive Sampling with Topological Scores

Classical ScoringActive-Learning McKay (ALM)– Sample high-frequency or low-confidence regions

Delta– Distribute samples in the range space or areas of steep

gradientExpected Improvement (EI)– Select points with high uncertainty or large discrepancy

with existing dataDistance (*DP)– Scaling factor applied to above, creating 3 new scoring

functions (ALMDP, DeltaDP, EIDP)

Page 49: Adaptive Sampling with Topological Scores

Topological ScoringTOPOP– Average change in persistence of each point computed

by comparing the Morse-Smale before and after inserting a candidate

TOPOB– Bottleneck distance between persistence diagram of

Morse-Smale before and after inserting a candidate point

TOPOHP– Highest persistence critical point in Morse-Smale

complex constructed from combined training and predicted candidate data

Page 50: Adaptive Sampling with Topological Scores

Topological ScoringAverage Change in Persistence (TOPOP)– Average change in persistence of each point

computed by comparing the Morse-Smale before and after inserting a candidate

Page 51: Adaptive Sampling with Topological Scores

Topological ScoringAverage Change in Persistence (TOPOP)– Average change in persistence of each point

computed by comparing the Morse-Smale before and after inserting a candidate

Page 52: Adaptive Sampling with Topological Scores

Topological ScoringBottleneck Distance in Persistence (TOPOB)– Bottleneck distance between persistence diagram of

Morse-Smale before and after inserting a candidate point

Page 53: Adaptive Sampling with Topological Scores

Topological ScoringBottleneck Distance in Persistence (TOPOB)– Bottleneck distance between persistence diagram of

Morse-Smale before and after inserting a candidate point

Page 54: Adaptive Sampling with Topological Scores

Topological ScoringBottleneck Distance in Persistence (TOPOB)– Bottleneck distance between persistence diagram of

Morse-Smale before and after inserting a candidate point

Page 55: Adaptive Sampling with Topological Scores

Topological ScoringHighest Persistence (TOPOHP)– Find highest persistence critical point in Morse-

Smale complex constructed from combined training and predicted candidate data

Page 56: Adaptive Sampling with Topological Scores

Initial Test Functions

Page 57: Adaptive Sampling with Topological Scores

Initial Test FunctionsDiagnosis:Little hope of modeling the complexity of some of these functions, especially in higher dimensions.

Page 58: Adaptive Sampling with Topological Scores

Test FunctionsAckley Diagonal 2 Max Diagonal 4 Max

2-Component GMM

5-Component GMM

Mis-Scaled Generalized

Rastrigin (MGR)

Salomon Generalized Rosenbrock

Page 59: Adaptive Sampling with Topological Scores

Comparing Model ResultsGPM Earth

NNet MDA

Page 60: Adaptive Sampling with Topological Scores

GPM Results

Page 61: Adaptive Sampling with Topological Scores

GPM Results

Page 62: Adaptive Sampling with Topological Scores

GPM Results

Page 63: Adaptive Sampling with Topological Scores

Moving ForwardLimitations

– Computation time of approximate K-nearest neighborhood is cost-prohibitive to build for every candidate point (TOPOB & TOPOP)• Morse-Smale computation is limited by number of extrema which need processed

– Topology is based on current prediction modelUnknowns

– Good parameter selection (k for KNN, parameters of prior)Future Work

– Integrate visualizations to better understand how points are being selected and gain intuition as to how to improve results

– Better neighborhood estimates– Investigate other means of measuring topological impact– Fitting local models based on topological partitioning– Using better models– Hybrid scoring or hybrid sampling techniques