Surrogate model based design optimization Aerospace design is synonymous with the use of long...

65
Surrogate model based design optimization Aerospace design is synonymous with the use of long running and computationally intensive simulations, which are employed in the search for optimal designs in the presence of multiple, competing objectives and constraints. The difficulty of this search is often exacerbated by numerical `noise' and inaccuracies in simulation data and the frailties of complex simulations, that is they often fail to return a result. Surrogate-based optimization methods can be employed to solve, mitigate, or circumvent problems associated with such searches. Alex Forrester, Rolls-Royce UTC for Computational Engineering Bern, 22 nd November 2010
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Surrogate model based design optimization Aerospace design is synonymous with the use of long...

  • Slide 1
  • Surrogate model based design optimization Aerospace design is synonymous with the use of long running and computationally intensive simulations, which are employed in the search for optimal designs in the presence of multiple, competing objectives and constraints. The difficulty of this search is often exacerbated by numerical `noise' and inaccuracies in simulation data and the frailties of complex simulations, that is they often fail to return a result. Surrogate-based optimization methods can be employed to solve, mitigate, or circumvent problems associated with such searches. Alex Forrester, Rolls-Royce UTC for Computational Engineering Bern, 22 nd November 2010
  • Slide 2
  • Coming up before the break: Surrogate model based optimization the basic idea Kriging an intuitive perspective Alternatives to Kriging Optimization using surrogates Constraints Missing data Parallel function evaluations Problems with Kriging error based methods 2
  • Slide 3
  • Surrogate model based optimization Surrogate used to expedite search for global optimum Global accuracy of surrogate not a priority 3 SAMPLING PLAN OBSERVATIONS CONSTRUCT SURROGATE(S) design sensitivities available? multi-fidelity data? SEARCH INFILL CRITERION (optimization using the surrogate(s)) constraints present? noise in data? multiple design objectives? ADD NEW DESIGN(S) PRELIMINARY EXPERIMENTS
  • Slide 4
  • Kriging (with a little help from Donald Jones) 4
  • Slide 5
  • Intuition is Important! People are reluctant to use a tool they cant understand Recall how basic probability was motivated by various games of chance involving dice, balls, and cards? In the same way, we can also make kriging intuitive. Therefore, we will now describe The Kriging Game
  • Slide 6
  • Game Equipment: 16 function cards (A1, A2,, D4) A B C D 1 2 3 4
  • Slide 7
  • Rules of the Kriging Game Dealer shuffles cards and draws one at random. He does not show it. Player gets to ask the value at either x=1, x=2, x=3, or x=4 Based on the answer, the Player must guess the values of the function at all of x=1, x=2, x=3, and x=4 Dealer reveals the card. Players score is the sum of squared differences between the guesses and actual values (lower is better) The Player and Dealer switch roles and repeat. After 100 times, the person with the lowest score wins. Whats the best strategy?
  • Slide 8
  • Example: Ask value at x=2 and answer is y=1 A B C D 1 2 3 4
  • Slide 9
  • The value at x=2 rules out all but 4 functions: C1, A2, A3, B3 At any value other than x=2, we arent sure what is the value of the function. But we know the possible values. What guess will minimize our squared error?
  • Slide 10
  • Yes, its the mean But why?
  • Slide 11
  • The best predictor is the mean Our best predictor is the mean of the functions that match the sampled values. Using the range or standard deviations of the values, we could also give a confidence interval for our prediction.
  • Slide 12
  • Why could we predict with a confidence interval? We had a set of possible functions and a probability distribution over themin this case, all equally likely Given the data on the sampled points, we could subset out those functions that match, that is, we could condition on the sampled data To do this for more than a finite set of functions, we need a way to describe a probability distribution over an infinite set of possible functions a stochastic process Each element of this infinite set of functions would be a random function But how do we describe and/or generate a random function?
  • Slide 13
  • How about a purely random function? Here we have x values 0, 0.01, 0.02, ., 0.99, 1.00. At each of these we have generated a random number. Clearly this is not the kind of function we want.
  • Slide 14
  • Whats wrong with a purely random function? No continuity! Values at y(x) and y(x+d) for small d can be very different. Root cause: the values at these points are independent. To fix this, we must assume the values are correlated, and that C(d) = Correlation( y(x+d), y(x) ) 1 as d 0 Where the correlation is over all possible random functions. OK. Great. I need a correlation function C(d) with C(0)=1. But how do I use such a correlation function to generate a continuous random function?
  • Slide 15
  • Making a random function
  • Slide 16
  • The correlation function 16
  • Slide 17
  • We are ready! Assuming we have estimates of the correlation parameters (more on this later), we have a way of generate a set of functions the equivalent of the cards in the Kriging Game. Using statistical methods involving conditional probability, we can condition on the data to get an (infinite) set of random functions that agree with the data.
  • Slide 18
  • Random Functions Conditioned on Sampled Points
  • Slide 19
  • Slide 20
  • The Predictor and Confidence Intervals
  • Slide 21
  • What it looks like in practice: 21 Sample the function to be predicted at a set of points i.e. run your experiments/simulation s
  • Slide 22
  • 22 20 Gaussian bumps with appropriate widths (chosen to maximize likelihood of data) centred around sample points
  • Slide 23
  • Multiply by weightings (again chosen to maximize likelihood of data) 23
  • Slide 24
  • Add together, with mean term, to predict function 24 Kriging predictionTrue function
  • Slide 25
  • Alternatives to Kriging
  • Slide 26
  • Moving least squares Quick Nice regularization parameter No useful confidence intervals How to choose polynomial & decay function? 26
  • Slide 27
  • 27
  • Slide 28
  • Support vector regression Quick predictions in large design spaces Slow training (extra quadratic programming problem) Good noise filtering Lovely maths! 28
  • Slide 29
  • 29
  • Slide 30
  • Multiple surrogates Surrogate built using a committee machine (also called ensembles) Hope to choose best model from a committee or combine a number of methods Often not mathematically rigorous and difficult to get confidence intervals Blind Kriging is, perhaps a good compromise selected by some data analytic procedure 30
  • Slide 31
  • Blind Kriging (mean function selected using Bayesian forward selection) 31
  • Slide 32
  • RMSE ~50% better than ordinary Kriging in this example 32
  • Slide 33
  • 33
  • Slide 34
  • 34
  • Slide 35
  • 35
  • Slide 36
  • Optimization Using Surrogates 36
  • Slide 37
  • Polynomial regression based search (as Devils advocate)
  • Slide 38
  • Gaussian process prediction based optimization 38
  • Slide 39
  • 39
  • Slide 40
  • 40
  • Slide 41
  • Gaussian process prediction based optimization (as Devils advocate) 41
  • Slide 42
  • But, we have error estimates with Gaussian processes 42
  • Slide 43
  • Error estimates used to construct improvement criteria 43 Probability of improvement Expected improvement
  • Slide 44
  • Probability of improvement 44 Probability there will be any improvement, at all Can be extended to constrained and multi- objective problems
  • Slide 45
  • Expected improvement 45 Useful metric that balances prediction & uncertainty Can be extended to constrained and multi- objective problems
  • Slide 46
  • Constrained EI 46
  • Slide 47
  • Probability of constraint satisfaction is just like the probability of improvement 47 Probability of satisfaction Prediction of constraint function Constraint function Constraint limit
  • Slide 48
  • Constrained expected improvement Simply multiply by probability of constraint satisfaction: 48
  • Slide 49
  • A 2D example 49
  • Slide 50
  • 50
  • Slide 51
  • 51
  • Slide 52
  • Missing Data 52
  • Slide 53
  • What if design evaluations fail? No infill point augmented to the surrogate model is unchanged optimization stalls Need to add some information or perturb the model add random point? impute a value based on the prediction at the failed point, so EI goes to zero here? use a penalized imputation (prediction + error estimate)? 53
  • Slide 54
  • Aerofoil design problem 2 shape functions (f 1,f 2 ) altered Potential flow solver (VGK) has ~35% failure rate 20 point optimal Latin hypercube max{E[I(x)]} updates until within one drag count of optimum 54