optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/...
Transcript of optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/...
![Page 1: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/1.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Hyperparameter optimization strategies
git clone https://github.com/IASIAI/hyperparameter-optimization-strategies.git
![Page 2: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/2.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Gabriel Marchidan Bogdan BurlacuSoftware architect AI researcher, PhD
![Page 3: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/3.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
“Algorithms are conceived in analytic purity in the high citadels of academic research, heuristics are midwifed by expediency in the dark corners of the practitioner’s lair”
Fred Glover, 1977
![Page 4: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/4.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Contents
● Problem statement● Disclaimer● Random search● Grid search● Bayesian optimization● Covariance Matrix Adaptation Evolution Strategy (CMA-ES)
![Page 5: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/5.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Problem statement
● Hyperparameters are parameters whose values are set prior to the commencement of the learning process.
● By contrast, the values of other parameters are derived via training.
● The problem (hyper)parameter optimization is not specific to ML
![Page 6: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/6.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Problem statement
●
![Page 7: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/7.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Problem statement
●
![Page 8: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/8.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Disclaimer
Things that will improve a real-life algorithm mode than in-depth parameter optimization:
● Having better data ● Having more data● Changing the algorithm (or the rates in which multiple algorithms’ results are being weighted)
So don’t start with this !
![Page 9: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/9.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Grid search
● Scans the parameter space in a grid pattern with a certain step size
● Hence the name “Grid search”
![Page 10: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/10.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Grid search
● Grid search probes parameter configurations deterministically, by laying down a grid of all possible configurations inside your parameter space
● In all continuous dimensions of parameter space a step is considered (defines the smoothness of the grid)
![Page 11: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/11.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Grid search
● Requires a lot of function evaluations
● Is highly impractical for algorithms with more than 4 parameters
● The number of function evaluations grows exponentially with each additional parameter (curse of dimensionality)
![Page 12: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/12.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Curse of dimensionality
●
![Page 13: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/13.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Random search
● Random points from the parameter space are being chosen
● In turn, random points from the solution space are being sampled
![Page 14: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/14.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Random search
● Random Search suggests configurations randomly from your parameter space
● The best result is saved along with the corresponding parameters
● The next result is sampled either randomly from the whole parameter space or randomly from a sphere around the current result
● The process is repeated until a termination criterion is met (usually, number of iterations)
![Page 15: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/15.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Random search
● Can be applied to functions that are not continuous or differentiable
● It makes no assumptions about the properties of the function
● Has multiple variants: fixed step, optimum step, adaptive step etc.
![Page 16: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/16.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Bayesian optimization
● With each observation we are improving the model of the objective function
● We are sampling the points that have the highest chance to improve the objective function
![Page 17: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/17.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Bayesian optimization
●
![Page 18: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/18.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Bayesian optimization
● To use Bayesian optimization, we need a way to flexibly model distributions over objective functions
● For this problem, Gaussian Processes are a particularly elegant technique
● Used for problems where each sampling is costly either as time or resources
● Historically Gaussian Processes were developed to help search for gold
![Page 19: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/19.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Bayesian optimization
●
![Page 20: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/20.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
CMA-ES
● CMA-ES stands for Covariance Matrix Adaptation Evolution Strategy
● It is an evolutionary algorithm for difficult non-linear non-convex black-box optimisation problems in continuous domain
● The CMA-ES is considered as state-of-the-art in evolutionary computation and has been adopted as one of the standard tools for continuous optimisation
![Page 21: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/21.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
CMA-ES
● It is an evolutionary algorithm
● Solutions are represented by parameter vectors with real number values
● Initial solutions are randomly generated
● Subsequent solutions are generated from the fittest solutions of the previous generation by recombination and mutation
![Page 22: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/22.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
CMA-ES
● Runs have the same population size each step (λ, λ > 4, usually λ > 20)
● Each value from the parameter solution vector is modified by sampling a certain distribution
● The distribution is updated by CMA based on the best solutions found in the current step(ES)
![Page 23: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/23.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
CMA-ES
● The mean is updated each time to provide a new centroid for new solutions
● Two paths of the time evolution of the distribution mean of the strategy are recorded, called search or evolution paths
● The two paths contain information about the correlation between consecutive iterations
![Page 24: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/24.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
CMA-ES
● Each iteration the mean is adjusted
● The two evolution paths are updated
● A new step size is calculated
![Page 25: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/25.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Running the simulations
● Windows - WinPython - https://winpython.github.io/● Linux and macOS – Python 3.5+, SciPy, scikit-learn, skopt (scikit-optimize)● Python virtualenv recommended for Linux and macOS● To test, you should be able to run the examples here: https://scikit-optimize.github.io/
git clone https://github.com/IASIAI/hyperparameter-optimization-strategies.git
![Page 26: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/26.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Simulation
● Trying to find the maximum value of the Rastrigin function
● Will run, in turn:○ Grid Search○ Random Search○ Bayesian optimization○ CMA-ES
![Page 27: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/27.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Biography
● https://www.lri.fr/~hansen/cmaesintro.html● https://blog.sigopt.com/posts/evaluating-hyperparameter-optimization-strategies● https://cloud.google.com/blog/big-data/2017/08/hyperparameter-tuning-in-cloud-machine-learning-engine-using-bayesia
n-optimization● https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)● https://en.wikipedia.org/wiki/CMA-ES● https://en.wikipedia.org/wiki/Rastrigin_function
![Page 28: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/28.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Questions ?
![Page 29: optimization strategies Hyperparameter · 2018. 4. 1. · meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net Gabriel Marchidan Bogdan Burlacu Software architect AI researcher, PhD](https://reader034.fdocuments.us/reader034/viewer/2022052102/603c525b024d6325cb1679e4/html5/thumbnails/29.jpg)
meetup.com/IASI-AI/ facebook.com/AI.Iasi/ iasiai.net
Thank You!