Download - Improving Model Predictions via Stacking and Hyper-parameters Tuning

Page 1: Improving Model Predictions via Stacking and Hyper-parameters Tuning

Improv ing Model Pred ictions v ia S tack ing and Hyper-Parameters Tuning

Jo-fai (Joe) ChowData [email protected]


Page 2: Improving Model Predictions via Stacking and Hyper-parameters Tuning


About Me

• 2005 - 2015

• Water Engineero Consultant for Utilitieso EngD Research

• 2015 - Present

• Data Scientisto Virgin Mediao Domino Data Labo

Page 3: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Mango Data Science Radar

Page 4: Improving Model Predictions via Stacking and Hyper-parameters Tuning


About This Talk

• Predictive modellingo Kaggle as an example

• Improve predictions with simple tricks

• Use data science for social good 👍

Page 5: Improving Model Predictions via Stacking and Hyper-parameters Tuning


About Kaggle

• World’s biggest predictive modelling competition platform

• 560k members• Competition types:

o Featured (prize)o Recruitmento Playground o 101

Page 6: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Predicting Shelter Animal Outcomes

• X: Predictorso Nameo Gendero Type (🐱 or 🐶)o Date & Timeo Ageo Breedo Colour

• Y: Outcomes (5 types)o Adoptiono Diedo Euthanasiao Return to Ownero Transfer

• Datao Training (27k samples)o Test (11k)

Page 7: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Basic Feature Engineering

X Raw (Before) Reformatted (After)

Name Elsa, Steve, Lassie [name_len]: 4, 5, 6

Date & Time 2014-02-12 18:22:00 [year]: 2014[month]: 2[weekday]: 4[hour]: 18

Age 1 year, 3 weeks, 2 days [age_day]: 365, 21, 2

Breed German Shepherd, Pit Bull Mix [is_mix]: 0, 1

Colour Brown Brindle/White [simple_colour]: brown

Page 8: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Common Machine Learning Techniques

• Ensembleso Bagging/boosting of

decision treeso Reduces variance and

increase accuracyo Popular R Packages (used

in next example)• “randomForest”• “xgboost”

• There are a lot more machine learning packages in R:o “caret”, “caretEnsemble”o “h2o”, “h2oEnsemble”o “mlr”

Page 9: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Simple Trick – Model Averaging

• Stratified samplingo 80% for trainingo 20% for validation

• Evaluation metrico Multi-class Log Losso Lower the bettero 0 = Perfect

• 50 runso different random seed

Page 10: Improving Model Predictions via Stacking and Hyper-parameters Tuning


More Advanced Methods

• Model Stackingo Uses a second-level metalearner

to learn the optimal combination of base learners

o R Packages:• “SuperLearner”• “subsemble”• “h2oEnsemble”• “caretEnsemble”

• Hyper-parameters Tuningo Improves the performance of

individual machine learning algorithms

o Grid search• Full / Random

o R Packages:• “caret”• “h2o”

For more info, see

Page 11: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Trade-Off of Advanced Methods

• Strengtho Model tuning + stacking

won nearly all Kaggle competitions.

o Multi-algorithm ensemble may better approximate the true predictive function than any single algorithm.

• Weaknesso Increased training and

prediction times.o Increased model

complexity.o Requires large machines

or clusters for big data.

Page 12: Improving Model Predictions via Stacking and Hyper-parameters Tuning


R + H2O = Scalable Machine Learning

• H2O is an open-source, distributed machine learning library written in Java with APIs in R, Python and more.

• ”h2oEnsemble” is the scalable implementation of the Super Learner algorithm for H2O.

Page 13: Improving Model Predictions via Stacking and Hyper-parameters Tuning


H2O Random Grid Search ExampleDefine search range and criteria

Best models

Page 14: Improving Model Predictions via Stacking and Hyper-parameters Tuning


H2O Model Stacking Example

17 out of 717 teams (≈ top 2%)

Getting reasonable results Using h2o.stack(…) to combine multiple models

Page 15: Improving Model Predictions via Stacking and Hyper-parameters Tuning



• Many R packages for predictive modelling.

• Use hyper-parameters tuning to improve individual models.

• Use model averaging / stacking to improve predictions.

• Trade-off between model performance and computational costs.

• Use R + H2O for scalable machine learning.

• H2O random grid search and stacking.

• Use data science for social good 👍

Page 16: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Big Thank You!

• Mango Solutions• RStudio• Domino Data Lab• H2O

o Erin LeDello Raymond Pecko Arno Candel

1st LondonR TalkCrime Map Shiny

2nd LondonR TalkDomino API

Page 17: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Any Questions?

• Contacto [email protected] @matlabulouso

• Slides & Codeo


• H2O in Londono Meetups / Office (soon)o

• More H2O at Strata tomorrowo Innards of H2O (11:15)o Intro to Generalised Low-Rank

Models (14:05)

Page 18: Improving Model Predictions via Stacking and Hyper-parameters Tuning


Extra Sl ide (Stratified Sampling)